A review of preclinical research of a now widely used cancer drug suggests the studies contain multiple methodology flaws and overestimate the benefits of the drug.
Specifically, the researchers found that most studies didn’t randomize treatments, didn’t blind investigators to which animals were receiving the drug, and tested tumors in only one animal model, which limits the applicability of the results. Importantly, they also found evidence that publication bias — keeping studies that found no evidence of benefit from the drug, sunitinib, out of the literature — may have caused a significant overestimation of its effects on various tumor types.
Together, these findings suggest the need for a set of standards that cancer researchers follow, the journal notes in a “digest” of the paper, published Tuesday by eLife:
Researchers studying certain medical conditions (such as strokes) have already developed, and now routinely implement, a set of standards for the design and reporting of preclinical research. It now appears that the cancer research community should do the same.
The research focused on reports of 158 experiments, which tested the effects of sunitinib using 2716 animals. Sunitinib, marketed under the name Sutent, is approved for tumors in the kidney, gastrointestinal system and pancreas.
Here’s more details about what the authors found:
This systematic review and meta-analysis revealed that many common practices, like randomization, were rarely implemented. Few of the published studies used ‘blinding’, whereby information about which animals are receiving the drug and which animals are receiving the control is kept from the experimenter, until after the test; this technique can help prevent any expectations or personal preferences from biasing the results. Furthermore, most tumors were tested in only one model system, namely mice that had been injected with specific human cancer cells. This makes it difficult to rule out that any anti-cancer activity was in fact unique to that single model.
It’s not clear whether these issues impacted clinical trials of the drug, the authors note, which was first approved by the U.S. Food and Drug Administration in 2006:
Though we did not perform a systematic review to estimate clinical effect sizes for sunitinib against various malignancies, a perusal of the clinical literature suggests little relationship between pooled effect sizes and demonstrated clinical activity. For instance, sunitinib monotherapy is highly active in RCC patients (Motzer et al., 2006a, 2006b) and yet showed a relatively small preclinical effect; in contrast, sunitinib monotherapy was inactive against small cell lung cancer in a phase 2 trial (Han et al., 2013), but showed relatively large preclinical effects.
The authors:
…went on to find evidence that suggests that the anti-cancer effects of sunitinib might have been overestimated by as much as 45% because those studies that found no or little anti-cancer effect were simply not published.
So how do you calculate the effect of studies you don’t see, since they aren’t published? Very carefully, using a range of statistical tools, said study author Jonathan Kimmelman at McGill University. He explained them in depth (they’re also described in the paper itself), but also gave us the short answer:
The simple idea is that you expect to see a certain number of small studies that show either negative effects, or very small effects, due to random variation. If you fail to see an expected number of small studies showing small or negative effects, this can suggest that some small studies are remaining unpublished. And you can “guess” the proper effect size by imputing missing studies.
Kimmelman said he wasn’t surprised to see that most studies don’t include blinding, and many go unpublished. But some results did surprise him:
For one, I was surprised that around half of experiments captured in our search could not be analysed because basic elements like sample size or measures of variance (e.g. standard deviation) were not reported in publications. I was also surprised that we were unable to detect a dose response effect when we pooled studies. Finally, I was surprised that every malignancy tested showed statistically significant responses to sunitinib. If every experiment gives a positive result, either you aren’t learning anything, or you aren’t publishing much of the data that you are learning from.
He added that readers should interpret the latest findings with caution:
Our protocol was not prespecified, for one. So it should be considered an exploratory analysis. For another, it pertained to a single drug. Finally, we used meta-analytic methods that were largely derived from those used in clinical research. It is quite possible that other, as yet undiscovered methods of meta-analysis might produce a more sanguine view of preclinical research in cancer.
Since the authors cite some of their own previous papers, and the paper critiques previous methodologies, we asked if flaws in their own work might have an impact on the conclusions. Kimmelman replied:
Hell yes!
In the meantime, there are some key steps authors can take to improve the literature, Kimmelman said:
Elsewhere (PLoS Biology 2014), my co-authors and I argued that researchers should prespecify whether a preclinical study is exploratory, or whether it is confirmatory. And if it is confirmatory, sample sizes and designs should be prespecified, and preferably registered. Major decisions, such as launching clinical testing, should be based not on exploratory findings, but rather on preclinical studies that are confirmatory and that use prespecified designs.
Also high on my list would be greater publication of negative or inconclusive findings…i’d love to see journals create a venue, like a “micropublication,” where researchers could easily report the results of a single experiment. that would remove one of the hurdles to publishing negative and inconclusive findings.
In addition, these findings suggest that cancer researchers must establish some guidelines to increase the integrity of preclinical work, Kimmelman noted:
Oncology has one of the highest rates of failure in drug development. Oncology has also pioneered methodologies in clinical research, including standardized reporting and outcome assessment, and methods for enhancing the external validity of clinical trials. What is especially noteworthy, in my opinion, is that oncology has not shown nearly as much leadership as other research areas, like neurology, in tackling the design and reporting of preclinical research.
Update 10/13/15 8:26 p.m. eastern: We asked Kimmelman a follow-up question about how he thought these preclinical papers might have influenced the clinical trials of sunitinib. He told us:
It’s hard to know how, exactly, these preclinical studies influenced decision-making in clinical trials of sunitinib. What i can say is that a) many preclinical studies in our sample (50%) concluded by recommending clinical trials against the malignancy tested in the preclinical study , and b) most publications of phase 1 and phase 2 clinical trials of sunitinib cited preclinical studies that contained tumor growth curves. So if the authors of trials are being truthful in their introductions, one can reasonably infer that many of the studies in our sample influenced the launch and design of trials.
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, and sign up on our homepage for an email every time there’s a new post. Click here to review our Comments Policy.
A mandatory xkcd cartoon that illustrates the problem well.
https://xkcd.com/1217/
A systematic review from 2002 showed that many animal studies were underpowered and didn’t cope with bias. http://www.bmj.com/content/324/7335/474 (Behind access controls I’m afraid.)
It’s depressing that the message wasn’t taken onboard.