A sweeping analysis of more than 5,000 papers in eight leading medical journals has found compelling evidence of suspect data in roughly 2% of randomized controlled clinical trials in those journals.
Although the analysis, by John Carlisle, an anesthetist in the United Kingdom, could not determine whether the concerning data were tainted by misconduct or sloppiness, it suggests that editors of the journals have some investigating to do. Of the 98 studies identified by the method, only 16 have already been retracted. [See update at end.]
The types of studies analyzed — randomized controlled clinical trials — are considered the gold standard of medical evidence, and tend to be the basis for drug approvals and changes in clinical practice. Carlisle, according to an editorial by John Loadsman and Tim McCulloch accompanying the new study published today in Anesthesia,
…has developed and refined a novel use for the statistical analysis of baseline data to identify instances where sampling in clinical trials may not have been random, suggesting the trial was either not properly conducted or was inaccurately reported. Essentially, Carlisle’s method identifies papers in which the baseline characteristics (e.g. age, weight) exhibit either too narrow or too wide a distribution than expected by chance, resulting in an excess of p values close to either one or zero.
For the new paper, Carlisle used a p value cutoff of one in 10,000 — in other words, an extremely narrow, or wide, distribution of characteristics.
Six of the journals in the new analysis are in anesthesiology, where Carlisle has focused his efforts for several years, refining the method since 2012. That earlier work led to the revelations that much of Yoshitaka Fujii’s research was fraudulent, and Fujii now tops our leaderboard, with 183 retractions. The method has also been used by others to identify issues in more than 30 papers by bone researcher Yoshihiro Sato.
For the latest effort, a look at studies published between 2000 and 2015, Carlisle also included JAMA and the New England Journal of Medicine. Those journals had 12 and 9 potentially problematic papers, respectively, with fewer of them retracted — one from each journal — compared to those in anesthesiology. Carlisle writes:
It is unclear whether trials in the anaesthetic journals have been more deserving of retraction, or perhaps there is a deficit of retractions from JAMA and NEJM.
As Andrew Klein, editor of Anesthesia (which was itself one of the included journals), tells Retraction Watch:
Different journals will retract at different rates and speeds (depending on many factors, including how long any investigation takes).
Carlisle contacted the editors of all eight journals, and all responded “swiftly and positively,” Klein said. JAMA editor in chief Howard Bauchner told Retraction Watch that he informed Carlisle that he would read the paper and consider next steps. He said he still believes in a process, and that while allegations should be investigated, he can’t draw conclusions based upon allegations. And the New England Journal of Medicine said they could not comment until they have a chance to discuss the supplemental data from the study, which goes live today.
Klein said he is meeting with editors of all of the anesthesiology journals today to discuss the findings:
No doubt some of the data issues identified will be due to simple errors that can be easily corrected such as typos or decimal points in the wrong place, or incorrect descriptions of statistical analyses. It is important to clarify and correct these in the first instance. Other data issues will be more complex and will require close inspection/re-analysis of the original data.
The results of the new analysis suggest that the Carlisle Method would have identified problems in papers by two other anesthesiologists — Joachim Boldt and Scott Reuben — with large numbers of retractions, Klein said. The method was not yet available when those problems came to light.
So should all journals use the method — which is freely available online — to screen papers? In their editorial, Loadsman and McCulloch note that if that were to become the case,
…dishonest authors could employ techniques to produce data that would avoid detection. We believe this would be quite easy to achieve although, for obvious reasons, we prefer not to describe the likely methodology here.
Klein tells Retraction Watch:
By publishing the Carlisle Method, we are aware that we may be providing dishonest authors a chance to analyse the tool and an opportunity to adapt their methods to avoid detection. However, the analogy I use is plagiarism detection software, which is freely available. Despite this, every week we still receive many submissions which contain significant plagiarism and are therefore rejected immediately. Overall, as we have stated before, trying to replicate ‘nature’ and the random distribution or patient characteristics is extremely difficult and the Carlisle Method is designed to pick up non-random distributions.
Klein said it was not clear whether other journals would have the same rates of potentially problematic data, since they have not yet been analyzed.
According to Loadsman and McCulloch:
With the proven utility of applying the method to previous studies we have no doubt more authors of already published RCTs will eventually be getting their tap on the shoulder. We have not yet heard the last word from John Carlisle!
Update, 2245 UTC, June 8, 2017: We’ve uploaded a list of the papers below the 1 in 10,000 cutoff; turns out there were 95, not 98, for reasons we explain in sheet 3 of the spreadsheet, “Explanation of Method.”
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our daily digest. Click here to review our Comments Policy. For a sneak peek at what we’re working on, click here.
I have blogged about this here: http://steamtraen.blogspot.fr/2017/06/exploring-john-carlisles-bombshell.html
The one-line summary is that the sky may not be falling just yet. I look forward to people with actual statistical expertise taking this further.
I have also blogged about this here https://errorstatistics.com/2017/07/01/s-senn-fishing-for-fakes-with-fisher-guest-post/
My one sentence summary is that whereas some of the results may be due to fraud and error more innocent statistical explanations are possible, although some of these have implications for the way that clinical trials are often currently analysed