Recently, a biostatistician sent an open letter to editors of 10 major science journals, urging them to pay more attention to common statistical problems with papers. Specifically, Romain-Daniel Gosselin, Founder and CEO of Biotelligences, which trains researchers in biostatistics, counted how many of 10 recent papers in each of the 10 journals contained two common problems: omitting the sample size used in experiments, as well as the tests used as part of the statistical analyses. (Short answer: Too many.) Below, we have reproduced his letter.
Reproducibility is everywhere recently, from the pages of scientific journals to the halls of the National Academy of Sciences, and today it lands in bookstores across the U.S. Longtime NPR correspondent Richard Harris has writtenRigor Mortis(Basic Books), which is published today. (Full disclosure: I blurbed the book, writing that “Harris deftly weaves gripping tales of sleuthing with possible paths out of what some call a crisis.”) Harris answered some questions about the book, and the larger issues, for us.
By now, most of our readers are aware that some fields of science have a reproducibility problem. Part of the problem, some argue, is the publishing community’s bias toward dramatic findings — namely, studies that show something has an effect on something else are more likely to be published than studies that don’t.
Many have argued that scientists publish such data because that’s what is rewarded — by journals and, indirectly, by funders and employers, who judge a scientist based on his or her publication record. But a new meta-analysis in PNAS is saying it’s a bit more complicated than that.
In a paper released today, researchers led by Daniele Fanelli and John Ioannidis — both at Stanford University — suggest that the so-called “pressure-to-publish” does not appear to bias studies toward larger so-called “effect sizes.” Instead, the researchers argue that other factors were a bigger source of bias than the pressure-to-publish, namely the use of small sample sizes (which could contain a skewed sample that shows stronger effects), and relegating studies with smaller effects to the “gray literature,” such as conference proceedings, PhD theses, and other less publicized formats.
However, Ferric Fang of the University of Washington — who did not participate in the study — approached the findings with some caution:
Researchers from China have retracted a physics paper after realizing an error led them to report results that were nearly 100 times too large.
What’s more, the authors omitted key findings that would enable others to reproduce their experiments.
According to the notice, the authors used a value to calculate a feature of electrons—called mobility—that “was approximately 100 times too small,” which led to results that were “100 times too large.” The notice also details several gaps in the presentation of experimental results, which preclude others from duplicating the experiments.
Most researchers by now recognize there’s a reproducibility crisis facing science. But what to do about it? Today in Nature,Jeffrey S. Mogil at McGill University and Malcolm R. Macleod at the University of Edinburgh propose a new approach: Restructure the reporting of preclinical research to include an extra “confirmatory study” performed by an independent lab, which verifies the findings before they are published. We spoke with them about how this could work.
Retraction Watch: You’re proposing to restructure animal studies of new therapies or ways to prevent disease. Can you explain what this new type of study should look like, and how researchers will execute it?
After an international group of physicists agreed that the findings of their 2015 paper were in doubt, they simply couldn’t agree on how to explain what went wrong. Apparently tired of waiting, the journal retracted the paper anyway.
The resulting notice doesn’t say much, for obvious reasons. Apparently, some additional information came to light which caused the researchers to question the results and model. Although the five authors thought a retraction was the right call, they could not agree on the language in the notice.
Researchers in China have retracted a 2016 paper exploring the replication behaviors of a retrovirus, after discovering that the key results could not be reproduced — possibly because their cell cultures had been contaminated.
The authors also cite a disagreement with a colleague, who they say contributed to the work but does not want to be listed as an author.
Doing research is hard. Getting statistically significant results is hard. Making sure the results you obtain reflect reality is even harder. In this week’s Science,Eric Loken at the University of Connecticut and Andrew Gelman at Columbia University debunk some common myths about the use of statistics in research — and argue that, in many cases, the use of traditional statistics does more harm than good in human sciences research.
Retraction Watch: Your article focuses on the “noise” that’s present in research studies. What is “noise” and how is it created during an experiment?
Nearly five years ago, researchers suggested that the vast majority of preclinical cancer research wouldn’t hold up to follow-up experiments, delaying much needed treatments for patients. In a series of articles publishing tomorrow morning, eLife has released the results of the first five attempts to replicate experiments in cancer biology — and the results are decidedly mixed.
As our co-founders Adam Marcus and Ivan Oransky write inSTAT, the overall take-home message was that two studies generated findings similar to the original, one did not replicate the original, and two others were inconclusive.