In February, David Allison came across a study with a familiar problem.
The authors of the study purported to show an educational program helped women lose weight, but they had not directly compared the treatment and control groups. Instead, they’d used a statistically invalid method to compare changes within the groups.
Allison, the dean at Indiana University’s School of Public Health in Bloomington, along with researcher Luis-Enrique Becerra-Garcia and other colleagues, in July submitted a critique of the study to the journal that had published it. Four days later, Nauman Khalid, the journal’s editor in chief, wrote to the study’s lead author.
“I got excellent feedback from Dr. Becerra-Garcia,” Khalid wrote. “According to their analysis, the statistical tool that you used in your research is wrong and not well-validated.”
Now, after a lack of adequate response from the authors, the study will be retracted, according to Khalid.
Becerra-Garcia called Khalid’s action “prompt and decisive.”
Although the study has not yet been retracted, the response from the journal is a departure from the back-and-forth between authors and publishers that can stall retractions. Another paper Allison’s group critiqued was recently retracted – more than three years after they first raised concerns about the work.
The study to be retracted, “Effects of the application of a food processing-based classification system in obese women: A randomized controlled pilot study,” was published in the journal Nutrition and Health this past February.
The researchers examined the effects of an educational program based on Brazil’s new dietary guidelines, and found the program led to significantly decreased body mass and increased quality of life for the women who followed it compared to women in a control group.
The commentary by Allison’s group critiquing the study appeared in the same journal in September.
Allison has been working for years to correct statistical errors in the nutrition literature. He told Retraction Watch that studies that use these kinds of invalid statistical methods “are more likely to mislead readers to think that effectiveness of some intervention has been shown by accepted statistical procedures with known and stated error rates when it has not been. This may lead to the unwitting adoption or investment of time, money, effort, and risk in inert interventions.”
Becerra-Garcia told Retraction Watch:
Regarding statistical methods, it is pertinent to note that in a randomized controlled trial (RCT) design like the one used in Giacomello et al., 2023, to test for the effectiveness of an intervention, it is crucial to directly compare differences between groups in a significance test rather than relying solely on inspecting changes within each group. This entails testing whether the control and the intervention group differ in their respective change in outcomes from before to after treatment. This study only analyzed within-group changes (before vs. after) without statistically comparing between groups, which is a statistical error known as the “Differences in Nominal Significance” (DINS) error.
Becerra-Garcia added that he and his colleagues have detected the same error in other studies, and the mistake can sometimes lead to incorrect conclusions, as it did with this study.
On July 25, well before the critique was published, Khalid emailed Daniel Fernandes Martins, the senior author of the study and an adjunct professor at the University of Southern Santa Catarina in Brazil, informing him of the critique.
“Per their commentary, they recommend correcting or retracting your original article,” he wrote.
He also wrote that Martins, who did not respond to an email from Retraction Watch, would have 15 days to take either action, according to the journal’s policy. However, the study remained online for many months.
After the critique of the study was published in September, Becerra-Garcia asked Khalid about the status of the study, writing that he and his coauthors “believe it is essential to address these issues to maintain the integrity of the scientific literature.”
In reply, Khalid said the authors of the study had requested the study be withdrawn without comment. To do so would be against the journal’s ethical standards, he said, and instead, the study would soon be retracted.
Khalid told Retraction Watch:
The main reason for retraction includes significant issues in statistical analysis of data and wrong presentation of conclusion. The initial research data was cross-verified by Prof. Allison’s group, and the re-evaluated data was shared with original authors via published commentary. Moreover, the original authors wanted to withdraw their manuscript, which I didn’t agree.
He added that “the authors need to be more vigilant in statistical analysis with human subjects since the minute change in conclusion has significant impact on what we see in this research.”
A spokesperson for Sage, the journal’s publisher, said that they were “actively investigating this [case] in accordance with COPE guidelines.” The retraction guidelines set out by the Committee on Publication Ethics (COPE) state that a study should be retracted if there is “clear evidence that the findings are unreliable,” including “as a result of major error (eg, miscalculation or experimental error).”
Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on Twitter, like us on Facebook, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].
This is just one more example of why studies need to undergo statistical review before being accepted for publication.
Props to the original authors for sharing their raw data and props to the reanalysis team for undertaking this project and posting the analytic R code. Transparent, honest science as it should be. Ideally the first paper would get rejected for inappropriate stats by a statistical reviewer, but this is a second best option for smaller journals without the resources of a statistician.
Another approach would be to use a difference-in-difference analysis which looks at differences in changes over time in a key outcome. However, the lack of blinding would be a problem (unless there was a sham intervention in the control group) and usually such analyses require a long time period. It would be interesting to see what a DID approach would bring to the conclusions, rather than just junk the whole thing.