Why is it so difficult to correct the scientific record in sports science? In the first installment in this series of guest posts, Matthew Tenan, a data scientist with a PhD in neuroscience, began the story of how he and some colleagues came to scrutinize a paper. In the second, he explained what happened next. In today’s final installment, he reflects on the editors’ response and what he thinks it means for his field.
In refusing to retract the Dankel and Loenneke manuscript we showed to be mathematically flawed, the editors referred to “feedback from someone with greater expertise” and included the following:
1.“…error rates in the real world may differ from what’s assumed in this paper, I am not sure that I understand where the fault lies with Dankel & Leonneke’s logic”
2.“There are papers from Robert Ross and colleagues in Canada. There are also papers from David Bishop’s group in Australia…If Jeremy Loenneke’s paper is to be retracted, then will the authors of the letter also request retraction of these other papers? Where would/should such a process stop?”
3.“…the two-step approach forwarded by Loenneke et al. had, in my view, positive aspects to it relative to the above papers by the other groups, i.e. the first stage use of Levene’s test to test the null hypothesis of variance equality between treatment and control groups. In the other papers, a formal control group was not even considered in the process.”
4.“…the letter from Tenan et al is very interesting and generates some questions in itself. I favour the letter and response being published in order for these questions to be discussed further.”
The flaws with these points are myriad, but here are some highlights:
On point 1, Dankel and Loenneke make factual, mathematically incorrect statements about error rates. It is irrelevant how statistical their wording and logic “sounds”—if it can’t be shown mathematically, it is an invalid statistical method.
On point 2, the fact that others are publishing potentially invalid work—none of which is cited in the Dankel and Loenneke manuscript—is irrelevant, as the existence of these works do not make Dankel and Loenneke’s work any less wrong or acceptable. Moreover, the insinuation of some sort of slippery slope is yet another red herring; whether we request retraction of other papers should not be considered here, as Dankel and Loenneke’s incorrectness stands independently.
In point three, our correspondents seem to be arguing that it the work still has value because Dankel and Loenneke at least suggest control groups are necessary. But it is because their approach relies on the variance of the control group that it fails to have the claimed error rates; moreover, the seemingly intuitive nature, the first step of the two-step approach, does not fix these error rates.
Point 4 is classic “both sides” nonsense and suggests that “any discussion” (creating additional citations) is valid. Again, the Dankel and Loenneke method is mathematically flawed, misrepresents their factual error rates, and the authors’ response shows a callous disregard for increasing the quality of scientific publishing.
You can’t say 2+2 = 4 and then say “oh, but let’s hear out this group that says 2+2=5.” At no point—either by Dankel and Loenneke, their anonymous statistician, or the Sports Medicine editorial staff—were our mathematics or simulations refuted. At no point did any proponents of the Dankel and Loenneke method provide any formal math supporting their method.
As I write this, the paper continues to be cited. Dankel and Loenneke have already published a subsequent paper using their flawed method in the journal Applied Physiology, Nutrition & Metabolism. We are currently in the process of writing a letter to the editor at this journal noting their use of an invalid methodology. Every time we see a paper use and cite the original Dankel and Loenneke methodology, we plan to notify the editor of the journal. We shouldn’t need to chase down every use of this flawed method, especially when we’ve identified the issues with the original paper prior to publication.
Simply sounding logical or mathematical is not a substitute for formal mathematics and simulation when proposing a novel analytical method. This formal math and simulation is the standard in nearly every other field; the sooner the field of exercise and sport science realizes this, the sooner the overall quality of research will improve. Proposing “novel statistical methods” without showing the math enables “evolving definitions” (i.e. goalpost moving) when aspects of one’s proposed method are shown to not make sense.
Indeed, this was exactly the case when we reviewed Dankel and Loenneke’s response to our letter to the editor. The response is at times nonsensical, and their explanation of their method does not actually match their proposed method in their original paper, as best as we can tell. This sort of thing would not occur if it was required to show formal mathematics and simulation studies for “novel statistical methods.”
We continue to hope that we can reach an amicable resolution with either a retraction or a substantial errata to the original manuscript. Our issue is not with the authors, the Editors, or the journal. Rather, it is that the field of Exercise and Sport Science continually allow statistical sounding methods to be published and widely used without any actual validation by knowledgeable experts. At no time, was the manuscript in question ever evaluated by an individual with a PhD in statistics.
Furthermore, our field, and science as a whole, are so averse to the idea of correcting scientific manuscripts when they are shown to be factually wrong (inevitably claiming that science is “self-correcting” in the same breath), it is harming the scientific process and clinical practice over the long run.
Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on Twitter, like us on Facebook, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].
Thanks for sharing this! I was quickly reading through on my phone, but it seemed that their method was valid under certain assumed conditions (I think you mentioned in part 2)? Perhaps I read this wrong as, if so, why is the goal not to simply state how important those assumptions are?
Either way, it’s a very interesting broader point here (with a very illuminating illustration), and something worthy of consideration. Thanks again for sharing!
There weren’t any consistent scenarios where their stated 5% error rate held. It was completely inefffective under non-constant measurement error. When measurement error was held constant, the error rates varied based on a number of other different factors which are detailed here: https://osf.io/ab683/
The only way the method “works” is if you don’t actually use the Dankel-Loenneke method but instead simply do a Levene’s Test of unequal variances between groups and stop there 100% of the time. Even then, we’d argue that an interpretation of differential responders is a vast over-interpretation of Levene’s Test. A series of gated tests (i.e. the full Dankel-Loenneke Method) creates a joint distribution for the error rates, which is what Dankel & Loenneke don’t seem to understand.