The detection process uses the algorithm “statcheck” — which we’ve covered previously in a guest post by one of its co-developers — to scan just under 700,000 results from the large sample of psychology studies. Although the trends in Hartgerink’s present data are yet to be explored, his previous research suggests that around half of psychology papers have at least one statistical error, and one in eight have mistakes that affect their statistical conclusions. In the current effort, regardless of whether any mistakes are found, the results from the checks are then posted to PubPeer, and authors are alerted through an email.
Till now, the initiative is one of the biggest large-scale post-publication peer review efforts of its kind. Some researchers are, however, concerned about its current process of detecting potential mistakes, particularly the fact that potentially stigmatizing entries are created even if no errors are found.
Chris Hartgerink, a PhD student at Tilburg University in The Netherlands, has posted a preprint online outlining the processes he and others used to mine just under 700,000 results from his sample of more than 50,000 papers, which all had statcheck run on them.
The project has value for readers as well as individual academics, who can fix any mistakes in their papers accordingly, Hartgerink told Retraction Watch.
Some researchers’ welcomed the project on social media:
— Jennifer Tackett (@JnfrLTackett) August 26, 2016
— Jay Van Bavel (@jayvanbavel) August 26, 2016
Not all researchers see the program in a positive light, however. For example, two papers co-authored by prominent psychologist Dorothy Bishop, who is based at the University of Oxford, UK, have so far been flagged by statcheck. One, however, says the program detected no statistical mistakes in her paper. Bishop was unhappy with the paper being flagged despite no errors being found, and took to Twitter to express her concern:
— Dorothy Bishop (@deevybee) August 26, 2016
She told us:
The tone of the PubPeer comments will, I suspect, alienate many people. As I argued on Twitter, I found it irritating to get an email saying a paper of mine had been discussed on PubPeer, only to find that this referred to a comment stating that zero errors had been found in the statistics of that paper.
As for the other paper, in which statcheck found two allegedly incorrect results, Bishop said:
I’ll communicate with the first author, Thalia Eley, about this, as it does need fixing for the scientific record, but, given the sample size (on which the second, missing, degree of freedom is based), the reported p-values would appear to be accurate.
Bishop would like to see statcheck validated:
If it’s known that on 99% of occasions the automated check is accurate, then fine. If the accuracy is only 90% I’d be really unhappy about the current process as it would be leading to lots of people putting time into checking their papers on the basis of an insufficiently sensitive diagnostic.
Hartgerink said he could see why many researchers may find the process frustrating, but noted that posting PubPeer entries when no errors were detected is also “valuable” for post-publication peer review. Too often, post-publication peer review is depicted as only questioning published studies, and not enough emphasis is put on endorsing sound content, he said.
Furthermore, he noted that statcheck is by no means “definitive,” and its results always needs to be manually checked. A few authors, for example, have commented on PubPeer claiming that their papers didn’t contain the flagged mistakes, said Hartgerink. In the end, there appeared to be mistakes in the algorithm itself, he said.
Hartgerink, therefore, recommends that researchers should always check whether errors highlighted by statcheck actually exist. If they do, researchers can then consider contacting journal editors, and issuing corrigenda where necessary, he said.
For the future, Hartgerink thinks it wouldn’t hurt for journals to run statcheck on manuscripts before accepting them. Michèle Nuijten, who is also a PhD student at Tilburg University and author of the November 2015 Retraction Watch article about statcheck, is speaking with several journal editors to pilot use of the algorithm as part of the reviewing process, Hartgerink explained.
Originally, Hartgerink, whose doctoral research is about detecting potential data anomalies, aimed to use the current data as psychology literature baseline for other projects; for instance, one of his projects about how extreme certain results are draws on the present data.
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our daily digest. Click here to review our Comments Policy. For a sneak peek at what we’re working on, click here.