A leading psychology research society in Germany has called for the end of PubPeer postings based on a computer program that trawls through psychology papers detecting statistical errors, saying it is needlessly causing reputational damage to researchers.
Last month, we reported on an initiative that aimed to clean up the psychology literature by identifying statistical errors using the algorithm “statcheck.” As a result of the project, PubPeer was set to be flooded with more than 50,000 entries for the study’s sample papers — even when no errors were detected.
On October 20, the German Psychological Society (DGPs) issued a statement criticizing the effort, expressing concern that alleged statistical errors are posted on PubPeer before authors of original studies are contacted. The DGPs also claimed when mistakes that are detected by statcheck and posted on PubPeer turn out to be false positives, it still results in damage to researchers that is “no longer controllable,” as entries on PubPeer cannot be easily removed.
Today, statcheck’s creators, led by Michèle Nuijten — a PhD student at Tilburg University in the Netherlands, who we’ve previously interviewed about statcheck — responded to DGPs’ critcisms, saying that there is value in
…openly discussing inconsistencies in published articles in an impersonal and factual manner, given that our own experiences in corresponding directly with authors about errors have not led to any documented corrections…
In their statement, the DGPs say:
…many researchers – especially those whose papers are among the 50,000 that were automatically screened – are worried about the fact that the screening of their article occurred (1) without the authors’ awareness, (2) without being able to actually verify whether the results of this screening are actually correct, and (3) without the opportunity to comment on the screening of their paper before the results were published on pubpeer. In addition, many colleagues are deeply concerned about the fact that it is obviously difficult to remove an entry on pubpeer after an error that had been “detected” by statcheck turned out to be a false positive.
The statement goes on to add:
…the detection of an alleged error necessarily requires a high level of sensitivity and cooperative intentions among all parties. Before a paper is publicly flagged for alleged statistical errors (on pubpeer or elsewhere), the authors of this paper should be given the opportunity to double-check and comment on the results of the screening. If an alleged error then turns out to be a false positive, any posts or comments in which the articles is flagged need to be removed or revoked at once.
The DGPs statement cites a paper posted on the preprint server arXiv earlier this month by Thomas Schmidt, a professor of experimental psychology at Technical University Kaiserslautern in Germany, which concludes:
The goal of this comment is to point out an important and well-documented flaw in this busily applied algorithm: It cannot handle corrected p values. As a result, statistical tests applying appropriate corrections to the p value (e.g., for multiple tests, post-hoc tests, violations of assumptions, etc.) are likely to be flagged as reporting inconsistent statistics, whereas papers omitting necessary corrections are certified as correct. The STATCHECK algorithm is thus valid for only a subset of scientific papers, and conclusions about the quality or integrity of statistical reports should never be based solely on this program.
In their reply today, Nuijten and colleagues write:
We clearly noted statcheck’s shortcomings in our publications. We continue to further refine statcheck and investigate the influence of possible bugs or other problems on our estimates of the prevalence of inconsistencies in psychology (see e.g., Nuijten, 2016). We therefore welcome all researchers’ comments on the performance of statcheck. So far, no bugs have been found that noticeably affect estimates of inconsistencies in statistical results in psychology. Hence we see no reason to adapt our initial estimates, or to discourage using statcheck in scientific articles, given that researchers take into account the program’s limitations.
We as scientists have the obligation to correct reporting errors even if the tools we use are not 100% accurate.
As we previously reported, statcheck received a mixed response from psychologists on social media when the PubPeer project was inaugurated last month.
Today, Nuijten noted that the PubPeer project was separate from statcheck, and told Retraction Watch:
I would like to stress that statcheck cannot (and does not pretend to) say anything about intentional mistakes, misconduct, or even fraud. It is simply a tool that calculates if the degrees of freedom and the test statistic correspond with the p-value.
Another possible application of statcheck, Chris Hartgerink — also a PhD student at Tilburg University and the second author of statcheck’s letter — previously told Retraction Watch that journals could run statcheck on manuscripts before accepting them. Psychological Science is already running a pilot to incorporate statcheck into their reviewing process, Nuijten noted.
Update, 5 p.m. Eastern, 10/27/16: Hartgerink has forwarded us his response to the DGPs. It concludes:
In my opinion, the reports on PubPeer should be seen as part of the scientific debate, enabling authors and other researchers to check the accuracy of statistics in published articles. Post-publication review represents a powerful forum for such scientific debate. It compliments traditional peer review that apparently has been unsuccessful in catching inconsistencies that statcheck can detect readily albeit not with a 100% accuracy.
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our daily digest. Click here to review our Comments Policy. For a sneak peek at what we’re working on, click here.