Scientific fraud isn’t what keeps Andrew Gelman, a professor of statistics at Columbia University in New York, up at night. Rather, it’s the sheer number of unreliable studies — uncorrected, unretracted — that have littered the literature. He tells us more, below.
Whatever the vast majority of retractions are, they’re a tiny fraction of the number of papers that are just wrong — by which I mean they present no good empirical evidence for their claims.
I’ve personally had to correct two of my published articles. In one paper, we claimed to prove a theorem which was in fact not true, as I learned several years after publication when someone mailed me a counterexample. In the other paper, we had miscoded a key variable in our data, making all our empirical results meaningless; this one I learned about when a later collaborator was trying to replicate and extend that work. So my own rate of retractions or corrections is something like 0.5%.
I’m a pretty careful researcher, and I can only assume that the overall rate of published papers with fatal errors is greater than my own rate of (at least) half a percent. Indeed, in some journals in recent years, I think the error rate may very well approach 50% — by which I mean that I think something like half the papers claim evidence that they don’t really have.
In recent years we’ve seen various high-profile cases in social science of published research with “statistically significant” findings which were fatally flawed, where the data could not support the elaborate claims being made. High-profile examples include claims of ESP among some college students, correlations of ovulation and voting, adopting a certain pose (“power pose”) makes you more powerful, beautiful parents are more likely to have girls than boys, and people react differently to hurricanes tagged with a male versus female name. In each of these cases, the issue is not that the underlying scientific claims are necessarily false– ESP might well exist, people do behave differently at different times of the month, etc. — but that the claimed evidence just wasn’t there to support the claims. To put it another way, in any of these examples, the data would also support the exact opposite of the claimed phenomena (college students having a negative ESP, power posing has a negative effect, and so forth). And it’s not just frivolity. Similar statistical concerns arose, for example, in a much-publicized study of the effect of air pollution on life expectancy.
The above-mentioned studies have four features in common:
- They are all (in my opinion) fatally flawed from a statistical perspective.
- Nobody (including me) is suggesting that the data in any of these studies was faked.
- None of these papers has been retracted or corrected.
- None of these papers is ever going to be retracted or corrected.
Readers may balk at that last assertion. How am I so sure the papers won’t be corrected or retracted? Because, with extremely rare exceptions, bad statistics is not enough of a reason for anyone to fix the scientific record. At most–and this too is very rare–the journal might publish a letter refuting the published paper, or an article explaining the error might be published elsewhere. But not usually even that.
I’m not saying that all these papers should be retracted; rather, I’m saying that retraction/correction will only ever be a tiny part of the story. Just look at the numbers. Millions of scientific papers are published each year. If 1% are fatally flawed, that’s thousands of corrections to be made. And that’s not gonna happen. As has been discussed over and over on Retraction Watch, even when papers with blatant scientific misconduct are retracted, this typically requires a big struggle in each case. The resources just aren’t there to adjudicate this for the thousands of published papers a year which are wrong but which don’t involve misconduct.
Indeed, it seems that retractions and corrections are not so much about correcting the scientific record as about punishing wrongdoers and shoring up the reputation of journals with regard to their most embarrassing mistakes. That’s fine — I agree that crime shouldn’t pay and that journals have every right (and even an obligation) to de-associate from cases of fraud.
My point here is that we shouldn’t think of retraction and correction as any kind of general resolution to the problem of published errors. The scale is just all wrong, with tens of thousands of papers that are wrong in their empirical content, and orders of magnitude fewer papers being corrected or retracted.
In an article discussed recently on Retraction Watch, Daniele Fanelli recommended that authors be able to self-retract articles that are fatally flawed due to honest errors. I think that’s fine — as noted above, on the two instances when it turned out my own articles were fatally flawed, I contacted the journals and they ran corrections–but I think that when it comes to cleaning up the scientific literature and flagging errors, retraction won’t be the most useful tool. Now and for the foreseeable future, it looks to me like retraction will be a rarely used tool, used mostly for flagging fraud. To tag the sorts of run-of-the-mill errors which plague science and cause some thousands of erroneous papers to be published each year, we’ll need some more scalable versions of post-publication review.
Gelman blogs at http://andrewgelman.com/.
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our new daily digest. Click here to review our Comments Policy. For a sneak peek at what we’re working on, click here.