Is scientific fraud on the rise?

As readers of this blog have no doubt sensed by now, the number of retractions per year seems to be on the rise. We feel that intuitively as we uncover more and more of them, but there are also data to suggest this is true.

As if to demonstrate that, we’ve been trying to find time to write this post for more than a week, since the author of the study we’ll discuss sent us his paper. Writing about all the retractions we learned about, however, kept us too busy.

But given how sharp Retraction Watch readers are, you will be quick to note that more retractions doesn’t necessarily mean a higher rate. After all, there were about 527,000 papers published in 2000, and 852,000 published in 2009, so a constant rate of retractions would still mean a higher number. Here’s what Grant Steen, who published a paper on retractions and fraud last month in the Journal of Medical Ethics, found when he ran those numbers:

The rate of increase in retractions is greater than the rate of increase in publications, although the two are correlated.

That squares with what you’ll find by looking at Neil Saunders’ lovely plot of publications vs. retractions since 1977. There were approximately eight times as many retractions per 100,000 papers published in 2009 as there were in 2000. Important to note: Saunders’ data set has the same number of papers published, but more retractions in each year. When we find the reason for that discrepancy, we’ll post an update.

There’s no arguing that there is an increase, however. One possible reason for it is that journals are, as Steen puts it, “reaching farther back in time to retract published papers.” The average length of time between publication and retraction is growing. In fact, it took nine years to retract one particular paper in 2009.

Steen queries the data to see if research fraud is increasing, which he hints is the case:

It is particularly striking that the number of papers retracted for fraud increased more than sevenfold in the 6 years between 2004 and 2009.

The paper differentiates fraud — data fabrication or falsification — from error, which includes plagiarism, self-plagiarism, duplication, and scientific mistakes. Here’s how those reasons break down:

Error is more common than fraud; 73.5% of papers were retracted for error (or an undisclosed reason) whereas 26.6% of papers were retracted for fraud (table 1). The single most common reason for retraction was a scientific mistake, identified in 234 papers (31.5%). Fabrication, which includes data plagiarism, was more common than text plagiarism. Multiple reasons for retraction were cited for 67 papers (9.0%), but 134 papers (18.1%) were retracted for ambiguous reasons

As Steen notes:

Very recently, a downturn in mistakes occurred at the same time as an upturn in fraud (figure 2), which may mean that some fraud in the past was excused as a scientific mistake whereas fraud is now more likely to be admitted as such. We note that some excuses given for a mistake seem implausible and no excuse at all is given for other retractions; it seems likely that at least some retractions for mistakes or for unstated reasons actually represent fraudulent papers.

It’s also possible, writes Steen, that journals “are making a far more aggressive effort to self-police now than in the recent past.” Or it may be that “repeat offenders” such as Jan Hendrick Schon and Scott Reuben — who together make up a staggering 14% of the retractions from 2000 to 2009 — are skewing the data. (The story of another such repeat offender, Naoki Mori, seems to be unfolding as we speak. The count is up to ten.)

Still, Steen concludes that

Levels of misconduct appear to be higher than in the past.

But it’s somewhat difficult to know for sure, as Steen acknowledges, because

8% of retractions were for unstated reasons and up to 18% of retractions were for ambiguous reasons.

That lack of transparency resonated with us, as did another data point that Richard van Noorden, over at Nature‘s The Great Beyond blog, picked up on:

Nearly a third of retracted papers remain at journal websites and do not have their withdrawn status flagged…

Specifically, 31.8% of papers were neither watermarked nor marked as retracted at their abstract pages. We’ve called for various forms of transparency before, including press-releasing retractions and writing clearer notices. But marking retracted papers to begin with is also necessary, and we’re disappointed to see journals doing this badly.

Steen published a paper based on the same data set in November that we covered here. In it, he made a claim that van Noorden took issue with in a blog post at the time:

American scientists are significantly more prone to engage in data fabrication or falsification than scientists from other countries.

It’s important to note that the data set Steen is working with for both papers is only of English language retractions.

In our post, we noted that Steen’s November piece found 13 retractions for journal office error. We thought that was a bit low, given that we had already found five in four months of Retraction Watch, here, here, and here. In the new paper, he found 27, which still seems low but is probably closer to recent figures.

Regardless of the actual number, we wholeheartedly agree that those “journal office error” retractions demonstrate that

…retraction is a very blunt instrument used for offences both gravely serious and trivial.

And we obviously also agree that

Given that retraction of a paper is the harshest possible punishment for a scientist, we must work to assure that it is applied fairly.

9 thoughts on “Is scientific fraud on the rise?”

  1. Thanks for the excellent post. One difference between ages past and current is the increase storage of raw data on lab machines. In our lab, we can go back 5-10 years and find original data from many of the lab’s experiments, something that was not possible 10 years ago when storage space was too precious and computers too few. I wonder if this may aid both the discovery of recent fraud and fraud committed many years in the past.

  2. Is there any data that looks at the date of the publication instead of the date that it’s retracted? Or is that already what we’re seeing? I would think that would be the way to see if fraud is on the rise, in which case the numbers for more recent years would be skewed downward since presumably later journals might ‘reach back in time’ and retract current papers.

  3. I’d strongly suspect that this is more like changes in reported crime, distinct from the actual crime.

    If there’s more fraud, that’s one thing, a problem. If we’re simply watching for fraud more carefully, and thus discovering it more, that’s ultimately a good thing.

  4. Papers with fundamental errors are often not retracted. This may be due to sympathy with the senior author who has been victimized by an unscrupulous associate.In earlier years, a person who was so victimized would feelashamed perhaps to the point of qjuitting research.Nowadays people take it as ‘one of those things’. Scientists with a large number of people working in their labs are particularly vulnerable – and so are older scientists who may be less familiar with new techniques than their postdocs. I personally could not handle a lab with more than 8 people in it and still know what everyone was doing. I believe there should be a category of papers in which the person who runs the lab is not an author, but the paper is listed as ‘from the lab of …” That won’t stop fraud but it may curb the unrealistic idea that one can write 25 papers a year and be familiar with all the experimental data.

    1. Dear Elaine Newman,

      I have read a couple of your comments which sound very sensible and also way to bring science back to what is supposed to be about, gaining new knowledge, not part of the industrial complex. I was particularly struck when you wrote:

      “I personally could not handle a lab with more than 8 people in it and still know what everyone was doing. I believe there should be a category of papers in which the person who runs the lab is not an author, but the paper is listed as ‘from the lab of …””

      That is a real a practical manifesto for action.
      8 sounds about the right number.

      I am glad there are some sane minds still around.

      1. Thank you, David Hardman.
        Lets try to think up a number of possible reforms, see what response we can get from readers of this blog and others, and think of a possible manuifesto topresent to granting agents.

        I’d like to startwith publishers and reviewers.
        Ithink that publishers should be responsible for the quality of their reviewers. If they cannot find good ones, don’t make the excuse that its hard to find reviewers, just don’tpublish the paper until you do find them.
        How about paying reviewers- say $200/paper.The open access boondoggle has given publishers a free ride- theircosts have dropped while authors pay astronomical fees to publish. Put some of that into paying reviewers. My postdoc could use 1 or 2 days work a month at$200/day and learn a lot while doing it. So could my various retired friends. Anyone wishing to domore than one review would find it worthwhile to do them well.
        What do you think?

  5. What has struck me about good, older papers I have read (from the ’70’s and earlier) is how thoroughly the authors would question their results. Nowadays I don’t see much of that – just “our results are amazing and interesting! Sure, there could be some confounding or reporting error but we don’t think so…” I speculate that this is due to the process for funding grants. Most of the time you need the results of earlier studies to support what you want the money for, so the incentive is there to make sure the answers of your earlier investigations come out the right way. Especially since “survival” depends on bringing in a certain amount of grant dollars. That is, the focus of the granting decision is on the answer and not on the question. (Full disclosure: I don’t live in that world.)

  6. Rather slow to spot the link to my plot in this blog post 🙂

    Just to note that my plot shows number of retraction notices, rather than retracted articles. I should alter the plot labels to make that clear. Not sure what you mean by “the same number of papers published, but more retractions in each year.” Perhaps that was a glitch at the time you viewed the plot.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.