“Why Growing Retractions Are (Mostly) a Good Sign”: New study makes the case

Daniele Fanelli
Daniele Fanelli

Retraction Watch readers will no doubt be familiar with the fact that retraction rates are rising, but one of the unanswered questions has been whether that increase is due to more misconduct, greater awareness, or some combination of the two.

In a new paper in PLOS Medicine, Daniele Fanelli, who has studied misconduct and related issues, tries to sift through the evidence. Noting that the number of corrections has stayed constant since 1980, Fanelli writes that:

If the recent growth of retractions were being driven by an increasing propensity of researchers to “cut corners,” we would expect minor infractions, and therefore the frequency of published errata, to increase just as fast as, if not faster than that of retractions.

Fanelli also finds that the proportion of journals retracting papers has increased, while the number of retractions in each of those journals has remained the same. Taken with the fact that the number of cases of misconduct found by the Office of Research Integrity (ORI) has not decreased, Fanelli concludes:

Data from the [Web of Science] database and the ORI offer strong evidence that researchers and journal editors have become more aware of and more proactive about scientific misconduct, and provide no evidence that recorded cases of fraud are increasing, at least amongst US federally funded research. The recent rise in retractions, therefore, is most plausibly the effect of growing scientific integrity, rather than growing scientific misconduct.

Hence the “(mostly) a good sign” in the title.

Fanelli singles out retraction notices for discussion:

An unjustified stigma currently surrounds retractions, and the opaqueness of many retraction notices betrays improper feelings of embarrassment [1]. Nearly 60% of retraction notices linked to misconduct only mention error, loss of data or replication failure, and less than one-third point to a specific ethical problem [19]. Editors writing these notices often use ambiguous euphemisms in place of technical definitions of misconduct, perhaps to prevent legal actions (see www.retractionwatch.com). Although retraction notices are becoming more transparent, many journals still lack clear policies for misconduct and retraction, and existing policies are applied inconsistently [19,20,21]. It is worth pointing out that journals with a high impact factor are more likely to have clear policies for scientific misconduct [22,23]. This datum offers a simple, and largely overlooked, explanation for the correlation observed between journal impact factor and retraction frequency, which instead is usually attributed to higher scrutiny and higher prevalence of fraudulent papers in top journals [1,7].

We asked Ferric Fang, who has of course studied retractions and is more of a proponent of the growing misconduct hypothesis, for his take:

The increasing number of retracted articles often raises the question of whether this reflects more misconduct or greater scrutiny.  Fanelli’s recent paper is an interesting attempt to answer this question.  However I have some concerns about the study’s methodology and conclusions.  One line of evidence is that records marked as “correction” in the Web of Science have been stable since the 1970s.  However corrections are frequently of a very minor nature, and it is conceivable that a significant increase in misconduct would be masked by a large pool of corrections that are trivial in nature or the result of honest error.  I have also learned, on the basis of studies that I performed with Arturo Casadevall and Grant Steen, to be careful about relying on journal retraction and correction notices.

A second line of evidence advanced by Fanelli is that the ORI caseload has been stable.  I have puzzled over this fact as well.  However, it is important to recognize that the ORI was only formed in 1992, and the rise in the rate of retractions had already begun prior to that time (Fang et al. PNAS 109:17028, 2012).  Furthermore, Fanelli examined the ORI caseload from 1994-2011, but John Dahlberg reports that the ORI caseload has sharply risen in 2012-2013.

Given all that, Fang has a somewhat different conclusion than Fanelli:

I agree with Fanelli that retractions are an imperfect reflection of research misconduct and that attempts to extrapolate from retractions to the scientific enterprise at large must be made cautiously.  Nevertheless, the detailed story behind each retraction has revealed useful insights into why scientists may be tempted to engage in misconduct and how journals and institutions may inadvertently encourage or enable this behavior.  The work of Retraction Watch has been particularly important in this regard.  I also agree that growing efforts by journals and scientists to correct the scientific record should be strongly encouraged.

My own view is that changes in both author and journal behavior have contributed to the rise in retractions (Steen et al.  PLoS One 8:e68397, 2013).  Fanelli makes a case for an increasing propensity by journals to retract invalid papers, with which I can surely agree.  However this does not exclude the possibility that misconduct is also increasing.  In fact the nearly ten-fold rise in the percentage of articles retracted for fraud or suspected fraud since 1980 suggests that either journals failed to detect and retract 90% of fraudulent work prior to 1980, or that such work is more common today.  I think the latter is more likely.  With studies like that of Martinson et al. showing that a third or more of scientists admit to questionable research practices (Nature 435:737, 2005), I am not sure that I can share Fanelli’s sanguine view that the increase in retractions is a good sign.

27 thoughts on ““Why Growing Retractions Are (Mostly) a Good Sign”: New study makes the case”

      1. This case should be brought up again, and again, and again, and again ….. until it is resolved PROPERLY.
        “If you are neutral in situations of injustice, you have chosen the side of the oppressor”, Desmond Tutu.
        Frank, do you urge us to be NEUTRAL?!?

        1. Actually, I was going to give a thumbs down because Mr. Robertson has been quite insistent on his story in quite a few stories at RW. However, I decided to pause my mouse right click function and give him the benefit of the doubt, and I decided to read the long PDF file more attentively. I must be honest, I think that even if his claims are incorrect, his request for a transparent inquiry (by the research institute and by the publisher) is valid. I am worried however that the University of Sydney did specifically request Mr. Robertson to not make any communications public while the investigation was pending, a clause he did apparently violate by placing a link at RW to that PDF. This brings up an interesting issue: if scientists do not have any location to vent their frustrations about publishers, other fraudulent scientists or other related issues, then how are they to effectively have their voices heard? Maybe RW could fortify its own liberal image by providing a list of blogs and notice boards that also cover related topics (we need a fraud watch integrator, basically). Mr. Robertson, if after the UoS finds that there was no foul play, why not submit a Letter to the same journal in MDPI and seek a formal publication of your concerns? That might bring you some psychological relief, even if it does not bring you the feeing of justice (i.e., a retraction).

          On the separate issue of sugars and sweeteners, I am extremely concerned about some research taking place in Japan, and maybe you could weigh in on this topic for me. There is a group working on rare sugars and quite recently a product has appeared in the local supermarkets which is actually a syrup, with an undisclosed amount of D-psicose in it. I should add that this group has just received millions in funding from the local prefectural government to expand the rare sugar business, without a single public explanation. From the literature I have read on these sugars, I have seen that at some concentrations, psicose (the D-form; and also other rare sugars) can actually be extremely toxic. However, researchers have found that at some very tiny concentrations, that this rare sugar can stop the proliferation of cancer cells (very broadly). From that finding, suddenly, wthout any human clinical trials and only having tested on lab rats, we find D-psicose being tested on the human population with massive funding to expand the food products in which it is to appear. To me this is astounding because I would think that extensive toxicological tests and clinical trials would be required for substances that are to be included in human food. To me this equates basically with the use of a live human population to serve as the toxicological test, en masse, with the (untested) claim that this is somehow some sort of a miracle anti-cancer and health-inducing agent. So, here I see some parallels with your Paradox paper. Although I am not a human physiologist, I assume that when a sugar (or rare sugar) enters the blood stream, that the metabolic pathway would result in the integration of a certain proportion as fat, or even within the cytoplasm of cells. Thus, is it possible (hypothetically) that a rare sugar could accumulate in the body, even if consumed at non-toxic concentrations each time, over a long period of time, or would it simply pass through into urine as it is not an L-form?

          Sorry for the distracting question which has nothing to do with Fanelli’s paper, but which is linked to Robertson’s comments.

  1. How about this for a possibility: people have greater awareness of and access to poor quality research.

    In the past, 20+ years ago, I think the attitude was this. There were high-profile journals, second-tier journals, and then the dross that you wouldn’t bother consulting or citing. Publication in the first and second tiers would entail a fairly strong review.

    Now, starting in the early 90’s there was a great advance in searchable electronic indexes, and then electronic publishing. It has now become infinitely easier to identify and acquire a publication in almost any (English-language) journal.

    I suspect many more eyes are seeing low-quality papers than would have previously. This would account for an increase in retractions from low-tier journals. Many of the papers we see here are from quite obscure journals.

    Here’s a research question: Has there been an increase in the number of retractions of high-impact papers? ‘Important’ papers in Big Journals have always gotten a lot of eyes. Are they getting more likely to be retracted?

    1. Perhaps a more interesting question to ask is: now that we have keyword searches, why do we need Big Journals to sort our topics for us? (I can actually think of many reasons to keep editorial-like roles to help with sorting out a little bit, but man- I don’t read journals, I just search in pubmed or scifinder etc….)

  2. Fang is quoted as saying this:

    ” In fact the nearly ten-fold rise in the percentage of articles retracted for fraud or suspected fraud since 1980 suggests that either journals failed to detect and retract 90% of fraudulent work prior to 1980, or that such work is more common today.”

    I would imagine that journals even currently fail to retract the vast majority of fraudulent work. Does Fang believe that most fraudulent work gets retracted?

    1. It is well documented that not all fraudulent work is retracted, as we noted in our 2012 PNAS article. However, if one is to argue that research misconduct was just as common prior to 1980 as it is today, than there must be a substantially greater proportion of fraudulent work prior to 1980 that was never recognized or retracted. I find this to be implausible and know of no supporting evidence.

      1. Either your argument is circular, or it rests on the assumption that it is implausible that 90% of fraudulent work went un-retracted prior to 1980. I don’t find that implausible at all.

        1. You seem to misunderstand my comment. I agree that retractions represent only a fraction of fraudulent work. However, if the incidence of research misconduct has indeed been constant over time, this would imply that perhaps five-to-ten times as many fraudulent papers published prior to 1980 remain unretracted in comparison to today. I don’t think this is very likely and I don’t think there is any evidence for it. It seems more likely that misconduct has become more frequent. I would argue that this, along with the lowered barriers to retraction that Daniele Fanelli and my colleagues and I have documented, has contributed to the temporal trend of increasing retractions. It is clear that potential incentives for misconduct (e.g., declining job opportunities and funding success rates) have intensified over time, so I don’t think an increase in misconduct should be all that surprising. The Office of Research Integrity received 422 allegations of misconduct in 2012, compared to 167 in 1993, which provides further evidence of an increase in actual misconduct.

          1. In the essay, I analyzed data on allegations, too. A rise in allegations is perfectly compatible with a rise in scientists’ awareness of these issues.
            Absence of evidence is not evidence of absence, and it could be true that misconduct has risen. The point is that data on retractions/cases of misconduct/allegations have no bearing on the issue. By assuming that they have, we do a disservice to the community. It’s like blaming active citizens and a growing police force for a rise in the number of reported crimes and convicted criminals.

          2. I don’t think I’ve misunderstood. If you allow that retractions may represent only a small fraction of fraudulent work, then your “argument” would appear to be question-begging. Obviously if x is larger now then 1-x was larger in the past. You’re more or less asserting that 1-x couldn’t have been that much larger in the past, without offering a compelling argument, which is equivalent to simply asserting your conclusion.

            You say that there is no evidence for this, but we are discussing it as one explanation for an empirical observation (an increase in retraction rate). There may be arguments against it, but I don’t see where you have added anything by pointing to the need for 1-x to have been larger in the past.

            Also, it is not true that this explanation “would imply that perhaps five-to-ten times as many fraudulent papers published prior to 1980 remain unretracted in comparison to today”. Suppose that 10% of fraudulent papers are retracted today and 1% were retracted prior to 1980. Then the incidence of unretracted fraudulent papers was only 10% higher in the past (99% then vs 90% now).

            I am not arguing that the “greater vigilance” explanation is correct; I am merely questioning an argument that has been made against it.

  3. I thank Ferric Fang for the comments, but would like to emphasize a third line of evidence in the paper: the number of retractions per-retracting journal have also not increased. If misconduct were really rising, we would expect that each journal was dealing, on average, with more cases. There is no evidence for that either.

    Connected to this is an answer to Dan Zabetakis: as i show in the paper, retractions in Science, Nature and PNAS (the only journals with retractions older than 20 years) have not increased gradually, but rather show an abrupt rise since year 2000 ca. Only PNAS’s retractions peaked recently. So there is little sign of an increase in retractions for high-ranking journals.
    Low-tier journals are certainly retracting more, as you suggest, but again not per-capita, rather in numbers. it could be for greater scrutiny, as you suggest, but the simplest explanation is that they are just more willing to retract. Past studies found a strong correlation between a journal’s impact factor and presence of retraction policies. A correlation that will hopefully become weaker in the future.

    1. I have to respectfully disagree. The number of retractions per retracting journal has increased over time. Using the data gathered for our 2012 PNAS study, and focusing on the articles retracted by Science, Nature, Cell or PNAS for fraud or suspected fraud, there was only a single retraction prior to 1980, in contrast to 6 retractions in the 1980s, 16 retractions in the 1990s and 42 retractions in the 2000s (by date of publication). I can also tell you, based on my experiences as an Editor-in-Chief, that journals are detecting cases of fraud or suspected fraud prior to publication and dealing with them accordingly. These cases never show up in the literature in the first place and are therefore not reflected in analyses of retractions.

      1. Let me first say, Ferric, that I am delighted by this chance to discuss the issue, and I appreciate your willingness to engage in the debate.

        To answer the Nature, Science and PNAS issue, at this point I can just send readers back to my paper (Figure 3, and the related section in the text).

        I fully believe your behind-the-scenes experience as an editor, and it would be quite interesting to have some data on pre-publication retractions, so to speak. However, again the key question is to what extent editors paid this level of attention to possible cases of fraud 10, 20, 30, 40 years ago. Did you?

        Perhaps you did, but in that case you would seem to be the exception, rather than the rule. There is plenty of evidence, for example, that most journals didn’t have any policies for retraction until recently, and that editors had no clue about what to do.

        1. Thank you, Daniele. Forty years ago I was in high school and not thinking much about fraud! In truth, although the Darsee case was in the news when I was a student at Harvard Med School, and I read x-rays with Slutsky at UCSD, I confess to having given little attention to the topic of research misconduct until I was drawn into the Mori case in 2010. However I don’t think that earlier generations of editors were necessarily clueless. After reviewing more than 2,000 retracted articles and reading historical accounts of the history of research misconduct by scholars like Zuckerman, Staneck and Cranberg, I have come away with a strong impression that the research culture has changed, and not for the better. Heartfelt testimony from scientists at the 1981 Congressional Hearings on Fraud in Biomedical Research convey the sincere impression at the time that fraud was exceedingly rare. Although there have certainly been instances of fraud documented throughout scientific history, the current ‘epidemic’ is largely a phenomenon of the last couple of decades. An AAAS survey of ~2,500 scientists in 1967 reflected high ideals and little evidence of fraud, in stark contrast to the meta-analysis that you published in 2009, with up to 72% of respondents witnessing questionable research practices. While it is not possible to conclude with certainty (hence the diversity of viewpoints), my strong impression is that misconduct is more common today. For those who disagree, hopefully we can still agree that the current amount of misconduct is too much. Moreover, you and I are in strong agreement that journals and authors who proactively retract flawed work from the literature should be encouraged.

          1. So, Ferric, your experience as an editor was that you became interested in misconduct and retractions only recently. But then, why are you so reticent to believe that most other editors had a similar story? Note that there is pretty good evidence that this is indeed the case.

            I respect your “impression”, but I see very little evidence to allow us to talk about an “epidemic” of scientific fraud. I only used that term once in my papers, and it was referred more generically to false positives and biased findings, for which there is good evidence of a growth (as discussed in the paper). Scientific misconduct as an extreme form of these problems, therefore, could have increased – but there could also be an argument for the opposite to be true: subtle biases might be replacing outright fraud, precisely because scientists are becoming more aware of when they should stop.

            Merton Zuckerman et al were greats in the sociology of science, but they are widely regarded, today, as having projected a partially idealized view of science.

            Any level of misconduct is surely too much – although we could argue that to some extent it is physiological. But let’s also then agree that current retractions are too few.
            The current number of retractions might seem impressive, but it is just a tiny, tiny fraction of the literature. The proportion of wrong and fraudulent papers is almost certainly higher.

  4. I’ve read the Fanelli paper in detail now. My opinion is that the trend in retractions clearly represents a change in publishing practice rather than a change in ethics among scientists. It seems that around 2000 it became seen as the appropriate response to retract a paper. Before that retractions were rare and probably done on an ad hoc basis without written policy.

    The paper clearly shows that what is increasing is the number of journals that retract papers.

    We can see changes in policy relating to corrections. We see dramatic increases in corrections circa 1943 and circa 1962. Do we have any idea why that would be? Could this represent either a change in editorial policy at one or more major publisher, or a change in reporting status with respect to the database on which these data were extracted?

    1. Dear Fanelli and Fang, based on your studies, even if they show different things, could you please provide us, or publish, a detailed breakdown of how retractions are related to two aspects:
      a) Time from date of submission to acceptance;
      b) Country of origin of the corresponding author and/or total authors.

      I suspect that there could be some peaks in some countries over time (I hypothesize about half a dozen exponential curves), and maybe even a decrease in other countries, causing the curve to be overall straight (the trend that you observed). This information could be extremely valuable and should be fairly easy for you to generate because you already have – hopefully – all the meta-data in your hands. This data is actually even more important than the data you present in your recent papers because there are claims, for example based on the Bohannon (Science) sting and on the Jeff Beall list of OA predators, that fraud in science publishing/publishers is emerging primarily from developing nations in Asia (India, Pakistan, Iran) and Africa (Nigeria), but also in the EU (UK) and the US. The former group is using the latter locations for establishing web-sites, servers, the postal address of the publisher, and the banks receiving funding for OA fees. However, fake, predatory, unscholarly or fraudulent journals run from other countries, for example from Australia, the UK or Canada, are in fact in many cases not by nationals of these countries, but by nationals of the same developing nations in the former list, i.e., is fraud as the business model creeping into or being facilitated by “Western” publishing platforms? So, for example, you may find that many of the managers and editors of “predatory” OA publishers are in fact not British or Australian, but rather Indian, Pakistani or Iranian (please examine the list at the Beall web-site). Therefore, even if you are able to generate data for b), it might not actually reflect the true cultural origin of fraud or misconduct (if I may liberally equate retractions with fraud or misconduct).

      If in fact you are able to generate data for b), it might be skewed by the fact that there is increasing collaboration. So, one might find, for example, a scientist from developing country XYZ working with a US scientist, at a distance, or, a scientist from country XYZ working in the US. If the US scientist is listed as the corresponding author, then all authors will be listed as “US scientists”, even if they are not.

      Thus, for b), my hypothesis is: “The cultural origin of the author is related to the level of retractions”. Can you prove or disprove this based on the data you have?

      As for a), this is important to know because excessively rapid submission-to-acceptance cycles could indicate insufficient peer or editor quality control. Thus retractions in these cases might not actually be the absolute fault or responsibility of the authors, but in fact may reflect irresponsibility on the part of the editor(s) / peers / publisher.

      The number of retractions is not important. That will rise, and should rise further as we begin to twist the arms of the editors who are biased, blind, or unreceptive to change and to truth-telling.

      1. Statistics of this kind are already used, and my point is exactly that they can be severely misleading.

        To use your example, “predatory” journals are both most likely to publish rubbish and least likely to retract papers. In fact, they probably retract none at all. Does that mean that they publish better research than Nature and Science?!

        In short, authors, journals, countries, institutions that retract papers are showing integrity, not fraudulent behaviour. Please read my essay and see if I can convince you. Thanks!

        1. Dear Danielle, I am 100% with you on your evaluation about the “predatory” journals. They actually pose an existential risk to the entire publishing community and their lack of control, and the fact that they are becoming increasingly referenced in “true scholarly” journals is going to cause the gradual corruption of “top” journals, but in a sinister way, like an internal fruit rot, or cancer. Only once the visual signs are there will people understand the effects, until then, it is a silent killer. Regarding your paper, and my query, actually, I was hoping for more of a Yes or No answer for my questions a) and b) above. Of course I do not have the tools, the skills or the time to accumulate the data that you and Fang already have, so I imagine that calculating metrics for a) and b) could be relatively easily possible. Can you do it? It doesn’t actually matter if these metrics or values are useless or not, because interpretation is a subjective thing. If you could generate those two statistics, it would open up our window of understanding, even if the correlations are weak. At this moment, science and science publishing are in revolution, that is why your papers are so frequently downloaded, and referenced, because the community is hungry for answers, and solutions.

          Regarding your last comment, I see things in a more sinister way (see my comments made about this grab for and commercialization of ethics here: http://retractionwatch.com/2013/11/29/p-values-scientific-journals-top-ten-plagiarism-euphemisms/#comments). I honestly believe that this “ethics” issue is just part of the new business model, the “soft” capitalistic approach to ensuring clients for the future. If the publishers can assure a strong mutual bond (even if they are still competing for the same base of clients), and link that with a powerful ethical body, a powerful ethical software and a powerful scientist data-base, then their chances of survival and business will be projected long into the future. Powerful is synonymous with centralized.

          So, your essay is great, make no doubt about it. It gives us new clues to this world of retractions, but I personally believe that it does not reflect the reality of the tsunami that is coming. Alot of scientists, including me, are increasingly angry and frustrated for several reasons: they feel marginalized, they feel that they are the victims of bias and lousy peer review practices, they feel that publishers are becoming increasingly intrusive and that respect is gradually being lost while principles and “ethics” are constantly being challenged. Unlike Fang, I am not of the “older” generation, but I still feel that things have changed alot from 10-15 years ago, and changed RADICALLY in the last 3-5 years. I do agree that retractions are mostly “good”, but the way in which retraction notices are being written, their lack of openness, details and transparency and quantification of the problem is soon going to start becoming a thorn in the side of the publishers unless they start to rethink how retraction notices are written and shared, and what information they contain.

          1. “It doesn’t actually matter if these metrics or values are useless or not, because interpretation is a subjective thing.” I am sure that you say this with the best intentions, but I have to completely disagree. We are scientists precisly because we are trying to remove subjectivity from our claims. Our job is to try to make objective sense of the data.
            When isolating subjectivity is judged impossible or undesirable, then we should admit so and reveal our biases transparently.

            There is nothing intrisically wrong in doing advocacy, PR, politics, propaganda, even, and it’s fine to push for reforms based on an unsupported “hunch” that they are right. But then let’s just say so, and please let’s not pretend that “evidence shows”, “scientists agree” etc.

            It’s a matter of scientific integrity and dignity.

    2. Yes, these are interesting hypotheses that we (you?) could test. Alternatively, those peaks could be just artefactual, caused for example by changes in the composition of the database, and/or fluctuations in the proportionally smaller numbers in of records available.
      I guess we should be careful not to forget that the web of science was not around in the 1940s. So their database may not be an accurate sample of what went on in the past. It is a retrospective collection of what is available today, of the past.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.