Is it time for a Retraction Index?

We often hear — with data to back the statement — that top-tier journals, ranked by impact factor, retract more papers than lower-tier journals. For example, when Murat Cokol and colleagues compared journals’ retraction numbers in EMBO Reports in 2007, as Nature noted in its coverage of that study (h/t Richard van Noorden):

Journals with high impact factors retract more papers, and low-impact journals are more likely not to retract them, the study finds. It also suggests that high- and low-impact journals differ little in detecting flawed articles before they are published.

One thing you notice when you look at Cokol et al’s plots is that although their models seem to take retractions “per capita” — in other words per study published — into account, they don’t report those figures.

Enter a paper published this week in Infection and Immunity (IAI) by Ferric Fang and Arturo Casadevall, “Retracted Science and the Retraction Index.” Fang, the editor of IAI, takes scientific integrity and retractions very seriously. He’s made his thinking on these issues clear every time we’ve asked, and was part of the review of the the Naoki Mori case that led to a 10-year ban on Mori publishing in American Society of Microbiology journals (including IAI).

For their IAI paper, Fang and Casadevall searched PubMed for retractions in 17 journals whose impact factor ranged from 2 to 53.5 (New England Journal of Medicine was at the top). For those of you unfamiliar, here’s how the impact factor — derived by Thomson Scientific’s Web of Knowledge — is calculated:

The journal Impact Factor is the average number of times articles from the journal published in the past two years have been cited in the [Journal Citation Reports] year.

The Impact Factor is calculated by dividing the number of citations in the JCR year by the total number of articles published in the two previous years. An Impact Factor of 1.0 means that, on average, the articles published one or two year ago have been cited one time. An Impact Factor of 2.5 means that, on average, the articles published one or two year ago have been cited two and a half times. Citing articles may be from the same journal; most citing articles are from different journals.

Fang and Casadevall’s “retraction index” was a simple calculation: They took the number of retractions in the journal from 2001 to 2010, multiplied by 1000, and divided by the number of published articles with abstracts.

They then plotted the retraction index against the impact factor. Here’s that plot (right click to enlarge and open in a separate browser window):

Source: IAI, August 8, 2011, ahead of print. Used with permission of Ferric Fang, Arturo Casadevall, and ASM Journals

As you can see, that plot

revealed a surprisingly robust correlation between the journal retraction index and its impact factor (p < 0.0001 by Spearman rank correlation). Although correlation does not imply causality, this preliminary investigation suggests that the probability that an article published in a higher journal will be retracted is higher than that of an article published in a lower impact journal.

The findings also aren’t dissimilar to what Cokol et al found. Fang and Casadevall offer a number of potential explanations for what they learned. Not surprisingly, given how outspoken Fang has been about misconduct and retractions, the authors pull no punches. For example, there is pressure to make results fit into a clean narrative:

In contradistinction to the crisp, orderly results of a typical manuscript in a high impact journal, the reality of everyday science is often a messy affair littered with non-reproducible experiments, outlier data points, unexplained results and observations that fail to fit into a neat story. In such situations, desperate authors may be enticed to take short cuts, withhold data from the review process, over-interpret results, manipulate images, and engage in behavior ranging from questionable practices to outright fraud (36).

So if journals are eager to trumpet their high impact factor, shouldn’t they also be willing, in the name of transparency, to let the world know how frequently papers are retracted? Maybe alongside every “New Impact Factor: 7.8” on a journal’s site should be “Retraction Index: 2.3.”

Fang and Casadevall’s paper — which includes commentary on how explicit retraction notices should be — should be required reading for anyone interested in scientific integrity.

It also mentions a blog we should probably check out:

Last year, the journalists Ivan Oransky and Adam Marcus launched a blog called “Retraction Watch,” which is devoted to the examination of retracted articles “as a window into the scientific process” (63); sadly, they seem to have no trouble finding material.

Indeed.

35 thoughts on “Is it time for a Retraction Index?”

  1. I wonder if part of the issue is that the work in the higher-impact journals is subject to more public scrutiny. It may be that bad results can be buried and ignored more easily in a minor journal (esp. one written in a language other than English or one that requires a paid subscription).

    I am sure that competition and a pressure to publish in high-impact journals also play a role.

    1. I was about to leave a comment saying exactly this. It’s very easy for people to bill high-impact journal as glamour mags that prize sexy science over high-quality science. But I suspect the greater scrutiny applied to high-profile journals is a bigger driver of the number of retractions. That being said, it is interesting that, for example, Science has a higher retraction index than Nature despite similar impact factors. It’s also interesting that Cell, often touted as the better alternative to the Big Two, has a retraction index somewhere in between.

      1. Another partial explanation may be that the fame of publishing in a high impact journal makes it more ‘profitable’ to present fraudulous research.

        It’s a pity that those retractions indices do not include a distinction between plagiarism, misconduct (fraud), and just plain error. It would indicate whether my hypothesis has any merits at all.

      2. I agree with Marco, but I am not sure if greater scrutiny is the main point regarding retractions in high-impact journals. Often, data are just weak and look suspicious if you know how the assays work, the choice of assays often raises questions, and materials and methods are poorly described. This together with reports (personal communication) I got that high-profile journals actively encourage authors to make new and surprising statements and even call well-known PIs to ask whether they might have a hot new “story” makes it seem more plausible to me that hunger for fame, journals’ calls for storytelling, and whistleblowers’ scrutiny all contribute to lower credibility of high-profile papers nowadays.

        Another point I’d like to make that journals respond very differently to allegations of scientific misconduct. My dataset is limited and mostly second-hand knowledge, but impact factor 10 always meant no action until considerable pressure is applied.

    2. Agree with both comments. I suspect that the retraction index (RI) would be dramatically higher in many lower impact journals, thus pointing out the lack of scrutiny to which most science is subjected. You might even argue that a high RI is a good thing, because it demonstrates that people are actually paying attention.

  2. The British (Nature) might just be better at stone-walling than the the Americans (Science) is another explanation of the lower retraction rate of Nature compared with Science.

    In the U.S. you have freedom of speech, and a freedom of information act which has been operating for at least 30 years. The one in the U.K. is only just off the starting blocks. We can’t do an experiment, as there are no alternative world histories, but it is another “explanation”. The fact that the U.K. has label laws which would not run in the U.S., might also put a dampener on legitimate debate.

    In the U.K. the higher proportion of state funding compared with the U.S., where there are many private universities (Harvard, Yale, Stanford..), nearly all universities in the U.K. are state organisations, does create a monolith which does need to be obeyed. You speak out and find out that the person you are criticising is higher up the only ladder. Speaking your mind may not find favour in official circles. Nature might just take less notice of foreigners. Social control may not be all that obvious, but is it there.

    1. I think that is stretching it a bit. The numbers really are too small to make such large conclusions. Science, for example, can put just 8 retractions on one person: Jan Henrik Schön. If that were counted as just one retraction, Science would likely move much closer to Nature (which only had 2 retractions from the same person, if I recall correctly).

  3. Ivan,

    Thanks for an excellent article and for bringing our attention to the interesting study by Cokol et al.. Arturo Casadevall and I were not previously aware of this paper. Our similar conclusions using somewhat different methodological approaches suggest that the relationship between impact factor and retraction rate is a robust one. I agree with the comments that greater scrutiny of high profile papers as well as greater incentives for misconduct are likely to be contributory factors. I don’t necessarily fault the journals, as deliberate fraud can be extremely challenging to detect.

    Keep up the great work with Retraction Watch!

  4. “De plane, boss, de plane…”
    Let’s hope your excellent blog will encourage more public scrutiny of irreproducible results in obscure journals, increasing the retraction index temporarily. In the long run, more scrutiny (and open-ness) means better science.

  5. Clearly it’s time for a “Journal of Universal Retraction.” At the theoretical maximum retraction index of 1000, the journal’s impact factor ought to be close to 12,000.

    1. I’ll like this idea, and, in much need of a push in my scientific carrier, I will try to get one or two articles in JUR. 24,000 if points will make me unbeatable!

  6. Any introduction of a retraction index would surely reduce the already very low motivation of many editors to proceed with a retraction, or investigate a suspect paper, when warranted.

  7. Some comments have hit the proverbial nail on the head. The fact that higher impact journals retract more articles (proportional to their impact factor!) is not surprising; NEJM, Science and Nature articles are simply seen and scrutinized more. I won’t trust any stats that suggest otherwise unless they have explicitly attempted to control for this popularity phenomenon.

  8. Interesting post, Ivan. I think it’s a great idea — just a bit of transparency, really — but of course, journals will probably never go for it. That said, two points about the link between high impact journals and retraction frequency: (1) these journals may try to push the envelope a bit, publishing material that is flashy and guaranteed to get attention, but may not be quite up to snuff. So that’s your basic high risk/high reward balance.

    (2) Papers published in high impact journals are typically timely and in “hot fields”, so they also more likely to be read and followed up on, meaning mistakes are more likely to be found than with the “average” paper in a lower-tier journal.

  9. Er. Simpler explanation. High IF journals = cutting-edge science. New science and results are sometimes proven to be wrong, difficult to reproduce or honestly misinterpreted in due course. A good proportion of papers are retracted on this basis and not because the authors are trying to pull a fast one. It’s easy for researchers to assume the worst of their colleagues, isn’t it?

  10. May be it’s not the journal’s fame but a parallel factor? Like the big shots, well connected, connected to big drug business, those who actually publish in the big journals.
    The retraction index once existed, if I remember correctly – at MIT. When I tried to look at it, the address was correct, but index was unavailable.

  11. How about scientific misconduct detected during the review process (I recently reviewed and rejected a paper that plagiarized a paper that was made available online using an “early access” model)? Would it be meaningful to factor that in as well?

    As for “Retracted Science and the Retraction Index”: it is interesting to see that this paper starts with a quote attributed to Confucius. I have some research experience in confucianist Korea, and the level of scientific misconduct going on there is, in my opinion, still alarmingly high, despite the Hwang-woo Suk debacle a few years ago…

    1. Some would probably say that it doesn’t matter from where and how an item is stolen. I happen to think that it does: if the thief is stealing unpublished material or stealing from someone in a vulnerable situation (from a PhD student), or the thief is in a position to see the material as a part of his/her duty (being a PhD supervisor or the journal reviewer), the crime would be grossly aggravated. Stealing from the “early access” publication can introduce some confusion about the dates of the original and the unoriginal publications. In this case, stealing ideas can remain unproven, although if the language matches – it’s a clear case. Usually, in science, the plagiarism is not in the language, but the language can serve as a proof.

      1. To clarify: the paper in question plagiarized in terms of both language and ideas. Given the plagiarization in terms of language, I think a system for automatic plagiarism detection could have been helpful here (on the condition that “early access” papers are indexed by the system for automatic plagiarism detection before these papers are made available online). Also, out of the three reviewers assigned to the paper, I was the only one to spot the plagiarism…

        I recently also had to review a journal paper that was a duplicate of a paper published earlier in the proceedings of a conference (the authors had only slightly altered their wording). Here too, a scan by a system for automatic detection of plagiarism would probably have been helpful…

    2. From my experience and anecdotes, blatant plagiarism, assigning authorship purely by seniority instead of merit, authorship granted to all students so they might graduate… the list of research misconduct in Korea goes on and on.
      And the worst part? The culture doesn’t allow for a free and open debate, and is rather dominated by groupthink. It’s either you are with us or against us. Combine that with the mentality that one should be loyal to your clique, regardless of consequences, and in the end you have a group of researchers that cooperatively hide mishaps, exaggerate successes, and put up a joint facade until the very last minute when nobody accepts blame or responsibility.

      Check out Kim Tae Kook’s downfall at Kaist. His students even graduated and went on to research associate positions at UCLA all based on the paper that was eventually retracted. The incredible part? The first co-authors claimed during investigation that they didn’t know the extent of fraud, and another senior author similarly claimed ignorance. How is that at all possible? Why are you the first co-author if you weren’t even involved in working up the very basic premise of your research?

      http://www.sciencemag.org/content/324/5926/450.full
      http://news.sciencemag.org/sciencenow/2008/03/05-01.html
      http://www.nature.com/nchembio/journal/v4/n7/full/nchembio0708-381.html

      Beware of groupthink, nurtured by blind loyalty to your elders, a theme repeatedly emphasized by confucious in his teachings. Quoting confucius on research ethics? Gah!

      1. I am actually happy to see that I am not the only one to complain about scientific misconduct in Korea. Also, don’t forget the pervasive “quickly, quickly” attitude, leading to sloppy execution, shallow treatment of research questions (discussion just slows things down), and badly written papers.

        As for Kim Tae Kook’s “punishment”: I would not be suprised to learn that he is now leading a research center somewhere in the Korean countryside, away from prying eyes (like Hwang-woo Suk, who is currently directing the Sooam Bioengineering Research Institute, probably in an effort to create a new growth engine for the Korean economy).

        I guess soon we will again see some national soul-searching over why Korean scientists are not able to win a Nobel Prize, but it is really not that difficult to understand why…

      2. Forget about Korea for a moment. The root of the problem everywhere is that the scientific community no longer exists. Scientists do not cooperate with each other; the last time they cooperated was in Manhattan project. I believe all that misconduct, not to mention a degradation of scientific thought, is the consequence of the absence of community. Not to mention the absence of open press. And not to mention the fact that some 90% of scientists with PhD actually are employees of other scientists, i. e. they have to do work dictated by others, not their own. All this has led to the drastic fall in the quality of the product. The correct word is – fast growing infantilism, or, shorter – a twitter.

      3. Okay, let’s forget about Korea for a while :).

        I have one comment regarding scientific cooperation. How about the collaboration going on at CERN? I was also going to mention DNA sequencing, but that is probably something that is in support of your argument 🙂 (I am not sure about that though, as this is not the field I am working in).

        I have also one comment regarding “a twitter”. I agree with you that the quality of the product has decreased. However, one could argue that, nowadays, the speed at which new ideas and (scientific) truth can be communicated is staggering (i.e., real time), compared to like one hundred years ago.

        Finally, how about the following thought? There are a limited number of fields – opening new fields is hard – for a number of knowledge workers that is increasing at rates never seen before, significantly heating up competition, hereby also increasing scientific misconduct. In addition, a significant portion of these new knowledge workers originates from cultures/countries that are actually not that familiar (yet) with the scientific method and the scientific publication system that is currently in place.

  12. I have really enjoyed reading Retraction Watch and the comments. Here are my 2 cents on this issue: For a retraction to happen, someone needs to notify the journal, author, or author’s institution that something is wrong. That is an uncomfortable thing to do. Why would someone do that for an article that I don’t think really matters? If the article mattered more, it would probably be in a journal with a higher impact factor.
    On the other hand, the impact factors at the low end of those considered in the study (2.5-10) is still pretty good!

  13. I love the idea of a Retraction Index as a badge of honour for journals! Incidentally, Liu pointed out the correlation between prestigious journals and high rates of retraction as long ago as 2006: see Liu SV. Top journal’s top retraction rates. Science Ethics 2006; 1:91-93. but Fang & Casadevall have taken this one step further — bravo!

  14. Another view: While interesting to ponder, there might be too little information on why papers are retracted to devote much attention to a Retraction Index. On the publication end we are, in some ways, dealing with a post-mortem situation.

    Research leaders should devote more attention to improving basic research protocol management practices that are fundamental to scientific integrity. Some of these have been overlooked in the RCR field.

    On publishing practices, the publication of original data can yield some needed insight into the issues we see on Retraction Watch — (http://blogs.ch.cam.ac.uk/pmr/2011/08/14/publishing-data-the-long-tail-of-science/).

  15. I think the problem is a bit more complex. Some papers are basically flawed and the data do not support the conclusions. In that case a retraction is logical. But what happens when the authors simply do not provide enough information for anyone to reproduce the results? I’ve come across this case very often lately. The methods sections are increasingly shorter and the supplementary material can be incomplete or non-available after certain time. I even contacted the corresponding author once about some data that wasn’t in the paper and he replied that he had lost the files. It is clearly a fault of the editor to allow a paper that doesn’t include all the information needed to reproduce the results. And this is exactly the care in high impact journals. What is there to do? Is a retraction necessary, or how can we deal with authors that in one way or another do not share the real data with the scientific community?

  16. Retraction Index (RI) is well overdue.

    RI will make publishers, editors, and especially submitting authors THINK TWICE BEFORE committing any misconduct/fraud (i.e. manipulation of data/images, plagiarism/self-plagiarism, etc.)

    RI is an important step to clean the Augean Stables in academic publication.

    For me, part of the explanation for the Graph (high IF correlates with high RI) is based on the proverb that “The goal justifies the means”. In other words, the goal to publish in high IF journal would be temptation for more academics to commit misconduct, which sooner or later will be revealed – thus will result in high RI.
    Once academics see the RI-system in place (i.e. operating) they would be less tempted for committing misconduct/fraud.

    The sooner RI is implemented, the better for all involved except for the fraudsters (as you know, it’s not possible to please everyone).

  17. Also has to do with the the fact that a lot of well-known names who are fixtures in these journals, “co-author” their students’ sloppy research so their students can get published and the editors let them slip because of famous researcher’s reputation. The politics of High impact journals has made it a closed system where cronyism trumps quality of research.

  18. Its interesting to know the critic and comments on NEJM. During the annual meeting of society in India, these days a “myth buster session” is held on “how to publish” where the panel of editors shares their perspectives and SOP, Dos, Don’ts, Approval Criteria, Retraction Criteria, and which type of articles should be published where, and why, what, when, how etc are discussed.
    I see all of above comments from scientist’s point of view who are publishing and face challenge in succeeding to do that in high impact factor journal. Over all – there is not much scope for correction in what is happening, and current practise are satisfactory.
    I would be happy to know views of members above, and same can be discussed with specifics – e.g. A publication which was accepted else where was retracted from NEJM etc. Then it would be good debate, why NEJM retracted where as other publisher happily accepted it. On a lighter note, after such sessions we hear comments from delegates like “grapes are sour” when they can’t get published in NEJM and comparing NEJM with Grapes. Happy discussion ahead.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.