The title of this post is the headline of our most recent column in LabTimes, which begins:
As we write this in mid-August, Nature has already retracted seven papers in 2014. That’s not yet a record – for that, you’d have to go back to 2003’s ten retractions, in the midst of the Jan Hendrik Schön fiasco – but if you add up all of the citations to those seven papers, the figure is in excess of 500.
That’s an average of more than 70 citations per paper. What effect would removing those citations from calculations of Nature’s impact factor – currently 42 – have?
Science would lose 197 citations based on this year’s two retractions. And Cell would lose 315 citations to two now-retracted papers.
In other words, what if journals were penalised for retractions, putting their money where their mouth is when they talk about how good their peer review is? Clearly, if a paper is retracted, no matter what excuses journals make, peer review didn’t work as well as it could have.
We explore what this might mean for top journals. But there are some nuances here. We wouldn’t want to further discourage retractions of papers that deserved it. One solution:
Journals might also get points for raising awareness of their retractions, in the hope that authors wouldn’t continue to cite such papers as if they’d never been withdrawn – an alarming phenomenon that John Budd and colleagues have quantified and that seems to echo the 1930s U.S. Works Progress Administration employees, being paid to build something that another crew is paid to tear down. After all, if those citations don’t count toward the impact factor, journals wouldn’t have an incentive to let them slide.
We also highlight a paper whose title included “retraction penalty.”
So back to our question: Is it time for a retraction penalty? And if so, how would it work?
Most of the scientists would be more than happy to ever get accepted as first or last author in Nature or Science (or similar top-rated journals in specific disciplines) – but most scientists never get that chance.
There is a different perspective from top-rated scientists who may choose to which journal they send it, and they may after some bad reviewing experience prefer the one or the other for the easier acceptance of their future manuscripts. Unfortunately, there can be many ways of plagiarism, including not acknowledging the originator of a project, idea, results. It may take many years to fight against the plagiarists, as well as against the plagiarism supporting university. Therefore, there should be several sort of penalties, mainly against the institutions which supported the plagiarism. The money could be used for supporting the victims of such plagiarism who did not get their deserved full professorship or institute.
A retraction penalty (for the publishers) is a great idea! To date, the community has primarily been focused (even obsessed) on author-based error, and not enough focus exists on publisher-based errors, in which publisher encompasses the editors, especially the editor-in-chief (EIC), and the peers, that are selected and approved by the EIC. That means that responsibility is held equally by authors and the publisher (+ affiliates). Thus, a penalty system should involve not only penalization of the authors (e.g., three strikes = a ban from a journal), but also one for publishers and their journals. A direct hit to the impact factor (IF) is a fabulous idea, because the IF score often translates into greater number of submissions, more papers published, and thus potentially greater revenues. So, profit being made off faulty papers, independent of the person responsible, must be penalized. To date, the publishers have enjoyed a roller-coaster of a ride with the IF, so now it’s time to penalize them for abusing this metric to their sole advantage. And so, too, should, Thomson Reuters take note that the scientific community is fed up with the IF and how it is being used to create a silly number of 42 that scientists equate with excellence in Nature (as one example). However, the penalty system – any system in fact – is only as good as the entity that implements it. So, if there is no appetite to implement it, then its only ink on paper, nothing more. For example, a publisher that fails to issue errata and corrigenda based on citations that reference now retracted papers are de facto irresponsible publishers. And we should hold them accountable, and call them out when they fail to correct the literature. And when they fail, they deserve much more than just a hit to their IF score. They deserve to be publically shamed on blogs like RW (as equally as faulty authors are shamed).
There is one major problem with the penalty system. The collateral damage. For every mistake/fraudster there are plenty honest. If the IF of a journal is reduced where one published, means reduced chances for decent funding or reduced chances of promotion.
The Transparency Index (see my comments here http://retractionwatch.com/transparencyindex/) provides a solution.
A retraction would have a positive impact on journal’s Impact Factor (and author’s reputation), when it is Doing_the_Right_Thing (i.e. author/reviewer/editor/publisher admits a mistake),
and not retracting a paper that should be retracted would have a negative impact on journal’s Impact Factor (and author’s reputation), since this would be doing_the_wrong_thing (i.e. cover up).
No collateral damage, only collateral BENEFIT.
The LabTimes column states: “Clearly, if a paper is retracted, no matter what excuses journals make, peer review didn’t work as well as it could have”. Really? Aren’t we holding peer review to an unrealistic standard? We all know that many retractions are due to research misconduct. As such, how can we expect peer reviewers to reliably detect 100% of the time fabricated data, manipulated data or images, or, for that matter, plagiarized data or text while refereeing a paper? Google or Crosscheck can’t even detect all instances of plagiarism, so how can we expect reviewers to do so?
To answer the original question, yes, penalties should be instituted, but to authors and only when the retraction is due to outright misconduct.
Thanks for the thoughtful comment, Miguel.
Certainly, peer reviewers can’t be expected to catch 100% of fakery, but journals are really too quick to give themselves — and their reviewers — a pass when things go wrong. The STAP stem cell saga is a good example. Nature tried arguing that it was the RIKEN investigation that led to the retractions, for reasons peer reviewers couldn’t have noticed. But that leaves out the critical fact that it was rapid post-publication peer review that found the problems that prompted that investigation. Why weren’t those issues caught in peer review — or, as may be the case because those reviews haven’t been made public yet, why were they ignored?
I don’t think this is binary, but a scale:
1) This paper has no business being in this journal, out of field.
2) Peer review definitely failed, obvious errors on a cursory read, wrong assumptions on strong science
3) Per review marginal, one can believe reviewers missed that in normal time
4) Really, really hard for any normal pre-publication peer review to catch.
I think that especially applies to some plagiarism, especially when people do mosaic plagiarism and then edits here and there. I cannot imagine any reviewer spending the time to track those down, especially when the copied sources are from books that aren’t online. I’d guess some fabrication/falsification fits as well.
Maybe we should start by clearly defining what IS a reviewers role?
If we start making unrealistic expectations of reviewers, people will simply stop doing it. I do NOT consider myself the fraud police when reviewing a paper. Lets say I get a paper that uses three main techniques -I am highly experienced in 2 of them, with good passing knowledge of the third – should I accept to do the review or not? Because honestly, I think if papers have to be reviewed by people who are absolute experts in every aspect of the paper (and given that papers are often highly multidisciplinary nowadays), I think the entire (current) reviewing process is going to grind to a halt.
Peer review for merit and obvious fraud, and PPPR for the more subtle stuff. A thousand PPP reviewers are always going to catch more than 2 -4 reviewers. We just need to accept that – and force the journals to be more responsive where PPP picks up more subtle issues.
PWK, I gave my interpretation of reviewers’ responsibilities here:
http://www.globalsciencebooks.info/JournalsSup/images/2013/AAJPSB_7(SI1)/AAJPSB_7(SI1)6-15o.pdf
JATdS, in what kind of journals have you published to state that in most cases the Publisher chooses the peer reviewer? In all my years of experiences with primarily Elsevier, I have never ever seen the Publisher involved in the choice of reviewer. It was always the Editor (or associated people).
Hear, hear!
Interesting question, Ivan. And your response raises a couple of additional empirical questions: How unique is the STAP stem cell case? That is, what proportion of retractions that are due to misconduct are the result of post-publication peer review? A related question: What differences, if any, might there be between journal referees who are asked to review a given paper and those who engage in post-publication peer review of that same paper, in terms of, for example, adversarial positions or perhaps other personal/professional rivalries with respect to the study questions pursued in the paper?
Great questions, Miguel. Difficult to answer rigorously, but worth pursuing. A paper by Paul Brookes that we covered earlier this year may have hints of some answers, and may also be relevant to another question that your questions raise: How many questioned papers should be retracted or corrected, but haven’t been?
Indeed Ivan! There are papers full of suspicious stuff to have them retracted several times over, but their authors laugh it away with Corrigenda or Errata. So when is a paper retracted? In my opinion, unless you deal with truly independent and professional editors (good luck there…), only the employing institutions’ pressure can force its scientists to agree to a retraction. And there, we would need a lot more of good luck, or a tremendous media attention. In the case of STAP, it was RIKEN (under media pressure?) which enforced the retractions. Not the authors, and surely not Nature.
Well, Brookes’ data is strongly suggestive that post-publication peer review (PPPR) is more effective at spotting problems than traditional peer review. As such, I think it is very important to discover the reasons why that is so and to use those findings to improve BOTH forms of review.
We’ve always had PPPR via letters to the editor, replies and rejoinders, etc. But thanks to the internet the current version of PPPR is much faster, more likely to grow even further and, therefore, it is here to stay and a great benefit to science. In the mean time, I’d say let’s find out what makes PPPR so effective and use that knowledge to improve traditional peer review. After all, I think we all rather have these problems spotted pre-publication than post publication.
Miquel – I should think that PPPR is more successful in part because of the mechanism of choosing reviewers. For pre-publication review, it’s usually an editor who selects a few likely candidates to review and keeps looking until she or he gets enough people to agree. Post-publication peer reviewers are self selected, are they not? The pool of potential reviewers post-publication is vast compared to the number an editor is likely to contact. I’d expect that the people who bother to write a post-publication review are highly motivated and have, or think they have, exceptional expertise in the area; and they probably have a vested interest in moving the field forward – not through bad publications, but good ones.
I think part of the problem in general is that we still think of science publishing in many ways as it was viewed a hundred years ago – perhaps longer. Publishing was a clear end point in a way it no longer is. The process is more fluid now than we recognize in practice – the publication is still the thing, and seems to overshadow the actual results.
I may be posing a different question: What would it take to give appropriate credit at various states of planning, performing, publicizing, and refining science, instead of pushing the vast majority of credit through the funnels of grant money and publications? And would such a change make any difference?
Ah! So, the far greater numbers of PPP reviewers (Ken) and the likely limited expertise (PWK) of the traditional reviewers, relative to the pool of expertise of PPP reviewers, play an important role in the observed differences. I do wonder about possible differences in motivation to review and, especially, differences in professional, personal, or theoretical biases between both groups. In addition, my sense is that the focus of most PPPR tends to be primarily critical with a focus on identifying problems whereas traditional peer review has evolved to be a more balanced examination of a paper’s positive contributions as well as its flaws.
PPPR is a complete disaster at high impact journals. Science does not even have a format for PPPR within the journal, and Nature will simply reject pretty much any Comment without external review, no matter how flawed the criticized paper is.
But, the beauty of PPPR in the age of the internet is that we now have all kinds of outlets (e.g., PubPeer) available to us to disseminate our criticisms of a published paper.
Here’s how we should be doing this:
Every piece of science worth publishing is worth repeating at least once…
Therefore, peer review stage 1 should proceed as in currently does. Stage 2 should involve repetition of the original experiments by a scientist who doesn’t know the outcome and who writes up his/her own interpretation of the experiments, again reviewed by the peer reviewers from stage 1. All scientists involved in peer review and the initial experiments should then receive equal credit for their contributions.
Anything short of requiring this real, thorough style of peer review, the journals should take all the blame for basically having no *real* means of authenticating what they publish.
QAQ, 1) WHO is going to do this “repeat” experiment? A journal-appointed laboratory? And 2) who will FUND this repeat? The idea in theory is rosy, and idealisic, but impractical. And requiring the authors to repeat it is senseless, too, because in theory they should already have replicated the study several times before publication. The solution is three-pronged: a) suitable education at the graduate level to avoid errors in the post-grad level, and this also includes constant verification of faculty by universities; b) traditional peer review, but double-blind and with at least 5 peers (real peers, and not editor appointed, either, or obtained from spam campigns by automatic mail spoolers on publishers’ data-bases). IN this case, the peer review will take longer, but science will learn to slow down a bit more to avoid the mess it is witnessing now. c) post-publication peer review, which works ad infinitum.
I think perhaps when journals send out an article for peer review as well as having 3 reviewers who check the scientific validity of the work and the conclusions they should also send it to a 4th reviewer whose main role is to check for plagiarism, duplication and down-right data manipulation. These reviewers could also go through the supplementary data (a role some normal reviewers may hesitate to do due to time limitations) and have the power to request original data. If this 4th reviewer was in place then we could safely penalise a journal (with regard to adjusted impact factor based upon citation loss) if a paper was retracted.
Failing this then the journal should at least make available the reviewers comments that where made for an article that was accepted and then later retracted (so the scientific community can judge who was at fault that the article was published in the first place.
One of the main problems (as mentioned above) is that many people continue to cite retracted papers. Personally I would suggest the development of a database (like PubMed) that just contains these papers so that when one is writing an article one can quickly check if it is in the “bad bin”.
Perhaps Retraction Watch would like to start one?
Many of the more established journals currently have direct hyperlinks alongside the references. It may be a good idea if publishers return papers to the editor if they find it cites a retracted paper. There may be good reasons if they do, but an extra check is never a bad idea.
The databases are not optimal (Pubmed does miss the occasional retraction), but anything is better than now.
I suspect that not counting citations of retracted articles would have a trivial effect on the impact factor. The IF is based on citations during the two-year interval following publication. If we take Nature as an example, the 2,651 articles published in Nature in 2012 have been cited 66,638 times (ref: Web of Science), for an average of 25.14 times per article. Two of these articles (Narayan et al. and Lin et al.) have been retracted, and these have been cited 49 and 45 times, respectively. This represents only 0.14% of the total citations. If the IF were to fall in direct proportion to the number of citations, it would drop from 42.35 to 42.28. Even following a year with an exceptionally high number of retractions, like this one, there would not be much of a dent in the IF.
Another consideration is that articles retracted due to misconduct take a mean of 46.78 months to retract (ref: Steen et al., PLoS One, 2013). Thus the IF is determined before many of the articles that will eventually be retracted have been retracted.
Avoiding retraction is obviously the best way to go but once a paper is published, PPPR is probably the best way to get the most apt and knowledgeable reviews. It should turn up reviewers who really know what the paper in question is all about and whom the Editor may well have missed or even avoided. My co-author, Joel Pitt, and I have just published a paper that is very much to the point (Failure to Replicate: A Sign of Scientific Misconduct? Publications 2014, 2(3), 71-82; doi:10.3390/publications2030071). One of the points that we make in the paper is “the importance of familiarity with the published literature in the field”. Our data analysis strongly indicates that the authors of the 2 papers we critique did not know how their results should have come out to be consistent with earlier published results. I admit with an intense blush that I am one of those co-authors. Reviewers for the journal also missed the mark and so, in fact, did the members of the Study Section that approved a grant application based on what probably were impossible findings under the conditions of those experiments. And that brings up the point: what about grant applications? Critiques and funding decisions are a black box. Wouldn’t it be better if, once applications were approved, they became open for public review? I can hear the outcry resulting from that suggestion — that it would allow the competition to pounce on the ideas. But, so what? There you would have a de rigueur mechanism for replication of results.
I’d love to see the penalty, but I fear that if there was a journal based penalty imposed, we’d return to the dark ages where journals just never retract their articles to preserve their IFs, to the detriment of Science as a whole…
For sure part of the issue can be crappy peer review not catching the duds up front. You want to curb that, the journals need to start paying reviewers and then implement fines if an article they reviewed is later retracted for something they missed…
All comments considered thus far, there is only one realistic solution for a journal (or publisher) that conducts bad peer review (in multiple cases), fails to correct the literature when errors are reported, or fails to retract a paper when very serious errors are reported, all in the name of protecting its image and/or IF: a boycott.
I am sorry, but I do not think paying reviewers will work. You want to pay a “consulting rate” for a senior scientist of $250 – $500 per hour? per paper? Times 3 to 5 reviewers? Not gonna work when reviewing a paper starts to cost several thousands of dollars. Or do you want to pay a token honorarium, but then hold the reviewer open to “fines” if some unknown person thinks the reviewer missed something in the review. Personally, thanks, but no thanks.
Reviewing is a freely given academic service, and 99.99% of reviewers do an honest job for free, providing often valuable insights and constructive criticisms of the work. I may often disagree with a reviewer, but I do appreciate their efforts, and do my best to return the favour.
As noted in my earlier post, I do not think reviewers are, or should be “the fraud police”. Yes, of course reviewers should pick up “obvious stuff”, but bands taken from a previous paper, inverted, contrast adjusted and stretched? No, no and no. If a reviewer is shown to be incompetent, then simply that journal should not use that person again.
I am concerned by the continual hunting around to seek where to apportion “blame”. Publishers, editors, EICs reviewers. Keep the spotlight firmly where it belongs. The blame apportions solely upon the authors who fabricated the data, “fixed” that image, doctored the graph. This is a sin of commission, NOT one of omission.
Sorry, I messed up my log in. The above post was by PWK.
Isn’t the impact factor just a marketing tool for the publisher.
A bad/poor paper published in a top tier journal is still a bad/poor paper and a great paper in a lower tier journal still a great paper.
Or am I being too cynical
I like the idea of rewarding journals for publishing sound, accurately reported and reproducible research. But wouldn’t penalizing retractions simply make journals issue yet more corrections instead of retractions?