Peer review isn’t a core subject of this blog. We leave that to the likes of Nature’s Peer-to-Peer, or even the Dilbert Blog. But it seems relevant to look at the peer review process for any clues about how retracted papers are making their way into press.
We’re not here to defend peer review against its many critics. We have the same feelings about it that Churchill did about democracy, aka the worst form of government except for all those others that have been tried. Of course, a good number of the retractions we write about are due to misconduct, and it’s not clear how peer review, no matter how good, would detect out-and-out fraud.
Still, peer review is meant as a barrier between low-quality papers and publication, and it often comes up when critics ask questions such as, “How did that paper ever get through peer review?”
With that in mind, a paper published last week in the Annals of Emergency Medicine caught our eye. Over 14 years, 84 editors at the journal rated close to 15,000 reviews by about 1,500 reviewers. Highlights of their findings:
…92% of peer reviewers deteriorated during 14 years of study in the quality and usefulness of their reviews (as judged by editors at the time of decision), at rates unrelated to the length of their service (but moderately correlated with their mean quality score, with better-than average reviewers decreasing at about half the rate of those below average). Only 8% improved, and those by very small amount.
How bad did they get? The reviewers were rated on a scale of 1 to 5 in which a change of 0.5 (10%) had been earlier shown to be “clinically” important to an editor.
The average reviewer in our study would have taken 12.5 years to reach this threshold; only 3% of reviewers whose quality decreased would have reached it in less than 5 years, and even the worst would take 3.2 years. Another 35% of all reviewers would reach the threshold in 5 to 10 years, 28% in 10 to 15 years, 12% in 15 to 20 years, and 22% in 20 years or more.
So the decline was slow. Still, the results, note the authors, were surprising:
Such a negative overall trend is contrary to most editors’ and reviewers’ intuitive expectations and beliefs about reviewer skills and the benefits of experience.
(You might ask, “So who peer-reviewed this paper?” A newer reviewer, one would hope.)
Annals of Emergency Medicine is a reasonably high-tier journal, in the top 11% of Thomson Scientific impact factors in 2008. So what’s true for the journal may be true at other top-tier publications.
What could account for this decline? The study’s authors say it might be the same sort of decline you generally see as people get older. This is well-documented in doctors, so why shouldn’t it be true of doctors — and others — who peer review? The authors go on:
Other than the well-documented cognitive decline of humans as they age, there are other important possible causes of deterioration of performance that may play a role among scientific reviewers. Examples include premature closure of decisionmaking, less compliance with formal structural review requirements, and decay of knowledge base with time (ie, with aging more of the original knowledge base acquired in training becomes out of date). Most peer reviewers say their reviews have changed with experience, becoming shorter and focusing more on methods and larger issues; only 25% think they have improved.
Decreased cognitive performance capability may not be the only or even chief explanation. Competing career activities and loss of motivation as tasks become too familiar may contribute as well, by decreasing the time and effort spent on the task. Some research has concluded that the decreased productivity of scientists as they age is due not to different attributes or access to resources but to “investment motivation.” This is another way of saying that competition for the reviewer’s time (which is usually uncompensated) increases with seniority, as they develop (more enticing) opportunities for additional peer review, research, administrative, and leadership responsibilities and rewards. However, from the standpoint of editors and authors (or patients), whether the cause of the decrease is decreasing intrinsic cognitive ability or diminished motivation and effort does not matter. The result is the same: a less rigorous review by which to judge articles.
What can be done? The authors recommend “deliberate practice,” which
involves assessing one’s skills, accurately identifying areas of relative weakness, performing specific exercises designed to improve and extend those weaker skills, and investing high levels of concentration and hundreds or thousands of hours in the process. A key component of deliberate practice is immediate feedback on one’s performance.
There’s a problem:
But acting on prompt feedback (to guide deliberate practice) would be almost impossible for peer reviewers, who typically get no feedback (and qualitative research reveals this is one of their chief complaints).
In fact, a 2002 study in JAMA co-authored by Michael Callaham, the editor in chief of the Annals of Emergency Medicine and one of the authors of the new study, found that “Simple written feedback to reviewers seems to be an ineffective educational tool.”
What about training? A 2008 study in the Proceedings of the Royal Society of Medicine found that short training courses didn’t have much effect on the errors peer reviewers failed to catch. That followed a 2004 study in the BMJ with similar results. And that’s consistent with what one journal editor who looked at the Annals of Emergency Medicine study told us about his own experience, too.
That same editor suggested that another potential fix — continually recruiting less-experienced reviewers, at the top of their games — might not work either. Such reviewers, he said, often didn’t include any narratives or interpretations in their reviews, just lists of comments.
Sounds like a good subject for the next Peer Review Congress, which should be held in 2013. In the meantime, please take our poll on one specific aspect of peer reviewing: