Peer review isn’t good at “dealing with exceptional or unconventional submissions,” says study

pnascoverOne of the complaints about peer review — a widely used but poorly studied process — is that it tends to reward papers that push science forward incrementally, but isn’t very good at identifying paradigm-shifting work. Put another way, peer review rewards mediocrity at the expense of breakthroughs.

A new paper in the Proceedings of the National Academy of Sciences (PNAS) by Kyle Silera, Kirby Leeb, and Lisa Bero provides some support for that idea.

Here’s the abstract:

Peer review is the main institution responsible for the evaluation and gestation of scientific research. Although peer review is widely seen as vital to scientific evaluation, anecdotal evidence abounds of gatekeeping mistakes in leading journals, such as rejecting seminal contributions or accepting mediocre submissions. Systematic evidence regarding the effectiveness—or lack thereof—of scientific gatekeeping is scant, largely because access to rejected manuscripts from journals is rarely available. Using a dataset of 1,008 manuscripts submitted to three elite medical journals, we show differences in citation outcomes for articles that received different appraisals from editors and peer reviewers. Among rejected articles, desk-rejected manuscripts, deemed as unworthy of peer review by editors, received fewer citations than those sent for peer review. Among both rejected and accepted articles, manuscripts with lower scores from peer reviewers received relatively fewer citations when they were eventually published. However, hindsight reveals numerous questionable gatekeeping decisions. Of the 808 eventually published articles in our dataset, our three focal journals rejected many highly cited manuscripts, including the 14 most popular; roughly the top 2 percent. Of those 14 articles, 12 were desk-rejected. This finding raises concerns regarding whether peer review is ill-suited to recognize and gestate the most impactful ideas and research. Despite this finding, results show that in our case studies, on the whole, there was value added in peer review. Editors and peer reviewers generally—but not always—made good decisions regarding the identification and promotion of quality in scientific manuscripts.

The authors offer a bit of insight about why this is important:

Because most new ideas tend to be bad ideas (55), resisting unconventional contributions may be a reasonable and efficient default instinct for evaluators. However, this is potentially problematic because unconventional work is often the source of major scientific breakthroughs (5).

The paper is a valuable contribution to the literature — hey, isn’t that something a peer reviewer might write? — and is well worth reading in its entirety. Perhaps next up, the authors will look at why so many “breakthrough” papers are still published in top journals — only to be retracted. As Retraction Watch readers may recall, high-impact journals tend to have more retractions.

Hat tip: Deborah Blum

22 thoughts on “Peer review isn’t good at “dealing with exceptional or unconventional submissions,” says study”

  1. In medicine, at least, my impression is that the editorial culture is resistant to new information, even to the extent of refusing to consider anything that they do not already know. I recently had an experience with a British journal that I think would just astonish right thinking people. Well-known British journal, too, with which I had some track record.

    Staff seem to spend their whole lives trying to prove they are smarter than the person at the next desk. The idea of actually breaking a story that would change people’s lives doesn’t come into it.

    Maybe its the peer reviewers wanting to show they are smarter than the authors infects the editorial process. But I continue to believe that science publishing will drift on until News Corp takes a stake in one of the players, fires one third of the staff, and doubles the readership.

  2. One has to wonder how many rejected papers that might have made potentially significant contributions to science were eventually abandoned by their authors and will never see the light of day. Editorial/peer review type II errors?

  3. I would be interested in what people think is a good alternative to peer review.

    Perhaps the best we can hope for is to educated reviewers better in how to do the job.

    1. There are no alternatives, see the usual comparison of peer review to democracy.
      But referees should stick to judging the objective quality of submitted work, not its “Novelty” and “Impact”.

  4. Who are these editors who think they can judge what is a breakthrough and what is not? Are they something like super-scientists with special supernatural powers? This very arrogance is what leads to all these sensationalist papers, later to be proven irreproducible. They nevertheless get cited, by all those who want to jump on the train of fairy tale telling to impress editors.
    What’s wrong with judging scientific work based on its original thinking and experimental reliability? Let the future research determine what was a breakthrough and what wasn’t, instead of letting the Nature’s or Cell’s editor choose.

  5. I’d be careful to read the paper, because “peer review” covers a wide range of actual handling.
    In particular, it is clear that many of the articles get desk-rejected before they ever get to peer review.
    An editor probably *should* desk-reject a lot of papers, but clearly may not find the good ones amidst the noise.

    From experience in running program committees (different, but somewhat related):
    1) Everybody ranked every paper (or abstract) 1-5 (no-yes), we put all those in spreadsheet, sorted by by average rating and then reviewed.
    2) Some papers were obvious accepts, some obvious rejects, but it was always worth checking for outliers, where most people said “1-3” but one person said “5”, and it turned out the one person really knew the work, and convinced everyone else, as per “12 Angry Men.”. I know that if I as PC Chair had just rejected some, I would have rejected some that turned out OK.

    3) Hence, perhaps a help for this is for an editor to say:
    a) Send to peer review.
    b) Desk-reject
    c) (new): send around a batch of papers to a review board, not to do a real review, but to take a quick look and see if any seem worth sending to peer review. I.e., if even one person thinks so, may be worth trying.

    Quotes from the paper:
    “Of the 808 eventually published articles in our dataset, our three focal journals rejected many highly cited manuscripts, including the 14 most popular; roughly the top 2 percent. Of those 14 articles, 12 were desk rejected”

    “Our results suggest that gatekeepers were at least somewhat effective at recognizing and promoting quality. The main gatekeeping filters we identified were (i) editorial decisions regarding which manuscripts to desk-reject and (ii) reviewer scores for manuscripts sent for peer review. Desk-rejections are manuscripts that an editor decides not to send for peer review after an initial evaluation.”

    a number of highly cited articles were desk-rejected, including
    12 of the 15 most-cited cases. Of the other 993 initially submitted
    manuscripts, 760 (76.5%) were desk-rejected. This finding suggests
    that in our case study, articles that would eventually become
    highly cited were roughly equally likely to be desk-rejected as
    a random submission. In turn, although desk-rejections were effective
    with identifying impactful research in general, they were
    not effective in regards to identifying highly cited articles.”

    “Despite the 15 glaring omissions at the top, on the whole gatekeepers
    appeared to make good decisions.”

    “By restricting analysis to journals with impact factors greater than
    8.00, we are more likely to be cherry-picking gatekeeping mistakes
    and ignoring the vast cache of articles that were “rightfully” sent
    down the journal hierarchy.”

    “The rejection of the
    14 most-cited articles in our dataset also suggests that scientific
    gatekeeping may have problems with dealing with exceptional or
    unconventional submissions. The fact that 12 of the 14 mostcited
    articles were desk-rejected is also significant. Rapid decisions
    are more likely to be informed by heuristic thinking (40). In
    turn, research that is categorizable into existing research frames
    is more likely to appeal to risk-averse gatekeepers with time and
    resource constraints, because people generally find uncertainty
    to be an aversive state (41).”

  6. The issue isn’t confined just to peer review of journals, but to “peer review” of any field. There is a conservative vs _avant garde_ bias that can be found everywhere there are judges.

    An perfect example is found in one of my new most-favorite anime, “Your lie in April,” about a young piano prodigy who suffered a nervous breakdown at age 11 and can no longer hear himself play music (bear with me -there’s method to my _non sequitur_). His muse is a young girl–a violin prodigy–who refuses to be constrained, not matter the venue, and insists on turning a Mozart violin solo into a virtuoso improvisational piece in the middle of a formal competition. The judges rightfully reject her, but the audience response (standing ovation and wild cheering) still compels them to allow her to compete at the second level.

    An example from the other perspective is found in the classic animation, “The Dot and the Line: A Romance in Lower Mathematics.”

    Free-thinking/improvisation is neither good nor bad, and neither is strict formalism. Both have their place. Unfortunately, the methodology used to rank one is inadequate to rank the other.

    Journals and journal readers need to recognize this and allow for it, possibly not in the same venue.

  7. “how many rejected papers that might have made potentially significant contributions to science were eventually abandoned by their authors and will never see the light of day”
    This is exactly my case. Two journals rejected two of my submissions on the basis that my findings did not agree with previously published research (which turned out to be irreproducible). Six months after I was informed of the rejection; a similar research was published by whom I suspect was a reviewer who pirated my ideas and stole my research. Since then I have been questioning why would any scientist hand his ideas and words to strangers who may feel free to steal his work?

  8. aceil
    “how many rejected papers that might have made potentially significant contributions to science were eventually abandoned by their authors and will never see the light of day” This is exactly my case. Two journals rejected two of my submissions on the basis that my findings did not agree with previously published research (which turned out to be irreproducible). Six months after I was informed of the rejection; a similar research was published by whom I suspect was a reviewer who pirated my ideas and stole my research. Since then I have been questioning why would any scientist hand his ideas and words to strangers who may feel free to steal his work?

    This is why you (and many others) should prefer services such as arXiv. A public trail in the Internet never vanishes.

  9. Peer review will never be immune to all these dysfunctions.We all have had either papers rejected or asked to change them and most of the time did not agree.This is simply due to the fact that peer review is a human task that will never be free of …..human judgement !!! On the other hand , it tends to be conservative because the famous gatekeepers are the stars that have built their reputation on the same paradigms they are asked to disavow. I think it will stay this way as long as an acceptable alternative has not been found.

    1. Excellent observation. The same occurs in instances of publicly funded research – the gatekeepers of the research are typically committee members who’s track record of research is within narrowly defined areas of their field. They are more than delighted to approve proposals that support or are adjunct to their own areas of research but much less inclined to approve proposals that are further afield or contrary.

  10. Aceil, I am very sorry to hear that you may have been victimized in this way. As we all know, plagiarism of ideas is much more difficult to establish than plagiarism of text. However, it occurs to me that you might be able to shed some light on what happened in your case by following some simple steps. For example, since most authors will list in their vita the journals that they serve as guest reviewers, editorial board members, etc., a little sleuthing on your part may help you determine whether the author of the published work could have been a reviewer of your paper. I also note that some journals acknowledge external reviewers in the last issue of each year and, of course, most journals list the members of their editorial boards. In sum, perhaps you have put this affair behind you but, if not, I think there are some avenues for you to pursue that could help you get to the bottom of your situation.

    1. Thank you for the advice. Let’s say I know who the responsible two persons are. Where and to whom should I file my complaint? Who is going to protect me from retaliation? How are the damages I have incurred going to be remedied? Where are the safeguards against editors and reviewers abuse? Obviously, the questions I am raising point to serious flaws in a system which the majority support. But why?
      If we follow the money, will we be able to figure out who is/are gaining from pushing the publish or perish mantra?

  11. My favorite reason for rejection was “If this is such a good idea, then why aren’t XXX Inc, YYY LLC and ZZZ Corp. already doing it?”

  12. Peer-review is a slow process of making ideas, experimental results, sum ups and in many ways mistreated data and results public. And it is not about victims or winners, but about canalization of innovation for public absorbation. It is a barrier for individualized success in the scientific community. It can be understood as a control system far above any scientific greatness, because it does not inherit it. It is nothing about science, but about control of logical endpoints. Peer-review is today misunderstood as quality control, but it does not work as it is supossed to be. It is far away of standardisation and scientists may publish results in a high impact journal, because they work in a network of success and high tier scientists may help them being fast with publishing, because they are part of the network. Many people dont believe in the peer-review system. No one trusts the peer-review system. No one should. And how could one trust in it? If you are a scientist today, one may advise you to start networking now or never, create data in a field that is “sexy”, do experiments, waiste money, gain money, abuse people, undergrads or whomever. That is how it works and mankind finds innovation based on this system, mankind evolutes based on this. It is there. But how does your diamond look like if you hold it in front of a field of dead souls? The path is full of it. Science works today in a dirty way. I have seen many great minded scientist “fail”. But fortunately, some of them have learned to monetarize their great mindness, they do something and move something far above “public” science, clearly speaking, in my opinion, sometimes in many ways not that great science. To keep it short, if the science community pursues to reduce failure and misinterpretation, start controlling the process right at its roots. Failures in the peer-review system are symptomatic, that is why one may start to i.e. quality control any results right at the beginning so that no symptomatic corrections are further needed.

    1. One of the most honest unbiased analyses. How many authors were advised to submit where their friends sit on editorial boards? How many where advised to add the names of hot selling authors to guarantee acceptance? How seriously contaminated the literature is with flawed statistics, fabrications, falsifications, erroneous data, misrepresentation, misappropriation of ideas and data…….? Are all our discussions falling on deaf ears?

  13. I’m not at all sure this paper did show that peer review wasn’t good at spotting exceptional papers. Sadly the paper is paywalled so I’ve only read the abstract, but as far as I can tell it’s based on a sample size of just 14 exceptional papers: hardly enough to be drawing anything but the most tentative conclusions.

    But more to the point, 12 of those 14 papers were rejected without being sent to peer review. So if we are going to draw any conclusions from this, it wouldn’t be that peer review is bad at spotting exceptional papers, it would be that journals’ in-house editorial staff are bad at spotting exceptional papers.

    I suspect there are further methodological weaknesses here, though it’s hard to be sure just from the abstract. But one thing that leaps out at me is that exceptional papers were defined as those that were highly cited. Not at all sure that’s a good measure of article quality. Sometimes, papers can be highly cited because they are wrong and generate many rebuttals.

  14. I’m not sure what to make out of this study. I find the practice of equating citation numbers to scientific excellence rather questionable. Especially in cases where the manuscript was rejected because of lack of validity or insufficient data to prove the claims (as in 12 out of the 15 most cited cases), I’d rather not blame reviewers or editors for not publishing such a paper, no matter how many citations it got elsewhere.

  15. Unfortunately this is yet another study of peer review that uses peer review as a common experience filtering process. The implicit theoretical context is that peer review is a rational system (see HIrschauer, 2010; Biagioli, 2002; Gaudet, 2014). Explicit assumptions in the study include that peer review is pre-publication, that the contingency of pre-publication peer review itself is ‘natural’, that referee anonymity is ‘natural’, that secrecy for decisions and judgements is ‘natural’, that the lack of access to pre-publication peer review decisions and judgements means it cannot be studied (or with great difficulty, instead of understanding lack of access as data in and of itself when studying secretive pre-publication journal peer review).

    Finally, analysis moves to asymetry in explanation. Mulkay and Gilbert (1982) coined “asymmetrical accounting for error” by scientists (1982:166). This frames understanding of ‘false beliefs’ in peer review (cf., rejecting apparently ‘valuable’ manuscripts) as being distorted by social, personal, or psychological elements as opposed to ‘correct beliefs’ (cf., not rejecting ‘valuable’ manuscripts) as purely cognitive and owing to rational analysis by individual editorial readers (see 1982:181).

    Hirschauer (2010) decried the lack of scientific study of peer review and deplored the movement to social explanations. In this case, psychological and social explanations were only to investigate rejection where “[t]here is thus a mundane ‘sociology of knowledge’, working selectively and opportunistically in favour of the author’s self-esteem” (2010:72). In this study, self-esteem appears to have been attached to purported rational decision-making in peer review and working not only in favour of authors, but also of journals that wish to maintain the dominance and naturaleness of pre-publication journal peer review.

    I propose a new concept to identify this type of study of peer review: Gatekeeperology. In gatekeeperology research, the status quo of pre-publication journal peer review is maintained with a common understanding of peer review as ‘gatekeeper’ for science.

  16. Science is human. We can only add to what is already known and if flaws are build into that, then the process will only patch over the cracks, until a sufficiently large earthquake exposes them beyond repair.
    For example, I point I keep trying to raise is that we experience time as a sequence of events, from past to future, which physics codifies by treating time as a measure of duration, but the actual process is change creating and dissolving those events, such that it is they which go future to past and duration is only the state of what is present, as these marks form and dissolve. To wit, the earth doesn’t travel some flow or dimension from yesterday to tomorrow, but tomorrow becomes yesterday because the earth turns. This makes time an effect of action, similar to temperature. Time is to temperature what frequency is to amplitude. While amplitude en masse is temperature, frequency en masse is noise, so we isolate particular actions, such as cycles of a cesium atom, or rotations of the planet. Different clocks run at different rates because they are separate actions. In fact, a faster clock burns/ages quicker and so recedes into the past faster.
    Suffice to say, it is much easier to simply ignore the basic logic of this, than question what are foundational premises of Physics. Even today, we still see the sun rise in the east and set in the west, but do understand how the effect is created by the earth turning west to east.

  17. I think this discussion overlooks one important point. The rejected ms may report a great and important idea, but may also be poorly written and difficult for even the referees (who are presumably in the top group in their field) to understand. So the end result of negative reviews (whether the immediate editorial decision is rejection or major revision) may be a greatly strengthened paper when it eventually gets published. The final published paper may not only be better written but may also report new analyses or experiments that respond to points of doubt raised by the original referees. Of course running this gauntlet may be highly frustrating for the authors, but dealing with this process is part of being a professional researcher.

Leave a Reply

Your email address will not be published. Required fields are marked *