The problem of publication bias — giving higher marks to a paper that reports positive results rather than judging it on its design or methods — plagues the scientific literature. So if reviewers are too focused on the results of a paper, would stripping a paper of its findings solve the problem? That was the question explored in a recent experiment by guest editors of Comparative Political Studies. Mike Findley, an associate professor at the University of Texas at Austin and one of the guest editors of the journal, talked to us about a new paper explaining what they learned.
Retraction Watch: Can you explain what a “results-free” paper looks and reads like?
Mike Findley: We solicited (and received) results-free papers of two types. In the first type, we asked for a submission that approximated a pre-analysis plan, instructing prospective authors that their submissions should provide designs that enable a reviewer to assess as fully as possible the theory, main hypotheses, design, feasibility, and potential contributions of the results. In this kind of submission, the research had not actually been carried out. In the second type of submission, for research that had already been completed, we invited submissions of otherwise complete manuscripts from which the results and discussion had been removed. For these submissions, the author(s) needed to provide a similar level of detail on the theory, design, and credible documentation that the results of the study were not posted or circulated in any way such that a peer reviewer could find and view the results and make a judgment on the paper with conclusions in mind. We received both types of submissions and ultimately accepted two of the first type (design / pre-analysis plan) and one of the second type (results not yet computed or reported).
RW: Why did you think it was important to test how peer review worked on papers that don’t include any results?
MF: The primary motivation for the special issue was to address publication bias, which exists when a set of published studies is not representative of all available or possible studies. In much scientific work, publication bias is most pronounced when publication decisions are based on the realized outcomes of a study—typically statistical significance of a result—rather than the merits of the approach and design. For example, it might be that 99 out of 100 tests of a hypothesis yield no statistically significant result, but the 100th test does. If only that 100th test is published, then the published literature will convey the impression that the balance of evidence favors rejecting the null hypothesis. Unfortunately, only one journal in political science (the Journal of Experimental Political Science) allows or encourages review based on approach and design alone rather than allowing results to impress reviewers.
Although the special issue editors were optimistic about addressing publication bias through results-free review, none of us thought (or think) that results free review provides a full solution to the problem. This gets to the heart of our exercise. Prior to the special issue, it was unknown what exactly results-free review could do or not do in addressing publication bias. We thus conducted the special issue to understand better what the scientific community stands to learn or not learn through results-free review practices.
RW: What would the process look like? Would the paper, for example, first be submitted to peer review before it had results, which guarantees the paper will be published no matter what the results are?
MF: As special issue editors, we were active in evaluating all manuscripts, be they prospective designs or completed studies with results scrubbed. The special issue editors read manuscripts and consulted with the standing CPS editors in a number of cases. In all, one manuscript was withdrawn by the author, eight manuscripts were desk rejected, and a final 10 manuscripts were sent out for peer review.
Reviewers submitted their comments on the results-free manuscripts through the regular CPS editorial mechanism, and the reviews were then sent to the special issue editors by the CPS standing editors. Both sets of editors jointly made the final decisions. Three of the 10 papers sent out for review were offered revise and resubmits. After revisions, all three papers were sent back to the original reviewers who all commented relatively positively, and the papers were then accepted for the special issue.
Once the determination had been made, that decision was the near-final decision on the manuscript, subject only to the constraint that the research was executed as planned. We instructed authors that deviations from the accepted research designs were acceptable, but had to be documented rigorously and discussed thoroughly. By asking that authors delineate the alterations made as a result of reviewer suggestions in the final article to clearly and publicly differentiate them from analyses that were preregistered, we gained novel insights into how the peer-review process shapes knowledge production and accumulation in comparative politics. In following this process, neither the referees nor any of the editors had any indication of the results when an accept/reject decision was made, thus ensuring a results-free decision at every possible level.
RW: Overall, what did your experiment find? Did the studies that were reviewed in this way have more of an impact than you would expect them to have if they were sent out to review including their results?
MF: Perhaps most interestingly, we found that null results emerged in one study, and partially in a second study. Given how rare null results are in most journals, this finding suggests that the broader adoption of results-free review would indeed produce more published null results work. Our best guess is that null results are much more common than published literature indicates.
Second, contrary to fears that greater emphasis on transparency creates more incentives for clever research designs and methodological perfection, reviewers placed an overwhelming emphasis on theoretical consistency and substantive importance. In this regard, results-free review worked better than we could have hoped in incentivizing theory and research design over narrow concerns about novelty of methodology or empirical causal identification. Atheoretical work stood very little chance of publication in our pilot: referees consistently asked for meaningful tests of theoretically important hypotheses.
RW: Are there any limitations to this approach?
MF: It was immensely challenging for reviewers and authors alike to argue coherently about the proper role of null findings, often referred to tellingly as “non-results.” The challenges for current practices in this regard are steeper than we had anticipated, and speak to general debates about null-significance hypothesis testing and the relationship between theory, data, and models.
RW: Will your journal be adopting results-free peer review permanently? If so, on what scale?
MF: Comparative Political Studies conducted this pilot as a special issue of the journal devoted to results-free review. As special issue editors, we understand that the current editors do not intend to make results-free review a mandatory practice, nor will it be allowed as a regular manuscript submission track. The Journal of Experimental Political Science plans to make results-free review a standing manuscript submission track (see forward to Lin and Green 2016 here).
RW: You focused on papers in the social sciences. Do you think this type of reviewing process is suitable for papers in other disciplines? Why or why not?
MF: Without a similar cross-discipline exercise, we cannot say with certainty whether results-free review will be appropriate for other disciplines. What we can say is that we learned that some types of research seem more suitable than others. We only received quantitative, normal science style submissions, of which most were some form of an experiment. Thus, we expect that results free review will be suitable for other disciplines that conduct substantial quantitative, normal science research. Disciplines focused mostly on qualitative, ethnographic, or historical research may find results-free review to be less appropriate.
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our new daily digest. Click here to review our Comments Policy. For a sneak peek at what we’re working on, click here.