Which journals will publish replications? In the first post in this series, Mante Nieuwland, of the Max Planck Institute for Psycholinguistics, described a replication attempt of a study in Nature Neuroscience that he and his colleagues carried out. Yesterday, he shared the story of their first submission to the journal. In the final installment today, he explains why the paper was eventually published in another journal.
We received a confirmation of our submission that restated the refutation procedure, in which the original authors had one week to provide comments on our submission before sending our correspondence and those comments to reviewers, which could include reviewers of DUK05.
More than a week later, on May 11th, I e-mailed Nature Neuroscience to explain the issue with the omission of the baseline correction procedure in DUK05, and why we included new analyses to address that issue, and the issue with the filler materials. I also added “Whether this is grounds for requesting an erratum, I cannot judge.”
I also raised the issue of data availability, because we two other researchers, namely Shravan Vasishth and Florian Jaeger, told us that they had requested the original data of DUK05 to perform re-analyses but without any result. My e-mail thus stated that “our group and several other groups have asked for data from the original study, but thus far DeLong et al. have not shared any data, so that the original analysis and results cannot be verified, and an improved analysis of the original result cannot be performed.”
In the meantime, however, the original authors asked us for new data that went into the reported analyses. We uploaded more of our data to the Open Science Framework and notified Nature Neuroscience we were doing so. Nature Neuroscience said that they hoped that DeLong et al. would make data available but that the reviewers would just have to do with what DeLong et al provided.
In other words, Nature Neuroscience ignored what seems to be a blatant violation of the journal’s policies, which state that authors must
make unique materials promptly available to others without undue qualifications. Any restrictions on the availability of materials or information must be disclosed to the editors at the time of submission. Any restrictions must also be disclosed in the submitted manuscript. After publication, readers who encounter refusal by the authors to comply with these policies should contact the chief editor of the journal. In cases where editors are unable to resolve a complaint, the journal may refer the matter to the authors’ funding institution and/or publish a formal statement of correction, attached online to the publication, stating that readers have been unable to obtain necessary materials to replicate the findings.
A caveat here is that I am not aware of the journal’s policies at the time when DUK05 was submitted nor of current policies about applying data sharing requirements retroactively. But no restrictions were disclosed in DUK05, and nothing happened: Nature Neuroscience did not explicitly acknowledge or respond to my mention of the methodological omissions in DUK05. To our knowledge, so far, the journal has not taken any of the actions described in their policy and Nature Neuroscience readers remain uninformed of these omissions.
Two months later, on July 4th, I was able to view the response from DeLong et al in the journal’s submission portal. I cannot reveal the details of that response, but it showed reanalysis of the DUK05 data with an analysis that was similar to our improved analysis (however, it was not identical because it used a different dependent variable, namely voltage in the 300-500 ms time window after word onset, instead of the 200-500 ms used in DUK05 and in our replication study). It also included 2 other older datasets which were not direct replications but came from related studies with different materials and different filler sentences (one of those datasets belonged to another already published paper which was not cited in the response). They reported a replication of DUK05: a statistically significant effect at the articles and at the nouns. It also showed that with the improved analysis, the effect in the DUK05 data alone was not statistically significant. Unusually, not a single ERP waveform was shown.
In the meantime, we had discovered an error in our calculations of the question accuracy. I wrote to Nature Neuroscience on July 5th to report this mistake, and we resubmitted a version with the correct numbers because “It would be relevant for a reviewer to have the correct data wrt our accuracy scores. We have also uploaded all log files and new accuracy data to our OSF page.” Referring to the response from DeLong et al. I also stated that “we have several major concerns with their data (no ERP waveforms, no available files etc.) and we have noted incorrect information in their descriptions and citations. it is unclear to us what the rest of the procedure is, and whether there is going to be opportunity to correct some of these errors.”
On July 11th, Nature Neuroscience responded that they had given DeLong et al extra time, and that DeLong et al would be sent our revised version so this would delay the process further. More than a month later, on Aug 22nd, Nature Neuroscience informed us that the two papers would finally be sent out for review.
More than two months later, on Nov 8th, we received the editorial decision that our paper was rejected. As editorial letters go, it did not say much except that our conclusions did not significantly challenge the conclusions of DUK05, and merely summarized some of the topics mentioned by the three reviewers (R1-3). R1 was very positive about our paper and supported our conclusions, but R2 and R3 had a range of concerns, which I cannot cite directly, but below is the gist:
R2 wanted to see a head-to-head comparison between our results and those of DUK05 when precisely the same methods were used. However, we had made all correlation results available for review (with the original and new baseline correction), and we used a Bayesian analysis to test whether we replicated both the size and direction of the original, which was the case for the nouns but not the articles. Somehow, all these data and analyses seem to have been ignored or missed by this reviewer.
Moreover, the discrepancy between the analyzed time windows in our improved analysis and the DeLong et al response (which was supposed to copy our analysis) was not picked up by this reviewer. R2 thus faulted us, rather than DeLong et al, for being unclear about the data analysis, while all our analyses were reproducible and following the details of DUK05.
R3 was very negative about our efforts, and argued against publication based on several points. First of all, R3 made what I consider an ad hominem argument by suggesting that we intentionally failed at the replication. R3 also suggested that our data were collected by poorly trained technical members, and that we “misplaced” electrodes. In fact, we did not misplace electrodes, but our laboratories had different EEG channel montages so we had to interpolate some channels for one lab to arrive at a common set of channels. R3 thus clearly demonstrated a lack of basic knowledge of how an EEG lab operates.
R3 also faulted us for not having all original materials, and said it was inappropriate to suggest that the original authors did not want to share their materials and demanded to see evidence. I had already informed the editor that we had those emails but that I could not make them available without permission. All I could do was cite my own personal communications where I asked for the materials with the stated purpose of replication. R3 thus suggested that our failure to replicate was either intentional or due to sloppiness, but did not seem too bothered that DUK05 had omitted crucial methodological details.
R3 also provided another, rather odd argument for rejection, namely that if the studies would be published together, readers would only read our study and ignore the commentary. But this does not seem like a reasonable argument for rejection, and in fact it is completely incompatible with the journal’s reason for publishing refutations together with commentaries in the first place.
We pointed out some of these issues to Nature Neuroscience in an email and briefly considered appealing yet again. However, we quickly decided against this given that rejecting our paper based on such comments conveyed (to us at least) an intention to reject our paper no matter what. We submitted elsewhere after making some further edits to pre-empt the concerns raised in the review. Painfully for us, yet another Nature editorial appeared two weeks later, again showcasing commitment to replication by stating that “Rewarding negative results keeps science on track: Creating a culture of replication takes prizes, grants and magnanimity — as well as publications.”
We submitted our paper to eLife, a non–profit publishing organisation inspired by research funders and led by scientists. About two months later, we received a long list of comments from 3 reviewers, most of which are published along with our paper.
The purpose of this post was to provide a transparent, behind-the-scenes account of our replication study and what happened when we submitted our study to Nature Neuroscience. On the one hand, I can understand why Nature journals might be hesitant to publish replication studies. It might open the floodgates to a wave of submissions that challenge conclusions from publications in their journal (although that in itself is not necessarily a bad thing).
On the other hand, a few things from this case study stand out by clearly contradicting Nature’s commitment to replication and transparency. Nature Neuroscience triaged our study for lack of general interest, failed to follow their own submission procedure in terms of timeline, failed to follow their own policy on data and materials sharing, failed to correct important omissions in the academic record of the original study, and failed to provide, in my opinion, a fair review process (i.e. by relying on one reviewer who faulted us for the lack of clarity due to the original paper, and on one non-expert reviewer who mostly just questioned our intentions and disagreed with the publication format).
In the end, the final decision letter demonstrated a lack of engagement, did not go beyond a mere 2-sentence summary of the negative comments by the reviewers, and did not even attempt to explain which of the concerns weighed most strongly in the editorial decision, why they couldn’t have been addressed in a revision, and which conclusions of DUK05 remain unchallenged.
Replication research may become mainstream, and that’s a good thing. Rolf Zwaan and colleagues recently argued that there is no compelling conceptual argument against replication (see also here). And surely they’re right, most researchers appreciate the importance of replication, but what about the practical reality of replication? What happens when you try and do it? The practical difficulties of doing and publishing replication research create substantial obstacles.
In my experience, reviewers and editors may place the burden of proof on replicators to account for different methodology (even if the methodology was never reported in the first place and cannot be verified), replications may be subjected to methodological critiques never raised against the original study (see also here), and reviewers and editors typically do not scrutinize claims of successful replications in the same way as those of failed replication. In addition, as in the current case, replication studies often appear in lower ranked journals than the original, feeding the suspicion that replications are not highly valued or of lesser quality.
Nature prides itself for its commitment to replication research and for its increasing transparency in publishing, and while Nature indeed is developing several initiatives, it seems they have a very long way to go, like many other journals/publishers. Here are some suggestions (see also here):
- Increase transparency of the review process, by publishing decision letters, reviewer comments and author responses (as eLife does)
- Treat replication studies as primary research articles and not as mere refutation correspondence. The refutation correspondence format is not intended for full presentation of data, whereas the whole point of replication research is to fully present a new set of primary research data of which the details may matter a good deal. Nature states that “we do consider high-value replications, subjecting them to the same criteria as other submitted studies,” but refutation correspondences and their responses are currently not subjected to the same criteria as other studies, so Nature could start applying the same criteria (e.g., on data availability, on requiring a statistical reporting checklist), and could officially enlist statistical expertise as some other journals have done.
- Nature could also follow the ‘Pottery Barn rule’ and take responsibility for publishing direct replications of studies by publishing such studies (after review on technical merit), regardless of the outcome, as brief reports that are linked to the original and therefore in the same journal, rather than relaying them to other journals such as Scientific Data.
- Even better, Nature could dedicate a submission format to replication research across all all journals (not just Nature Human Behavior), for example as registered reports via the Open Science Framework. Many issues that arose during our publication process could have been avoided had Nature Neuroscience had a Registered (Replication) Report format. (In fact, one lesson to take away from this experience is: if you’re going to do a large-scale replication study, do it as a Registered Report even if that is with a different journal than the original)
Ultimately, I am happy with how things turned out, and I’m very satisfied with the format and content of our publication in the non-profit journal eLife. Regardless of what happened during the publication process, and while some senior colleagues have voiced their distrust of and annoyance with our study, we have received many more, very positive responses from colleagues and from the open science community more broadly. On that note, I want to close by arguing for replication, pre-registration, and increasing transparency (see also here and here), no matter what occurs.
Like Retraction Watch? You can make a tax-deductible contribution to support our growth, follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up for an email every time there’s a new post (look for the “follow” button at the lower right part of your screen), or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at email@example.com.