Last week, the International Committee of Medical Journal Editors proposed requiring authors to share deidentified patient data underlying the published results of clinical trials within six months of publication. The proposal has earned much support but also some concerns – for example that other scientists might poach the findings, acting as the New England Journal of Medicine dubbed “research parasites.” Elizabeth Wager, a member of the board of directors of our parent organization, disagrees with that concern, but raises another issue – namely, the unintended consequences of data sharing on other, more effective initiatives to make reporting more transparent.
The recent proposal from the ICMJE may appear, at first glance, a positive step towards better clinical trial reporting. However, I’m concerned that this new requirement might undermine other more effective initiatives to increase the efficiency of research, such as the publication of protocols and full study reports. Here’s why.
All actions have costs, risks, and benefits: Making partial data sharing a condition of publication is no exception. The costs are hard to quantify but undoubtedly not trivial. Putting clinical data into a usable format and making it meaningful to other researchers requires considerable time and effort by knowledgeable people. To this must be added the costs of establishing and maintaining suitable repositories and of checking compliance.
I’m not saying that open data does not have any benefits. It would be fantastically helpful in investigations of suspected misconduct and the very fact of having to share some of your data might act as a deterrent against fraud and inappropriate analysis. Open data might also flag up errors in analysis. Access to individual patient level data from clinical trials can enhance the value of meta-analyses and therefore help to inform medical decisions. These are unquestionably good things that make the case for data sharing.
If we can afford to make full datasets usable and public, that would be wonderful. Perhaps one way would be to increase the efficiency of research by funding only worthwhile projects. Huge amounts of medical research funding are wasted on poorly designed or executed research which cannot provide reliable evidence, or on research into questions that have already been answered. Another waste of research funds is caused by the fact that a worrying proportion of medical research is never published: Some studies suggest this may be as much as half (at least in the past). Even if this figure is an overestimate (as some have argued), or things are improving, this represents an appalling waste.
However, we also know that when trials are published in journals, the reports are often incomplete or misleading. Even when trials are registered at the start (which still isn’t required by many journals), researchers don’t always report the pre-stated outcomes in their publications. Ben Goldacre’s COMPARE study is showing that outcome switching is far more prevalent than most of us realised.
My fear is that demanding that authors share the data underlying their publications might have the unintended consequence of exacerbating the problem of non-publication simply because it increases the effort and cost of publishing. Furthermore, this would disproportionately affect less well funded, non-commercial studies and researchers from low income countries. But my main concern is that, because the ICMJE are only asking for data that support the published conclusions, rather than the full dataset, the new requirement will do nothing to reduce the serious problems of outcome switching or selective reporting.
I am also worried that the emphasis on posting raw data could remove attention from other proposals to improve transparency and the effectiveness of reporting. Goldacre himself said recently he was concerned that an emphasis on data sharing could sideline efforts to make study results more available. While only a tiny number of people are interested in reanalysing or synthesizing other researchers’ raw data, more (admittedly, not many) might use the information in full study reports (such as those already required by regulatory authorities and funders). Since such reports are already being produced in a standard format (at least for commercially-funded drug studies), the cost of making them public is low.
In addition, requiring full, date-stamped protocols to be publicly posted (e.g. on funders’ or institutions’ websites, or alongside publications) costs almost nothing yet brings clear benefits. Access to full protocols not only highlights selective reporting and outcome switching, but also increases the chance that methods can be repeated (either to replicate the research or put the findings into practice). Publishing protocols at the start of projects can also avoid duplication of effort and, if protocols are widely reviewed before research begins, problems with proposed methods or analyses might be resolved before, rather than after the research is done, making the work more likely to generate meaningful results.
Many journals (including ICMJE members) now require protocols to be submitted alongside manuscripts so they can be used by peer reviewers and editors to check for outcome switching and cherry-picking of results. It is easy for a journal to check, at the time of publication, whether a protocol has been published and whether a trial has been registered. And, since both these things should be done at the start of the trial, there is no risk that mandating them would delay publication. Another issue I have with the ICMJE proposal is that giving researchers a six-month grace period after publication before data must be posted will make it much less likely that journals will check compliance and much harder to take action if authors fail to share their data properly.
I don’t share the same concerns about data sharing as other commentators – namely, that research parasites or crackpots could use data for nefarious purposes. If we can review the original methods and analysis, we can surely judge the value of secondary analyses, although I agree that we need to find a good way to credit and incentivise data providers.
I want to re-emphasize that I am not against data sharing. If I had a magic wand (aka unlimited funding for research and its dissemination) I would undoubtedly wave it over all research and create a system in which raw data were permanently linked to all types of report and all the report formats were linked (so that, for example, somebody reading a press release could easily check the journal article, and, if they wished, also the protocol, full study findings and raw data). But if the fairy gave me two wishes instead of a wand, I would wish for prospective trial registration and access to full trial reports for all trials before wishing for raw data.
But who knows – sometimes progress needs great leaps of faith and some visionary mavericks who see the far horizons (and I’m happy to acknowledge that the ICMJE’s leadership on trial registration had a huge, positive impact). Perhaps, if a few influential medical journals mandate partial data posting, the era of properly linked (or “threaded”) publications, which I and many others hope for, will arrive more quickly. I’m not convinced — but, like a good scientist, I really hope I’m wrong.
Liz Wager is a publications consultant at Sideview, and former chair of the Committee on Publication Ethics (COPE). She is based in Princes Risborough, UK
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our new daily digest. Click here to review our Comments Policy.