Sharing data is a good thing. But we need to consider the costs.

Liz Wager

Last week, the International Committee of Medical Journal Editors proposed requiring authors to share deidentified patient data underlying the published results of clinical trials within six months of publication. The proposal has earned much support but also some concerns – for example that other scientists might poach the findings, acting as the New England Journal of Medicine dubbed “research parasites.” Elizabeth Wager, a member of the board of directors of our parent organization, disagrees with that concern, but raises another issue – namely, the unintended consequences of data sharing on other, more effective initiatives to make reporting more transparent.

The recent proposal from the ICMJE may appear, at first glance, a positive step towards better clinical trial reporting. However, I’m concerned that this new requirement might undermine other more effective initiatives to increase the efficiency of research, such as the publication of protocols and full study reports. Here’s why.

All actions have costs, risks, and benefits: Making partial data sharing a condition of publication is no exception. The costs are hard to quantify but undoubtedly not trivial.  Putting clinical data into a usable format and making it meaningful to other researchers requires considerable time and effort by knowledgeable people. To this must be added the costs of establishing and maintaining suitable repositories and of checking compliance.

I’m not saying that open data does not have any benefits. It would be fantastically helpful in investigations of suspected misconduct and the very fact of having to share some of your data might act as a deterrent against fraud and inappropriate analysis.  Open data might also flag up errors in analysis. Access to individual patient level data from clinical trials can enhance the value of meta-analyses and therefore help to inform medical decisions. These are unquestionably good things that make the case for data sharing.

If we can afford to make full datasets usable and public, that would be wonderful. Perhaps one way would be to increase the efficiency of research by funding only worthwhile projects. Huge amounts of medical research funding are wasted on poorly designed or executed research which cannot provide reliable evidence, or on research into questions that have already been answered. Another waste of research funds is caused by the fact that a worrying proportion of medical research is never published: Some studies suggest this may be as much as half (at least in the past). Even if this figure is an overestimate (as some have argued), or things are improving, this represents an appalling waste.

However, we also know that when trials are published in journals, the reports are often incomplete or misleading. Even when trials are registered at the start (which still isn’t  required by many journals), researchers don’t always report the pre-stated outcomes in their publications. Ben Goldacre’s COMPARE study is showing that outcome switching is far more prevalent than most of us realised.

My fear is that demanding that authors share the data underlying their publications might have the unintended consequence of exacerbating the problem of non-publication simply because it increases the effort and cost of publishing. Furthermore, this would disproportionately affect less well funded, non-commercial studies and researchers from low income countries. But my main concern is that, because the ICMJE are only asking for data that support the published conclusions, rather than the full dataset, the new requirement will do nothing to reduce the serious problems of outcome switching or selective reporting.

I am also worried that the emphasis on posting raw data could remove attention from other proposals to improve transparency and the effectiveness of reporting. Goldacre himself said recently he was concerned that an emphasis on data sharing could sideline efforts to make study results more available. While only a tiny number of people are interested in reanalysing or synthesizing other researchers’ raw data, more (admittedly, not many) might use the information in full study reports (such as those already required by regulatory authorities and funders). Since such reports are already being produced in a standard format (at least for commercially-funded drug studies), the cost of making them public is low.

In addition, requiring full, date-stamped protocols to be publicly posted (e.g. on funders’ or institutions’ websites, or alongside publications) costs almost nothing yet brings clear benefits. Access to full protocols not only highlights selective reporting and outcome switching, but also increases the chance that methods can be repeated (either to replicate the research or put the findings into practice). Publishing protocols at the start of projects can also avoid duplication of effort and, if protocols are widely reviewed before research begins, problems with proposed methods or analyses might be resolved before, rather than after the research is done, making the work more likely to generate meaningful results.

Many journals (including ICMJE members) now require protocols to be submitted alongside manuscripts so they can be used by peer reviewers and editors to check for outcome switching and cherry-picking of results. It is easy for a journal to check, at the time of publication, whether a protocol has been published and whether a trial has been registered.  And, since both these things should be done at the start of the trial, there is no risk that mandating them would delay publication. Another issue I have with the ICMJE proposal is that giving researchers a six-month grace period after publication before data must be posted will make it much less likely that journals will check compliance and much harder to take action if authors fail to share their data properly.

I don’t share the same concerns about data sharing as other commentators – namely, that research parasites or crackpots could use data for nefarious purposes. If we can review the original methods and analysis, we can surely judge the value of secondary analyses, although I agree that we need to find a good way to credit and incentivise data providers.

I want to re-emphasize that I am not against data sharing. If I had a magic wand (aka unlimited funding for research and its dissemination) I would undoubtedly wave it over all research and create a system in which raw data were permanently linked to all types of report and all the report formats were linked (so that, for example, somebody reading a press release could easily check the journal article, and, if they wished, also the protocol, full study findings and raw data). But if the fairy gave me two wishes instead of a wand, I would wish for prospective trial registration and access to full trial reports for all trials before wishing for raw data.

But who knows – sometimes progress needs great leaps of faith and some visionary mavericks who see the far horizons (and I’m happy to acknowledge that the ICMJE’s leadership on trial registration had a huge, positive impact). Perhaps, if a few influential medical journals mandate partial data posting, the era of properly linked (or “threaded”) publications, which I and many others hope for, will arrive more quickly. I’m not convinced — but, like a good scientist, I really hope I’m wrong.

Liz Wager is a publications consultant at Sideview, and former chair of the Committee on Publication Ethics (COPE). She is based in Princes Risborough, UK

Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our new daily digest. Click here to review our Comments Policy.

17 thoughts on “Sharing data is a good thing. But we need to consider the costs.”

  1. Unfortunately, a more basic issue about the opening access to the data behind published and unpublished randomized clinical trials (RCT)–the external validity of the RCT data themselves and substantial risk for clinicians and their patients in acting on these reports. See: Margaret T. Whitstock (2015). Reducing adverse events in older patients taking newly released drugs: early identification of risk factors to offset clinical trial findings with limited applicability to older patients. Scholar’s Press: Saarbrücken, Germany.

    A focus on remedying problems of incomplete and biased reporting of potentially internally valid RCTs detracts from addressing the more basic issues of external validity and data insufficiencies for application by clinicians to their patients.

    I recommend Dr. Whitstock’s book for its in-depth treatment of the research design, statistical analysis, and reporting issues, as well as application problems from the clinical perspective. My reading of Dr. Whitstock’s book leads me to conclude that biomedical researchers and medical practitioners live in two different worlds.

    1. I totally agree that you cannot understand the data unless you truly understand the study methods (and any weaknesses therein) .. sharing data doesn’t fix poor study design

      1. Yep, sharing data doesn’t “fix” research design but can reveal it and disclose inferential weaknesses that might otherwise go unnoticed.

  2. And how is the risk of handing data to untrustworthy reviewers going to be dealt with? How is confidentiality going to be guaranteed, when authors cannot hold reviewers accountable?

    1. I don’t know. When my term as Chair ended in early 2012, COPE hadn’t formulated a position on data sharing. These are my personal views, not representing any particular organization.

  3. It’s quite easy to say that we should, “increase the efficiency of research by funding only worthwhile projects.” But it’s impossible to do….

    1. well, demanding systematic reviews BEFORE funding would help — and making sure protocols get expert statistical review (to avoid underpowered studies) .. I don’t think they’re impossible

  4. Aceil,

    Clearly, medical journals that publish reports need to adopt the BMJ policy of publishing rapid-response (RR) letters to the editor in which the authors can respond to any perceived unfair or unsubstantiated criticisms of their reported research. It is unclear what “confidentiality” means anymore in the new age of social media. Besides, the back-and-forth between authors and critics empowers the audience and potential users of published reports to decide for themselves the quality of reported RCT research and limits on its application.

    BTW, Cochrane Collaboration researchers, I assume, do not fall within your category of “untrustworthy reviewers.” Unlike maybe some solo researchers, Cochrane Collaboration researchers are likely to have the expertise and resources to conduct high-quality meta-analyses. Again, readers can decide for themselves differences in the quality of critical reviews.

    President Harry Truman once stated, “If you can’t take the heat, get out of the kitchen.” Research can sometimes be a tough game to play. Given the stakes for clinicians and their patients, I’m loath to see policies adopted that unduly protect the sensibilities of the players.

  5. “Putting clinical data into a usable format and making it meaningful to other researchers” should be a condition met by any serious scientific team handling materials from a scientific endeavour as important as clinical trials. The cost of doing this should have been built into the funding for the trial, or any other serious scientific investigation. Therefore there should be no significant additional cost to sharing data from any worthwhile scientific investigation.

    Readers of this blog will be quite familiar as to the fate of research teams and individuals who do not put their data into a usable format that would be meaningful to other researchers.

    As Wager points out, “huge amounts of medical research funding are wasted on poorly designed or executed research which cannot provide reliable evidence, or on research into questions that have already been answered.” We need fewer studies, with better funding, to increase the proportion of studies that yield usable findings, be they positive or negative results. I’ll disagree with the issue of funding studies into questions that appear to have been answered – science requires replication and we don’t do enough of it.

    1. You’re right about replication — of course that’s essential, but not further repetition (as shown by many cumulative meta-analyses). I totally agree we need fewer, better studies!

  6. Thanks to Liz for a very cogent discussion of the issues in the biomedical research arena. In response to queries about COPE’s position on data sharing, our membership extends beyond STM journals, therefore, we are proposing a more wide-ranging discussion of the issues related to data sharing. To that end we have posted a Discussion topic for our next Forum and we invite members to comment online at the following link (http://publicationethics.org/forum-discussion-topic-comments-please-6) as well as participate in the live discussion via webinar on Friday 12 February at 3PM GMT. Additionally, over the past several months COPE has engaged in a systematic examination of current policies at our member journals. We will propose a coherent Discussion Document/Proposed Guidelines once we have data to support our conclusions.

  7. Dr Christopher Marshallsay and Ms Tracy Farrow (Chairs, Regulatory Public Disclosure Special Interest Group, European Medical Writers Association) says:

    The 26 January 2016 editorial by the International Committee of Medical Journal Editors (ICMJE) on the sharing of clinical trial data that form the basis of results and analyses published in journals (1), raises new questions and confusion about how the sharing of data from clinical trials will be accomplished. The editorial proposes new requirements for data sharing and asks for feedback on the proposals by 18 April 2016 (1).
    At face value, the proposal that these data should be shared seems logical. By providing access to these data the analyses can be replicated and new analyses can be performed. This should be a basic expectation for all data forming the basis of the results and analyses published in journals. The sharing of data would also allow analyses to be performed that answer questions that go beyond the needs of the company that performed the clinical trial.
    Many of the ICMJE proposals are already anchored in the Principles for Responsible Clinical Trial Data Sharing (2) issued in July 2013 by the Pharmaceutical Research and Manufacturers of America (PhRMA) and European Federation of Pharmaceutical Industries and Associations (EFPIA) which require “upon request” sharing of synopses of clinical study reports (CSRs) and evaluation of requests from researchers for the sharing of clinical trial information such as full CSRs, individual patient data (IPD), and [aggregated] trial-level data. Implementation of these principles began in January 2014; compliance with these principles is being monitored by PhRMA/EFPIA.
    Upon closer examination, differences between these proposals and the already-implemented principles become apparent. Most notably, the proposed different data sharing deadlines, the additional need to post plans for data sharing and to use trial registries with mechanisms for the registration of data sharing plans (e.g., ClinicalTrials.gov), the different scope of the data to be shared, and the different sub-set of clinical trials considered within scope (1, 2). In this context, the ICMJE proposals add further to the already confusing and regionally diverse data sharing needs.
    The ICMJE proposals would impact companies and researchers in academia and in the government seeking to publish their work. Likewise, although the PhRMA/EFPIA principles primarily target biopharmaceutical companies, they also impact researchers, including those in academia and in the government. For example, just as companies are expected to share data and to publish results, researchers given access to clinical trial data or generating clinical trial data themselves are also expected to share their data and to publish their results.
    The pros and cons of sharing of data from clinical trials (the benefits for science, for healthcare, for patients; a reduced patient burden due to the avoidance of unwarranted repetition – versus the cost; the issue of liability when data are shared; the risk of patient de-identification; the risk to protected personal data and to commercially confidential information; the risk of data poaching; and the health, safety and commercial risk when clinicians and patients act on reports based on shared data) were extensively discussed when developing the PhRMA/EFPIA principles.
    Revisiting the pros and cons discussion of sharing of clinical trial data seems advisable when considering the ICJME proposal. The ICMJE requirements ultimately implemented should build upon and reinforce the already accepted framework of the existing PhRMA/EFPIA principles. After all, as concluded by the ICMJE, “(data sharing) ….it will benefit patients, investigators, sponsors, and society”.
    1. Taichman DB, Backus J, Baethge C, et al. Sharing clinical trial data — a proposal from the International Committee of Medical Journal Editors. N Engl J Med 2016 January 20.
    2. EFPIA/PhRMA Principles for Responsible Clinical Trial Data Sharing. July 2013.
    Dr Christopher Marshallsay and Ms Tracy Farrow (Chairs, Regulatory Public Disclosure Special Interest Group, European Medical Writers Association)

  8. An EMWA team (Christopher Marshallsay and Tracy Farrow for Regulatory Public Disclosure Special Interest Group; Art Gertel, Sam Hamilton, Tracy Farrow for the Budapest Working Group, Developer of CORE [Clarity and Openness in Reporting: E3 based] Reference (http://www.core-reference.org) provided feedback on 09 May 2016 direct to ICJME’s Dr Taichman. Dr Taichman replied to Sam Hamilton, thanking the EMWA group for their thoughtful comments and explaining that ‘…although they cannot be included with those posted prior to the deadline (that site is closed), they will be shared with the other members of the ICMJE’.
    Further EMWA contributions to the discussion on this topic can be seen on EMWA’s LinkedIn page (https://www.linkedin.com/grp/post/2717752-6102832116159049732) and on the LinkedIn page of ‘The Publication Plan’ group (https://www.linkedin.com/groups/1886265/1886265-6130063859827957764).
    EMWA’s comments on the ICJME proposals may be viewed at:
    http://www.emwa.org/Documents/EMWA comments-ICMJE proposals-09may16.pdf

    1. Ms. Hamilton,

      It took some time but at least the ICMJE is considering the notion. I gave my view on what action medical journals needed to take in the British Medical Journal (BMJ 2015;350:h437) with citation of my challenge to the editor of the Journal of the Royal Society of Medicine (J R Soc Med 2014;107:258). It will be a tough uphill slog against entrenched interests and practice. Machiavelli famously noted the dynamic, stating “There is nothing more difficult to take in hand, more perilous to conduct, or more uncertain in its success, than to take the lead in the introduction of a new order of things.”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.