Ben Goldacre has been a busy man. In the last six weeks, the author and medical doctor’s Compare Project has evaluated 67 clinical trials published in the top five medical journals, looking for any “switched outcomes,” meaning the authors didn’t report something they said they would, or included additional outcomes in the published paper, with no explanation for the change. The vast majority – 58 – included such discrepancies. Goldacre talked to us about how journals – New England Journal of Medicine (NEJM), JAMA, The Lancet, BMJ, and Annals of Internal Medicine — have responded to this feedback.
Retraction Watch: When you discover a published trial has switched outcomes, what do you do?
Ben Goldacre: Among the 67 trials we’ve evaluated, in total so far there have been 301 pre-specified outcomes left unreported, and 357 non-prespecified outcomes that were silently added to the reports. This is consistent with the research that’s already been done on the prevalence of outcome switching. We were concerned that this problem has been really well documented, for a very long time, and yet still persists. So we decided to go beyond just publishing anonymous prevalence figures: We’re writing a letter to the journal every time we find a trial that has misreported its outcomes, to correct the record, and to try to elicit progress on this widespread structural problem.
RW: When you contact the journals, what has been their response so far?
BG: The responses from journals have been incredibly variable. People often have ideas about the “character” of different journals, and this may play into that. BMJ has rapidly issued corrections. NEJM has dismissed concerns of outcome switching out of hand. It’s fair to say that so far JAMA have been friendly but ponderous: We are waiting for them to get back to us on their decision about publishing our letters. The Lancet seem to be publishing the correction letters but regarding it as a matter for authors to remedy rather than editors, which we disagree on very strongly, especially as these journals have generally endorsed reporting guidelines like CONSORT. To be clear, we are confident that journal editors in general are strongly committed to improving reporting standards, but there seems to be a rather odd cultural blindspot around policing it. Annals have been the real surprise for everyone: dismissing concerns, writing error-laden “rebuttals”, and even effectively telling trialists that they don’t need to worry about replying to corrections on gross misreporting. There is more to come on that next week.
To be clear, while some of this is undoubtedly disappointing and odd, it’s also useful and interesting. Until we began writing to journals, we only knew that outcome switching was highly prevalent, despite most journals promising to adhere to high reporting standards. Now, from the responses we’ve had, we’re learning why it continues to be so prevalent, we are identifying the recurring misunderstandings and systemic shortcomings. Essentially we’ve solicited qualitative data on the reasons why outcome switching occurs in journals, and it could only have been done by writing these letters.
RW: What’s been the most troubling incident(s) in the journals’ responses to your correspondence?
BG: I think it depends on perspective. NEJM have simply come out and said, effectively: “We don’t care about outcome switching and we don’t care about your letters correcting it”. While we disagree, and we think readers will be surprised to hear that NEJM take that view, it is at least straightforward. The responses from Annals have really surprised everyone, because they’ve been so confused, so internally contradictory, riddled with factual errors, and then they’ve behaved very oddly around publishing responses to their “rebuttals”.
RW: You recently published some correspondence with NEJM editors, such as Deputy Editor Dan Longo, in which he says the editors “view each piece individually and add the data as appropriate based on the judgment of the peer reviewers, the statistical reviewers, and the editors.” In the other correspondence, Longo and other editors say a paper does not contain “any clinically meaningful discrepancies between the protocol and the published paper that are of sufficient concern to require correction of the record.” In other words, they may be aware that there is outcome switching, but it’s not impacting the paper enough to warrant a correction or retraction. What is your response to that?
BG: We think this is highly problematic. Of course we don’t think every trialist switching their outcomes is setting out to deliberately mislead readers, or exaggerate their findings. But a culture of permissiveness around outcome switching leads to lower standards, and it gives cover to those who are setting out to mislead. That’s why we have clear guidelines and reporting standards in academic medicine. And to be absolutely clear, there’s nothing wrong with sometimes changing your outcomes, after a trial begins, in the light of new information: But as the CONSORT guidelines say, when you do so, you should discuss and declare this in the trial report. That’s not hard to do, and it prevents readers being misled.
Furthermore, I can’t agree that the changes were trivial. We’re keen to involve students in our projects wherever possible, so we have a team of graduate-entry medical students doing the initial review of all trials in COMPare, then they are reviewed in detail by at least one of our three senior academics (in an incredibly long and very painful meeting, at least twice a week). I asked the coders for their favourite examples of outcome switching from NEJM:
Henry Drysdale:
In the ASTRAL-1 trial, one of the outcomes (Incidence of adverse events leading to discontinuation of study drug) was specifically pre-specified as a primary outcome, but reported as a secondary outcome. This is a clear deviation from the original registry entry, and this simple switch significantly changes the meaning of the results, as much less weight is attached to secondary outcomes. This could have been clearly declared in the report, with reasons.
In the trial “Cabozantinib versus Everolimus in Advanced Renal-Cell Carcinoma”, only 4 of 10 prespecified outcomes were reported. Two of the missing outcomes were declared as unreported, however the rest were not. The missing outcomes included “Quality of life anxiety and depression at 6 months”. I think this kind of selective outcome reporting is clinically meaningful, and requires correction of the record.
From Aaron Dale:
One of our letters to NEJM was in response to the ASTRAL- 1 [1] trial, which investigated the effect of the combination of two antiviral agents (Sofosbuvir and Velpatasvir) in patients with chronic hepatitis C virus. The outcomes pre-specified in the registry included several measures of response to antiviral treatments at pre-defined timepoints. When we analysed the paper we found that different time points were reported. Essentially they have switched response at time X to response at time Y and reported these results as effective. Clearly this is an issue because the viral response to treatment may appear more effective at certain time points relative to treatment than others. There can sometimes be good reasons to change outcomes, but these changes should be flagged and justified in the body of the paper to better inform readers.
RW: What are the next steps for the COMPare project?
BG: We’re publishing a paper on our findings so far: To be clear, that’s not just our data on the prevalence of misreporting (because there are already numerous papers showing this problem is extensive) but also the publication rate for our correction letters, and the responses of journals and authors to this misreporting being corrected. We’re going to follow through on all correspondence, from all of our dozens of letters, and all our additional correspondence with journals. Then we have phase 3. I don’t want to sound too cloak and dagger, but we’re not sharing that publicly for now!
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our new daily digest. Click here to review our Comments Policy.
Just curious – were the journals aware that their responses would form the basis of a paper? I hope not as this pre-knowledge may have affected their response (i.e. an “out come switch” from “I don’t care” to “yes of course”.
On the one hand, this seems reasonable. On the other hand, it is excessively and mistakenly rigid. When you plan a trial, you choose primary and secondary outcomes. However, if something else is found which is interesting or important, it should be reported. The position that “357 non-prespecified outcomes that were silently added to the reports.” is somehow a problem is kind of ridiculous. What is the specific problem with reporting a result that you did not expect? As they say, that’s why they call it research. Put another way, if Alexander Fleming had not pre-specified the anti-bacterial effects of the penicillium mold, would Dr. Goldacre be in high dudgeon about his reporting a “non-prespecified outcome”? This is a misguided effort.
The scenario we’re trying to avoid:
A mass of data comes back from a trial. The investigators examine it and find that there are some nominally significant results. They report on these without any indication of how many different ways they looked for a significant result. This is known to lead to badly inflated significance levels. In essence, when you test multiple aspects of data you need a multiple-tests correction on your significance values, but if you conceal or misrepresent how many aspects you tested, this correction can’t be done.
It’s like this: if you tell me that TP53 is a candidate gene in a particular cancer, and I test its mutation frequency in that cancer vs. controls, a standard p-value test is appropriate. If instead I look at 5000 different genes, a standard p-value test is NOT appropriate as it is expected to have about 250 false positives; I need to correct for the number of tests. Therefore I *must* publish the number of genes I looked at, otherwise my p values can’t be trusted.
The main issue here is that if can you change the primary outcomes AFTER you’ve seen the data you have a carte blanche to mislead. For example, say you declare 5 primary outcomes in the trial protocol. You get the data, and test for every single outcome that comes to mind (say, 100..). Likely you’ll find a few that are statistically significant by sheer chance (in the direction that you want, preferably..). You report these and forget about the initial primary outcomes, which were not statistically significant / were un-favourable.
Nothing stops you from reporting serendipitious, unexpected results that were not included as outcomes in the initial protocol. However, you have to explicitly make it clear that they were not, and justify the inclusion /deviation from the initial protocol.
Thanks Paul. Neither we nor CONSORT have a problem with people switching their outcomes. We simply ask that this switch is disclosed in the trial report.
This is already discussed above:
“And to be absolutely clear, there’s nothing wrong with sometimes changing your outcomes, after a trial begins, in the light of new information: But as the CONSORT guidelines say, when you do so, you should discuss and declare this in the trial report. That’s not hard to do, and it prevents readers being misled.”
It’s also discussed in the linked NEJM correspondence, our FAQ, our letters, numerous blogs, numerous interviews, and many other places, so I don’t think it’s unclear. For example:
http://compare-trials.org/blog/how-did-nejm-respond-when-we-tried-to-correct-20-misreported-trials/
http://compare-trials.org/blog/where-does-annals-of-internal-medicine-stand-on-outcome-switching-a-detailed-response/
http://compare-trials.org/faq
etc.
Thanks for your replies. I agree that trials should address the primary and secondary outcome measures in a clear manner.
I am not 100% in agreement with your characterization of the tone of your concerns about “non-prespecified positive results”. You state in comments above that you wish to ensure that “non-prespecified outcomes” are properly presented, with appropriate considerations for multiplicity, which I agree with. This is specifically not what you state in communications with journals; in the document “How did NEJM respond”, you state “We agree that space constraints may be a potential problem with print journals. However, at COMPare we have repeatedly found large numbers of additional non-prespecified outcomes reported in journal articles, including repeatedly in NEJM.” There is no conditionality in this statement. You basically are questioning the publication of non-prespecified outcomes. If you wish to have these done in a careful manner, this is what you should say in letters. But you do not. You are trying to get them to suppress such findings, which is not appropriate.
I invite other interested readers to compare what is said here to what is said in communication with journals by the COMPare group. There are important discrepancies.
My position (having worked on a number of trials over the last 20 years, and being an experienced trialist) is that outcomes should be published if they are found. The prespecification is important, but it is also important to present information to further the science. If non-prespecified outcomes are encountered, they should and will be published.
Data does not analyse itself. It is analysed because someone decides to do it. So why was this need only identified at analysis rather than the protocol stage. Was it sloppy work (I’m on an ethics committee, I see lots of that) or a desire to do a fishing expedition. No one has any idea how many of these were performed or their specifications, as this can include choosing cutpoints or creating composite outcomes.
I don’t have anything against including results for adverse events provided that it is clear they weren’t an outcome. I wonder if the original Vioxx papers discussed the effect on blood pressure?
The point about unexpected results is that they are …. unexpected. If they were expected, they would have been added in as an outcome. So your comment is curious. How is a plan for a clinical trial to make a plan for unexpected outcomes?
Another point is that when you have done a trial, you have usually spent a long time on that particular project. When I was on the EXCITE trial, we met for a week in 2001 to plan the trial (after the trial was funded), collected data for 3 years with the difficulties and complexities of that process, analyzed data for the next 2 years intensively and the next 3 years somewhat less intensively. Thus, I was involved with this trial directly for 7 years. If you think that the trial personnel (primary document had 7 authors) would simply stop the analysis and evaluation of the data from this trial after the primary outcome data had been analysed, you would be wrong. And it would be wrong to not continue to examine the information of the trial, for which the US public spent a lot of money. So, we did look at a variety of things not planned for in the original specification. I continue to think about what else could be obtained from this data.
“Discuss and declare” seems appropriate but currently does not fly with reviewers at ClinicalTrials.gov – see QA Comment copied below. A registered study can meet criteria for results-reporting transparency only by locating and entering actual data for measures that may have been changed or abandoned as unfeasible or irrelevant. This seems very burdensome to individual investigators of small studies who are unprepared for this level of rigor. Any opinions on the value of this requirement?
Thanks
PRS Reviewer’s QA Comment: >>Information in a previous version of this record indicates that several Secondary Outcome Measures were deleted from this record. Note that all collected data for pre-specified Primary and Secondary Outcome Measures should be reported in the appropriate Outcome Measure data table(s) rather than deleting non-redundant Outcome Measures for this study. Please revise to report the data, as appropriate.
Dear Paul,
You say:
“I am not 100% in agreement with your characterization of the tone of your concerns about “non-prespecified positive results”. You state in comments above that you wish to ensure that “non-prespecified outcomes” are properly presented, with appropriate considerations for multiplicity, which I agree with. This is specifically not what you state in communications with journals; in the document “How did NEJM respond”, you state “We agree that space constraints may be a potential problem with print journals. However, at COMPare we have repeatedly found large numbers of additional non-prespecified outcomes reported in journal articles, including repeatedly in NEJM.” There is no conditionality in this statement. You basically are questioning the publication of non-prespecified outcomes. If you wish to have these done in a careful manner, this is what you should say in letters. But you do not. You are trying to get them to suppress such findings, which is not appropriate.”
You have misrepresented the COMPare letters and output. We simply ask people to declare novel non-prespecified outcomes as novel, as per CONSORT guidelines.
The quote you have taken, out of context, above, is not from a letter to a journal, it is from the end of this blog post:
http://compare-trials.org/blog/how-did-nejm-respond-when-we-tried-to-correct-20-misreported-trials/
It is followed by the phrase: “As previously discussed, we don’t see a problem with trialists switching their outcomes where there are good reasons to do so, as long as this is openly declared in the trial report, as per CONSORT guidelines.”
It is therefore not clear to me why you would wish to misrepresent this text.
As described above, the COMPare Trials project is absolutely clear that reporting additional outcomes is fine, as long as these are declared as such. This is also precisely what CONSORT recommends.
I have already given multiple links where this is made clear, including
http://compare-trials.org/blog/where-does-annals-of-internal-medicine-stand-on-outcome-switching-a-detailed-response/
http://compare-trials.org/faq
“There are often legitimate reasons for changing outcomes during a trial. Does COMPare consider this “outcome switching”? No. We agree that outcomes can change for good reasons after a trial has started. However, consistent with the CONSORT guidelines, these changes must be declared and explained in the trial report. Where changes have been declared, we consider the outcomes in that trial to be correctly reported.”
“Trial protocols are often updated and published after trial commencement, for good reasons. Do you take into account protocol amendments when assessing outcome reporting? Protocols published after the trial start date cannot, by definition, contain pre-specified outcomes. We regard changes to pre-specified outcomes as completely normal, however they must be declared in the publication reporting the trial results. Where updates or amendments to the original outcomes have been declared, we consider the outcomes in that trial to be correctly reported.”
This is also made clear in the Q&A above. It is also made clear in our individual letters to journals, including the NEJM correspondence you refer to in your comment (even though the initial letter did not contain any additional outcomes to discuss):
http://compare-trials.org/blog/how-did-nejm-respond-when-we-tried-to-correct-20-misreported-trials/
For example, :
“…Specifically, we aim to detect outcomes that are pre-specified but not reported, as well as outcomes that are novel but not declared as such…”
And:
“You mention space limitations as a justification for selective outcome reporting, and non-reporting of pre-specified outcomes; however our monitoring, and our letters, have regularly identified trials where additional non-prespecified outcomes are reported (and not declared as novel), including several trials published recently in the New England Journal of Medicine”
I am always happy to hear feedback on how messages can be made clearer but it is honestly hard to regard such feedback as constructive when it comes from someone who is, very puzzlingly, simply misrepresenting and misquoting.
I don’t think it’s necessary or helpful for you to misrepresent COMPare’s output in order to make your point that it is acceptable to report additional novel outcomes. Reporting additional outcomes is fine, as long as they’re declared as such.
It is alarming that most of the trials were found to have switched outcomes and what’s even more disturbing is the stance of the journal editors who do not feel responsible for alerting the readers and getting the records corrected. Considering the fact that the results are used by doctors and policy makers, journals should insist on accurate reporting of outcomes in the papers they publish. Would it not be in the best interest of the journals as well if the papers they publish meet ethical standards?