The Canadian National Breast Screening Study conducted in the 1980s and led by researchers at the University of Toronto evaluated the efficacy of breast cancer screening in reducing mortality from breast cancer. Because the research was supposedly a “gold standard” randomized controlled trial, its results, published in academic journals and reported in the media, have influenced public perceptions and informed policy on mammography screening in several countries.
However, over the past decades, flaws in this study have come to light. My colleagues and I learned of failures in randomization, and we and other researchers have found other serious problems. We think these flaws strongly suggest the publications of CNBSS results should be retracted. Despite being informed of the flaws in this study in 2021, the University of Toronto has not adequately or appropriately addressed these issues.
The CNBSS was configured as two separate randomized clinical trials, one for women in their 40s at entry and the other for women in their 50s. In CNBSS1, 50,000 women ages 40-49 were supposedly randomly assigned to the intervention arm, in which they would receive up to five annual screens with two view film mammography plus clinical breast examination by a nurse, or to the control arm, where they received a single clinical examination at entry and usual care (essentially, no screening) afterwards. In CNBSS2 for 40,000 women, the randomization was between the intervention of mammography plus a clinical breast exam versus clinical exam only.
When the results of CNBSS1 and CNBSS2 were first published in 1992, the investigators reported no difference in mortality from breast cancer between the arms in either trial. This finding contrasted with positive results in several other studies. In fact, the investigators initially, and falsely, claimed screening was leading to excess deaths from breast cancer in the younger women. That claim was eventually retracted. As well, results from CNBSS were later used by one of its principal investigators to produce unrealistically high estimates of overdiagnosis attributed to mammography screening.
I and other colleagues have criticized the CNBSS over the years for, among other serious problems, strong indications of bias in randomization. These concerns led to a forensic review in 1995 of the randomization of the trial, but, for reasons not entirely clear, the mandate of the review did not include interviewing the staff crucial to the randomization process. The review proved inconclusive and left the key question of subversion of randomization open.
In 2021, my colleagues and I became aware of eyewitness evidence staff members had steered some participants with symptoms of breast cancer into the mammography arm of the study. At all but one trial site, all women received a clinical breast examination before getting their study arm allocation. The staff registering the women into the study were aware of the exam results. Entering a woman with significant findings to the intervention arm would assure her of an immediate mammogram.
According to eyewitness testimony, well-meaning staff members could, and did, steer such individuals into the screening arm with the intention of providing a symptomatic woman with an immediate mammogram, without appreciating the consequences for randomization this entailed. As we have explained, moving even a few women from their intended random allocation in the control arm to the intervention arm after discovery of palpable abnormalities could shift the trial findings from demonstrating a mortality benefit from screening to suggesting a mortality detriment.
In light of this new evidence, later in 2021 we contacted the Canadian Cancer Society (CCS), whose subsidiary, The National Cancer Institute of Canada, had sponsored the CNBSS. The CCS then approached the University of Toronto, via the Dean of Medicine, informing them of this evidence and requesting a review. In the meantime, we published an article describing the evidence for subversion of randomization.
After several months with no response from the university, we followed with a letter to the university’s Research Ethics Board. Eventually we were informed that the matter had been referred to the Office of Research Integrity, which is the responsibility of the Vice President of Research and Innovation of the University of Toronto. Finally, in December of 2021 we were told that our concerns would be investigated and an external three-member panel would be appointed to do this. This was not consistent with the university’s established framework for investigations of research ethical issues.
After another long gap, we learned on June 14, 2022, the panel members had been selected by Lorraine Ferris, the associate vice president of research and innovation at the University of Toronto. Two of the members of this committee, Mette Kalager and Peter Jüni, had previously been on record supporting the results of the CNBSS and in one case, had co-published on issues around screening with one of the trial’s principal investigators. If CNBSS were to be invalidated, both of these individuals would have to reverse pointed statements they had made publicly. We were surprised the university would appoint either of these individuals to the committee, given their clear conflicts of interest (for the same reason, that these panel members did not immediately recuse themselves also was surprising).
We immediately raised these concerns to Prof. Ferris. In her response, she stated the investigation would go forward with this panel. She said it was challenging to find suitably qualified panel members for such a review. We explained our concerns regarding subversion of a randomized control trial do not require people with specific expertise in breast cancer screening studies to resolve them; it is largely a matter of consideration of proper conduct of any such study.
In a follow-up letter, we indicated we would continue to protest the process the university had chosen to investigate our concerns and to push for a more appropriately constituted panel. At the same time, we agreed to be interviewed by whatever review panel was appointed by the university. We did not want, by refusing to cooperate, to provide an excuse for the university to back away from its responsibility to conduct an investigation. On Aug. 2, 2022, we received a response from Prof. Ferris that she was unwilling to reconsider the composition of the panel.
The panel did interviews by Zoom in the Fall of 2022, including one with me. But the witness list was not made available to those of us who had raised the initial concerns, nor have the statements made by the various witnesses been shared.
In 2023, we were informed by Professor Ferris that she expected the review to be completed by March of 2024. However, that date passed without event. We requested a meeting with the current Provost, Trevor Young, and that was arranged for March 28 of last year. Professors Ferris and Leah Cohen, vice president of research and innovation, and strategic initiatives, were also scheduled to attend. The meeting was abruptly cancelled by the Provost on the afternoon of March 26. We have heard nothing since.
This is not the first time the University of Toronto has delayed or failed in fulfilling its responsibilities to safeguard the ethical conduct of medical research. Notably, in the late 1990s, Nancy Olivieri, a hematologist clinician researcher conducting a study with an industrial collaborator to evaluate a new drug for treating thalassemia, observed adverse effects on patients. When she attempted to discontinue the study and voice her concerns, she was faced with intense harassment by the pharmaceutical manufacturer and by some of her academic colleagues. Her request for intervention from the university –- which was in the midst of negotiating a $20 million building donation from the manufacturer — did not receive an adequate response. Ultimately the matter had to go to an external review, which vindicated her.
At this point, almost four years have elapsed since we brought our concerns to the attention of the University of Toronto and, as far as we can see, nothing has happened. Perhaps the long delay in the review suggests that, in light of overwhelming evidence, the university is no longer able to credibly support the conduct or the results of the CNBSS. Or perhaps it means that by delaying its review, the university hopes the problem will go away.
The CNBSS has influenced policy and opinions of primary healthcare providers and women, discouraged opportunities for earlier detection of cancer, and, in turn, resulted in delays in cancer detection. The publications of results from CNBSS must be retracted.
Martin J. Yaffe is a medical physicist and imaging scientist at Sunnybrook Research Institute and Professor of Medical Biophysics at The University of Toronto. His research over the past 45 years has focused on the earlier detection, diagnosis and characterization of cancer.
Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on X or Bluesky, like us on Facebook, follow us on LinkedIn, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].
It looks strange that this text, describing the work done mostly after 2020, does not mention the Gotzsche’s 2013 systematic review – the first review separated the ‘suboptimal’ trials. In this influential review, the Canadian trial was classified as having adequate randomization.
Furthermore, here the report PMCID: PMC1226907 is mentioned as “and other researchers have found other serious problems.’ If we look at it, we find that ” The authors’ thorough review of ways in which the randomization could have been subverted failed to uncover credible evidence of it.”
Such omissions and mispresentation weaken the argument for retraction of the Canadian study
Unfortunately, the Gotzsche review, mentioned by Dr. Vlassov, has been largely discredited and Dr. Gotzsche has been removed from the Cochrane Group. He claimed that of the randomized, controlled trials, only the Malmo and Canadian trials had adequate random assignment while, inappropriately, dismissing all the other trials. In fact, multiple reviews, the scientific facts, and the testimony of those who participated in the Canadian studies confirmed that the Canadian trials violated the fundamental rules for randomized, controlled trials and their results are unreliable.
Random allocation is fundamental and critical in these trials. The goal of Randomized Controlled Trials is to produce 2 identical groups of women such that each woman in the study arm will have a “twin” in the control arm. If nothing else was done, there should be the same number of women diagnosed with breast cancer in both groups and the same number of deaths from breast cancer in both groups over time.
It is critical that nothing can be known about the participants before they are randomly assigned to the study or control arms of the trial that might compromise random allocation.
It has been clearly documented that most of the women who participated in the Canadian trials underwent a clinical breast examination prior to allocation. Not only were women who were found to have a clinical abnormality on these examinations allowed to participate in these trials of mammography screening, but the results of the clinical examination were provided to the coordinators prior to allocation (another major violation).
Compounding the error is the fact that, once again in violation of the fundamental rules for RCT, assignment was not blinded. The coordinators could assign women to either arm of the trial since assignment was on open lists which identified the assignment. If they wanted to assign a woman to the screening arm to be certain that she would get a mammogram, they could, literally, skip a line and then fill in the skipped space with the next participant, and the nonrandom allocation would be undetectable. The data clearly show this happened with statistically significantly more women with advanced cancers being assigned to the screening arm than the control arm in CNBSS1.
The Principal Investigator has never allowed the coordinators to be interviewed to determine how often this happened, but the data have strongly suggested it, and it has been documented that women referred to surgeons with clinically evident breast cancers were then referred to these trials of screening. There is now “eyewitness” testimony that some of these women were assigned to the “screening arm” out of random order. All of these concerns have been verified, and the major imbalances in the published data have confirmed that these trials were corrupted by nonrandom assignment.
This does not even include the well documented failure and poor quality of the mammography used in the Canadian Trials.
It is unethical for the Toronto review to be taking so long and allowing these compromised trials to be used to deny women access to screening and the benefits of early detection.
Daniel B. Kopans, M.D.
Professor of Radiology, Harvard Medical School
For me, there is a promotion of the unbalanced point by the primary author as well as Dr. Kopans.
Discrediting esteemed experts like Peter Juni and Peter Gotzsche is no an appropriate option. Years ago, when PCG first introduced in the practice of systematic review, the separate analysis of good and not-so-good trials, it was rejected by the community, but then became the standard practice. In the mammography sys review, separate analysis does not support the mammography screening, and the analysis of the totality of data only marginally supports. This attack on the Canadian study has obvious purpose: exclusion of “these compromised trials to be used to deny women access to screening and the benefits of early detection.”
Women are not denied access to screening because of the quality of the 30-years old data. Major problem is the absence of good modern data, relevant to modern treatment options.
Vasiliy Vlassov may be misinterpreting my comments. Regarding Prof. Juni, I am in no way impugning his credibility as a scientist. My point was that because of the strong statements that he has made against mammography screening and his support of the findings of the CNBSS trials, he should not have been considered by the University of Toronto to be in an appropriate position to render an objective review of concerns regarding ethical and methodological improprieties in these trials. He and Prof Kalager both have a professional conflict of interest in this matter.
I also note that in a 2000 publication with Ole Olsen in The Lancet, Prof. Gotzsche did mention the concerns that had been raised regarding subversion of randomization in CNBSS, however, he dismissed them, accepting the findings of an inquiry, which as Prof Kopans has stated, had major gaps. When he endorsed the CNBSS as a well-conducted trial he could not have been aware of the eyewitness evidence of interference with the randomization which formally came to light in 2021 and was described in my post.
– Martin Yaffe
The medical profession needs to stop making decisions that harm the people they purportedly serve, to benefit or protect colleagues or funding dollars. First do no harm…stop paying lip service to your oath.
In Martin Yaffe’s criticism of the CNBSS trial of breast screening, it seems the piece de resistance is that women in the control group who was identified with a lump on their clinical breast examination were sometimes unduly transferred to the screening arm of the trial. The open allocation procedure is claimed to have been used to contravene and thus corrupt the randomisation. This was done out of compassion for the individual woman to make sure she had a mammogram. According to Martin Yaffe’s rationale, this would increase both breast cancer mortality and incidence in the screening arm, which would explain the ‘unwelcome’ results of the trial, namely that breast screening was not found to reduce breast cancer mortality but rather increased incidence due to overdiagnosis.
Let us give Martin Yaffe’s claims a bit of a think.
His accusation rests on a few assumptions (and a bit of hearsay it appears). First, the trial authors would have to have been cruel enough to plan the trial so that no follow-up investigation, e.g. using a clinical mammogram, was provided for women found to have a lump during the clinical examination (which raise the obvious question why it was then offered in the first place). Second, this cunning plan to deny women clinical investigation would have to have been found worthy of funding. Third, the plan would have had to gain ethical approval. Fourth, it would have to have been accepted by participating university hospitals and their skilled staff.
None of these four assumptions appear immediately convincing.
In fact, the reason all women included in the trial had the offer of a clinical breast examination was that some of them had to travel quite far to contribute to the trial. It was thought reasonable to offer those allocated to the control group at least something. This was done despite that this would lower the chance of the trial identifying a benefit of breast screening, which was what the investigators hoped and expected, as most investigators who invest a lifetime in a trial does.
The trialists’ offer of something was the clinical breast exam where those identified with lumps would of course be offered a clinical mammogram, indicating to the radiologist where to look for changes. Allocating these women to a screening mammogram would have been doing them a great disservice, lowering their chance of a positive outcome compared to immediate referral to a clinical mammogram (Randomised trials have since documented that regular clinical- and self- breast examination to screen for breast cancer likely has little or no effect on breast cancer mortality).
The trial authors write: “Participants then had a physical (clinical) breast examination and were taught breast self examination by trained nurses, or in the province of Quebec, by doctors (fig 1⇓). The examiners had no role in the randomisation that followed; this was performed by the study coordinators in each centre.” BMJ 2014;348:g366.
This criticism has been raised many times, but it does not seem more solidly justified in this blog post than it has been before. Rather, the blog post seems a strange mixture of conspiracy theory, reluctance to accept unwelcome results, and hearsay. The CNBSS trial has been tried and found sound in a previous evaluation and in several systematic reviews, including one I have co-authored for the Cochrane Collaboration.
Rather than calling for this blog post to be retracted, I will argue that it should remain so that readers can judge for themselves. Readers should also consult the original reports of the CNBSS rather than base their opinion on a blog. While the perfect trial is yet to be conducted, the CNBSS is, together with the Malmö trial and the UK AGE Trial, the most well-designed, transparently and consistently reported trial of breast screening, with the lowest risk of bias.
I thank Prof. Jørgensen for his comments, however, I would like to correct his first statement. My Retraction Watch article is not so much a criticism of the CNBSS as an expression of concern that the University of Toronto is unacceptably delaying its review of two serious ethical flaws that have been discovered in this study and undermine its credibility. This review by an external panel appointed by the University has been underway since 2022.
The many limitations of the two CNBSS trials have been well documented by several authors, notably in the 1993 publication by Boyd et al. They include inadequate size of the cohorts to attain power for a reasonable effect size, poor image quality, poor training of the radiologists, and in some cases, the use of outdated equipment. But the two major concerns brought to the University of Toronto did not focus on these scientific deficiencies, but on issues of research ethics. They included the one mentioned by Prof. Jørgensen – “ women in the control group who was (sic) identified with a lump on their clinical breast examination were sometimes unduly transferred to the screening arm of the trial”. This represents a subversion of the randomization and because the number of breast cancer deaths was fairly small, the transfer of even a few women with poor prognosis cancers into the screening arm could dramatically alter the findings of the trial. That such transfers were likely to have occurred was suggested in the Boyd publication, by the overwhelming statistical imbalance between trial arms of large palpable cancers, but that they actually occurred was confirmed and clearly documented by an eyewitness who had been a mammography technologist in CNBSS and who was subsequently interviewed by the external review panel. This is not hearsay. What stronger evidence of interference with randomization can exist than a report from an eyewitness?
The other ethical lapse relates to the fact that women who had already been attending surgical breast clinics for known symptoms were allowed to participate in CNBSS, which has been identified as a study of screening, i.e., testing of asymptomatic women. Regardless of the arm of the trial to which such symptomatic individuals were assigned, the screening intervention would be of more limited value and the power of the trial would have been diminished by their inclusion.
Presumably Prof. Jørgensen would agree that both of these problems, confirmed subversion of randomization and inclusion of women with known symptoms, undermine the credibility of CNBSS.
In a forensic review led by Bailar and McMahon, the weaknesses of the open book randomization used in CNBSS were clearly described. Allowing one part of the intervention, the clinical examination, to be performed after a participant’s intended randomization status had been established but before she was formally registered in the trial left the process open to manipulation.
Prof. Jørgensen uses several terms loosely. By “hearsay”, perhaps he is referring to the written statement of the eyewitness. He attributes, without valid evidence, the failure of the CNBSS to demonstrate a mortality reduction to overdiagnosis, although the level of overdetection reported by CNBSS is both inconsistent with its own previous reports and with the data, appearing to occur mainly after screening in the trial had ended. As for assumptions, all sources for my comments have been cited in a peer-reviewed published article and are available for him to read.
His “bit of a think” does not accurately describe the opinions of my colleagues and myself. Of course, the hypothetical conspiracy chain of mean-spirited behavior, proposed not by me but by Prof. Jørgensen, did not exist, nor did it need to exist for our concerns to be valid. It betrays an apparent lack of understanding on his part of the clinical environment in which the trial was conducted. Prof. Jørgensen defends the offering of a clinical exam to all participants. At the time of CNBSS, this was reasonable and is not under dispute. The point is that one should not deliver part of the intervention in a randomized trial (the clinical exam) before finalizing the trial arm assignment of the participants. This was done at all but one of the 15 study sites and is extremely poor trial design.
We attribute no intentional “cruelty” to the trial plan. Indeed, for those properly assigned, a finding of a palpable lump would have initiated a process that would likely have led eventually to a mammogram. Given the concerns of a well meaning nurse (untrained in research) who examined the breast and felt a lump, rather than facing some inevitable delay, it is not too surprising that she would have done what she could to ensure that the mammogram would be performed immediately. As acknowledged in the subsequent review by Bailar and MacMahon the approach used in CNBSS to enter trial participants left this door open. By the way, contrary to Prof. Jørgensen‘s suggestion, in the 1980s in Canada there would have been little difference between the imaging that would have been provided in a diagnostic versus a screening mammogram.
Prof. Jørgensen mentions his co-authorship of a Cochrane review of screening mammography which gave CNBSS a clean bill of health. At the time he could not have been aware of the subversion of randomization, and perhaps he did not know that symptomatic patients were knowingly entered into the study, but given these facts perhaps it is now time for him to have another “think” about CNBSS. In such a large trial, despite excellent balancing of other demographic and risk factors over trial arms, even a small imbalance in one key variable, the number of women with advanced cancers (those women most likely to die from breast cancer) registered in each trial arm can shift the entire study result.
We are prepared to believe that all involved in CNBSS had good intentions. On the other hand, the weakness in trial design and limitations in the oversight of the trial arm assignment as well as allowing entry with those with previously known symptoms cast great suspicion on the credibility of CNBSS results and its labeling as a trial of screening.