‘Aggressive’ COVID-19 strains: What it takes to correct a flawed paper

A group of researchers in Scotland have taken aim at a study published in early March which reported surprising findings on the genetics of the SARS-CoV-2 virus responsible for the Covid-19 pandemic. 

But the story of what it took to correct the record about the paper is likely to be all too familiar to those who attempt such feats. It involved a blog post and a new paper — neither of which appeared on the site of the original journal that published the work, and neither of which is seeing the kind of attention paid to the original article.

The paper, “On the origin and continuing evolution of SARS-CoV-2,” appeared in National Science Review, published by Oxford Academic. According to the abstract

Although we found only 4% variability in genomic nucleotides between SARS-CoV-2 and a bat SARS-related coronavirus (SARSr-CoV; RaTG13), the difference at neutral sites was 17%, suggesting the divergence between the two viruses is much larger than previously estimated. Our results suggest that the development of new variations in functional sites in the receptor-binding domain (RBD) of the spike seen in SARS-CoV-2 and viruses from pangolin SARSr-CoVs are likely caused by mutations and natural selection besides recombination. … These findings strongly support an urgent need for further immediate, comprehensive studies that combine genomic data, epidemiological data, and chart records of the clinical symptoms of patients with coronavirus disease 2019 (COVID-19).

Not surprisingly given the topic, the report, by Xiaolu Tang, of Peking University, and colleagues triggered a flood of interest, including nearly 200 articles in the media and more than 10,000 tweets about the paper. 

It also prompted a group of researchers at the University of Glasgow Centre for Virus Research, in Scotland, led by Oscar MacLean, to write a blog post about their concerns on virological on March 5, two days after the paper was published, in which they argue that: 

Two of the key claims made by this paper appear to have been reached by misunderstanding and over-interpretation of the SARS-CoV-2 data, with an additional analysis suffering from methodological limitations. 

That post led to some back-and-forth, including this response:

The criticisms by MacLean et al. (Response to “On the origin and continuing evolution of SARS-CoV-2” ) of the Tang et al.’s recent publication (“On the origin and continuing evolution of SARS-CoV-2”) on National Science Review (NSR) will be briefly answered below. Both parties have agreed that full-length exchanges should appear in NSR, which has generously offered to host this very public and open debate.

It also led to an addendum to the original paper, which is only available at the end of the paper’s PDF but not referenced by the study’s abstract page:

In our recent publication (https://doi.org/10.1093/nsr/nwaa036), we showed that among circulating SARS-CoV-2 (with 103 genomes analyzed) two different viral genomes co-exist. We identified them as lineages L and S. The concerned amino acid we used to define the L and S lineages is located in ORF8 (open reading frame 8), which plays a yet undefined role in the viral life cycle. Based on the finding that “L” lineage has a higher frequency than lineage S, we described the L lineage as aggressive. We now recognize that within the context of our study the term “aggressive” is misleading and should be replaced by a more precise term “a higher frequency”. In short, while we have shown that the two lineages naturally co-exist, we provided no evidence supporting any epidemiological conclusion regarding the virulence or pathogenicity of SARS-CoV-2. By saying so, corrections will be made in the print version of this paper to avoid being misleading

So, did the critique end up in the journal that originally published the work? MacLean told us that:

We were actually invited to publish it there by the senior editor (albeit it indirectly through an intermediary Chinese scientist for some reason). After the senior author on the paper responded to us on virological in a way that suggested he hadn’t actually bothered to read our response, and understand what the issues with his analysis were, we decided responding in NSR would be a bad course of action. We took the virological response to be indicative of the level of discourse we would expect if we published there.  We were also privately encouraged not to publish there by other virologists, so as not to make it look like the issues with the Tang et al. paper were minor and simply technical disagreements amongst scientists. 

The MacLean critique ended up in a journal called Virus Evolution (also an Oxford periodical) from MacLean and colleagues. Their commentary, published April 30 and titled, bluntly, “No evidence for distinct types in the evolution of SARS-CoV-2,” states

A recent study by Tang et al. (2020) claimed that two major types of SARS-CoV-2 had evolved in the ongoing COVID-19 pandemic and that one of these types was more “aggressive” than the other. Given the repercussions of these claims and the intense media coverage of these types of articles, we have examined in detail the data presented by Tang et al, and show that the major conclusions of that paper cannot be substantiated. Using examples from other viral outbreaks we discuss the difficulty in demonstrating the existence or nature of a functional effect of a viral mutation, and we advise against overinterpretation of genomic data during the pandemic.

As of this writing, the Virus Evolution paper has been mentioned by 97 news outlets — about half of the number the original paper was mentioned by — and tweeted by 93 people, which is just 1% of the 9,300 who tweeted the National Science Review paper.

Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on Twitter, like us on Facebook, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].

5 thoughts on “‘Aggressive’ COVID-19 strains: What it takes to correct a flawed paper”

  1. This is becoming a very political topic. This paper from Los Alamos Laboratory appears to lend some support to the original paper’s (Xiaolu Tang, et al.) contention of two strains, with the 614G strain becoming the more dominate strain. The paper states: “Finally, the D614G mutation is predicted to destabilize inter-protomer S1-S2 subunit interactions in the trimer, and this may have direct consequences for the infectivity of the virus (Fig. 4). Increased infectivity would be consistent with rapid spread, and also the association of higher viral load with G614 that we observed in the clinical data from Sheffield, England (Fig. 5).”
    From: “Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2”
    https://www.biorxiv.org/content/10.1101/2020.04.29.069054v1

    1. My understanding is the viral load was not in fact measured – PCR was done. Higher PCR levels can also be due to more defective viral particles or viral fragments, which do not contribute to higher infectivity as they are not infectious. (and in fact may lead to lower infectivity if the total infectious particle count goes down due to production of the faulty ones).
      Founder effect is a real issue here.

  2. Hi Kevin,

    The spike ORF D614G mutation did not distinguish the two ‘types’ proposed by Tang et al., those were separated by a synyonymous mutation in ORF1AB and a nonsynonymous mutation in ORF8.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.