Human cell lines represent key reagents for many research laboratories. Cell lines are often the first models that researchers choose for experiments such as gene manipulation and drug testing, as they are relatively accessible and inexpensive, particularly compared with mouse and other animal models.
However, cell lines also are prone to contamination by other faster growing cell lines. As a result, many human cell lines purported to represent particular tumor types have been found by genetic testing to be contaminated by other cancer cells. This potential for confusion poses a serious problem for researchers who want to study a particular cancer type but end up using cells from an unrelated disease.
Our team studies wrongly identified nucleotide sequence reagents in cancer research, such as PCR primers and gene knockdown reagents. Recently in the context of an undergraduate student project, we decided to also check the identities of cell lines in a small group of papers on the human gene miR-145, which codes for a microRNA. We found wrongly identified nucleotide sequences and cell lines in numerous articles about miR-145, but also what appeared to be five misspelled identifiers of contaminated cell lines.
This issue isn’t new – some cell line identifiers are known to be misspelled, although little has been written about such misspellings. We decided to study these apparent misspellings, and eventually found 23 published human cell line identifiers that were not recognised as cell lines in the most comprehensive cell line knowledgebase, Cellosaurus. We then studied eight of these non-verifiable (NV) cell line identifiers in detail across 420 papers.
While all eight NV identifiers likely represent misspellings in at least some papers, we also found that seven of the eight NV cell line identifiers seemed to be taking on new identities as independent cell lines. How did we separate identifier misspellings from seemingly independent cell lines? NV cell line identifiers were described as misspellings where they were only used alternately with similarly named human cell line(s), such that the misspelled identifier was never directly connected with any similarly-named cell line.
In contrast, NV cell line identifiers were indicated to represent independent cell lines if they were used without mentions of any similarly named human cell line; if an NV and similarly named human cell line was included in any list of cell lines studied; if results for both cell lines were shown in the same experiment(s); and/or if both cell line identifiers were directly connected in the text.
We determined that more than half of the 420 papers appeared to refer to at least one NV identifier as an independent cell line. However, we could not find any published descriptions of how these cell lines were first established. Some authors claimed to have produced genetic profiles for three NV cell lines, but we could not find such profiles for these cell lines, either in publications describing NV cell lines or elsewhere. Six NV cell lines were claimed to have been sourced from large cell line repositories such as ATCC, but we could not find any of these NV identifiers or cell lines in the claimed repository catalogues.
So why should we be concerned by NV cell lines? As the problem of cell line contamination has shown, cell line identities underpin the results of experiments conducted with these models. The inability to independently verify cell lines raises doubts about the significance of associated results, how these results might be reproduced, and even which experiments were conducted in the first place. While researchers seem unlikely to source NV cell lines for their own experiments (as we could not find them in claimed repositories), they could still risk wasting time and money by following up results from NV cell lines using cell lines that they already have on hand.
Descriptions of NV cell lines call for a zero-tolerance approach to misspelled cell line identifiers and pairing the names of all published cell lines with their corresponding Research Resource Identifiers (RRIDs). In the meantime, as we have advised for nucleotide sequence identities, researchers should check the identities of any cell lines that they don’t recognize before planning any future experiments. Simple checks could avoid wasting time on cell lines that might be found on wanted posters, but not in Cellosaurus.
Jennifer Byrne is conjoint Professor of Molecular Oncology and leads the PRIMeR group at the University of Sydney, Australia.
Like Retraction Watch? You can make a tax-deductible contribution to support our work, subscribe to our free daily digest or paid weekly update, follow us on Twitter, like us on Facebook, or add us to your RSS reader. If you find a retraction that’s not in The Retraction Watch Database, you can let us know here. For comments or feedback, email us at [email protected].