What happens to copies of retracted papers on non-publisher websites (eg PubMed Central)?
One of the important questions when it comes to retractions is, what happens to retracted papers? How do readers find out they’re retracted? There’s evidence they are cited less often, but that when they are cited, the vast majority of the time it’s as if they were never retracted.
…Internet copies of 1,779 retracted articles identified in MEDLINE, published between 1973 and 2010, excluding the publishers’ website. Found copies were classified by article version and location. Mendeley (a bibliographic software) was searched for copies residing in personal libraries.
Many of the 321 copies of 289 retracted articles Davis found were on PubMedCentral (PMC), the “free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health’s National Library of Medicine.” Here are the highlights of the results, which Davis published last month in the Journal of the Medical Library Association:
Just over one-quarter (26% or 82) of retracted articles located in this study contained some retraction statement. Sixty-six of these 82 copies (80%) were accessible from the page-view in PMC—a format that provides access to an article 1 page at a time within a larger web page, which contains bibliographic data. Removing these and focusing on full PDF files of retracted articles, just 16 (5%) of the articles contained some form of retraction statement.
Davis could only see the bibliographic information in Mendeley, not the whole PDF, so he didn’t speculate as to whether those were marked.
In other words, just 5% of the full PDF files contained a retraction statement. (Minor note: The paper’s abstract includes a typo when summarizing these results, of 15 for 16. When we pointed this out to Davis, he immediately contacted the publisher, who will issue a correction in the journal’s October issue. But it doesn’t affect the results, and is just a typo. Kudos to Davis for taking care of it so quickly.)
Here’s how Davis summarized the paper for Retraction Watch:
Taken together, the results claim:
- That copies of retracted papers are widely available, and easily discoverable, from non-publisher websites, that
- Few of these copies indicate that the papers were subsequently retracted, and that
- References of these papers are widely found in the libraries of a popular reference manager
We asked Davis why he focused, in the paper’s abstract, on the 5% figure, rather than the 26%. The former, of course, paints an even more bleak picture of what’s on non-publisher websites. He responded:
Consider the PMC page-view rendering of a retracted article.
As a reader, I have to display each of the 7 pages individually and I can’t download the entire article or print it off as a single article. The rendering of the text is not good, I can’t copy and paste relevant statements, and it is hard for me to read this article from my screen. I think most readers would find this an unacceptable version to do more than give a cursory view. As a reader, I want the full PDF file of the article.
The PMC page-view display of the article also provides an opportunity to display a retraction message in the metadata. In the above example, the statement “This article has been retracted” is found just above the title information in the bib record. However, you will note that there is no indication on the PDF file itself that the article has been retracted. Had I found a copy of the full PDF on a publicly accessible website, I would miss any indication that this article was retracted.
So, to answer your question, I feel that an unmarked full PDF of the publisher’s version poses the biggest threat to potential readers and as such, focused on these.
Fair enough. Regardless of which number is the better one to use, they’re both low. But we should note that publishers aren’t exactly perfect when it comes to notifying readers, either. As Grant Steen found in a 2010 study:
Journals often fail to alert the naïve reader; 31.8% of retracted papers were not noted as retracted in any way.
Davis’s work was “funded by the Publishers International Linking Association, which oversees the operation of CrossMark,” one potential solution to the awareness problem. As we’ve written elsewhere, CrossMark is a
…clickable logo that will let a reader know whether there have been any corrections, retractions or other revisions. It is a solution to the fact that such changes are at best difficult to find — and are sometimes not mentioned at all on ‘current’ versions of papers.
Theoretically, that means authors will avoid citing retracted papers positively — or at least know when they’re citing such studies.
But Davis, who offers some ways to prevent unknowingly citing retracted papers, notes in the study:
As CrossMark is unable to push alerts to readers or replace older PDF files on user machines with current copies, it is likely that retracted papers—especially older papers published without the CrossMark symbol—will be cited for some time.