Which countries have the most retractions, for which reasons?

jmlaOne of the questions we often get — but are careful to answer with some version of “we don’t know because we don’t have a denominator” — is how retraction rates vary by scientific field and country. We’ve noticed that the reasons for retraction seem to vary among countries, but didn’t really have the data. A new paper in the Journal of the Medical Library Association by Kathleen Amos takes a good step toward figuring the country part out.

Amos looked at PubMed-indexed retractions from 2008 to 2012. Here’s what she found:

Authors from more than fifty countries retracted papers. While the United States retracted the most papers, China retracted the most papers for plagiarism and duplicate publication. Rates of plagiarism and duplicate publication were highest in Italy and Finland, respectively. Unethical publishing practices cut across nations.

It’s important to note that the “rate” here is among the retractions from each country, not a rate per papers published in that country. And

…each paper was assigned a single country of authorship for the purposes of analysis. For papers with authors from multiple countries or with first authors affiliated with institutions in more than one country, the author based the analysis on the primary national affiliation of the first author.

Specifically, Amos found:

Of the 20 countries that had retractions of 5 or more papers, the highest rate of retraction for plagiarism was found in Italy, where 66.7% of retractions resulted from plagiarism (Table 2). This was followed by Turkey at 61.5%, Iran and Tunisia at 42.9% each,and France at 38.5%. In total, 12 countries had rates of plagiarism higher than the 16.6% average calculated for the sample. China’s plagiarism rate was 16.8%, almost double the United States’ rate of 8.5%. Both Finland and Germany recorded rates of 0.

For duplicate publication, fewer countries had retraction rates higher than the 18.1% sample average, and the range of rates was smaller. Finland had the highest rate of duplicate publication at 37.5%, followed by China at 29.4% and Tunisia at 28.6%. Japan (22.8%) and Iran (21.4%) also had rates above the sample average, while the rate of duplicate publication in the United States was below the average, at 13.1%. Only Sweden retracted no papers for duplicate publication.

We asked Ferric Fang, who of course has published a number of papers on retractions, including one in PNAS cited by the new paper, for his take:

Unfortunately I gained only limited insights from this short study.  I would like to see the research extended by placing the findings in the context of the scientific cultures of different countries.  This is a complex topic that deserves further exploration.  It would also be of interest to normalize the data according to the publication output of each country.  Unfortunately this is not straightforward to do in PubMed. Third, it should be noted that the “country of origin” of a publication does not necessarily reflect the country from which each of the authors originates.  It would be interesting to know this information but it is not easy to obtain.

Finally, one could improve our understanding  by contacting authors to try to understand the underlying motivations in each case.  This is the kind of in-depth analysis in which Retraction Watch often provides important insights.  To reduce the prevalence of duplicate publication and plagiarism, it is important to understand the perceptions of individuals committing these actions.  In the few cases of plagiarism in which I have actually contacted authors to get their side of the story, I have been surprised by the rationalizations for what seemed to me to be overtly unethical behavior.  The authors don’t always see it that way, and it is useful to try to understand their point of view even if we don’t agree with it.

Whether or not plagiarism really deserves a place along with faking data in the Office of Research Integrity’s “fabrication, falsification, and plagiarism” definition of misconduct is, of course, a subject of active debate here on Retraction Watch. As Fang notes, plagiarism

does not corrupt the content of the scientific literature in the way that data fabrication and falsification do.  Plagiarism is about stealing others’ words and ideas, which is certainly wrong, but the content of the words and ideas are not altered.  Putting flawed data in the literature has much more potential to do harm if those flawed data mislead other scientists and the community at large.  So by all means, let’s discourage plagiarism and duplicate publication, but in the big picture they are not all that important.

For some context, here are all of the retractions we’ve covered for the four top countries:

16 thoughts on “Which countries have the most retractions, for which reasons?”

  1. I agree with Fang, especially on “It would also be of interest to normalize the data according to the publication output of each country.”
    I think that focusing on the relative rate from all retractions from a country offers little insight. If the number of papers per country is hard to obtain, even per-population rates would give a more interesting view, in my opinion.

  2. Off topic, a bit, but I suppose this is as good a place as any to post…

    I’m curious as to the opinion of Retraction Watch readers and contributors of the following. Last fall I came across two papers by the same primary author that are largely duplicates (I mentioned this briefly in another thread at RW at the time). The papers can be found here:

    A. Zaggia, B. Ameduri Current Opinion in Colloid & Interface Science 2012, 17, 188-195
    http://www.sciencedirect.com/science/article/pii/S1359029412000532

    G. Kostov, F. Boschet, B. Ameduri Journal of Fluorine Chemistry, 2009, 130, 1192-1199
    http://www.sciencedirect.com/science/article/pii/S0022113909002188

    I contacted the editors of both journals (8 total) to inform them of this finding. To summarize:

    Figures 1, 2, and 3 in the JFC paper are exactly duplicated as Figures 1, 2, and 3 in COCIS.
    The COCIS paper has five Schemes, four of which are exactly duplicated from JFC.
    Two chemical structure diagrams (not labeled as a Scheme or Figure) from JFC are exactly duplicated in COCIS.
    The JFC paper contains ca. 40 paragraphs, the COCIS paper 41. The first 21 paragraphs of the COCIS paper are introductory material and appear to be unique. Of the remaining 20 paragraphs of actual scientific content, 11 are copied essentially verbatim from the JFC paper.

    COCIS, after 4-5 months, finally issued the following statement as an “Editor’s Note”, with no other title or subject to indicate to a reader what the Note is about, the reader must be curious enough to follow the link to find:

    Editors’ note
    An article published in Current Opinion of Colloid and Interface
    Science (COCIS) in 2012 (“Recent Advances on Synthesis of Potentially
    Non-Bioaccumulable Fluorinated Surfactants”, COCIS 17 (4), 188–195,
    2012 by A. Zaggia and B. Ameduri) duplicates parts of a review article
    published in Journal of Fluorine Chemistry (JFC) in 2009 (“Original
    Fluorinated Surfactants Potentially Non-Bioaccumulable”, JFC, 130
    (12) 1192–1199, 2009 by G. Kostov, F. Boschet, B. Ameduri). While the
    COCIS article does contain new and significant perspectives that go
    beyond the JFC review, all of the figures and all but one of the schemes
    in the COCIS article are duplicated from the JFC review. In addition,
    sections of the text, including the abstract and Section 5 of the COCIS
    article are similar to the JFC review.
    http://tinyurl.com/qalhm8q

    The online version of the 2012 COCIS article now contains a hyperlink to this Editor’s Note, but again without any indication as to what it concerns – the reader must follow the link to find out.

    I believe the COPE guidelines give the Editors the freedom to decide how they wish to handle such matters. So be it. But I am curious as to the opinion of the readers here as to how they chose to handle this one. From my correspondence with the chief editor at COCIS they are done with the matter….

    1. Interesting case. I take that the analysis above reflects the real situation ~50% selfplagiarism by the primary author (Bruno Ameduri). The 2012 study has nearly the same acknowledgement as the 2009 and the two coauthors of 2009 who are not in author in 2012. The 2009 case is a mini-review in a conference proceeding. The 2012 is a full article claiming to be a ‘recent advance in …’ (but with 2009 figures). If I were an interested academic in this field, I am cheated and that warrants a retraction – or a replacement, where only the “new and significant perspectives that go beyond the JFC review” are given. The primary author should be ashamed.

      1. I’m also – and primarily – interested in opinion on the editors’ actions in this matter.

  3. Fang states “let’s discourage plagiarism and duplicate publication, but in the big picture they are not all that important”.

    Surely Fang recognizes that it is possible to plagiarize others’ data, as it is to self-plagiarize one’s own data, both of which are equivalent to data fabrication in terms of their effects in corrupting the scientific record if they remain undetected. Thus, plagiarism and self-plagiarism of data should be treated with the same degree of seriousness as data fabrication and falsification.

    1. Regarding Fang’s comments, I agree with Miguel Roig. If we start to scale academic “crimes” as minor or major, we will start to politicize the system. Let all academic infractions be treated equally seriously. Regarding the cultural issue, I find basal aspects of this study really troubling for three main reasons.

      Firstly, it restricts itself to PubMed. PubMed is not the main or only scientific data-base. Can we please start to decentralize from PubMed, please! A good study would have, at minimum, taken into account Elsevier’s Scopus and/or sciencedirect.com, Springer’s SpringerLink, Wiley Online, Taylor and Francis Online, Google Scholar, and DOAJ.

      Also, why did Dr. Amos not update the data to 2013?

      MY second qualm with this paper relates to the use of numbers in an absolute way. Surely numbers can only make sense when seen as a proportion of the total number of scientists in that country? So, even if let’s say 50 papers retracted for plagiarism were Turkish scientists, that number would be important if the total number of scientists was 1000, or if the number of papers by Turkish scientists on PubMed was 100. But what if there are 1 million Turkish scientists and what if there are 1 million papers by Turkish scientists on PubMed? So, my second complaint about this paper is that it fails abysmally in looking at absolute and relative numbers of scientist from the same country and the number of scientists from the same country who published in PubMed.

      The concepts of country and culture are clearly distinct entities. Did Dr. Amos indicate the “original” culture of the scientists in each of the “country” papers?

      My letters tend to be turned away — http://retractionwatch.com/2014/01/27/a-rating-system-for-retractions-how-various-journals-stack-up/ — so I would suggest someone else write a letter challenging the findings.

      1. JATdS, I think your criticism is going a little too far.

        I currently don’t have access to the full text, so everything is based on what I can read here.

        Data form 2008-2012 in a study published in 2014, that’s just fine IMHO. Asking for 2013 is too much. You always have a deadline and then you start working on your data.

        As to “only pubmed”: Hmm, well I only have some insight into biomedical research. But in that field, I don’t know of anyone who spends much time using anything else but pubmed in order to scan the literature. What’s not in pubmed is not seen by the community. SpringerLink, Wiley Online etc. are publisher-sites. Anything halfway meaningfull there will also be found in pubmed.

        The issue with the “country of origin” and “culture” has already been commented by Fang. By searching databases, the “original” culture of the scientists is almost impossible to obtain. And where do you draw the line? Country of birth? Migratory background (country of birth of parents)? You would have to contact every first author/senior author, probably with very limited feedback. And since it was (rightly) criticised, that relative numbers are more informative than absolute: How would you obtain the “culture” of every first author of every paper?

        I certainly agree that relative numbers are a lot more informative than the absolute figures. And tekija’s comment below about low numbers (low “n”s) is certainly very true. On the other hand, it seems the study puts it’s main focus on the reasons for retraction, so there you have a relative number (country A retracts mainly because of plagiarism, country B mainly because of data fabrication).

        Every study has it’s limitations. It is important that they are clear. But limitations don’t necessarily make it a bad study. More limited studies just tell us less, that’s a general rule in science.

    2. I’m sorry, my colleague Miguel, but I do not understand your strong statement: “. . . to self-plagiarize one’s own data . . . [is] equivalent to data fabrication in terms of their effects in corrupting the scientific record . . . . [and] should be treated with the same degree of seriousness as data fabrication and falsification.”

      I do not believe they are even close. If one “self-plagiarizes” (reuses) one’s own data that are authentic and accurate, then the scientific record [data] remains unchanged. Of course, one could violate the copyright of the first journal, unless the publisher agrees that there is no problem with the reuse (as for different autdiences and different thrusts to the papers). And one could try to inflate inappropriately one’s publication record.

      In contrast, data fabrication is a direct corruption of the scientific record [the data itself]. There is no plausible explanation nor defense to making up data, a clear violation of the standard scientific process. Thus, it is much more serious than so-called “self-plagiarism.”

      As a reflection of this difference in significance, ORI has routinely debarred from federal funding those respondents who fabricate or falsify data [about 90% of ORI findings], while ORI has generally imposed certification and supervision on those who have plagiarized research.

      [ However, ORI has in fact debarred four persons for solely (major) plagiarism:
      – Imam (plagiarism of another’s whole grant application):
      http://www.gpo.gov/fdsys/pkg/FR-1997-12-18/html/97-33035.htm
      – Qian (plagiarism of images for a grant):
      http://www.gpo.gov/fdsys/pkg/FR-2000-07-05/pdf/00-16876.pdf
      – Panduranji (plagairism and then false representation of those images in grant application):
      http://www.gpo.gov/fdsys/pkg/FR-2001-08-02/pdf/01-19307.pdf
      – Sultan (plagiarism and then made false claims about images in grant application):
      http://www.thefederalregister.com/d.p/2004-11-19-04-25648
      [eight others who plagiarized but also falsified or fabricated data were also debarred by ORI; thirteen others who were found to have plagiarized research were not debarred by ORI].

      1. Hi Alan, I will illustrate my thinking with a fictitious case of what we might call covert salami publishing. Suppose that I, together with 4 of the post-docs that I supervise, carry out a large prospective study in a sample of, say, 10,000 patients in which I examine how variables A, B, C, D are interrelated with variables V, W, X, Y Z. The nature of my dependent variables is such that data from variables V and W, the most interesting data in the study, is most suitable for publication in a specialized geriatric journal. So, I go ahead and publish those data from the sub-sample of elderly patients in that journal and I do so with the one post-doc who collected and analyzed those data. Later, the five of us decide to publish the results of the entire data set, including the data that had already been published, in a more general journal. However, we do so without any cross referencing –or perhaps with ambiguous cross-referencing- to the earlier article, thus giving the editor and readers the illusion that the studies were carried out separately and that the patient samples were independent of each other. To insure that data from the earlier geriatric sample look different in the new paper, we employ less stringent inclusion-exclusion criteria for the later article that results in a larger sub-sample of geriatric patients relative to the earlier published sample, which results in somewhat different means, Sds, and Ns for that sub-sample.

        In such a scenario much of the geriatric data from the second study are being presented as new data to the scientific community (i.e., self-plagiarism). But, these are really the old data! Except for the few patients that were added to make the ns different, the ‘newer’ geriatric data are data that do not really exist. It seems to me that that this type of deception is analogous to data fabrication.

          1. Hi Alan, the acts are certainly different. In data fabrication, the perpetrators make up the data out of nothing. In cases of covert self-plagiarism of data, the perpetrators reuse data that they themselves had already collected, but present them as new data. Those who plagiarize others’ data are also presenting as new data that have already been collected, albeit data collected by others. These three misdeeds are certainly very different. On the surface, data fabrication seems to be the most severe of the three cases, followed by plagiarism of others’ data and then self-plagiarism of one’s own data. But, if uncovered, all three acts have the same detrimental effect on science by falsely altering the record. In all three cases, the scientific community is purposely* misled into accepting data that, in reality, do not exist.

            *I don’t believe that in all cases of covert self-plagiarism of data there is an explicit purpose to mislead readers. However, I am pretty certain that such is the case in the vast majority of instances

  4. (At the risk of repeating the ideas of Fang): I liken this study a bit to Olympic Medal counts by country alone.

    Further than Fang, it is very important to note that a retraction appearing on pubmed does not represent the worst outcome of a paper…

    If I may:

    1) Published and unchanged, or corrected for editorial mistake (author’s name misspelled)
    2) Corrected for content
    3) Expression of concern
    4) Retracted
    5) Disappeared

    We know that some papers that have appeared on pubmed in the past can no longer be found on pubmed nor can a note of retraction of explanation be found (see the illustrious case of BJH/AJ). In my opinion, this is an offense far worse than retraction (unless perhaps there was an honest editorial mistake where the authors had no part… e.g. secretary accidentally pushed “publish” on a manuscript under review).

    One could imagine a scenario where authors in a specific country were more likely to publish in journals that disappeared articles instead of retracting them.Importantly, this could significantly skew results, even if normalized to total # of papers (unless a way to identify disappeared papers was used).

  5. Regarding duplicate publication, Finland was an unexpected champion in this race: “Finland had the highest rate of duplicate publication at 37.5%, followed by China at 29.4% and Tunisia at 28.6%.”

    Table 2 tells us the following:

    Finland – 8 retractions – 3 for duplicate publications
    China – 143 retractions – 42 for duplicate publications
    Tunisia – 7 retractions – 2 for duplicate publications

    I was taught that when N<10 do not use percentages at all, give the proportion 3/8 – and here we go with fractions of percentages. Very misleading. The 95% CI for 37.5% – which looks deceptively precise, does it not – is from 9% to 76%!

    I looked into the 3 duplicate publications that took Finland to the top, and it turns out all were traced to one man, emeritus professor in genetics, Petter Portin, who seemingly tried to publish as the sole author one study three times, loosing all of them.

    He actually may have tried to split the findings into three papers, which became to much look-a-likes.

    1: Evidence based on studies of the mus309 mutant, deficient in DNA double-strand break repair, that meiotic crossing over in Drosophila melanogaster is a two-phase process.
    Portin P.
    Genetica. 2010 Oct;138(9-10):1033-45. doi: 10.1007/s10709-010-9489-1. Epub 2010 Aug 31. Retraction in: Genetica. 2010 Dec;138(11-12):1309.

    2: The effect of the mus309 mutation, defective in DNA double-strand break repair, on crossing over in Drosophila melanogaster suggests a mechanism for the centromere effect of crossing over.
    Portin P.
    Genetica. 2010 Mar;138(3):333-42. doi: 10.1007/s10709-009-9422-7. Epub 2009 Nov 2. Retraction in: Genetica. 2010 Dec;138(11-12):1307.

    3: The effect of the mus309 mutation, defective in DNA double-strand break repair, on crossing over in Drosophila melanogaster suggests a mechanism for interference.
    Portin P.
    Hereditas. 2009 Sep;146(4):162-76. doi: 10.1111/j.1601-5223.2009.02144.x. Retraction in: Saura A. Hereditas. 2011 Feb;148(1):50.

    Actually two are labeled retracted for "plagiarism" but these have been reclassified by Amos as "double publications" I think, because it was self plagiarism.

    Be as it may, "Peter Porta" has became synonymous with "Finland", taking the snowly country to Abstract level: "Rates of plagiarism and duplicate publication were highest in Italy and Finland, respectively". This anecdotal outcome is not evident from the paper.

    Together with the serious limitations already noted, above, taken the nonchalant atitude to using statistics, should one consider retracting the J Med Lib Assoc paper, which seem terribly misleading on many issues!

    1. Excellent observations and revealing (mis)calculations, tekija! Maybe the time is ripe to contact the EIC of J Med Lib Assoc with a link to this blog post and a list of the concerns about those analyses.

    2. See my comment above.

      I haven’t read the original paper. If low “n”s are not commented on, that’s poor science. However, I continue to dispute that poor science is a reason for retraction per se. Otherwise, the scientific world would be full of nature- and science-paper and nothing else. And as we know, even these journals are full of poor science.

  6. There is one factor that is difficult to trace. Fraudulent academics are sometimes good at covering their traces. This cover-up might be a function of the country or culture.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.