How much self-plagiarism, aka duplication, is too much?

copeDuplication is a frequent reason for the retractions we cover. Such duplication retractions are so common that we don’t get to most of them. While many have argued that duplication pollutes the literature, and can bias meta-analyses when the same study ends up being counted more than once, others say the need to come up with new ways to say the same thing is a waste of time. (That doesn’t explain why some scientists don’t just put their old words in quotes and cite them, but we digress.)

Appropriately, the Committee on Publication Ethics is taking up the issue at their regular forum tomorrow, using new guidelines produced by BioMedCentral as a starting point. Here’s an excerpt:

Journal editors should consider publishing a correction article when:

  • Sections of the text, generally excluding methods, are identical or near identical to a previous publication by the same author(s);
  • The original publication is not referenced in the subsequent publication; but
  • There is still sufficient new material in the article to justify its publication.

The correction should amend the literature by adding the missing citation and clarifying what is new in the subsequent publication versus the original publication.

Journal editors should consider publishing a retraction article when:

  • There is significant overlap in the text, generally excluding methods, with sections that are identical or near identical to a previous publication by the same author(s);
  • The recycled text reports previously published data and there is insufficient new material in the article to justify its publication in light of the previous publication(s).
  • The recycled text forms the major part of the discussion or conclusion in the article.
  • The overlap breaches copyright.

The guidelines also suggest that editors should only go back to 2004 as long as the duplication doesn’t involve data. COPE is looking for responses to these guidelines, so leave a comment on their site. And ours too, of course.

Separately, COPE has created a discussion document on how members should respond to anonymous whistleblowers, and is soliciting feedback. We’ve taken up this issue as well.

45 thoughts on “How much self-plagiarism, aka duplication, is too much?”

  1. My recent experience of highlighting serial data re-use to journal editors is only somewhat positive. Moreover, it would seem that the result of my contacting journals may simply be serial corrections of some, but not all, instances. This seems rather unsatisfactory, particularly as current practice has one important divisive result: a strict rule for students (deduction of marks data re-use without proper formal attribution; mark of zero for serial offenders) and a much laxer rule for their teachers (many, but not all authors work in Universities). Perhaps the retraction rod needs to be wielded more firmly?

  2. Oh well, they could have just said “The overlap breaches copyright”. Because that is the only reason why this issue is being brought up by publishers. They created the term “self-plagiarism”, trying to make it into an ethical issue similar to plagiarism.

    1. Who exactly coined the term self-plagiarism for the first time? Surely this is something that would be easy enough to discover. Surely someone wants to take credit for coining the term. I think when we start to look at the root of this term, we will start to discover strong geo-political and business / marketing influence taking place…

      1. As I noted in my ORI / AAAS Conference on Plagiarism in 1993, “self-plagiarism” is a self-contradictory misnomer (available on ORI’s http://ori.hhs.gov under http://ori.hhs.gov/sites/default/files/aaas.pdf & 3 files).

        “Plagiarism” is the unauthorized use of another person’s words, ideas or creations without giving that person appropriate recognition and citation. One cannot “plagiarize” oneself. “Self-plagiarism” is not a “valid” word.

        “Duplicate Publication” or “Text Recycling” and “Copyright Violation” for publications, when the reuse is not acknoweldged nor authorized, are good terms to use instead.

  3. “editors should only go back to 2004”

    What about older literature? It seems that many offenders will continue to enjoy achievements which were granted to them on the basis of certain number of publications notwithstanding duplications!

    1. Yes, good point. I can think of two groups, one who published in marketing/psychometrics, one who published in psychometrics/quant psych, and who would routinely publish the same papers 2-3 times. The important idea is to select widely spread outlets who were (in 1995) unlikely to interact. So, publishing in the Australian Journal of xxx, the Swedish Journal of YYY, and so forth was a common pattern. Plus a pattern of several article on roughly the very same topic coming out in quick succession. Many of these people/groups were quite good and had useful contributions. It was that the publications were all published 2x.

      1. OK. But can in fact a paper be re-published if a declaration that it was already published in another journal 5 or 20 years ago in a journal of a publisher we are simply not satisfied with? This raises very new and interesting ethical questions, particularly if the authors own the copyright. I hope we can get some copyright experts’ opinion here on this one…

  4. It’s good to see that they are excluding methods, because I once got caught up in a “self-plagiarism” issue over a methods issue. However, another problem was that although I myself had written the words in question, I had slightly different coauthors on the other paper. The journal can’t tell if you are plagiarizing yourself or plagiarizing one of your previous coauthors. I learned my lesson and now just move the words around even though they mean the same thing.

    1. “and now just move the words around even though they mean the same thing”. I understand, but is it the best way to use valuable researcher’s time? Shuffling words around in the methods section even though it means the same thing? A methods section should convey the methods clearly. Once you have such a section, there is absolutely no reason why one should change it (other than because of publisher imposed rules that make no sense).

    2. If you have an established method it should be enough to cite the “protocol paper” or “first use article” – shouldn’t it? Though I acknowledge that it would mean that the reader often would have to find another paper to fully evaluate the work…

  5. Methods are a difficult area. For the rest of a paper, then surely it has to be original? After all, a student cannot submit the same piece of work to different assessments, so why should a professor be able to?

    1. What about an expert review on a particular topic – journals started to retract reviews as well as they contain self-plagiarised as well as plagiarised part from other reviews – moreover reviews do not contain experimental data

      1. My own experience is for a review, if you’re writing honestly, then you can never really write the same article twice. Either your perspective will have changed as the field continues to the develop, or else you’ll be writing to a different audience for whom different aspects of the same ideas should be emphasized or who will need different things explained.

        1. My own experience is, for a review, if you are invited to do one you do it. Many times you don’t want to do it but a colleague asks you for one for a book they are editing. You just did two last month for a different book. Therefore, you try to minimize the time it takes you to do it by not wasting hours on pointless word combinatorics, but simply by rearranging sections within a new context.

          1. If you’re pumping out a dozen nearly-identical “new” reviews per year, then I would view that as seriously problematic from an ethical standpoint. Juggling words around to make it not plagiarism wouldn’t help: you’re still puffing up the literature with duplicate material.

            If you’re being asked to provide material for a textbook or something else where the work doesn’t need to be novel, then obtain permission, write “republished from” or “adapted from” and reuse the old material directly with acknowledgement of the original source. That’s perfectly ethical, honest, and even easier to do.

  6. On this topic, it is always worth reading Pam Samuelson’s essay on self-plagiarism.

    She writes law & technology & intersections papers, not science, but it illustrates the issues in (reasonable) re-use/rework of material for very different audiences.

  7. I don’t understand: the quoted new guidelines are BioMedCentral work or COPE variety of it? But that wouldn’t matter; the guidelines are horribly wrong. I mean – horribly written and that, for the guidelines is the same as being horrible. A good scientist would not write something like this. AllOutWar, above, said: “I think when we start to look at the root of this term [self-plagiarism], we will start to discover strong geo-political and business / marketing influence taking place…” Exactly the same I think is the origin of these guidelines – the global origin. There must be an All Out War on such profanation of science.

    1. I don’t agree that the problem at hand has anything to do with copyright or “self-plagiarism”.

    2. I dare think that it has everything to do with global illiteracy in science: authors no longer able to write papers. The passages repeating previous work must be written in this manner: “In the previous paper [..], we showed (discussed, proved, measured, proposed, etc.) …………..” And you either give a quotation or say similar things. What’s wrong with talking to your readers in a simple manner? I know what’s wrong – authors are globalising themselves. They feel the need to be above normal human language.

    3. And what are the feelings of the “committees”, “secretariats” and all the huge (and growing) bureaucracy of ethics and integrity? O, they want much more, they want (as every law-making and decision-making agency) to include in their law a clause allowing them to always have full freedom to make discretionary decisions, i. e., in one case, to say “we do not believe that the authors gave enough of new material” and in another case to say: “we believe that the authors have given enough of the new material”. Please, note here that no one except BioMedCentral, COPE and their global equals would have the right to contradict. That is the beauty of a discretionary decision for the committees and the horror of it for the scientists.

  8. Some of this brings to mind patent law issues. Talking to patent lawyers a while back, I was told that until a few decades ago it was common for big corporations to argue that some inventions were not such because after all they used relays, transistors etc… All well known electronic components. It was only after a certain point that lawyers started making the counterargument that these components were put together in novel ways and that was what counted as the invention. Similarly, for papers, I think of paragraphs as modules, essentially, with minor tweaks to account for context. I see no problem in reusing modules in different papers, if the whole is different (different scope, different organization). Programmers do that all the time. I think this discussion has been highjacked by lawyers and businesspeople and that many scientists are just internalizing these arbitrary rules, instead of realizing they make no sense.

  9. The whole concept of “Self-Plagiarism” (for duplicate publication) is nonsense. The term doesnt even make sense. It is a oxymoron.
    Why create a problem by using meaningless words….?? It is a copyright issue and that is it…not a academic misconduct etc…

    1. It most certainly is academic misconduct when you present something as new while it has already been published before, unless it is specifically mentioned that this is a duplicate publication. It’s artificial inflation of your output, and since your publication list is an often-used parameter to determine such things as promotions and getting funding, it amounts to unfair competition.

        1. Self-plagiarism and its close cousin, salami slicing, distort the literature and contribute to the background noise of anybody trying to find information. I hate it when a group publishes virtually the same material over and over again, because they’re taking up space that could be used by actual new voices and new ideas. Don’t even think of claiming that space is unlimited, else why would there be such a thing as a top publication venue?

          1. What space? Well, I happen to submit regularly in journals that have to reject up to 75% of all submissions due to limited “space”. They have print editions, which often have a defined maximum number of pages per year.

            Of course, the option is to then go to journals that have much more “space”, but those are often also the journals that have the smallest signal-to-noise ratio.

          2. Yes to a significant extent it comes down to that rather outmoded concept of quality. Most scientists I know and work with try to produce high quality impactful work and to publish this in good quality journals. Writing what is essentially the same review again and again seems entirely pointless – what is the value in that other than maybe an ego boost or CV padding? If a piece of work is of value it will be noticed and get its appropriate recognition (e.g. in terms of citations); not sure what’s to be gained from publishing it a second or third time…

            Interestingly (and tediously) here in the UK we’re in the latter stages of one of the periodic Research Assessment Exercise (aka REF) where one’s individual research impact is assessed on the basis of one’s 4 best publications over a 4 or 5 year period. So the fact that one might have written several various versions of a review and a huge bunch of papers in crappy journals with “lots of space” (!) counts for nothing. That should be the case for assessing publication records in search/appointment committees too. Science as a whole would be more productive if people only wrote papers/reviews when they had something truly novel to say/report. But perhaps that’s an old-fashioned viewpoint!

        2. Copyright just makes it easy to make a decision to retract: the other journal knows that it can cost them money if the first journal (or rather, the Publisher) goes to court.

      1. You are automatically assuming “guilty mind” (Mens rea) in regards to duplicate publication. It is very narrow interpretation, mainly propagated by publishing houses. Why is it difficult to accept that It can be driven purely by dissemination of knowledge. Authors own the work and they have the right to disseminate to larger audience as they seem fit. There are several justification for duplicate publication and it is not driven solely for profit (unlike publishing houses!).
        But I do still agree it is copyright issue.

        1. Self-plagiarism in my view contains deliberately disguising that something has been published before. There may be justifications for duplicate publications, but if the audience is not informed that it *is* a duplication, it constitutes self-plagiarism: presenting something as new that isn’t.

          If the duplication is openly indicated, it is not self-plagiarism.

          Copyright is in my opinion the least worrisome part.

          1. Some people don’t reuse their own text under any conditions for whatever reason. That is their choice. However, they don’t like if others do. The solution is obvious.

  10. Possibly old news to many of you, but the text-matching website eTBLAST (http://etest.vbi.vt.edu/etblast3/) checks for possible plagiarism well. I use it on my own papers (out of neuroses about duplication). It’s also fun to run suspect papers through if you have free time.

    I am in no way affiliated with eTBLAST. I learned of it when its creator (Skip Garner, I think; of course credit for creation also due others in his group) gave a talk where I work.

  11. Fact 1: There is a finite number of ways to say something in good and concise English.
    Fact 2: The number of published papers increase constantly.
    Conclusion 1: At some point, it will be impossible to write anything without plagiarizing some other paper, whether intentionally or not.
    Conclusion 2: Publishers need to cool off and find other ways to make money. As scientists now start holding the Copyright to their papers, things are beginning to change.

    1. I recommend a stiff dose of combinatorics.

      Even if you restrict yourself to only Basic English (http://en.wikipedia.org/wiki/Basic_English) an extremely conservative estimate gives around 10 to the 20th available short sentences of the same length as my first sentence in this post. We are in no danger of running out of novel 100-word abstracts.

      1. I said “good and concise English”, not random combinations of words. Plus, if they mean the same thing, there is no reason to write difference sentences to convey it.

        1. If we were only considering random combinations of words, the number would be 10 to the 30th, not 10 to the 20th. And that’s only for a single short sentence. Human language is so expressive that it can effectively say the same thing in a good and concise manner in a remarkably large number of ways.

          The odds of somebody accidentally reproducing somebody else’s paragraph to the precision required for a plagiarism charge are remarkably small for anything except the most formulaic methods section (which are generally specifically exempted).

          1. If people had no memory and they talked using random combinations of words, the odds would be low indeed.

  12. Copyright issue again? It’s not funny to uphold this purely commercial right that has only one thing related to science – its destroying effect on science, barely a month after this young man hanged himself over the “copyright issue”. Every journal office should have his portrait at the entrance.

      1. So, presumably the PLoS family, which are reasonable in some areas and rather poor in others.
        What else would you recommend?

        There’s an infinite number of people sending me notes asking for me to pay $1000 to put something in their new journal, but I wouldn’t want to taint my reputation by doing so.

        Conferences, for those fields where they are significant, are even worse, as you’ve often got a strict upper limit on the number of talks that can be accommodated in the program.

        1. The problem is that when you go into a new sub-field with few published papers, you can’t avoid duplication.
          Say for instance you analyze a new concept with 5 published papers and a 2-3% growth rate.
          You can’t publish 3 or more papers in this field without recycling your literature review in a significant way.
          Say you treat different types of fungi with CO2 and monitor their growth patterns.
          You have 6 experiments and 5 papers for your literature review.
          Ultimately they should discard literature review from the duplication process.

  13. So unless I have missed it in the comments above… what about the reader? If I was paying for two “different” papers, only to discover they were essentially the same and the author was recycling his/her own work, I would be thoroughly hacked off…

  14. I have noticed a case where a professor published the same paper, with only minimal variations, in five different publication venues, from 2005 to 2012. The University (University of Arkansas) was informed. The official in charge of research integrity stated that no “research misconduct” had occurred. Duplication is not considered “research misconduct” by the definitions of the federal Office of Research Integrity. The ORI is concerned with data falsification and such, which is certainly more serious than padding one’s CV by duplication. The federal office cannot be expected to police every aspect of academic integrity, but Universities certainly are expected to take a more comprehensive view of academic honesty. Ironically, they expect their students to live up to high standards of academic honesty that faculty (in this case at least) are actually not subject to. A student can be punished for submitting the same essay in different classes. For faculty members, at least at UA, there are no such rules!

    The editors of all the journals were informed by May 14. Only one responded initially. He was satisfied with the author claiming that the papers were actually distinct and focused on different aspects of the study. They did not. (All papers have identical summary of results sections). The journal finally retracted (http://www.scirp.org/journal/PaperInformation.aspx?PaperID=22024).

    On July 15, a representative of the publisher (Taylor and Francis) announced that an investigation would be conducted, which might take “two months or more”. The whole development sadly shows how little editors and academic officials still care about duplication. The COPE guidelines have proven to be very helpful in convincing editors to take the issue more seriously. I recommend to quote extensively from the guidelines whenever you are in the position to discuss duplication issues with editors.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.