Two major publishers will remove more than 120 papers created with random paper generator SCIgen, according to Nature.
Richard van Noorden, who has the scoop, reports:
Over the past two years, computer scientist Cyril Labbé of Joseph Fourier University in Grenoble, France, has catalogued computer-generated papers that made it into more than 30 published conference proceedings between 2008 and 2013. Sixteen appeared in publications by Springer, which is headquartered in Heidelberg, Germany, and more than 100 were published by the Institute of Electrical and Electronic Engineers (IEEE), based in New York. Both publishers, which were privately informed by Labbé, say that they are now removing the papers.
Although it’s unclear who submitted the papers, or why, it’s hard not to see the revelations as as the ying to the yang of John Bohannon’s sting of open access publishers that appeared in Science in October. Bohannon, posing as a fake academic, got half of a group of more than 300 journals to accept fake papers. From today’s Nature story:
Labbé emphasizes that the nonsense computer science papers all appeared in subscription offerings. In his view, there is little evidence that open-access publishers — which charge fees to publish manuscripts — necessarily have less stringent peer review than subscription publishers.
Indeed, as we and many others pointed out at the time, Bohannon didn’t include any traditional journals. As we noted:
…Retraction Watch readers may recall that it was Applied Mathematics Letters — a non-open-access journal published by Elsevier — that published a string of bizarre papers, including one that was retracted because it made “no sense mathematically” and another whose corresponding author’s email address was “[email protected].”
With about 500 retractions per year in 2012 and 2013, these 120-plus — if they show up as retractions in databases — could make 2014 another record-breaking year.
Bonus: We’ve authored a paper on SCIgen, “A Case for Von Neumann Machines.” Go ahead, cite it, help Ivan’s h-index!
Hat tip: Allison Stelling
I don’t know what will happen in the case of Springer, but there’s a line in the story about what IEEE have done: ‘the web pages for the removed articles give no explanation for their absence.’
For example, here’s one (the paper ‘TIC: a methodology for the construction of e-commerce’, mentioned in the story): http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6626010&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6626010
As you can see, it just says ‘page not found’. (Although, this paper still has a DOI and is indexed in Scopus, last I checked).
If all the papers are treated like that, then none of these show up as retractions. They simply vanish. Then again, they are entirely worthless nonsense.
The paper is still listed in the Table of Contents pdf for the 2013 QR2MS Conference (p. 33):
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6625522
The proceedings runs to around 2170 pages, and maybe 500 papers!
Labbe found “more than 120 papers in more than 30 proceedings”. If even a few are as a big as QR2MS, that’s a pretty low rate of fake papers.
Strange, but wow! This may help me to understand why some extremely wrong paper was published.
Computer generated papers? We know who did it. Cylons.
Computer generated paper is an interesting term, and while it currently equates with garbage and “randomly derived paper” I think there should be a better term for what’s been done here.
Consider: The term computer derives originally from work done by humans sitting in a room. The “computers” worked on things like calculating angles and velocities for projectile weapons during wars. If a computer were to write a paper, why not publish it? Many grad students are in fact glorified computers.
Second, more importantly… while these particular papers are apparently junk and not authored by their claimed author, it’s quite conceivable that a (silicon) computer will, at some point, author a very fine scientific paper without much aid from a person. Perhaps the best papers in 50 years will be computer generated entirely.
Finally, humans can also generate crap papers randomly using the same algorithm that these computers did… perhaps even truly randomly (with a bunch of D20s!) while the computer can only do it pseudorandomly.
“Many grad students are in fact glorified computers.”
Computers are great at doing boring number crunching that the human brain is terrible at anyway (addition, subtraction, long lists of numbers, matrix work, etc). But, they rarely arrive at “elegant” mathematical solutions for things like….differential equations, abstract algebra, topology…those sorts of things. (Far beyond my ken!)
Computers will likely replace a lot of grad students- just look at PCR! Used to be someone’s thesis work to clone a gene, nowadays companies do it!
Computers can, will, and ought to take over a lot of things: but, hopefully, this will lead to a place and a type of society where “work” is not an absolute requirement for having essentials like “heath care”. (I may be being a bit optimistic, here.)
Remembering that William Gosset who wrote under the name “Student” has been working at the brewery of Guinness in Dublin for most of his life, an email address at Budweiser is not utter nonsense.
The apps.pdos.lcs.mit.edu link is not working from here…
Fixed, thanks.
Ivan – am curious about Figure 4 – it appears egregious at first glance, rather than just impossibly obscure as the other figures and most of the text.
How are the figures generated, in SCIgen?
More about SCIgen here: http://pdos.csail.mit.edu/scigen/ All it asks users to input are names of authors.
This bit reads funny “Bohannon, posing as a fake academic, got more than 150 papers accepted at more than 300 journals.”
Unless he submitted to 300 and only 150 accepted, or he got each paper accepted in 2 journals, it just sounds odd. I thought he got the same paper accepted into a ton of different journals, and he targeted 300 overall, but not all accepted the paper? Either way, as written it seems incomplete.
Fixed, thanks.
The nature piece states “published conference proceedings”. These are in many ways more like open-access journals than like traditional subscriptions journals. They are effectively pay-to-publish venues (just like most open-access journals and unlike subscription journals). The payment is just called “registration fee” rather than “article processing charge”.
I do not see why it is amusing (although I did chuckle) that the address is “[email protected]”. For the non-statisticians here, the t-test was originally presented by “Student”, the nom de science of W Gossett. He worked at Guinness, and was not allowed to publish work-related data. His research is very important. Industrial chemists, statisticians, and scientists have and will make important contributions.
Linked to IEEE, it is interesting to note how companies like Editage proudly trumpet the names and papers of authors whose papers they have revised, including by Kazuo Yamamoto (Analytical surveys of transient and frequency-dependent grounding characteristics of a wind turbine generator system on the basis of field tests), published in IEEE Transactions on Power Delivery, boasting an impact factor of 1.208. See here:
http://www.editage.com/papers-published/environmental-science-energy-and-sustainability
I wonder a few things:
a) How do IEEE publications’ impact factors get penalized, if at all, by Thomson Reuters, in cases such as this, where quality control clearly broke down or simply wasn’t existent at the time of “peer” review.
b) If such PC-generated papers can be generated so easily, but appear to contain so much nonsense, then can IEEE please provide the peer reviewer reports of those 120 papers so that we can see what kind of “peers” approved these studies? The same applies to Springer.
c) Do papers that have used Editage acknowledge this company in the acknowledgements? If not, then this would be a serious case of ethical breach by authors and publishers (the non-declaration of guest or ghost authors, i.e., companies that have edited the paper, but have not been declared).
d) This is a case of where a PC was used to generate false papers. It is scary to imagine what the human can do to manipulate and generate false data.
The fraud is so wide, and so broad, will retractions even be a solution to fix this mess?
This links to a deeper discussion on ghost authorship and/or the failure to acknowledge those who have provided any material support, and how it is considered to be highly unethical by many publishers:
http://retractionwatch.com/2014/01/21/is-it-ethical-to-ghost-write-a-paper/#comments
Considering this fact, one would have to now examine, in detail, the Acknowledgements of ALL papers listed by Editage here:
http://www.editage.jp/files/journal-accepted-test.csv (open access after downloading, no idea of the date and validity of the data), but enough to initiate an investigation.
The key question is, if one finds a paper that was edited by Editage, and if Editage was not acknowledged, then can a case for retraction be made based on false declarations by authors and the existence of ghost authorship?
Hello, I’m writing on behalf of Editage. We have checked all possible avenues and haven’t found any evidence that indicates that the paper you mentioned, namely, “Analytical surveys of transient and frequency-dependent grounding characteristics of a wind turbine generator system on the basis of field tests,” deserves deeper public scrutiny. Further, it is definitely not one of the gibberish papers that IEE has retracted.
As a company, we value ethical compliance without compromise, and take all practical measures to systemically implement good publication practice. Please view Editage’s commitment, publication ethics guidelines, and various resources we provide to educate authors about good publication practice:
http://www.editage.com/publication-support/ethics
http://www.editage.com/ethics-guide
http://www.editage.com/insights/categories/publication-ethics
We also provide a certificate of editing to all our authors and encourage them to submit this to the journal along with their manuscript. Note that language editing does not amount to ghostwriting, nor does providing suggestions to improve the presentation of a manuscript. In fact, a large number of journals and publishers understand the difficulty that non-native English-speaking authors face in getting published in international journals, and now recommend that these authors get their manuscript edited by one of the many available commercial editing companies before submission. While we encourage our authors to acknowledge their use of Editage’s service to improve their writing, such acknowledgement of editing services is not mandated by standard publication ethics guidelines, such as those of the ICMJE (http://icmje.org/) and COPE (http://publicationethics.org/).
If there’s anything we’re unaware of that would support the point you are making, we would request you to let us know so that we can look into the matter. I must reiterate that our sole objective in reviving this thread is to strengthen our internal processes in order to stringently detect cases of ethical misconduct.
Clarinda Cerejo,
Managing Editor, Scholarly Communications
Editor-in-Chief, Editage Insights
Editage
Dear Clarinda, nice to hear from you and thank you for taking the time to respond and to provide some background about your company’s stance on ethics. Do you request your clients to acknowledge Editage Insights in the acknowledgements section? For example, did the Yamamoto paper acknowledge your company in the acknowledgements? Can you indicate, seeing that you are clearly aware of the final outcome of all of your clients’ papers, the exact percentage of papers that have actually acknowledged Editage in the Acknowledgements. A few links to specific papers with an acknowledgement to Editage, let’s say 5 or 10, preferably in open access format, would then serve as a good example, to appease my criticism of Editage (and other English revision companies).
As I mentioned before, we do encourage authors to acknowledge Editage, but since this practice is not mandated by journals or publication ethics guides, we do not follow-up to ascertain the acknowledgment. A simple Google search could help you identify papers where authors have chosen to acknowledge us, or other editing companies for that matter. We see no worth in pursuing any further analysis or debate on the matter, as we are clearly digressing from the thrust of the original post about the retraction of nonsense papers.
Good points above.
The whole mess is sad because IEEE is (and still is, I might add) a highly respected instance, not just as a publisher for engineering and computer science stuff, but also as a standard body and whatnot. If such a huge amount of garbage gets published, the whole computer science practice of conference-driven publishing should take a long look in the mirror. As Ivan hints in the story, if this is the truly record breaking quality level, what is the point of even publishing these conference proceedings? No one should read them after this since the probability for human-generated bad stuff is too high. Or at least read very very carefully, take your pick.
What interests me most is whether these “Conference proceedings” are claimed to be peer-reviewed. I have always assumed that proceedings do not count as peer-reviewed and that they usually will publish anything if the keywords are somehow related to the conference topic (as obviously happened in the above cases). But some proceedings seem to claim that they are fully peer-reviewed. In publication lists, proceedings are usually grouped apart from peer-reviewed journals but some authors group them together to boost their output. Also I wonder how often proceedings papers are turned into regular publications, and whether this would be considered ethically permissible.
I would be interested to know what others think about these questions.
Uarktransparency, I wish to provide comment in response to your query.
In this case, the most high profile of the horticultural communities, Acta Horticulturae, published by the International Society for Horticultural Science, the claims are pretty clear:
http://www.ishs.org/faq/acta-horticulturae-sound-peer-reviewed-journal
In the plant sciences, societal meetings and the proceedings they generate, are generally rigorously peer reviewed, so to find cases of duplication or plagiarism would then reveal a massive porosity of the system, and a great failure by the editors.
Just in case the ISHS decides to wipe out that page, I will copy, verbatim, the entire text here, so that we have an official copy (although I also keep a screen-shot as evidence):
“Is Acta Horticulturae a sound peer reviewed journal?
Answer:
Editorial policy of the ISHS regarding Acta Horticulturae® (ISSN 0567-7572)
•Acta Horticulturae exclusively contains research which has been presented at an ISHS symposium or at the ISHS Congress
•All papers, either oral or poster contributions, are subject to preliminary evaluation by the Scientific Committee of the meeting concerned. Therefore abstracts are necessary and contributors must provide their abstracts before they register.
•Contributions are presented at the meeting before an audience of peers.
•Final contributions are submitted by the Editor(s) to the Editorial Board for scientific review.
Each symposium has an Editorial Board. The Editorial Board includes the most eminent researchers in that particular field of research and this to guarantee a consistent quality of the scientific review process. Acta Horticulturae has an history of over 50 years in independent, non-commercial, high quality scientific publishing. ISHS strongly believes in the value and impact of presenting before an audience of peers followed by a screening process by an Editorial Board which is composed of a selection of the most eminent scientists active in the particular field/subject of horticultural science.
Acta Horticulturae is included in the Thomson Reuters Web of Science® Conference Proceedings Citation IndexSM. For several – mainly publishing-technical reasons, Acta Horticulturae, like other series of proceedings is currently not covered by Thomson Reuters in the Science Citation Index (SCI); consequently the series has no Impact Factor following the standard procedure used by Thomson Reuters (formerly ISI). Nonetheless Acta Horticulturae is a sound scientifically reviewed proceedings series.”
In this case, uarktransparency, the person in charge of publications at the ISHS, Prof. Yves Desjardins (http://www.ishs.org/ishs-board) has stated publically, and very emphatically, “ISHS hold its authors to high ethical standards. When we come across cases of duplication, we are acting promptly and retracting the papers swiftly. I also thank you for pointing out the other case of duplication of results. I will inquire right away and request a retraction by the authors.” in response to the duplicated paper by Hoshino et al.:
https://www.jstage.jst.go.jp/article/plantbiotechnology1997/15/1/15_1_29/_article
Plant Biotechnology Vol. 15 (1998) No. 1 P 29-33 (open access)
Production of Transgenic Grapevine (Vitis vinifera L. cv. Koshusanjaku) Plants by Co-cultivation of Embryogenic Calli with Agrobacterium tumefaciens and Selecting Secondary Embryos
Yoichiro HOSHINO, Yan-Ming ZHU, Masaru NAKANO, Eikichi TAKAHASHI, Masahiro MII
The duplicate paper (text, data and tables) is (except for the figure):
Hoshino, Y., Zhu, Y.-.M., Mii, M., Takahashi, E. and Nakano, M. 2000. TRANSGENIC GRAPEVINE PLANTS (VITIS VINIFERA L.) PRODUCED BY SELECTING SECONDARY EMBRYOS AFTER COCULTIVATION OF EMBRYOGENIC CALLUS WITH AGROBACTERIUM TUMEFACIENS. Acta Hort. (ISHS) 528:361-366
http://www.actahort.org/books/528/528_51.htm
In fact, the ISHS already has taken decisive measures to clean up house with one retraction by Van Eeckhaut and Van Huylenbroeck:
http://www.actahort.org/books/961/961_15.htm
So yes, we should hold proceedings and such jorunals that claim to be peer reviewed and with “sound academic quality” fully accountable. I hope this case study may be of use. See more discussion on this here:
http://retractionwatch.com/2014/01/07/journal-dumps-grain-paper-for-controversial-data/#comments
Thanks for that interesting quote. It doesn’t really answer the question though. “Final contributions are submitted by the Editor(s) to the Editorial Board for scientific review.” It doesn’t say that there is actual peer review. Since you seem to be in the field, maybe you have personal experience with that review process?
Where I come from (analytical spectroscopy), conferences are *not* considered peer reviewed.
To get more specific: if you’re a chemistry Master’s student, listing a conference abstract is nice and is a way for the Feds to keep track of the money they spend on student education.
However, it won’t get you a job. You need a full paper authorship (preferably 1st author, of course)- with data and figures you can explain *without help* from your PI- to get a reasonably stable, middling position in an industrial lab. (Preferably, a couple of papers- the times are gettin’ though, here.)
So: no, conference proceeding aren’t the full peer reviewed article. Some do make you submit a full article (SPIE http://spie.org/x2584.xml does this); which I personally count as “pseudo-peer review”. I’ve noticed the process varies considerably between fields.
There’s of course a distinction between conference abstracts and proceedings. In the case of proceedings publishing papers, I am curious about this: is it normal for proceedings papers to end up in real journals? Is that even considered permissible? How is this handled?
With some fields, the proceedings *are* the “real journals”. The peer review happens during the conference, in a sense.
The conferences I’ve been to tend to treat the abstract and poster/talk as a “teaser” for the full blown article; which usually goes through the society’s journals in any case.
“The peer review happens during the conference, in a sense.” Doesn’t make sense to me. Conference attendees can’t judge the quality of the work from a 10 minute presentation.
If the proceedings are considered real journals, then a proceedings article cannot be also submitted to a journal. But then the authors need to get credit for the proceedings article (whereas there should be no publication credit for a mere abstract). How is that credit apportioned if there hasn’t been proper peer review? How is that dilemma solved? I pose that question to the whole community.
“Conference attendees can’t judge the quality of the work from a 10 minute presentation. ”
This really depends on the size of the conference. Many sub-fields do spend a lot of time with the students at poster sessions, and many talks are more like 20 to 30 min. But you’re right: it’s hard to tell data quality without the full blown manuscript, plus perhaps the raw data itself.
Also: your phrase “publication credit” is very interesting. What does “count”- and how does this tie into authorship and science evaluation- and funding, for that matter? How do we get credit to the right students and scientists in an efficient and accurate manner, now that science “output” is rapidly escalating? (It’s more like “peer skimming” at this point, there’s so many papers!)
As someone who has published in several IEEE conferences, let me interject with some information that may help to understand the context of these retractions.
Many of the conferences published by IEEE are indeed rigorously peer reviewed. This is standard practice for good quality computer science conferences, for instance, including but not limited to those published by IEEE. In fact, the peer review process of many computer science conferences is as rigorous as (or even more rigorous than) the peer review process of many peer-reviewed journals in other research areas. This makes sense if one understands that the papers published in these conferences typically contain substantial research results — as substantial as (or more substantial than) the results typically published in many journals in other areas. This publication model is rather different than the publication model with which many scholars in many other disciplines are familiar, which can lead to confusion such as is seen in this thread when scholars from other areas assume that the publication model is more or less the same across all disciplines.
However, while good quality computer science conferences (and journals) do have rigorous peer review, not all conferences (or journals) are good quality. In these conferences (and journals) there may be lax peer review or none at all. Nonetheless, they typically claim to have peer review because it is the standard in the field.
This leads me to a final point regarding IEEE. In my experience, IEEE sponsored conferences that claim to have peer review actually do have rigorous peer review. However, IEEE has taken to publishing conferences proceedings from conferences that they do not sponsor. This gives these other conferences a publishing house and gives IEEE an additional revenue stream. It also gives these other conferences (and authors who publish in them) an aura of legitimacy — “Look, we are associated with the highly reputable IEEE!” — that may not be deserved in all instances. Without looking closely into the matter, it isn’t always easy to know which conferences are IEEE sponsored and which are merely published by IEEE. My suspicion here is that very poor quality conferences published by IEEE but not sponsored by it are most likely the source of these bogus papers. Sadly, IEEE seems to be trading its good name for dollars and, in the process, causing collateral damage to science and engineering.
This article reminded me of this cartoon:
http://www.phdcomics.com/comics.php?f=1417