Challenge accepted: A reader wrote a program to find fake references in books

Hermann/Pixabay

Following our coverage this summer of a book with citations that did not exist, we asked you to send us examples of other books with similar issues. One reader took the request as an assignment to find problematic texts.

Michał Wójcik, a Ph.D. student at the Free University of Berlin, saw a link to our article about the book on LinkedIn. “I started thinking that it shouldn’t be that hard to check those references automatically,” he said. “I decided to just spend some time on it, and I had a prototype in a few hours.” 

The Python script he wrote searches through books to verify the existence of each citation by checking if the DOI existed in Crossref. He told us he manually checked citations the script couldn’t identify by looking in other databases and searching on Google Scholar, which takes him between an afternoon and a whole day. 

To pick the books, he said he searched for books published in 2025. “To various degrees of certainty, I checked 22 books,” he said. The two he sent to us “were particularly bad, I would say.”

The first of the two texts is about urban planning for sustainable smart cities. It includes citations with substantial errors or to works that do not exist. The second, on energy storage, describes different technologies and their applications for electrical grids, and was translated, citations and all, into English by artificial intelligence. Both were published by Springer Nature, which also published the book we previously wrote about. 

As we described in our coverage earlier this year, large language models like ChatGPT often generate nonexistent and error-prone citations. Publisher guidelines often forbid wholesale generation of text by AI. However, AI may be used in other aspects of academic book publishing, like copyediting or translation, sometimes with disclosure requirements.

While Springer Nature started using AI to translate books experimentally in 2023, it is not the only publisher pursuing AI translations. In March, the publisher Taylor and Francis announced plans to start using AI to translate books into English. 

The publisher has retracted individual chapters and entire books for references that could not be verified. In total, the Retraction Watch Database contains over 240 books with one or more chapters retracted for a variety of reasons by their publishers.

Wójcik’s script identified 40 references missing DOIs in the urban planning book, Urban Morphology and Sustainable Smart Cities. We looked into the book ourselves by checking the first 32 citations and were unable to verify 11 of them. Four of these cite documents from the Indian government that have since been taken offline. We contacted the listed authors of the remaining seven works, four of whom responded and confirmed they did not write them or there were substantial errors in the citation. 

“I am not sure if this is a fake citation of mine or another author,” Bogdan Ibanescu, a researcher at the Centre for European Studies, told us. Ana Lavalle, an author listed in another citation, confirmed she did not write the paper she was credited with. 

Similarly, Manolya Kavakli-Thorne, a professor at Aston University in England, confirmed she did not write an article matching the one cited with her name. She told us one of the listed coauthors is her former Ph.D. student with whom she published multiple papers. “The names involved as authors are from one of those publications,” she said. “I believe this is an example of hallucinations of deep learning.”

Other citations had unusual errors. One citation correctly included an article’s title and journal, but added two extra authors. “That article was written by me only. Not sure why other authors were listed,” Jack Ahern, a professor emeritus at University of Massachusetts Amherst, told us.

One of the authors of this urban planning book, Gouri Sankar Bhunia, is an expert in remote sensing and Geographical Information Systems, according to the book’s preface.

“The citations in question may have arisen due to inadvertent oversight during the compilation and cross-referencing of multiple sources, particularly while integrating notes from various drafts,” Bhunia told us by email.

When asked if generative AI was involved in the writing process, Bhunia said, “we have made very limited use of an LLM tool (such as ChatGPT) in this book, and only for minor purposes such as checking grammatical consistency and verifying citation formats. The use was strictly confined to copyediting support and did not involve generating substantive content.” 

Such use is permitted within Springer Nature’s AI policies, although the company states: “in all cases, there must be human accountability for the final version of the text and agreement from the authors that the edits reflect their original work.”

Bhunia said all the citations in the book “were prepared and verified by academic professional[s]” and not generated by AI. “While I am aware that citation inconsistencies can sometimes appear in manuscripts, I can assure you that any such errors, if present, are unintentional and purely human in nature.”

The spokesperson for Springer Nature told us the publisher was already investigating both books before we contacted them, “with the majority of enquiries having been completed at the end of August.” For the urban planning book, “a correction is being issued alongside the retraction of three chapters,” the spokesperson said.

The second book, Electrical Energy Storage Technologies and Applications, was published in Chinese in 2020 and translated into English this year. The preface of the book states, “the translation was done using artificial intelligence. A subsequent revision was performed by the author(s) to further refine the work and to ensure that the translation is appropriate concerning content and scientific correctness.”

The citations are in English despite primarily referencing Chinese-language sources. Many do not include DOIs or functioning hyperlinks to the original sources. We were unable to find many of the original works from the references listed in the English-language version of the book. “I think the point of scientific references should be assuring that the reader can get to the resources if desired,” Wójcik said. “That is usually not possible if one translates the titles, especially since many of the references do not provide a (working) link or DOI.”

The first author of this book, Xisheng Tang, did not respond to our requests for comment. All three authors are affiliated with the Chinese Academy of Sciences, according to the book’s preface. 

The publisher will issue a correction for the citations in this book, the Springer Nature spokesperson said. 

Errors in book citations “are rare and experienced by all publishers,” the spokesperson for Springer Nature told us. “They are typically identified through a combination of editorial review, peer feedback, and post-publication scrutiny. While some issues may appear straightforward externally, they can be complex to detect — particularly when they involve a number of nuances.”

Wójcik told us he has continued to work on his script to detect sham references. He started using it on papers as well as books, and has checked around 100,000 to date. His script flagged about 1 in every 300 papers, though he has only had time to manually verify 150 of them so far. “Some have small issues like unresolved DOI or DOI entered with a dot at the end, but most have nonsensical references,” he said. “I think such a problem should not exist in scientific literature, especially since links to Google Scholar are provided in the online version, and they are just useless.”


Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on X or Bluesky, like us on Facebook, follow us on LinkedIn, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].


Processing…
Success! You're on the list.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.