Springer Nature book on machine learning is full of made-up citations

Would you pay $169 for an introductory ebook on machine learning with citations that appear to be made up?

If not, you might want to pass on purchasing Mastering Machine Learning: From Basics to Advanced, published by Springer Nature in April. 

Based on a tip from a reader, we checked 18 of the 46 citations in the book. Two-thirds of them either did not exist or had substantial errors. And three researchers cited in the book confirmed the works they supposedly authored were fake or the citation contained substantial errors.

“We wrote this paper and it was not formally published,” said Yehuda Dar, a computer scientist at Ben-Gurion University of the Negev, whose work was cited in the book. “It is an arXiv preprint.” The citation incorrectly states the paper appeared in IEEE Signal Processing Magazine.

Aaron Courville, a professor of computer science at Université de Montréal and coauthor on the book Deep Learning, was correctly cited for the text itself, but for a section that “doesn’t seem to exist,” he said. “Certainly not at pages 194-201.” And Dimitris Kalles of Hellenic Open University in Greece also confirmed he did not write a cited work listing him as the author.

The researcher who emailed us, and asked to remain anonymous, had received an alert from Google Scholar about the book, which cited him. While his name appeared on multiple citations, the cited works do not exist.

Nonexistent and error-prone citations are a hallmark of text generated by large language models like ChatGPT. These models don’t search literature databases for published papers like a human author would. Instead, they generate content based on training data and prompts. So LLM-generated citations might look legitimate, but the content of the citations might be fabricated. 

The book’s author, Govindakumar Madhavan, asked for an additional “week or two” to fully respond to our request for comment. He did not answer our questions asking if he used an LLM to generate text for the book. However, he told us, “reliably determining whether content (or an issue) is AI generated remains a challenge, as even human-written text can appear ‘AI-like.’ This challenge is only expected to grow, as LLMs … continue to advance in fluency and sophistication.”

According to his bio in the book, Madhavan is the founder and CEO of SeaportAi and author of about 40 video courses and 10 books. The 257-page book includes a section on ChatGPT that states: “the technology raises important ethical questions about the use and misuse of AI-generated text.” 

Springer Nature provides policies and guidance about the use of AI to its authors, Felicitas Behrendt, senior communications manager for books at the publisher, told us by email. “Whilst we recognise that authors may use LLMs, we emphasise that any submission must be undertaken with full human oversight, and any AI use beyond basic copy editing must be declared.” 

Mastering Machine Learning contains no such declaration. When asked about the potential use of AI in the work, Behrendt told us: “We are aware of the text and are currently looking into it.” She did not comment on efforts taken during Springer Nature’s editorial process to ensure its AI policies are followed.

LLM-generated citations were at the center of controversies around Robert F. Kennedy Jr.’s “Make America Healthy Again” report and a CDC presentation on the vaccine preservative thimerosal. At Retraction Watch, our cofounders were once cited in a made-up reference in an Australian government report on research integrity.  We’ve seen fake citations fell research articles, and our list of papers with evidence of undisclosed ChatGPT use has grown long and almost certainly represents only a fraction of those that actually do. 

The same day Behrendt replied to our query, Springer Nature published a post on its blog titled, “Research integrity in books: Prevention by balancing human oversight and AI tools.” 

“All book manuscripts are initially assessed by an in-house editor who decides whether to forward the submission to further review,” Deidre Hudson Reuss, senior content marketing manager at the company, wrote. “The reviewers – subject matter experts – evaluate the manuscript’s quality and originality, to ensure its validity and that it meets the highest integrity and ethics standards.”


Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on X or Bluesky, like us on Facebook, follow us on LinkedIn, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].


Processing…
Success! You're on the list.

One thought on “Springer Nature book on machine learning is full of made-up citations”

  1. I wouldn’t expect an answer from Springer any time soon. I reported a similar case of a book chapter which contained hallucinated references, including one which it attributed to me which doesn’t match anything I’ve actually written. It’s been 4 months now and I’m still waiting for their investigation to reach a conclusion.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.