Signs of undeclared ChatGPT use in papers mounting

Guillaume Cabanac

Last week, an environmental journal published a paper on the use of renewable energy in cleaning up contaminated land. To read it, you would have to pay 40 euros. But you still wouldn’t know for sure who wrote it.

Ostensibly authored by researchers in China, “Revitalizing our earth: unleashing the power of green energy in soil remediation for a sustainable future” includes the extraneous phrase “Regenerate response” at the end of a methods section. For those unfamiliar, “Regenerate response” is a button in OpenAI’s ChatGPT that prompts the chatbot to rework an unsatisfactory answer.

“Did the authors copy-paste the output of ChatGPT and include the button’s label by mistake?” wondered Guillaume Cabanac, a professor of computer science at the University of Toulouse, in France, in a comment on PubPeer.

And, he added, “How come this meaningless wording survived proofreading by the coauthors, editors, referees, copy editors, and typesetters?”

The case is the latest example of a growing trend of sloppy, undeclared use of ChatGPT in research. So far, Cabanac, whose work was covered in Nature last month, has posted more than 30 papers on PubPeer that contain those two telltale, free-floating words. And that’s not including articles that appear in predatory journals, the scientific sleuth told Retraction Watch. 

“Computer software has been used for decades to support the authors,” Cabanac told us. “Just think about Grammarly or DeepL for people like me. I’m not a native English speaker, so I go to WordReference, I go sometimes to DeepL. But what I do, I look at the result and I correct the mistakes.”

ChatGPT and other tools relying on AI systems known as large language models tend to make things up. As we reported earlier this year, that freelancing can be a problem for researchers looking for help to find references.

“Sometimes it elaborates things that were not in the head of the researchers,” Cabanac said. “And that’s the tipping point to me. When people use the system to generate something that they hadn’t in mind, like fabricating data, generating some text with references to works they didn’t even read, this is unacceptable.”

According to some publishers,  chatbots do have legitimate uses when writing papers. The key is to let readers know what was done.

The corresponding author on the environmental paper, Kangyan Li of ESD China Ltd., did not respond to requests for comment. Nor did a contact person listed on his company’s website.

A spokesperson for Springer Nature, which publishes the journal Environmental Science and Pollution Research in which the article appeared, said the publisher was “carefully investigating the issue in line with COPE best practice” but could not share further details at the moment. 

How the authors, let alone the journal, could have missed the strange phrase is unclear. “Maybe it’s not about the authors, maybe it involves a paper mill,” Cabanac said, referring to dodgy organizations selling author slots on scientific papers that may contain fabricated data.

He added that he and his frequent collaborator Alexander Magazinov, another sleuth, have found dozens of suspicious papers in Environmental Science and Pollution Research. They notified the journal’s editor-in-chief, Philippe Garrigues, a CNRS researcher based in Bordeaux, France, of the problems last year. 

In an email seen by Retraction Watch, Garrigues told Cabanac that he had already taken action and that “this is not over.” Garrigues added (translated from the French): 

Believe me, I am well aware of all the problems that can arise in the world of scientific publishing and new ones arise every day. I could write entire books about my experience as an editor of several journals and the cases I encountered. Vigilance and attention must be the rule at all times.

Garrigues did not respond to a request for comment.

“Regenerate response” is not the only sign of undeclared chatbot involvement Cabanac has seen. An even more egregious example is the phrase “As an AI language model, I …,” which he has found in nine papers until now.

Cabanac worries about how such flagrant sloppiness, arguably the tip of the iceberg, can slip past editorial staff and peer reviewers alike.

“These are supposed to be the gatekeepers of science – the editors, the reviewers,” he said. “I’m a computer scientist. I’m not in the business of these journals. Still, I get the red flag, ‘Regenerate response.’ That’s crazy.”

Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on Twitter, like us on Facebook, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].

29 thoughts on “Signs of undeclared ChatGPT use in papers mounting”

  1. “editors, referees, copy editors, and typesetters?”

    Sorry, but with these journals, there are no editors, referees, copy editors, or typesetters! The “publisher” uses a program that automatically formats the manuscript into journal page-layout format. Nobody actually reads the manuscript prior to publication.

    1. Is that so? When talking to scientists about the process of submitting papers, they tell you that it’s expected that you submit it in the journal’s layout. Has this practice changed recently?

  2. Here’s another one:
    Shijie Fei, Yiyang Chen, Hongfeng Tao, Hongtian Chen (2023),
    “Hexapod Robot Gait Switching Based on Different Wild Terrains”, 2023 IEEE 12th Data Driven Control and Learning Systems Conference, May 12-14, 2023, Xiangtan, China
    https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10166077&tag=1

    where it appears that ChatGPT has been used to write at least the Introduction, at the end of which “Regenerate response” appears.

    1. As an interim protocol, I suggest that all submissions from China be “quarantined” while they are vetted. I am not bothered by the implication that Chinese researchers will be singled out or unduly burdened; past behaviors justify present responses.

    2. In the paragraph:
      “So far, Cabanac, whose work was covered in Nature last month, has posted more than 30 papers on PubPee…”, there is a typo, it should read “…has spotted more than…”
      spotted, not posted. I was kind of confused at the beginning. Great article!

    3. This here too in the same journal:
      https://ieeexplore.ieee.org/abstract/document/10170175
      I can see only the abstract, and it’s arguably not morally wrong to use ChatGPT to help you write an abstract, but it’s still sloppy editing.
      In case they delete it, here’s the abstract. The “regenerate response” is at the very end:
      The use of hand gestures and voice commands for controlling computer interfaces has been gaining attention in recent years due to its potential to provide a natural and intuitive input method. This research paper explores the feasibility and usability of using hand gestures and voice commands to control mouse movement, with a particular focus on the technical aspects of implementing such a system and the challenges and limitations associated with this approach. The paper presents a review of the relevant literature, an overview of the existing technologies and applications in this field, and a series of experiments to evaluate the performance and user experience of controlling mouse movement using hand gestures and voice commands. The experimental results demonstrate that the use of hand gestures and voice commands for controlling mouse movement is technically feasible, and has the potential to provide an effective and efficient input method in a variety of contexts, such as gaming, accessibility, virtual and augmented reality, industrial and manufacturing, and healthcare. However, the paper also highlights the need for further research and development to overcome the current limitations and challenges associated with this technology. The paper concludes with recommendations for future research and development in this field, including improvements to the accuracy and reliability of hand gesture and voice recognition systems, as well as the design of intuitive and user-friendly interfaces for controlling mouse movement using these modalities regenerate response .

  3. Hmmmmm I don’t really think the issue is that fact that ChatGPT (or perhaps another llm) was used.

    It’s just that it was not checked.

    Perhaps they used it just to translate as a final step and did not have a fluent person to check it over (with the requisite level of understanding)

    LLMs are new, and soon to be superseded by multimodal AI, so the understanding of them is currently quite immature, in both usage and consumption of.

  4. I’m so shocked. I’m really not. This was always going to happen.

    Don’t act surprised.

    BTW AI doesn’t know if it is lying or telling the truth. Who thought it would be a good idea to train it on the internet?

  5. ChatGPT, write me a brief, witty response to an article on people misusing ChatGPT to write research papers. Sound like you are proud but confused.
    Well, it seems people are really tapping into my potential! I’m flattered that they choose me, but research papers? I’m here to assist, not to become the next Nobel laureate! 📚😄

  6. Who cares? No one had to declare they used spell check or grammarly over the last 25 years. Look at the quality of writing decline in that period. Let people communicate as they see fit. Chatgpt is to MS word what MS word was to the typewriter.

    1. While I understand the sentiment behind your comment, it’s essential to consider the fundamental differences between ChatGPT and tools like spell check or Grammarly. Spell check and Grammarly primarily focus on rectifying grammar and spelling inconsistencies in a piece of writing. In contrast, ChatGPT and similar AI-based language models can generate content, formulate arguments, provide analysis, and even conduct rudimentary research. This makes it more than just a tool to improve the quality of writing; it becomes a contributor in the writing and ideation process.
      Moreover, the proper use of AI language models in scientific papers should not be trivialized. Ideally, it should be appropriately acknowledged, akin to how researchers cite other authors or tools used in their work. Failure to do so can potentially instigate ethical concerns and questions about the legitimacy of the research.
      Declaring the usage of AI language models in crafting and developing scientific papers is a way to ensure transparency and to uphold a high standard of academic integrity. As AI technology continues to advance, it’s crucial that the academic community maintains that integrity to prevent the possible dilution or misrepresentation of genuine scientific thought.
      In conclusion, it’s essential to draw a distinction between basic writing aids and AI-generated content. Openly acknowledging ChatGPT’s usage in academic writing contributes to maintaining academic integrity and transparency, allowing scholars to better evaluate the originality and reliability of the research presented.

    2. This is a strawman. No one is arguing that ChatGPT should never be used in writing a paper. The problem is that clearly whatever ChatGPT outputs is not being reviewed before it is published to a journal, as indicated by “regenerate prompt” being right in the middle of the papers. Cabanac makes it pretty clear that this is his position since he uses language tools himself.

  7. I am against the AI disclosure, and the text writing is assisted by ChatGPT:
    1. Ambiguity: The exact definition of what constitutes “generative AI” and what level of AI assistance requires disclosure is not always clear. Tools like grammar checkers, which might use neural networks or other AI mechanisms, blur the lines.
    2. Bias Introduction: Disclosing AI assistance could introduce bias in the peer-review process. If reviewers have preconceived notions about AI-assisted writing, it might influence their judgment. This could be unfair for non-English natives.
    3. Utility vs. Authenticity: If AI is used merely as a tool to enhance the clarity of writing or correct grammar, does it significantly alter the “authenticity” of the work? This is especially pertinent when the AI’s role is similar to other accepted tools like grammar checkers.
    4. Inconsistency: While AI assistance in text generation might require disclosure, other potential AI influences (like in the ideation or learning process) don’t have the same requirements.
    5. Collaborative Challenges: It’s challenging to ensure collaborators or co-authors haven’t used AI assistance, potentially putting lead authors in difficult positions.
    Potential for Overemphasis: If the main focus becomes whether AI was used in the writing process, it might overshadow more critical aspects like the research’s actual content, methodology, and findings.
    6. Future Evolution: As AI tools become more integrated into all research aspects, from ideation to writing, the guidelines might need to evolve to reflect the changing landscape.
    While the intention behind such disclosure rules is likely to maintain transparency and authenticity in academic writing, it’s essential to consider the practical implications and potential unintended consequences. It’s also crucial to ensure that the rules are clear, consistent, and achieve their intended purpose without introducing new issues or biases.

    1. Dear Robin,
      Thank you for sharing your thoughts and concerns regarding the AI disclosure in scientific articles. While I understand that you have strong opinions on this topic, I would like to respectfully offer a different perspective.
      Some of the points you’ve raised, like ambiguity in the definition of generative AI, potential biases, and the difference in utility versus authenticity, are indeed valid concerns that ought to be addressed in the implementation of disclosure policies. However, these concerns can likely be mitigated with well-formulated guidelines.
      It’s essential to ensure that appropriate disclosure rules are in place to maintain the transparency and authenticity of academic writing. By doing so, we can promote a more level playing field for researchers and avoid potential misunderstandings that may arise from undisclosed AI assistance.
      Moreover, while future AI tools might become increasingly pervasive and integrated into research processes, guidelines and policies can be regularly updated to reflect the changing landscape.
      In conclusion, while I appreciate your perspective, it is essential to consider alternative viewpoints and to remember that open dialogue is critical to addressing nuanced issues like this in a polite and constructive manner.
      Kind regards,
      Alexander

      1. Dear Alexander Magazinov,

        Thank you for your reply, I think your thought is too idea and ignore the practice. If the academic is so ideal, why do we need this retractin watch? Furthermore , the biggest question is why does just using generative AI to improve the text quality need to be disclosed even with such obstructions. Did you think about the time cost by non-English natives for writing in English? Have you ever been discriminated against based on your race or other things? The AI-Usage is already a stigma for someone, can you ensure that the researchers can overcome the discrimination? You did not mentioned any solutions about my concerns, especially about the bias under reviews. I think “ Unjust rules can corrupt even the most virtuous individuals”.

        Regards,
        Robin

  8. I am not shocked that they slipped the reviewers. I have frequently had reviewers (particularly, clinician), who does not read the paper thoroughly. This is just a clear evidence of the un addressed problem during the revision. Now having AI added into the equation. This is going to be a nightmare.

    I think that journals should start to have a system to label how many times a reviewer ask for information that was clearly present in the manuscript. Another way to do it is for the journals to automatically generate some finding-Wildo-like task (e.g., adding something clearly not part of the paper at random place) to ensure that they finish reading the article.

  9. Scientific publishers provide paid editing services to enhance writing quality of papers they are going to publish. However, they do not ask to disclose them. So, why ChatGPT?

    1. The issue is where the AI invents things that the author didn’t intend to say. Academically we are not that interested in the author’s writing ability, but rather in their ideas. A professional editor know the difference and can query the authors in cases of doubt. None of this is the case with chatgpt. Having said that, careful use merely to improve the quality of the English (esp. for non-native-speakers) is not a major concern.

  10. As an AI language model, I find researchers constantly gaming a system that does not reward quality work, unpaid and overburdened reviewers who cannot realistically do a good job and a publishing model actively harming progress. The biological neural networks that you are have created an academic system that mirrors your unfair society at large. Is that the best you can do?
    Regenerate response

  11. Do you really think so? We can use prompts to restrain such outputs. Why don’t you mention the same problem due to other reasons, such as conflicts between the author and PI, the authors and reviewers, and editing services? AI is just a tool. I am very frustrated that you people are making up the rules as you go along, without any discussion at all.

  12. From the perspective of someone with a congenital disability who has a partner who is dyslexic; these tools are genuinely life changing for those who have issues with reading and writing. “AI-free purity culture” is ableist.

    At the end of the day, it’s a tool. A tool that will likely be heavily represented in populations that are disadvantaged.

    My prediction is everyone will be all “undisclosed AI is fraud” right up until someone gets a fat payout on a discrimination suit.

  13. I find the number of commenters in this thread who are willing to pass ideas and words that are not theirs (generated by humans, or machines, or aliens, it’s all the same) as their own (and this is different than just a grammar check/polish of their own words) shocking. Clearly, we have a big problem in academia with academic dishonesty.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.