Are AI chatbots infiltrating online survey data? Not yet, says new study

Despite concerns some have raised about potentially compromised data, AI chatbots aren’t yet completing online research surveys widely, according to a new preprint.

The authors of the study, posted earlier this month on PsyArXiv, found that fewer than 1% of around 4,800 survey responses collected by 12 different companies contained text that was likely not written by a human. Among 400 responses from a 13th company, however, around 16% were flagged for possibly being completed by a chatbot.

The study used a novel detection tool created by the survey research company Prolific, which funded the project.

Since online data sampling began, online bots and faked responses have been a thorn in the side of researchers conducting analyses that rely on such data, with some evidence suggesting that fraudsters have become more sophisticated in their scamming methods in recent years. A 2025 study in PNAS found it was technically possible for AI chatbots to infiltrate online survey data.

In the new study, Prolific’s authenticity checker correctly identified all the 125 surveys completed by AI chatbots and didn’t misidentify any of the 124 surveys completed by humans, according to lead author Andrew Gordon, a staff researcher in behavioral science at Prolific in Dorset, England. “That gives us a lot of confidence in that measure,” he said. The survey was completed 25 times each by ChatGPT, Gemini, Claude, Perplexity and an internal AI agent created by Prolific.

Gordon said he isn’t critical of the PNAS study that found AI chatbots could infiltrate online surveys. But he deemed the media attention it received a “bit hyperbolic,” saying it “fueled a perception that is extremely divorced from the reality on the ground.”

Sean Westwood, the author of the PNAS study and a political scientist at Dartmouth College in Hanover, New Hampshire, told us he was approached to collaborate on Gordon’s study but declined. He noted Prolific’s financial stake in the outcome of the study, and said it’s not possible to verify the study’s claims because the firm has not made its authenticity checker available to external researchers.

The Prolific study employed survey participants with high approval ratings — people who had previously provided replies the company deemed reliable — Gordon said. Two of the authors, including Gordon, are employees of Prolific, a third is a past adviser to the firm, and the remainder are academics who told Retraction Watch they were not paid by the firm for their contributions.

Natalia Pinzón, an agroecologist and geographer at the University of California, Davis, who has studied the integrity of online survey data, called the new analysis “rigorous” and “well done.”

Pinzón, who was not involved with the research, said it was important to draw attention to the various ways bad data can be introduced, including when humans provide fraudulent responses, sometimes by using large language models (LLMs). “I think that the challenge is broader in terms of coordinated human fraud and sophisticated invalid responses that erode the quality of the survey,” she said.

The only troubling standout in the study proved to be Amazon Mechanical Turk (MTurk), whose data appeared to have been completed by chatbots 16% of the time. Gordon said MTurk has a history of “bot problems.” He added: “It’s kind of like the wild west of online sampling.”

“On nearly all the platforms that people are actually using, there’s very little evidence of AI agents taking part,” Gordon said. What’s more, the responses that were flagged for possible AI activity were “low consistency, low comprehension, low honesty,” Gordon said.

The results indicate the non-human responses probably stem from traditional scripted survey bots, which have existed for decades, and not new AI chatbots like ChatGPT and its ilk. “Since the early days of online sampling, you’ve had these very hacky scripted bots that can take surveys, and the data is rubbish,” Gordon said. “They’re very easy to identify.”

But to be sure they weren’t mistaken, Gordon and colleagues ran three AI chatbots on the survey to benchmark how well a chatbot would handle the questions. “They actually outperformed humans on nearly all measures, and they drastically outperformed the responses that got flagged by our detection,” Gordon said.

“Human data quality is the biggest problem in online research,” Gordon said. “It’s not AI agents.”

Pinzón said her research also suggests human data quality and survey fraud remains a bigger problem in online survey data than AI chatbots. That’s probably because humans are incentivized and motivated to take part in surveys even when they don’t meet the criteria for inclusion, she said.

But the possibility that humans are introducing bad data doesn’t mean we should abandon online survey data and revert back to only in-person interviews, which typically are less diverse and generalizable than online counterparts, according to Gordon. “I think that would be extremely regressive,” he said.

As for AI chatbots infiltrating online data, Gordon said it’s possible the picture could change quickly.

“When I started at Prolific I could relatively confidently predict what was going to happen in the next year in terms of online sampling,” he said. “I wouldn’t feel that confident predicting a couple of months in advance right now, just because things are changing so quickly.”

Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on X or Bluesky, like us on Facebook, follow us on LinkedIn, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].

Are AI chatbots infiltrating online survey data? Not yet, says new study

Related

Leave a ReplyCancel reply

Share this:

Related

Leave a ReplyCancel reply