Over the past few years, we have witnessed the rapid rise of a transformative technology. Much like the personal computing boom before it, AI is poised to reshape the world in ways that extend far beyond what we’ve already seen. In just the last year alone, AI models have expanded their capabilities at an astonishing pace.
For the marketing research industry, these advancements have brought both opportunities and challenges. One of the more concerning consequences has been the rise of sophisticated survey bots—AI-driven models deployed by malicious actors to manipulate data for personal agendas or exploit survey incentives for financial gain.
This issue is not new. Even before the emergence of modern AI, survey botting was a persistent challenge, though the methods were far less advanced. However, today’s AI capabilities have made these fraudulent techniques significantly more formidable.
A moderately capable AI model can now not only determine which responses appear most human-like but also generate highly convincing qualitative answers for open-ended questions—once considered a reliable method for detecting fraud.
What’s more, these models are improving at an unprecedented rate. With each passing day, their responses become more difficult to distinguish from genuine participants, and their methods grow increasingly sophisticated. We’ve seen this evolution firsthand—just two years ago, ChatGPT struggled to produce a coherent story beyond three paragraphs. Now, it can generate pages of text that even AI detectors struggle to identify as machine-generated.
So, where do we go from here? How do we combat this growing threat?
The first step is to reevaluate the methods we’ve relied on in the past and determine whether they remain effective in today’s AI landscape.
In the article Online Surveys: Lessons Learned in Detecting and Protecting Against Insincerity and Bots by Amber Thomson and Rebecca Utz, the authors summarize traditional approaches to combating survey fraud and explore new techniques that may be necessary moving forward.
Below, I outline both the established methods that have been used in the past and potential future strategies that may prove effective.
Honeypot questions
Honeypot questions are designed to include subtle cues or instructions that a human would easily recognize but a bot might overlook.
A prime example of this is the “Choose this answer despite your opinion” question. While a genuine respondent would follow the instruction, an AI-driven bot—focused on generating logically consistent responses—may ignore it, revealing its fraudulent nature.
“To ensure data quality, please select ‘Somewhat Disagree’ as your answer to this question.”
Rate your agreement to this statement “I am an outgoing individual”
Answer Options:
- Strongly Agree
- Somewhat Agree
- Somewhat Disagree (correct answer)
- Strongly Disagree
Another example is the invisible question

Captcha
Despite its reputation, CAPTCHA is largely ineffective against most bots. A 2020 study published in The Quantitative Methods for Psychology found that even with CAPTCHA implementation, bots were still able to bypass these measures and participate in online surveys.
As AI continues to advance, CAPTCHAs will likely become even less effective, making it increasingly difficult to rely on them as a safeguard against fraudulent responses.
IP and VPN Tracking
While IP and VPN tracking can be effective against lower-effort survey bot attempts, determined botters have ways to bypass these measures.
Bots can leverage proxy services to generate a new IP address for each request, making it difficult to track and block them. Residential proxies are particularly concerning, as they route traffic through real users’ IPs, making bot activity appear legitimate and much harder to detect.
Comprehensive Recruiting
Among the older methods, recruiting specific individuals for surveys is perhaps the most effective at eliminating bot responses, as it allows for human verification at the recruitment stage.
However, the main drawback is cost. While this approach may be feasible for smaller studies with limited respondent pools, scaling it for larger surveys can be prohibitively expensive and difficult to implement comprehensively.
Open Ended Question Analysis
This technique typically involves analyzing respondents’ open-ended answers for logical consistency. For example, a respondent who claims to be pregnant is unlikely to mention going out drinking that same week. Inconsistencies like this can help flag potentially fraudulent responses.
Another emerging approach is using AI to detect AI. As AI-generated text becomes more sophisticated, AI text detectors are evolving just as rapidly. By identifying patterns in AI-generated responses, these tools aim to assess how likely a given answer is to have been created by artificial intelligence.

The Problem? Its not quite good enough. Even a general model like ChatGPT can make their responses undetectable. In the image below, I simply asked ChatGPT to “Make this story undetectable by AI detectors.”

A model made exclusively for circumnavigating these detectors is likely to be nearly undetectable.
Paradata Analysis
Let’s say we release a survey to a small town, asking residents about their opinions on local elected officials.
To begin, we can check respondents’ locations and IP addresses to verify that they are within (or at least traveling from a reasonable distance to) the small town. If a significant portion of responses comes from outside the area—or even from foreign countries—it would be a strong indicator of tampering.
Next, we can analyze response behavior. By assessing the average time taken per question and the average typing speed for open-ended answers, we can flag respondents who consistently fall several standard deviations outside the norm. Extremely fast response times, especially for complex questions, may indicate bot activity.
Finally, we can examine their answers. Are they responding in ways that align with their demographic group? While occasional outliers are expected in any dataset, respondents who are consistently outliers across multiple factors should be flagged as suspicious.
Individually, each of these behaviors might still be possible for a real respondent. However, if multiple red flags appear together, it’s reasonable to suspect that the response may not be genuine.
You might be thinking—none of these methods seem entirely foolproof. And in truth, you’d be right. There is no single solution to completely eliminate survey botting.
What we can do is remain vigilant.
Consider what a trained AI model might already know. Is there extensive behavioral data available on your target population? Could an AI realistically emulate your average respondent?
Has similar data been collected before? Is it publicly accessible?
Are your survey incentives valuable enough to attract outside actors? Could someone benefit from intentionally skewing your results?
Survey botting, like many technological challenges, is constantly evolving. But so are we.

Leave a comment