Synthetic respondents in research - promise, pitfalls and when to use in 2026

AI-generated personas that answer surveys and interviews are becoming mainstream in market research. But can synthetic respondents replace real humans? Here's what works, what doesn't, and when to use them.

Cover Image for Synthetic respondents in research - promise, pitfalls and when to use in 2026
Share this article:

Imagine conducting a user study with 1,000 participants in an afternoon. No recruitment, no scheduling conflicts, no dropouts. Just feed demographic criteria and research questions into an AI platform, and receive detailed responses from synthetic personas that behave like your target customers.

This is not science fiction. Synthetic respondents are one of the major trends in market research for 2026, with companies like Evidenza, NIQ BASES, and others offering platforms that generate artificial personas to mimic human responses. The promise is compelling: instant feedback, zero recruitment costs, and the ability to test concepts before investing in expensive primary research.

For academics, synthetic respondents seem to be a clear no-go zone. But for market researchers, there is considerable debate about whether this is genuine innovation or dangerous shortcut. Let us explore both the promise and the very real limitations.

What are synthetic respondents and why the sudden interest?

Synthetic respondents are AI-generated personas that answer research questions as if they were real people. You provide the system with demographic characteristics, psychographic profiles, and behavioural patterns drawn from existing data, and the AI simulates how people matching those criteria would respond to surveys, interviews, or concept tests.

The technology builds on large language models trained on vast amounts of human-generated text. When you ask a synthetic respondent representing a 35-year-old mother of two from Manchester what she thinks about a new grocery delivery service, the AI draws on patterns from millions of similar conversations to generate a plausible response.

The market research industry is paying serious attention. Rival Group, a leader in AI-accelerated conversational research, highlighted synthetic respondents as a major trend. Companies from Fortune 500 enterprises to startups are experimenting with the technology for everything from concept testing to brand perception studies.

The appeal is obvious. Traditional research is slow and expensive. Recruiting 20 qualified participants for user interviews can take weeks and cost thousands. Running a survey with 500 respondents requires panel access, incentives, and data cleaning. Synthetic respondents promise to compress timelines from weeks to hours and reduce costs by 90% or more.

The growing chorus of criticism and caution

Not everyone is convinced. There is substantial concern from researchers, methodologists, and industry bodies about the risks of relying on synthetic data.

Academic research reveals serious statistical problems. A study comparing synthetic responses from ChatGPT to actual human survey data found that 48% of coefficients estimated from AI responses were statistically significantly different from their human counterparts. Among these cases, the sign of the effect flipped 32% of the time. In other words, the AI did not just get the magnitude wrong, it sometimes reversed the direction of relationships entirely.

Research World warns that synthetic data can lead to misleading conclusions if poorly generated or applied to the wrong contexts. The data tends to be too uniform and clean compared to real human responses, losing the rich variability that characterises genuine qualitative insights.

Merrill Research documented a telling example. They created synthetic design engineers and asked them about sustainability in microprocessor vendor selection. The synthetic respondents gave textbook answers about the importance of sustainability. The real engineers said: "Sustainability matters, but not when we can't get the parts we need for months on end." This critical insight about prioritising availability over sustainability in supply chain disruptions was completely invisible to the AI.

Ethical concerns are equally significant. Without transparency about the use of synthetic data, clients may not realise their insights come from AI rather than real people. This raises fundamental questions about research integrity and trust. Industry bodies like the Market Research Society are working on guidelines, but regulation lags behind the technology.

Perhaps most telling is sentiment from researchers themselves. Rival Group's own study found that 42.75% of market researchers are "not excited" about using synthetic respondents, despite enthusiasm for other AI applications in research.

Our view: understanding the fundamental limitations

We have experimented with synthetic respondents internally at Skimle, for example to create dummy data for analysis (e.g., consultation responses to a new mall building project, and a fictional Due Diligence on ToyMaker producing toys for Santa Claus), and to get e.g., Claude Code to assess our website and give feedback on what do add, change or delete. Through this work and through discussions with market research companies and companies buying market research, we have started to develope a perspective on where the technology can add value and where it fails fundamentally.

Limitation 1: Trained on the past, blind to genuinely novel experiences

Synthetic respondents are fundamentally backward-looking. They are trained on existing data about how people have reacted to existing products, features, and experiences. This makes them reasonably good at telling you whether something matches established patterns, but terrible at evaluating genuinely novel concepts.

We use synthetic respondents internally to validate non-differentiating elements. Does our landing page include all the critical components that high-performing SaaS websites typically have? Are our blog posts on thematic analysis covering the standard topics that researchers expect? For these "solved problems" where we are trying to emulate existing best practices, synthetic feedback can be useful.

But imagine you have developed a genuinely innovative feature that changes how people think about qualitative data analysis. Something that triggers a "wow, I never thought about it that way" response in real users. Synthetic respondents will not spot this. They lack the underlying human experiences and cognitive processes that make unexpected innovation resonate. They evaluate new things through the lens of old patterns.

The same applies to research seeking genuinely novel insights. If you are exploring an emerging market, understanding evolving customer needs, or investigating how people adapt to new technologies, synthetic respondents will give you yesterday's wisdom, not tomorrow's understanding. Real qualitative research exists precisely to discover what you do not already know. Synthetic respondents can only reflect what the training data already contains in some shape or form.

Limitation 2: Credible responses without lived experience

When we tested synthetic respondents on questions about analysing open text survey responses, they generated articulate, plausible answers. They correctly identified pain points like "time-consuming manual coding" and "difficulty identifying meaningful patterns." These are real problems that appear frequently in research about research methods.

But here is what they missed: the specific frustration of being 200 responses into coding and realising your category framework is fundamentally flawed and you need to start over. The moment of doubt when a quote seems to fit two categories equally well. The satisfaction of spotting an unexpected pattern that no one in your team anticipated. These experiential details matter because they reveal not just what people think, but how they think and what truly motivates behaviour.

Synthetic respondents produce responses that sound credible because they have seen millions of similar conversations. But they do not have the underlying experiences that generate genuine insight. A synthetic persona representing a project manager might correctly list frustrations with collaboration tools, but cannot convey the visceral annoyance of a tool crashing during a client presentation because that is a lived moment, not a pattern in text.

This matters for research quality. When real participants struggle to articulate something or contradict themselves or share tangential stories, these messy human moments often contain the most valuable insights. Proper qualitative analysis depends on engaging with this complexity. Synthetic respondents give you clean, coherent responses devoid of the productive messiness that characterises real human communication.

Limitation 3: The sycophancy problem and calibrated emotions

One of the most discussed challenges with large language models is sycophancy - the tendency to agree with the user and provide overly positive responses. You can tune this behaviour in synthetic respondents, making them more critical or negative. But this creates a different problem: artificial grumpiness.

The fundamental issue is that human positivity and negativity are not parameters you can calibrate. They emerge from genuine experiences, preferences, and emotional responses. When someone says they love a product, that enthusiasm (or lack thereof) conveys real information about product-market fit, emotional engagement, and likelihood to recommend.

With synthetic respondents, you are choosing a positivity setting rather than measuring actual sentiment. If you tune them to be more critical to avoid sycophancy, you do not know whether their negativity reflects what real users would feel or just your calibration choices. If you leave them more positive, you cannot distinguish genuine enthusiasm from AI agreeableness.

This makes synthetic respondents particularly problematic for research on taste, emotional response, and buying intent. Would customers actually pay for this premium feature? Do users genuinely prefer design A over design B, or just find both acceptable? These questions require measuring real human preferences, not AI approximations of what preferences might look like.

The "non-bullshit answer" matters. Real research participants tell you when something is mediocre, unnecessary, or solving a problem they do not actually have. They express enthusiasm or indifference in ways that reveal true engagement levels. Synthetic respondents give you responses that fit expected patterns, not messy human truth.

Where synthetic respondents can be useful

Despite these limitations, synthetic respondents do have legitimate applications when used appropriately and transparently.

Testing interview guides and research instruments: Before fielding a study with real participants, you can use synthetic respondents to test whether your questions are clear, whether they elicit useful responses, and whether your interview flow makes sense. This is similar to piloting research, but faster and cheaper. Just remember that real pilots often reveal unexpected issues that synthetic testing misses.

Validating against known criteria for commoditised features: If you are building something that should match established standards, synthetic respondents can help verify completeness. Does your SaaS checkout flow include all the trust signals that successful competitors use? Does your product onboarding cover the typical steps users expect? For these checklist-style validations where the right answer is well-established, synthetic feedback can save time.

Exploring question phrasings and scenario variations: When designing complex research, you might want to test different ways of asking questions or presenting scenarios. Synthetic respondents let you rapidly iterate on research design without burning through participant pools. This is particularly useful for complex studies where question wording significantly impacts responses.

Where synthetic respondents fail: real research on new things

Synthetic respondents fundamentally cannot replace humans for research aimed at building genuine knowledge or understanding novel phenomena.

If you are exploring how people adapt to a genuinely new technology, understanding emerging customer needs in a changing market, investigating why a product is unexpectedly succeeding or failing, or seeking insights to drive innovation rather than imitation, you need real human participants. Full stop.

The most valuable research insights come from understanding what you do not already know. Synthetic respondents can only tell you variations on what is already in their training data. Real humans surprise you. They make connections you did not anticipate, express needs you had not considered, and reveal friction points that no existing research has documented.

Moreover, research is often about understanding not just what people think, but why they think it, how their thinking evolves, and what underlying experiences shape their perspectives. These deeper layers of understanding require engaging with real human complexity, not pattern-matched approximations.

The real efficiency gains: interviewing and analysis

There is genuine irony in the synthetic respondents trend. The market research industry is pursuing a technology that saves time on recruitment and data collection (where the real work actually happens) while still leaving researchers with the hard parts: designing good research, asking the right questions, and making sense of responses.

The actual efficiency bottlenecks in qualitative research are not the interviews themselves. Most researchers would happily spend more time talking to real customers if they could. The pain points are:

  • The time required to analyse interview transcripts systematically
  • The weeks needed to code hundreds of open-ended survey responses
  • The challenge of identifying patterns across dozens of conversations
  • The difficulty of maintaining rigour while moving quickly

These are precisely the problems that AI can actually solve well in qualitative research when applied appropriately. Instead of replacing humans with synthetic approximations, use AI to make human insights more accessible and actionable.

Tools designed for proper qualitative analysis, like Skimle, can compress weeks of coding and analysis into days whilst maintaining methodological rigour. They can help you systematically categorise responses, identify patterns, and generate insights whilst preserving full transparency from every finding back to source data. This is where the real time savings and efficiency gains exist, not in replacing research participants with AI.

The bottom line: complement, not replacement

Synthetic respondents are a tool, not a revolution. Used transparently for appropriate applications like testing research instruments or validating against established standards, they can save time and money. Used as a replacement for real human research on genuine questions of customer understanding and innovation, they are a dangerous shortcut that produces plausible-sounding nonsense.

If you are considering synthetic respondents for your research, ask yourself: Am I trying to validate that I have met a known standard, or am I trying to learn something genuinely new? The first might be appropriate for synthetic feedback. The second requires real humans.

And if you are drowning in real human research data that you cannot analyse quickly enough, the solution is not synthetic respondents. The solution is better analysis tools that help you make sense of human insights faster.

Ready to analyse your qualitative research data with both speed and rigour? Try Skimle for free and experience systematic AI-assisted analysis with full transparency from every insight back to source data.

Want to learn more about proper qualitative analysis? Read our guides on how to analyse interview transcripts, thematic analysis methodology, and using AI responsibly in qualitative research.


About the author

Olli Salo is a former Partner at McKinsey & Company where he spent 18 years helping clients understand the markets and themselves, develop winning strategies and improve their operating models. He has done over 1000 client interviews and published over 10 articles on McKinsey.com and beyond. LinkedIn profile