How many interviews is enough for qualitative research? What the evidence says

Most guidance on qualitative sample sizes is vague. Here is what the actual research on data saturation shows — and how to decide for your specific study.

Cover Image for How many interviews is enough for qualitative research? What the evidence says
Share this article:

For most qualitative research, somewhere between 12 and 30 interviews will be sufficient for data saturation, depending on your research question, method, and the diversity of your participants. The two most-cited empirical studies on this — Guest, Bunce & Johnson (2006) and Hennink, Kaiser & Marconi (2017) — found that code saturation typically occurs between 9 and 17 interviews, while meaning saturation (the deeper level where new interviews add no new understanding) occurs between 16 and 24. Simpler, more focused research questions with homogeneous participants saturate faster. Complex questions with diverse populations require more.

If you want the short version for planning purposes: 15-20 interviews is a reasonable default for most applied qualitative research. Under 10 is usually too few unless the research question is very narrow; over 40 is rarely necessary and often signals a scope problem rather than a rigour one.

Why there is no single correct answer

The honest instinct behind the question "how many interviews do I need?" is usually a desire for certainty — a number you can put in a methods section or a project plan. The research on saturation tells us something both reassuring and mildly inconvenient: there is no universal number, but there are principled ways to decide.

The concept at the heart of all sample size thinking in qualitative research is data saturation: the point at which additional interviews stop producing new themes, codes, or understandings. When you are hearing the same things in different words, you have probably reached saturation for that research question with that population.

The challenge is that saturation is a property of the completed dataset, not something you can calculate in advance. You have to get there to know you are there — which is why the empirical studies on saturation are so useful. They give us calibrated estimates based on real data.

What the research actually says

The two most-cited empirical investigations of interview saturation are worth understanding in some detail, because they are often cited loosely ("12 interviews is enough!") in ways that strip out the nuance.

Guest, Bunce & Johnson (2006)

In a study published in Field Methods, Guest and colleagues analysed 60 interviews conducted with women in two African countries on a narrow, well-defined health topic. They tracked when new codes stopped appearing and when new thematic insights stopped emerging.

Their finding: code saturation occurred by the 12th interview in both sites. Within the first six interviews, the researchers had identified 80% of the codes that would appear in the full 60-interview dataset.

This is the source of the often-cited claim that "12 interviews is enough." What the citation usually omits:

  • The research question was narrow and well-defined
  • The participant population was relatively homogeneous
  • The researchers were coding for a specific health topic, not exploring an open theoretical territory
  • The study was conducted by trained qualitative researchers with extensive pre-fieldwork familiarity with the context

Extrapolating "12 interviews is always enough" from this study is like extrapolating "a 400m run takes under a minute" from a sprint study — true in context, misleading in general.

Hennink, Kaiser & Marconi (2017)

Hennink and colleagues, in a study published in Qualitative Health Research, made a critical distinction that the earlier literature had missed: the difference between code saturation and meaning saturation.

  • Code saturation: no new codes are appearing (new interviews are producing codes you have already seen)
  • Meaning saturation: the existing codes are fully developed and nuanced (new interviews are adding nothing to your understanding of what a code means, why it occurs, or how it relates to other codes)

Their findings:

  • Code saturation occurred between 9 and 17 interviews
  • Meaning saturation occurred between 16 and 24 interviews

The gap between these two numbers matters enormously. You might reach code saturation at 12 interviews and think you are done. But the meaning attached to those codes — the context, the nuance, the understanding of why participants experience something a particular way — may still be developing.

For academic research, where the depth of theoretical understanding is the output, meaning saturation is the relevant standard. For applied research in business, consulting, or policy, code saturation may be sufficient if the goal is to identify themes rather than develop theoretical explanations.

How sample size varies by method and purpose

The saturation research above applies most directly to semi-structured interview studies with a relatively focused research question. Different methods and purposes lead to different requirements.

Grounded theory

Grounded theory, in its classical form, does not specify a sample size in advance. You collect and analyse simultaneously, continuing until theoretical saturation is reached — the point at which new data no longer extends or refines your emerging theory. In practice, well-executed grounded theory studies typically involve 20-50 interviews, but the number follows from the theory's needs rather than being decided in advance.

Phenomenological research

Phenomenological studies aim to describe the essential structure of a lived experience. Because the goal is depth of understanding of a single phenomenon, sample sizes are typically smaller — commonly 6-15 participants. Creswell's frequently cited guidance suggests 5-25 for phenomenological studies, with the lower end appropriate when participants are interviewed multiple times.

Case studies

A case study focuses on a specific bounded context (an organisation, a project, a community). Interview numbers vary widely based on the case's complexity, but 10-30 interviews per case is typical. Multi-site case studies multiply this accordingly.

Applied qualitative research (consulting, UX, HR)

In applied contexts — exit interview programmes, user research, customer discovery, employee listening — the goal is usually actionable insight rather than theoretical contribution. Code saturation is the relevant standard, and 15-25 interviews with a well-structured guide often provides sufficient coverage of the main themes. See our guide on practical interview setup for how to structure applied interview programmes.

Diverse vs homogeneous populations

Research question diversity and participant diversity interact:

  • Narrow question, homogeneous population: saturation comes early (8-12 interviews)
  • Broad question, homogeneous population: saturation may require 15-20 interviews
  • Narrow question, diverse population: 15-20 interviews depending on how diverse
  • Broad question, diverse population: 25-40 interviews may be needed, sometimes more

If you are studying a specific experience in a specific organisational context (exit interviews from a single company), 15 interviews might be sufficient. If you are studying how different types of professionals experience AI tools across industries, 30-40 interviews may still leave you feeling you are missing perspectives.

Practical guidance for deciding your sample size

Here is a framework for working through the decision rather than guessing a number:

Step 1: Define your research question precisely. Vague questions produce vague sampling decisions. "Understanding customer satisfaction" is not a research question. "Understanding what factors cause high-tenure customers to consider switching in the first six months after a contract renewal" is.

Step 2: Identify the population diversity you need to capture. Are all your participants likely to have similar experiences (same role, same organisation, same product)? Or do you expect significant variation across subgroups that you need to capture? Each meaningful subgroup typically needs at least 3-5 participants to generate reliable themes.

Step 3: Choose the appropriate saturation standard. Code saturation for applied research; meaning saturation for theoretical or academic research.

Step 4: Plan for attrition and poor interviews. In any interview programme, some interviews will be cut short, produce unusable transcripts, or turn out to be with participants who were outside your actual scope. Build in 15-20% buffer above your target. If you need 20 usable interviews, plan to conduct 24-25.

Step 5: Conduct a sequential analysis. Rather than waiting until all interviews are complete to analyse, analyse in batches. If you are aiming for 20 interviews and find after 15 that you have not heard a genuinely new code in the last five, you may have reached saturation early. If you are still discovering entirely new themes at 20, you need more. The thematic analysis methodology guide covers how to do this iteratively.

The role of AI tools in larger interview sets

One practical implication of these saturation norms is that the cost of conducting more interviews — in analyst time — has changed significantly.

Manually coding 25 interview transcripts takes an experienced researcher 3-4 full days. Coding 40 takes proportionally longer. This time cost historically pushed researchers toward the lower end of the defensible range even when a larger sample would have been theoretically better.

AI-assisted analysis tools change this equation. Skimle can process 25 or 40 interviews in the time it takes to read one, producing a structured theme hierarchy with every quote attached. The manual work shifts from coding to reviewing and interpreting the AI's initial structure. See how Skimle handles thematic analysis and the AI in qualitative research guide for academics for how to document AI assistance in ways that satisfy peer reviewers.

This does not mean you should automatically aim for larger samples. The point is that the sample size decision should be driven by the research question and the saturation standard, not by the time cost of analysis.

What peer reviewers will ask

If you are writing up qualitative research for publication, your methods section needs to address sample size with more than "interviews were conducted until saturation was reached." Reviewers expect:

  • A statement of which saturation standard you applied (code vs meaning saturation)
  • Reference to the theoretical or empirical basis for that standard
  • Acknowledgement of the characteristics of your sample that affected the saturation process
  • If you stopped before reaching saturation (which is sometimes reasonable), an explanation of why

The most defensible framing references the empirical literature: "We conducted interviews until meaning saturation was reached (Hennink et al., 2017), with new theoretical understanding ceasing to emerge after [n] interviews." Our guide on how to write up a thematic analysis covers the methods section in full.

A note on sample size in applied research

The question "how many interviews is enough?" sounds the same in academic and applied contexts but is asking slightly different things.

In academic research, "enough" means: sufficient to make credible theoretical claims and withstand peer review.

In applied research, "enough" means: sufficient to identify the patterns that should drive decisions.

In applied contexts — user research, HR listening programmes, commercial due diligence — the saturation standard is more forgiving, but the stakes are different. A flawed sample in an academic study gets returned by peer reviewers. A flawed sample in a commercial due diligence might lead to a bad investment decision. The practical guide on qualitative research sample size covers the applied planning considerations in more detail.

The evidence from Guest et al. and Hennink et al. should give you more confidence to stop at 15-20 interviews for a well-focused research question, rather than defaulting to 30 or 40 out of vague anxiety about rigour. Larger samples are not always better. Rigorous analysis of a well-planned, appropriately-sized sample is what produces reliable findings.

Ready to analyse your qualitative interview data systematically? Try Skimle for free and see how structured AI-assisted thematic analysis handles the coding process while keeping every theme traceable to its source.

Related reading:


About the authors

Henri Schildt is a Professor of Strategy at Aalto University School of Business and co-founder of Skimle. He has published over a dozen peer-reviewed articles using qualitative methods, including work in Academy of Management Journal, Organisation Science, and Strategic Management Journal. His research focuses on organisational strategy, innovation, and qualitative methodology. Google Scholar profile

Olli Salo is a former Partner at McKinsey & Company where he spent 18 years helping clients understand the markets and themselves, develop winning strategies and improve their operating models. He has done over 1000 client interviews and published over 10 articles on McKinsey.com and beyond. LinkedIn profile


Sources