The Critical Incident Technique: a practical guide for UX and market research

The Critical Incident Technique (CIT) asks for specific remembered events, not general opinions. What it is, how to run it, and how to analyse incidents at scale.

Cover Image for The Critical Incident Technique: a practical guide for UX and market research
Share this article:

The Critical Incident Technique (CIT) asks participants to recall a specific, remembered event where something positively or negatively affected an outcome, instead of asking for a general opinion. Developed by John Flanagan in 1954, it surfaces concrete detail ("the time the app crashed during checkout") that generic questions ("what do you think of the app?") rarely produce, and the incidents are then coded and grouped to find recurring patterns.

If you have ever asked "what do you think of X" and got back a vague, polite non-answer, CIT is the fix for that specific problem. It is one of the oldest interviewing techniques in applied psychology, and one of the most underused outside academic and aviation-safety circles, despite being a natural fit for UX and market research.

What is the Critical Incident Technique?

John Flanagan introduced CIT formally in 1954 in the journal Psychological Bulletin, building on work he had done during the Second World War analysing why pilots failed training, which itself depended on specific remembered incidents rather than general impressions of "good" and "bad" pilots. The technique has since spread well beyond aviation, into healthcare, education, and, increasingly, UX and market research.

The defining feature is the word "critical." A critical incident is not just any event a participant happened to experience. It is one where the participant perceives a clear causal link between what happened and the outcome being studied: this specific moment made the task easier, or this specific moment made it fail. That causal framing is what separates CIT from an ordinary open-ended question.

Why a critical incident question gets better answers than a generic one

A useful way to see the difference is to compare three ways of asking about the same topic, as Nielsen Norman Group lays out in their guide to CIT:

Question typeExampleWhat it actually captures
Generic"Tell me about a time you used the tool"No directional focus; participant picks at random
Recency-based"Tell me about the last time you used the tool"Recency, not significance
Critical incident"Tell me about a time the tool helped you be especially effective at your work"A causally significant moment the participant can actually defend

The critical incident framing forces specificity. A participant cannot answer "the time it made me effective" with a shrug; they have to retrieve an actual memory, which comes with concrete detail: what they were doing, what led up to it, what happened, and why it mattered. That specificity is exactly what makes interview data analysable rather than a pile of vague sentiment.

How to run a Critical Incident Technique study

1. Define what you're actually studying. Before writing a single question, be precise about the outcome you care about: task effectiveness, satisfaction with a specific feature, trust in a process, safety in a workflow. CIT works only when the "critical" half of the question has something concrete to be critical about.

2. Collect both positive and negative incidents. Ask for a moment where something worked well, and separately, a moment where it did not. NN/g's recommended structure starts with the positive version (it tends to build rapport before asking about failures), then asks for clarification: what task were you doing, why did you choose this approach, what happened next. Then it asks if the participant can think of another instance, since most participants will have more than one.

3. Expect a lot of incidents per participant, and a lot of participants. Because each person can usually surface several incidents, even a moderate sample of 15-25 participants can generate hundreds of individual incidents. This is the part of CIT that quietly becomes a data-volume problem: the technique is excellent at generating rich, specific data, and that richness is exactly what makes hand-coding it slow.

4. Code the incidents and look for saturation. Each incident gets coded into a category, and researchers track how many incidents land in each category. When a category keeps accumulating new incidents that say roughly the same thing, that is the saturation signal CIT relies on for validity: a requirement, a failure mode, or a need is real and recurring, not a one-off.

5. Watch the known weaknesses. CIT depends on memory, and memory is selective. Participants tend to recall extreme events more easily than ordinary ones, so a CIT study will reliably surface real edge cases and rare failure modes, but it will not give you an accurate picture of typical, everyday use. Treat it as a tool for finding what matters most, not a representative sample of what happens most often.

A worked example

Suppose you are studying why customers abandon a checkout flow. A generic question ("what do you think of checkout?") gets you shrugs and mild complaints about page load speed. A critical incident question gets something different: "tell me about the last time you started buying something online and gave up partway through."

One participant describes a specific moment: they were buying a gift, got to the shipping address step, the form rejected a valid postal code three times, and they closed the tab and bought from a competitor instead. That is a critical incident. It comes with a task (buying a gift), a specific failure point (the address validation step), an emotional reaction (frustration, then giving up), and a consequence (lost sale, to a named alternative). None of that level of detail comes from a satisfaction score or a generic complaint.

Collect 200 of these across 20 participants and a pattern usually emerges fast: several incidents cluster around address validation, several more around unexpected shipping costs appearing late, and a handful around payment failures. Each cluster, once it has enough incidents supporting it, is a concrete, evidence-backed argument for a specific fix, not a vague impression that "checkout could be smoother."

Where Critical Incident Technique fits in product and market research

CIT shows up naturally in a few recurring situations:

  • Understanding why a feature or product fails people, by asking for the specific moment things went wrong rather than a satisfaction score
  • Service design and customer experience research, identifying which specific touchpoints make or break a customer's experience
  • Win-loss analysis, asking sales prospects for the specific incident that tipped a decision one way or another, rather than a generic "why did you choose us / them"
  • Customer discovery, surfacing the specific moment a workaround or frustration became unbearable enough to go looking for a new solution

It pairs naturally with a well-written interview guide: the critical incident questions are the core probe, and good follow-up questions (what were you trying to do, what happened next, how did it make you feel) are what turn a remembered moment into usable data. Product teams and UX researchers tend to get the most consistent use out of CIT, since it maps directly onto the specific failure moments a roadmap decision needs evidence for.

Analysing critical incidents at scale

The methodological strength of CIT, getting hundreds of specific, detailed incidents instead of a handful of vague opinions, is also its practical bottleneck. Coding 300 critical incidents by hand, checking which ones belong to the same underlying category, and tracking saturation across categories is exactly the kind of systematic, repetitive coding work that consumes weeks when done manually.

This is a case where AI-native thematic analysis earns its keep specifically because the method already produces well-structured, causally framed data. Skimle reads every incident, builds a category structure from the content itself rather than starting from a fixed codebook, and shows exactly how many incidents support each category, with every category traceable back to the specific incidents that produced it. Saturation, the core analytical signal in CIT, becomes something you can see directly in the data rather than something you estimate by feel partway through manual coding.

If your incidents come with structured metadata (which department, which product tier, which customer segment), filtering categories by that metadata shows whether a specific failure mode is universal or concentrated in one group, which is often the more actionable finding than the raw incident count.

Frequently asked questions

How is the Critical Incident Technique different from a regular interview?

A regular interview can ask about opinions, habits, or general impressions. CIT specifically asks for a remembered event where the participant perceived a causal link between what happened and an outcome. That causal, specific framing produces concrete, detailed answers that are easier to code and analyse than general sentiment.

How many participants do I need for a Critical Incident Technique study?

There is no fixed number; researchers typically watch for saturation, the point where new participants stop surfacing new categories of incident. Studies commonly involve 15-25 participants, though each contributes multiple incidents, so the total incident count is usually much higher than the participant count.

Can Critical Incident Technique be used in a survey, not just an interview?

Yes. Open-ended survey questions can use critical incident framing ("describe a specific time when..."), though follow-up probing, which surfaces the most useful detail in CIT, is harder in a static survey. Skimle Ask can ask a critical incident question and then probe further based on what the respondent says, closer to the interview version of the technique than a fixed-question survey.

What are the main limitations of CIT?

It depends on participant memory, which favours dramatic or unusual events over routine ones, so it is better at finding what matters most than at estimating how often something typically happens. It also requires participants to correctly identify a causal link, which is not always reliable.

How do I know when I have collected enough critical incidents?

Watch for saturation rather than counting toward a fixed target: once new interviews keep landing in categories you have already seen repeatedly and stop producing new categories, you have likely covered the main requirements or failure modes. If every new participant is still surfacing a category you have not seen before, keep going.


Want to analyse a large set of critical incidents without weeks of manual coding? Try Skimle for free and see how fast a structured category view with saturation built in comes together.

Planning the interview itself? Read how to write a strong interview guide and how to conduct effective business interviews.


About the authors

Henri Schildt is a Professor of Strategy at Aalto University School of Business and co-founder of Skimle. He has published over a dozen peer-reviewed articles using qualitative methods, including work in Academy of Management Journal, Organisation Science, and Strategic Management Journal. Google Scholar profile

Olli Salo is a co-founder at Skimle and former Partner at McKinsey & Company where he spent 18 years helping clients understand the markets and themselves, develop winning strategies and improve their operating models. He has done over 1000 client interviews and published over 10 articles on McKinsey.com and beyond. LinkedIn profile


Sources