Three problems have always plagued people who try to analyse interview transcripts, open-text survey responses, or other qualitative data using general-purpose AI tools: hallucinations, context window limits that restrict how much material you can actually process, and the black-box nature of the output. Each of these is serious.
Together, these three issues make most AI-generated qualitative analysis indefensible. It's no wonder that lot of academic researchers, consultants and market researchers have found basic AI unreliable for serious work and are questioning the value it can bring.
Skimle's architecture was built specifically to eliminate all three. Here is what each problem means in practice, and how we address it.
The hallucination problem: AI invents evidence
Large language models generate plausible text. That is their core function. They are not retrieval systems, and they do not verify what they produce against source material. The consequence for qualitative research is real: a model analysing a company's employee survey responses has been trained on vastly more data than your specific corpus. When generating a summary or a list of themes, it draws on both your material and its training data simultaneously, in ways you cannot see or control.
In practice, this means quoted evidence can be fabricated. A model might produce a quote that sounds entirely consistent with your data but does not exist in it. For research that will inform decisions it is a fundamental credibility problem.
Skimle's approach is "trust but verify". After the AI identifies a passage as relevant, classic programming checks that the quote appears verbatim in the source document. If it does not match, it is not included. Every insight in Skimle is backed by an exact quote you can verify in the original document. This is the foundation of two-way transparency: you can move from any theme down to the supporting quotes, and from any document up to the categories it contributed to.
The context window problem: most AI tools only process a fraction of your data
Even the largest context windows have practical limits. With qualitative research, you often need to work with substantial corpora: 40 interview transcripts, 300 open-text survey responses, 80 expert call memos. Fitting all of that into a single prompt is frequently not possible. And even when it technically fits, research consistently shows that LLMs pay unequal attention across a long context, attending more to the beginning and end and significantly less to the middle.
Some tools tried to address this with RAG (retrieval-augmented generation), which uses semantic search to pre-select relevant passages before analysis. RAG works well for factual retrieval. It fails for qualitative analysis because thematic proximity and semantic proximity are different things. "Prices are too high" and "availability in rural areas has declined" are semantically distant but may both belong to the same theme of "critique towards the new taxi law". A retrieval step based on embeddings will miss the second one if you only searched for the first. We wrote about why RAG does not work for structured qualitative analysis in more detail in a separate post.
Skimle's approach is to process text in small chunks and build the category structure incrementally, just as an experienced researcher would work through a corpus. Each passage is read and coded on its own terms. The categories develop as more material is processed, and later passes refine and consolidate what earlier passes identified. The length of the source material has no effect on accuracy. Skimle can handle thousands of documents with the same rigour as a small set.
The black-box problem: you cannot defend what you cannot trace
This is the subtlest of the three problems, but often the most consequential for professional use.
When you upload a dataset to a general-purpose chatbot and ask it to identify themes, you receive a plausible-looking summary. But the summary is generated fresh each time you ask. Ask slightly differently and you get a different answer. Ask for supporting evidence for a specific claim and the model may apologise and tell you there are actually no quotes that support it. This is not a malfunction, it is how these systems work. There is no stable intermediate structure between your raw data and the chat output. Each response is generated from scratch against the full corpus.
This means the findings are not reproducible and not auditable. For research that will be shared, published, or used to support business decisions, that is a serious problem. ChatGPT can help explore qualitative data, but it cannot provide the structured, traceable output that defensible analysis requires.
Skimle codes each paragraph explicitly and stores that coding as a persistent, inspectable structure. You can see exactly what is coded where, verify the coding against the source, and revise it if you disagree. The categories view shows which documents contributed to each theme and which passages were coded to which categories. Every step of the analysis is visible and editable. And because the coded structure is permanent, you can export the full coded dataset as REFI-QDA (.qdpx) and open it in NVivo, MAXQDA, or ATLAS.ti. The analysis is not locked inside Skimle.
Why this matters beyond methodology
If you are a researcher who cares about method, these distinctions are directly relevant to the validity of your work. Using AI in qualitative research responsibly requires being able to describe and justify each step of your analytical process. Hallucinated quotes, incomplete coverage of your corpus, and untraceable summaries all undermine that.
If you are a business user and mainly care about the outcome... all of the nerdy stuff above just means you get accurate, complete results that you can show to a client or a supervisor without anxiety.
The automatic thematic analysis in Skimle was designed with this goal from the start. Not to produce impressive-looking summaries, but to produce analysis you can stand behind.
Want to see how it works in practice? Try Skimle for free — upload a set of transcripts or documents and see the full analysis with traceable quotes and exportable coded data.
Related reading:
- Two-way transparency: creating confidence in AI analysis
- Why RAG does not work for structured qualitative data
- Can ChatGPT analyse qualitative data?
About the authors
Henri Schildt is a Professor of Strategy at Aalto University School of Business and co-founder of Skimle. He has published over a dozen peer-reviewed articles using qualitative methods, including work in Academy of Management Journal, Organisation Science, and Strategic Management Journal. His research focuses on organisational strategy, innovation, and qualitative methodology. Google Scholar profile
Olli Salo is a former Partner at McKinsey & Company where he spent 18 years helping clients understand the markets and themselves, develop winning strategies and improve their operating models. He has done over 1000 client interviews and published over 10 articles on McKinsey.com and beyond. LinkedIn profile
