Qualitative evidence synthesis: a practical guide to thematic synthesis and meta-ethnography

Qualitative evidence synthesis combines findings from multiple already-published qualitative studies to answer a broader research question, rather than analysing new interview or document data. The main approaches are meta-ethnography (interpretive, theory-building) and thematic synthesis (more structured, closer to the primary data). Researchers screen, appraise (typically with CASP), then extract and synthesise findings, a step that becomes the main bottleneck once a review includes 30 or more studies.

If you have done primary qualitative research, this will feel familiar and unfamiliar at once. The logic of coding and theme-building carries over. What changes is the unit of analysis: instead of coding transcripts, you are coding the findings, themes, and quotes that other researchers have already reported in their published papers. This guide walks through what qualitative evidence synthesis is, how it differs from primary thematic analysis and from quantitative meta-analysis, the main methodological approaches, and the practical extraction-and-synthesis bottleneck that shows up once a review has moved past study selection.

What is qualitative evidence synthesis?

Qualitative evidence synthesis (sometimes called qualitative research synthesis, meta-synthesis, or qualitative meta-summary) is the systematic combination of findings from multiple qualitative studies that address a related research question. It sits within the broader systematic review tradition, borrowing its emphasis on a documented search strategy, explicit inclusion criteria, and transparent reporting, while applying interpretive methods suited to qualitative data.

The output is not a list of studies. It is a new, higher-order account: a set of synthesised themes, a conceptual model, or a theory that none of the individual included studies stated on its own. A meta-ethnography of patient experiences of chronic pain, for example, does not just summarise twenty papers' worth of findings side by side. It produces a new interpretation that explains patterns across all twenty.

This kind of synthesis has become a standard part of evidence-based practice in health sciences, social policy, and education, particularly where decision-makers need to understand not just whether an intervention works (a quantitative question) but how and why it works, or fails, for the people experiencing it. Cochrane and the WHO now treat qualitative evidence as a routine companion to effectiveness reviews rather than a supplementary extra (Cochrane Handbook, Chapter 21).

How does this differ from primary thematic analysis?

This is the distinction that trips up researchers moving between the two. Primary qualitative analysis, the kind covered in our guides on reflexive thematic analysis and grounded theory, works directly with raw data: interview transcripts, documents, observation notes. The researcher who collected the data also analyses it, often with direct access to participants for clarification or follow-up.

Qualitative evidence synthesis works one level removed. The "data" are the findings sections of published studies, not raw transcripts. You cannot go back to the original participants. You cannot recode an excerpt that the original authors did not report. Your synthesis is only as good as what the primary authors chose to publish, which is why quality appraisal is a mandatory step rather than an optional check.

	Primary thematic analysis	Qualitative evidence synthesis
Unit of analysis	Interview/document excerpts	Findings and themes reported in published studies
Data source	Researcher's own collected data	Other researchers' published papers
Access to raw data	Full access, can revisit and reinterpret	Limited to what authors reported; rarely raw transcripts
Search process	Not applicable (data is collected, not searched for)	Systematic database search, often PRISMA-reported
Quality control step	Reflexivity, audit trail	Formal critical appraisal (e.g. CASP) of each included study
Typical scale	10-40 interviews or documents	10-77+ published studies (commonly capped around 40-60 for feasibility)
Output	Themes describing the dataset	A synthesised model, theory, or set of cross-study themes

How is this different from quantitative meta-analysis?

Quantitative meta-analysis pools numerical effect sizes across studies, typically using statistical methods to produce a combined estimate (does drug X reduce symptom Y by Z%, averaged across all trials). Qualitative evidence synthesis pools meanings, experiences, and explanations rather than numbers. There is no statistical pooling step, because the inputs are not commensurable numbers, they are differently-worded accounts of similar phenomena.

The two are increasingly combined in mixed-methods systematic reviews, where a quantitative meta-analysis answers "does it work" and a qualitative evidence synthesis answers "why, for whom, and under what conditions", which is one reason qualitative evidence synthesis has grown alongside Cochrane's broader effectiveness review programme rather than as a separate, niche pursuit.

What are the main approaches to qualitative evidence synthesis?

Three approaches dominate the methodological literature, and choosing between them is a real decision that should be stated and justified in your methods section, not glossed over.

Meta-ethnography

Developed by George Noblit and R. Dwight Hare in their 1988 book Meta-Ethnography: Synthesising Qualitative Studies (Sage), meta-ethnography is the most established and most interpretive of the three approaches. It treats each included study as an account that can be "translated" into the terms of the others, looking for reciprocal translations (where studies broadly agree), refutational translations (where they contradict each other), and lines-of-argument synthesis (building an overarching interpretation across both).

Noblit and Hare's original work, by their own description, synthesised only 2-6 studies at a time, reflecting how labour-intensive close interpretive translation is when done by hand (Toye et al., 2014). Meta-ethnography is suited to producing new theory or conceptual models, not simply aggregating what studies found, which makes it the natural choice when your research question is conceptual rather than purely descriptive.

Thematic synthesis

James Thomas and Angela Harden formalised thematic synthesis in their 2008 paper "Methods for the thematic synthesis of qualitative research in systematic reviews", published in BMC Medical Research Methodology (volume 8, article 45). Thematic synthesis follows three explicit stages: line-by-line coding of the findings reported in each study, development of descriptive themes that stay close to what the primary studies said, and generation of analytical themes that go beyond the primary studies to produce new interpretive constructs that address the review's specific question.

Thematic synthesis is more procedural than meta-ethnography, closer in spirit to the coding processes covered in our guide on how to code qualitative data, and was explicitly designed to combine the transparency systematic reviewers expect with the interpretive depth qualitative researchers need.

Meta-narrative review

Developed by Trisha Greenhalgh and colleagues and given formal reporting standards through the RAMESES project (Wong, Greenhalgh, et al., BMC Medicine, 2013), meta-narrative review is suited to a different problem: heterogeneous bodies of research that have approached the same broad topic from distinctly different theoretical traditions. Rather than forcing studies from different paradigms into one synthesis, a meta-narrative review traces how each research tradition (each "meta-narrative") developed, then produces an overarching account of how and why the traditions diverge. It is the right tool when the included literature is too conceptually fragmented for direct translation or thematic pooling to make sense.

Approach	Originators	Best suited for	Typical output
Meta-ethnography	Noblit & Hare (1988)	Conceptually rich, interpretive synthesis across a moderate number of studies	New theory or conceptual model via translation
Thematic synthesis	Thomas & Harden (2008)	Reviews needing transparent, auditable coding close to primary findings	Descriptive themes plus analytical themes that answer the review question
Meta-narrative review	Greenhalgh, Wong et al. (RAMESES, 2013)	Heterogeneous literatures spanning distinct research traditions	An account of how and why traditions diverge, plus an overarching narrative

What does the synthesis process actually involve?

Regardless of which approach you choose, most qualitative evidence syntheses follow a recognisable sequence, often reported against PRISMA for the search and screening stages and ENTREQ for the synthesis stages (Tong et al., 2012):

Define the question and protocol. Register or document your review question, inclusion criteria, and planned synthesis method before screening begins.
Search systematically. Run a documented, reproducible search across relevant databases, reported with enough detail that another researcher could repeat it.
Screen for inclusion. Apply your criteria to titles, abstracts, then full texts. PRISMA flow diagrams are now expected even for qualitative-only reviews.
Appraise quality. Assess each included study's methodological rigour, typically with a tool such as CASP (see below).
Extract findings. Pull the relevant findings, themes, and supporting quotes from each included paper into a structured extraction format.
Synthesise. Apply your chosen method (translation, thematic coding, or narrative tracing) to build cross-study themes or a new conceptual account.
Assess confidence in the findings. Increasingly done with GRADE-CERQual, which rates each synthesised finding on methodological limitations, coherence, adequacy of data, and relevance (Cochrane GRADE-CERQual training).

What is the role of quality appraisal and tools like CASP?

Quality appraisal is non-negotiable in qualitative evidence synthesis precisely because you cannot return to raw data to check a weak study's claims. The Critical Appraisal Skills Programme (CASP) qualitative studies checklist is the most widely used tool for this step, posing ten questions across three areas: whether the study's results are valid, what the results actually are, and whether the results will be useful for the review's purpose (CASP qualitative checklist).

CASP appraisal does not produce a numeric score that automatically excludes studies. It produces a judgement that you carry into the synthesis and report transparently, sometimes weighting or sensitivity-testing your conclusions against studies appraised as weaker. This appraisal step is squarely the researcher's responsibility. No tool, including Skimle, makes this judgement for you, and any vendor implying otherwise is overstating what software can do in a process built around expert interpretation.

Where does the extraction-and-synthesis bottleneck actually happen?

Screening and appraisal get most of the methodological attention because they have the clearest rules. The quieter problem is what happens after you have a final list of, say, 30 to 60 included qualitative studies and need to extract and synthesise their findings.

Each included paper typically reports several themes, each theme supported by a handful of participant quotes, each written in the original authors' own framing and terminology. Reading sixty findings sections closely enough to translate them against each other (the core operation in meta-ethnography) or to code them line by line (the core operation in thematic synthesis) is slow, detailed work. Researchers doing this manually commonly describe spreadsheets with hundreds of rows, one per extracted finding, cross-referenced back to source papers by hand.

This is also where errors creep in. A theme gets misattributed to the wrong study during a late-night extraction session. A nuance in how one paper qualified its finding gets flattened when copied into a spreadsheet cell. Toye and colleagues note that because of the labour involved, "only a few meta-ethnographic syntheses include more than 40 studies", and that one frequently cited recommendation puts the practical ceiling at around 40 studies "to allow sufficient familiarity" with the material (Toye et al., 2014, BMC Medical Research Methodology). That ceiling is a feasibility constraint on hand extraction, not a methodological rule, and it is exactly the constraint that AI-assisted extraction can loosen.

How can Skimle help with the extraction and synthesis step?

Skimle was built for primary qualitative analysis, but the underlying mechanics, coding excerpts of text and building a theme structure with traceability back to source, apply directly to the extraction-and-synthesis bottleneck described above. Used this way, the included papers (or just their extracted findings and quotes sections) become the documents in a Skimle project.

In practice, this looks like:

Upload your included papers or extracted findings. Skimle accepts PDFs and text documents, so you can upload full papers or, more commonly, a structured extraction file where each row is a finding tagged with its source study. See supported file formats.
Use metadata to track study-level attributes. Tag each document with study author, year, country, population, and CASP appraisal rating as metadata variables, so you can later filter or weight your synthesis by study quality or context.
Let inductive analysis surface candidate cross-study themes. Inductive analysis builds a theme structure from the findings you have uploaded, the same operation as manually coding sixty findings sections, run across the full corpus at once rather than one paper at a time.
Trace every synthesised theme back to its source study. The categories view keeps every theme linked to the specific excerpt, and therefore the specific paper, it came from, which is exactly the auditability that ENTREQ and CASP-literate reviewers expect to see in a synthesis write-up.
Use metadata analysis to test patterns across study characteristics. Cross-tabulating themes against metadata lets you check whether a theme holds across, say, high- versus low-income country settings, which is close to what a meta-narrative review or a refutational meta-ethnography translation is trying to establish.

What Skimle does not do, and should not claim to do, is replace the researcher's judgement on study selection, CASP appraisal, or the interpretive work of deciding what a cross-study theme actually means. Extraction and pattern-surfacing are mechanical, repetitive tasks that AI assistance speeds up considerably. Translating those patterns into a defensible synthesis, the kind a peer reviewer or a Cochrane methods editor will scrutinise, remains an expert task. If you are an academic researcher weighing where AI fits into a rigorous workflow more broadly, our guide on how to use AI in qualitative research covers the same boundary in more depth, and the academic researchers use-case page has more on how Skimle fits an academic workflow generally.

How does qualitative evidence synthesis relate to content analysis?

Content analysis and thematic synthesis share more procedural DNA than meta-ethnography does with either. Both involve systematic coding against a structure, whether predefined categories or codes that emerge from the data. Our guide on content analysis vs thematic analysis covers the distinction in primary research, and the same logic largely carries over to synthesis work: a thematic-synthesis-style qualitative evidence synthesis is closer to inductive coding across studies, while a quantitative content analysis of qualitative findings (counting how often a theme appears across studies) is a different, more frequency-driven exercise that loses some of the interpretive depth a meta-ethnography is built to preserve.

Frequently asked questions

What is the difference between meta-ethnography and thematic synthesis?

Meta-ethnography, developed by Noblit and Hare in 1988, builds new interpretation through "translating" studies against each other and is more interpretive and theory-oriented. Thematic synthesis, developed by Thomas and Harden in 2008, uses an explicit three-stage coding process (line-by-line codes, descriptive themes, analytical themes) and is generally considered more transparent and procedural, which makes it easier to report against systematic review standards. Both can produce findings that go beyond what any single included study reported; they differ mainly in how explicit and code-based the process is.

How many studies should be included in a qualitative evidence synthesis?

There is no fixed rule, but feasibility becomes a real constraint as study count grows. Noblit and Hare's original meta-ethnographies included only 2-6 studies. A commonly cited recommendation suggests around 40 studies as a practical ceiling for maintaining sufficient familiarity with the material by hand, though some published syntheses have included more (one widely cited methods paper synthesised 77 studies). The right number depends on your research question's breadth and how much support you have for extraction, not a target figure.

Do I need to do a systematic search for a qualitative evidence synthesis, or can I be selective?

A systematic, documented search is expected for a qualitative evidence synthesis, generally reported against PRISMA for the search and screening stages. Purposive or theoretical sampling of studies (selecting a smaller, deliberately varied set rather than every eligible study) is accepted in some meta-ethnography traditions, but it must be stated explicitly and justified, not presented as if it were a comprehensive systematic search.

What is GRADE-CERQual and do I need to use it?

GRADE-CERQual ("Confidence in the Evidence from Reviews of Qualitative research") is a framework for rating how much confidence to place in each synthesised finding, based on methodological limitations, coherence, adequacy of data, and relevance. It is recommended for all Cochrane qualitative evidence syntheses and increasingly expected by journals and guideline developers, though it is not a universal legal requirement for every qualitative evidence synthesis published outside that context.

Can software like Skimle replace CASP appraisal or the interpretive synthesis step?

No. Quality appraisal with CASP and the interpretive judgement involved in building a cross-study synthesis are the researcher's responsibility and require domain expertise that software cannot substitute for. Tools like Skimle help with the mechanical extraction and pattern-surfacing work that follows study selection and appraisal, surfacing candidate cross-study themes faster and keeping every theme traceable to its source study, while leaving the appraisal and interpretive judgement to the research team.

Ready to speed up the extraction and synthesis step in your next review? Try Skimle for free and build a cross-study theme structure from your included papers, with every theme traceable back to the study it came from.

Related reading:

About the authors

Henri Schildt is a Professor of Strategy at Aalto University School of Business and co-founder of Skimle. He has published over a dozen peer-reviewed articles using qualitative methods, including work in Academy of Management Journal, Organisation Science, and Strategic Management Journal. His research focuses on organisational strategy, innovation, and qualitative methodology. Google Scholar profile

Olli Salo is a former Partner at McKinsey & Company where he spent 18 years helping clients understand the markets and themselves, develop winning strategies and improve their operating models. He has done over 1000 client interviews and published over 10 articles on McKinsey.com and beyond. LinkedIn profile