Agentic Analysis

How Skimle's agentic analysis takes a research question end-to-end — from clarifying chat to coded insights and analytical themes.

Agentic analysis is Skimle's most thorough analysis mode. You start by stating a research question; Skimle then runs a multi-step pipeline that builds a coding framework, codes every relevant chunk of every document, and synthesises analytical themes that answer the question you asked.

It is designed for projects where the research question matters as much as the categories — for example, "why do staffing decisions diverge across presidential terms?" or "what differentiates respondents who churn from those who renew?" — and where you want analytical depth on top of category coding.

Step 1: Setup chat

You start by typing a research question into the agentic analysis dialogue. Skimle's setup agent talks to you to:

  • clarify what the question actually means (it asks one to three short questions if anything is ambiguous);
  • classify the analysis type — taxonomic (mapping a space), explanatory (why something occurs), causal (cause-and-effect), or comparative (between groups);
  • propose a coding framework: a small set of first-level categories (typically four to seven), candidate sub-themes for each, and two to four "broader dynamics" — the higher-level analytical questions the synthesis stage will investigate.

Skimle pulls the framework from a library of analysis templates, biased towards a prior agentic analysis on the project if one is available. You can edit every part of the framework before confirming.

Step 2: Confirming the framework

The framework editor lets you rename, add, or remove categories and broader dynamics. Once you accept, Skimle:

  • creates a top-level insight category named after your research question;
  • creates an analysis log memo that will record every step the worker runs;
  • queues the analysis. The dashboard shows it as pending.

Step 3: Empirical extraction

For each first-level category, Skimle uses embedding-based pre-selection to find the document chunks most likely to contain relevant material, then runs strict coding over those chunks. Strict coding means each candidate insight is post-checked against the category's question and the supporting quote is verified against the original document.

Skimle keeps a watermark of which chunks have been processed and the lowest similarity threshold that has yielded passing material. New documents added later — including during a re-run — are coded only against the chunks not yet seen, so re-runs stay cheap.

The result of this step is a populated category tree under your research question, with verified quotes attached to every insight.

Step 4: Metadata enrichment

Skimle checks whether the analysis needs a new metadata field to support the comparison your research question implies (for example, a period field for a question about temporal change, or an organisational role field for a question about staffing). When a new field would help, Skimle proposes it, names the value categories meaningfully, and codes every document. When the existing metadata already covers the distinction — or when no metadata field is needed at all — Skimle records its reasoning and moves on.

Step 5: Analytical synthesis

For each broader dynamic in the framework, Skimle runs an investigation — a focused agent that searches the empirical insights, metadata, and temporal patterns to find evidence relevant to that dynamic. The investigation runs over multiple search steps and ends with a synthesis that produces two to five analytical themes per dynamic.

Each analytical theme is written as a paragraph of prose with bracketed citations to the underlying insights (for example, [I205], [D3k], or [I26-Q2] for a specific quote). The themes are saved as memo insights on a per-dynamic analytical memo so you can browse them by dynamic.

Step 6: Counter-evidence

For each analytical theme, Skimle runs a deliberate search for evidence that contradicts the theme. When counter-evidence exists, Skimle records it on the theme so the synthesis is honest about its limits. Themes with strong counter-evidence are flagged.

Step 7: Findings summary

Finally, Skimle writes a cross-cutting summary that pulls together every analytical theme into a single answer to your research question. This summary is appended to the analysis log.

Output

When the worker finishes, the project contains:

  • the category tree under your research question, with coded insights and verified quotes — visible in the spreadsheet, categories, and document views like any other analysis;
  • category summaries for every populated category;
  • the analysis log memo — a step-by-step narrative of what Skimle did, why, and what it found;
  • analytical theme memos — one memo per broader dynamic, each holding the synthesised themes with citations and counter-evidence;
  • optionally, new metadata fields with coded values for every document.

Every citation in every memo is a clickable bracketed reference that opens the underlying insight, document, or quote.

Re-running an agentic analysis

You can re-run an agentic analysis at any time. Skimle:

  • preserves the empirical extraction that has already passed strict coding — re-extraction only considers chunks the prior run did not reach;
  • removes placeholder themes that previous runs could not synthesise so they get another attempt;
  • repeats the analytical synthesis, counter-evidence, and findings summary from scratch so the latest empirical state is reflected in the analytical themes.

The analysis log is appended to rather than replaced, so you can see how the analysis evolved across runs.

When to use agentic analysis

Reach for agentic analysis when you have a specific research question you want answered with analytical depth — not just a set of coded categories. If you only want to map what is in the corpus, automatic thematic analysis or inductive analysis are quicker and cheaper. Agentic analysis is the right choice when you want both: well-coded data and a written-up answer to your question that is grounded in the data.