Analytical stack — each layer requires the one below it
01
Pattern Recognition
Requires: structured event corpus
02
Scenario Tracking
Requires: pattern recognition + event weighting
03
Structured Reporting
Requires: structured corpus + assumptions layer
04
Inquiries
Requires: scenarios + multi-analyst weighting history
01 Pattern Recognition Requires structured event corpus

Event patterns that prose databases cannot see.

The most immediate gain from structured encoding is the ability to query the record in ways that are simply impossible with prose. When every event is encoded with a consistent actor, action, domain, location, and timestamp — in a governed vocabulary — the corpus becomes a database in the full analytical sense of the word.

Questions that currently require weeks of manual research become direct queries. How many times has a given actor engaged a given counterpart, in what domains, with what actions, over a defined time period? Which institutional actors most frequently initiate diplomatic contact in the energy domain, and how does that pattern differ by region? Which action types cluster around elections in Southeast Asia in the eighteen months prior to polling day?

None of these questions are exotic. They are the questions analysts ask routinely. What CIE provides is the infrastructure to answer them rigorously — against a corpus where the same event encoded by different analysts in different source languages produces a directly comparable record.

Example queries — enabled by structured encoding
"Show all [dpl.mt] events between [ind.*] and [rus.*] actors in the [eng.*] domain, last 36 months"
"Which [*.hog] actors have initiated [dpl.sgn] actions most frequently, filtered by [axn.sec.bil] domain, by region"
"Show frequency of [cl.*] action codes in [sea.*] events within 18 months of scheduled national elections"
Visual placeholder
Actor relationship graph
query result view
Cross-jurisdictional comparison The same action code means the same thing regardless of the originating country, source language, or reporting outlet. Transnational pattern analysis requires no additional normalization.
Graph traversal The Neo4j analytical layer supports relationship queries unavailable in relational databases — actor networks, institutional hierarchies, and event chains traversed in a single query.
Auditable results Every result traces back to a specific encoded event with a source record. No black box. No generated summaries. Every finding is reproducible.
02 Scenario Tracking Requires pattern recognition + event weighting

Prediction markets, built on something real.

Scenarios are binary, falsifiable questions about future political conditions — structured the same way as a prediction market question, but with a critical difference in the underlying architecture. Where prediction markets aggregate crowd opinion against vague natural language inputs, Cypher's scenario system is fed by pre-parsed, structured events, allowing for a precision of analytical accounting that crowdsourced probability simply cannot produce.

Each scenario has a defined resolution condition, a resolution date, and a probability that updates over its lifespan as analysts assign encoded events to it with explicit weights and documented reasoning. The weight an analyst assigns to an event is not an opinion — it is a structured claim about how that event bears on the scenario's resolution, recorded against the full context of the actor's institutional position, relationships, and prior actions.

This structure enables capabilities that prediction markets do not have: granular event-level attribution for every probability movement, competing analytical models running simultaneously against the same event corpus, backtesting against historical scenario runs, and a complete audit trail of every assumption made during the scenario's lifetime.

Scenario structure
Binary
Resolution is 1 or 0. Forces precise, falsifiable claim formulation.
Probability update mechanism
Weighted
Each encoded event is explicitly weighted by the analyst and documented.
Input data
Structured
Pre-parsed CIE events, not prose summaries or natural language inputs.
Analyst positions
Independent
Multiple analysts work simultaneously without seeing each other's weightings.
Visual placeholder
Scenario probability
tracking view
Event-level attribution Every probability movement traces to specific encoded events and specific analyst weight assignments. You can always answer: why did the probability move, and who moved it.
Backtesting Resolved scenarios remain in the corpus with their full event and weighting history. New models and assumptions can be run against historical scenario runs to evaluate methodology before deploying it forward.
Competing models Multiple analysts assign independent initial probabilities and independent event weights. The variance between models is itself an analytical signal — not noise to be averaged away.
03 Structured Reporting Requires structured corpus + assumptions layer

Reports that show their working.

The most common use of AI in political analysis today is to summarize unstructured reporting — to take a large volume of news and produce a readable digest. The limitation is structural: when the input is unstructured prose carrying all the editorializing, duplication, and language asymmetries described on the Problem page, the output inherits those failures, presented in confident fluent prose that does not flag its own uncertainty.

Cypher's reporting capability inverts this. Reports are generated not from scraped prose but from the structured event corpus — encoded facts, explicit source credibility ratings, and analyst-documented assumptions. The result is not a summary of what reporters wrote about what happened. It is a structured account of what actually happened, against a defined set of explicit assumptions, with every claim traceable to a specific encoded event and a specific source record.

This changes the epistemic character of the output. Instead of a confident paragraph that cannot be audited, you get a report where every factual claim is a link to an event encoding, every probabilistic assessment cites the scenario it derives from, and every assumption is recorded and queryable. The report is not a black box. It is a transparent instrument.

Visual placeholder
Structured report output
with traceable event citations
Traceable inputs Every claim in a Cypher report links to the encoded event it derives from. The source record, credibility rating, and encoding analyst are all recoverable.
Configurable scope Reports can be scoped by actor, institution, domain, region, or time window — and combinations thereof — producing tailored output without manual curation of the underlying dataset.
Explicit assumptions Where probabilistic assessments appear in a report, the assumptions underlying them are recorded and accessible — not embedded invisibly in the model's weights.
04 Inquiries Requires scenarios + multi-analyst weighting history

Analytical intelligence about the analysis itself.

Inquiries are the layer of the system that has no equivalent in existing political analysis tools — and the one that most clearly illustrates what structured data makes possible when it accumulates over time. Where scenario tracking asks "what is the probability of this outcome," Inquiries ask a different set of questions: why did analysts disagree, who was right, and what does that reveal about the event or the analyst?

Because every probability movement in a scenario is attributed to specific encoded events and specific analyst weightings, with reasoning documented at the time of assignment, Inquiries can examine the history of a scenario's analysis at whatever level of granularity the question demands. The system can identify which analysts moved in tandem against the consensus and subsequently proved more accurate — and then surface the specific event weightings that explain the divergence. It can identify an analyst whose initial probability assessments consistently outperform their ongoing event weighting, suggesting a specific diagnostic about where their analytical value lies. It can isolate a week in which two analysts reached similar directional conclusions through entirely different event selections, and ask what that means about the events themselves.

These are not capabilities that can be bolted onto a prose-based system after the fact. They depend on the full weighting history being structured, attributed, and queryable from the moment it was created. They are the compounding return on the prior discipline of encoding.

Illustration — the kind of question Inquiries answers

In a twelve-week scenario run, two analysts independently moved their probability assessments in the same direction during weeks three through six and again in weeks ten through twelve — against the consensus of all other analysts working the same scenario. Both ultimately proved more accurate than the group. Pulling the event assignations for those specific weeks reveals that both analysts weighted the same two events that no other analyst had included — one from North America, one from Southeast Asia — while working independently on a scenario centered on Northeast Asian political stability.

The question Inquiries raises: did these two analysts share an interpretive framework that caused them to see connections others missed? Were those events genuinely causally relevant, or coincidentally correlated with the outcome? The structured record provides the raw material to investigate both possibilities — and to do so reproducibly, with the full event and weighting history available for re-examination.

This is the kind of meta-analytical insight that professional intelligence shops spend significant manual effort to reconstruct after the fact, if they attempt it at all. With structured inputs, it is a query.

Visual placeholder
Multi-analyst scenario probability comparison — Inquiries view
The compounding return

Each layer of the stack is useful in isolation. Together, they constitute something that does not exist elsewhere: a closed analytical environment where every input is structured, every output is traceable, and every assumption is recorded.

The competitive advantage of this system is not any single capability. It is the corpus — the accumulating body of structured, governed, auditable event encodings that makes each layer of the stack more powerful over time. The corpus is not a byproduct of the system. It is the system.

Get in touch