Core Concepts

Talonic structures unstructured documents into schema-validated data with per-cell provenance. These six concepts explain how the platform works under the hood.

The Field Registry

The unified knowledge graph of canonical fields discovered across all your documents. Tier system, semantic clustering, and master extraction instructions.

Schemas as Entities

Schemas are versioned entities that define extraction targets. Each field links to the Field Registry and tracks changes via the Schema Graph.

Document groups connected through shared entities. Cases auto-form from link keys and include evidence chains with AI narration.

Per-Cell Provenance

Every extracted value carries a confidence score, source region, reasoning trace, and resolution method — the confidence gate locks cells at 0.7+.

The Four-Phase Pipeline

Resolve, Agent, Validate, Re-read — the four phases that progressively fill the extraction grid with confidence-gated values.

Document Ontology

A 529-type classification system aligned with DIN SPEC 91491 for consistent document type recognition across languages and industries.

For hands-on usage, see the Extract endpoint, Schemas API, or the authentication guide.