Skip to main content

Supported Formats

Talonic processes a wide range of document formats including PDF (native and scanned), DOCX, images (PNG, JPEG, TIFF), spreadsheets (XLSX, CSV), and email files (EML, MSG). Scanned documents and images are processed through the OCR pipeline before entering the four-phase extraction flow. Each format is mapped to the 529-type document ontology for accurate classification.