Extract Overview

Rynko Extract uses AI to pull structured data from unstructured documents — PDFs, images, spreadsheets, and more — and return it as clean JSON that matches your schema.

How It Works

Define a schema — Describe the fields you want to extract (e.g., invoice number, line items, total amount)
Upload documents — Send PDFs, images, Excel files, or other supported formats
Get structured data — Receive JSON output matching your schema, with confidence scores per field

┌─────────────┐    ┌──────────────┐    ┌──────────────┐
│  Documents   │ →  │   Rynko      │ →  │  Structured  │
│  PDF, Excel  │    │   Extract    │    │  JSON Data   │
│  Images      │    │   (AI)       │    │              │
└─────────────┘    └──────────────┘    └──────────────┘

Key Features

Schema-Driven Extraction

Define exactly what fields you need. The AI uses your schema as a guide, ensuring output always matches your expected structure.

Multi-File Support

Upload multiple documents in a single job. Extract merges data across files with configurable conflict resolution:

Flag conflicts — Mark fields with conflicting values
Prefer first file — Use the first file's value
Prefer highest confidence — Use the most confident extraction

Confidence Scoring

Every extracted field includes a confidence score (HIGH / MEDIUM / LOW), helping you decide whether to trust the result or flag it for human review.

Provider-Agnostic

Extract supports multiple AI providers. The default provider is optimized for speed and cost, but you can choose:

Google Gemini (default) — Fast, cost-effective
Anthropic Claude — Best for complex reasoning
OpenAI GPT-4o — Strong general performance
OpenRouter — Access to multiple models

Standalone vs Gate-Linked

Standalone Extract

Create an Extract config, upload files, get structured data. Use this when you just need to extract data without validation.

# Create a job with a custom schema
curl -X POST https://api.rynko.dev/api/extract/jobs \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "[email protected]" \
  -F 'schema={"type":"object","properties":{"invoiceNumber":{"type":"string"},"total":{"type":"number"}}}'

Gate-Linked Extract (Stage 0)

Enable Extract on a Flow gate to accept file uploads. The gate's schema becomes the extraction schema — no duplication needed.

File Upload → Stage 0 (Extract) → Stage 1 (Validate) → Stage 2+ (Render, Approve, Deliver)

Structured inputs (JSON/YAML/XML) skip Stage 0 automatically, costing 0 extract credits.

Supported File Types

Format	Extensions	Notes
PDF	`.pdf`	Text and scanned documents
Images	`.png`, `.jpg`, `.jpeg`, `.webp`	OCR-capable
Excel	`.xlsx`	Multi-sheet support
CSV	`.csv`	Tabular data
JSON	`.json`	Structured input (skip extraction)
XML	`.xml`	Structured input (skip extraction)
Text	`.txt`	Plain text

Pricing

During the founders preview, every team gets 100 free extraction credits. Each job consumes 1 credit regardless of file count.

After the beta period, extract credits are available as monthly packs or one-time purchases.

Next Steps

Extract Quickstart — Get started in 5 minutes
Extract API Reference — Full endpoint documentation
Gate Stage 0 Guide — Use Extract with Flow gates

How It Works​

Key Features​

Schema-Driven Extraction​

Multi-File Support​

Confidence Scoring​

Provider-Agnostic​

Standalone vs Gate-Linked​

Standalone Extract​

Gate-Linked Extract (Stage 0)​

Supported File Types​

Pricing​

Next Steps​