Extract Quickstart

Extract structured data from a PDF in under 5 minutes.

Prerequisites

A Rynko account with an API key (create one here)
A document to extract from (PDF, image, or spreadsheet)

Step 1: Create an Extraction Job

Upload a file with a schema describing what to extract:

cURL
Node.js
Python

curl -X POST https://api.rynko.dev/api/extract/jobs \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "[email protected]" \
  -F 'schema={"type":"object","properties":{"invoiceNumber":{"type":"string","description":"The invoice number"},"vendorName":{"type":"string","description":"Name of the vendor"},"totalAmount":{"type":"number","description":"Total amount due"},"lineItems":{"type":"array","description":"List of line items"}},"required":["invoiceNumber","totalAmount"]}'

import Rynko from '@rynko/sdk';

const rynko = new Rynko({ apiKey: 'YOUR_API_KEY' });

const job = await rynko.extract.create({
  files: ['./invoice.pdf'],
  schema: {
    type: 'object',
    properties: {
      invoiceNumber: { type: 'string', description: 'The invoice number' },
      vendorName: { type: 'string', description: 'Name of the vendor' },
      totalAmount: { type: 'number', description: 'Total amount due' },
      lineItems: { type: 'array', description: 'List of line items' },
    },
    required: ['invoiceNumber', 'totalAmount'],
  },
});

console.log('Job created:', job.id);

from rynko import RynkoClient

client = RynkoClient(api_key="YOUR_API_KEY")

job = client.extract.create(
    files=["./invoice.pdf"],
    schema={
        "type": "object",
        "properties": {
            "invoiceNumber": {"type": "string", "description": "The invoice number"},
            "vendorName": {"type": "string", "description": "Name of the vendor"},
            "totalAmount": {"type": "number", "description": "Total amount due"},
            "lineItems": {"type": "array", "description": "List of line items"},
        },
        "required": ["invoiceNumber", "totalAmount"],
    },
)

print(f"Job created: {job.id}")

Step 2: Poll for Results

Extraction runs asynchronously. Poll the job status until it completes:

cURL
Node.js
Python

curl https://api.rynko.dev/api/extract/jobs/JOB_ID \
  -H "Authorization: Bearer YOUR_API_KEY"

// Poll until complete
let result = await rynko.extract.get(job.id);
while (result.status === 'QUEUED' || result.status === 'PROCESSING') {
  await new Promise((r) => setTimeout(r, 2000));
  result = await rynko.extract.get(job.id);
}

console.log('Result:', JSON.stringify(result.result, null, 2));

import time

result = client.extract.get(job.id)
while result.status in ("QUEUED", "PROCESSING"):
    time.sleep(2)
    result = client.extract.get(job.id)

print(f"Result: {result.result}")

Step 3: Use the Extracted Data

The response includes the extracted data matching your schema:

{
  "id": "abc123",
  "status": "COMPLETED",
  "result": {
    "data": {
      "invoiceNumber": "INV-2026-001",
      "vendorName": "Acme Corp",
      "totalAmount": 1250.00,
      "lineItems": [
        { "description": "Widget A", "quantity": 10, "price": 50.00 },
        { "description": "Widget B", "quantity": 5, "price": 150.00 }
      ]
    },
    "fields": [
      { "field": "invoiceNumber", "confidence": "HIGH", "score": 0.98 },
      { "field": "vendorName", "confidence": "HIGH", "score": 0.95 },
      { "field": "totalAmount", "confidence": "HIGH", "score": 0.99 },
      { "field": "lineItems", "confidence": "MEDIUM", "score": 0.82 }
    ]
  }
}

Tips for Better Extraction

Add descriptions to schema fields — helps the AI understand what to look for
Use specific types — number for amounts, date for dates, array for lists
Mark required fields — ensures the AI prioritizes these fields
Use the discovery endpoint first if you're unsure what fields exist in your documents

Next Steps

Extract Overview — Understand the full feature set
Extract API Reference — Complete endpoint documentation
Extract Schemas Guide — Schema best practices

Prerequisites​

Step 1: Create an Extraction Job​

Step 2: Poll for Results​

Step 3: Use the Extracted Data​

Tips for Better Extraction​

Next Steps​