Ship your first extraction in under five minutes.
A REST API with an OpenAPI 3.1 spec. MCP-native agents. Production-ready webhooks from day one. Bring any HTTP client.
Quickstart
From zero to your first structured extraction in under five minutes. Copy-paste ready code.
→API Reference
Full OpenAPI 3.1 spec for every endpoint. Request/response shapes, error codes, rate limits.
→Auth & Rate Limits
Bearer-token authentication, signed webhook payloads, per-tier rate limits, and idempotency keys for retry-safe writes.
→A complete extraction in one curl.
POST a file and a schema. Get back structured data with pixel-level citations — all in one call. No SDK to install.
Try in playground$ curl -X POST https://api.datadistill.co/v1/extract \ -H "Authorization: Bearer $DD_KEY" \ -F "file=@invoice.pdf" \ -F 'schema={"invoice_id":"string","total":"number","due_date":"date"}' # 200 OK { "data": { "invoice_id": "INV-2024-0042", "total": 12480, "due_date": "2024-02-15" }, "provenance": { "total": { "page": 1, "bbox": [170,210,228,228] } } }
Use any HTTP client.
Native REST and webhooks. Bring whatever language you already ship in — generate types from the OpenAPI 3.1 spec and you're done.
Python
requests, httpx, or anything else. JSON in, JSON out.
requests.post(url, files=..., headers=H) TypeScript / Node
Native fetch. Generate types from our OpenAPI spec.
await fetch(url, { method: "POST", body }) Go
net/http. Context-aware, zero external dependencies.
http.Post(url, "multipart/form-data", body) cURL / shell
Test from any terminal. No project, no install.
curl -X POST $API/extract -F file=@... Compose extraction into your agents.
DataDistill ships an MCP server you can connect to any MCP-compatible client. Your agents can extract, verify, and query structured data without leaving the conversation.
# Drop DataDistill into any MCP agent from datadistill.mcp import server await server.serve(port=8080)
Ship your first extraction today.
API key in 30 seconds. No credit card. Full access to every API on the free tier.