Developers

Ship your first extraction in under five minutes.

A REST API with an OpenAPI 3.1 spec. MCP-native agents. Production-ready webhooks from day one. Bring any HTTP client.

Code, by example

A complete extraction in one curl.

POST a file and a schema. Get back structured data with pixel-level citations — all in one call. No SDK to install.

Try in playground
bash · quickstart.sh
$ curl -X POST https://api.datadistill.co/v1/extract \
    -H "Authorization: Bearer $DD_KEY" \
    -F "file=@invoice.pdf" \
    -F 'schema={"invoice_id":"string","total":"number","due_date":"date"}'

# 200 OK
{
  "data": {
    "invoice_id": "INV-2024-0042",
    "total": 12480,
    "due_date": "2024-02-15"
  },
  "provenance": {
    "total": { "page": 1, "bbox": [170,210,228,228] }
  }
}
MCP

Compose extraction into your agents.

DataDistill ships an MCP server you can connect to any MCP-compatible client. Your agents can extract, verify, and query structured data without leaving the conversation.

MCP-native stdio + HTTP transports Tool, resource, prompt primitives
python · mcp_server.py
# Drop DataDistill into any MCP agent
from datadistill.mcp import server

await server.serve(port=8080)
Start building

Ship your first extraction today.

API key in 30 seconds. No credit card. Full access to every API on the free tier.