Ordica API Docs
Ordica is a proxy that reduces token costs. You bring your own provider API key — Ordica optimizes the prompt and forwards it to your provider under your key. Your provider account pays for model inference. No SDK swap required.
Quick Start
Two things you need before you start:
- An active API key for a supported provider: OpenAI, Anthropic, Google Gemini, or xAI Grok
- An Ordica API key (
ord_...) — issued at signup on ordica.ai
OpenAI SDK drop-in — Python
Change one URL and add one header. Everything else stays the same.
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.ordica.ai/v1",
api_key=os.environ["ORDICA_API_KEY"], # your ord_... key
default_headers={
# Your OpenAI key — forwarded directly; Ordica never stores it
"X-Provider-Key": os.environ["OPENAI_API_KEY"],
},
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarize this document: ..."}],
)
print(response.choices[0].message.content)
api_key is your Ordica account key — it identifies your account for billing. X-Provider-Key is your actual OpenAI key — it's what OpenAI charges. Ordica never touches your OpenAI account directly.
OpenAI SDK — Node / TypeScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.ordica.ai/v1",
apiKey: process.env.ORDICA_API_KEY,
defaultHeaders: { "X-Provider-Key": process.env.OPENAI_API_KEY },
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Summarize this document: ..." }],
});
console.log(response.choices[0].message.content);
Anthropic SDK drop-in — Python
Set base_url to Ordica. The Anthropic SDK sends your Anthropic key as x-api-key automatically — add your Ordica key in the Authorization header.
import os
import anthropic
client = anthropic.Anthropic(
base_url="https://api.ordica.ai",
api_key=os.environ["ANTHROPIC_API_KEY"], # forwarded to Anthropic as x-api-key
default_headers={
"Authorization": f"Bearer {os.environ['ORDICA_API_KEY']}"
},
)
message = client.messages.create(
model="claude-opus-4-5",
max_tokens=1024,
messages=[{"role": "user", "content": "Summarize this document: ..."}],
)
print(message.content[0].text)
stream=False or omit it. Non-streaming works fully.
Direct HTTP — curl
curl -s https://api.ordica.ai/v1/chat/completions \
-H "Authorization: Bearer $ORDICA_API_KEY" \
-H "X-Provider-Key: $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello."}]}' \
| python3 -m json.tool
Verifying compression is working
Ordica adds four headers to every /v1/chat/completions response:
| Header | Description |
|---|---|
X-Ordica-Request-Id | Unique ID for this request |
X-Ordica-Tokens-Original | Token count before optimization |
X-Ordica-Tokens-Compressed | Token count after optimization |
X-Ordica-Savings-Pct | Integer percent saved (0 on pass-through) |
curl -si https://api.ordica.ai/v1/chat/completions \
-H "Authorization: Bearer $ORDICA_API_KEY" \
-H "X-Provider-Key: $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello."}]}' \
| grep -i 'x-ordica'
What to expect
| Workload type | Typical range | Median |
|---|---|---|
| RAG retrieved-document blocks | 43–55% (non-Gemini) | ~50% |
| Multi-turn conversation histories (>2K tokens) | 31–40% | ~35% |
| Long system prompts and few-shot blocks | 22–33% | ~28% |
| Dense structured inputs (short JSON, schemas) | 7–10% | ~9% |
Gemini RAG savings are materially lower (~3%) due to a conservative optimization profile for that provider. Use the Analyzer to estimate savings on your own prompts.
Quality on blind-judged equivalence testing on RAG inputs (175 quality measurements across 4 providers) averaged 4.35 / 5.0. See the methodology →
Pass-through: when optimization would not safely reduce token count, the original prompt is forwarded unchanged. X-Ordica-Savings-Pct: 0. Because Ordica's fee is a share of your savings, pass-through requests incur no fee.
Authentication
Base URL: https://api.ordica.ai
All requests require your Ordica account key in the Authorization header:
Authorization: Bearer ord_your_key_here
Your ord_... key is issued at signup. Your provider API key (OpenAI, Anthropic, Gemini, or Grok) is passed separately per request — Ordica forwards it to the upstream provider and never stores it.
POST /v1/chat/completions
OpenAI-compatible chat completions with prompt optimization. For OpenAI, Google Gemini, and xAI Grok.
Request headers
| Header | Required | Description |
|---|---|---|
Authorization | Yes | Bearer ord_your_key_here |
X-Provider-Key | Yes | Your provider API key — forwarded to the upstream, never stored |
Content-Type | Yes | application/json |
Request body
Standard OpenAI chat completions format. Required fields:
| Field | Type | Description |
|---|---|---|
model | string | Model name (e.g. gpt-4o, gemini-2.0-flash, grok-3). Determines the upstream provider. |
messages | array | Conversation turns. Each item: {"role": "user"|"assistant"|"system", "content": "..."} |
All other OpenAI-compatible fields (temperature, max_tokens, stream, etc.) are forwarded to the upstream as-is.
Response
Standard OpenAI chat completions response body, unmodified, plus four additional headers: X-Ordica-Request-Id, X-Ordica-Tokens-Original, X-Ordica-Tokens-Compressed, X-Ordica-Savings-Pct.
Streaming. stream: true is supported. On stream errors, error events are emitted as SSE before the stream closes.
Provider errors (e.g. a 400 or 429 from OpenAI) are returned with the upstream status code and body unmodified — your existing error-handling code continues to work.
POST /v1/messages
Anthropic-compatible Messages API with prompt optimization. For Anthropic Claude models only. The Anthropic SDK can be pointed at this endpoint without code changes.
Request headers
| Header | Required | Description |
|---|---|---|
Authorization | Yes | Bearer ord_your_key_here |
x-api-key | Yes | Your Anthropic API key — forwarded to Anthropic, never stored |
Content-Type | Yes | application/json |
anthropic-version | No | Forwarded to Anthropic. Defaults to 2023-06-01. |
Standard Anthropic Messages API request body. Note: "stream": true is not yet supported — requests with streaming return 400.
Response: standard Anthropic Messages API response body, returned unmodified. No X-Ordica-* headers on this endpoint.
POST /compress
Token-count endpoint — no upstream API call is made. Runs the optimization pipeline and returns token counts and savings. Use this to benchmark how much your workload will save before integrating.
No X-Provider-Key required.
curl -s https://api.ordica.ai/compress \
-H "Authorization: Bearer $ORDICA_API_KEY" \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"Your long prompt..."}],"provider":"openai"}'
Response
{
"original_tokens": 1842,
"compressed_tokens": 1243,
"savings_pct": 33
}
GET /health
Service status. No authentication required.
curl https://api.ordica.ai/health
# {"status": "ok"}
Error codes
| Status | Code | Meaning |
|---|---|---|
400 | missing_provider_key | X-Provider-Key (or x-api-key on /v1/messages) header is absent. |
400 | unsupported_model | Model starts with claude-. Use /v1/messages for Anthropic models. |
400 | invalid_request | Malformed JSON, missing model, or empty messages. |
401 | missing_api_key | Authorization header absent or malformed. |
401 | invalid_api_key | ord_... key not recognized. |
429 | — | Rate limit exceeded. Retry with backoff. |
502 | upstream_error | Network failure reaching the upstream provider. |
504 | upstream_timeout | Upstream provider did not respond in time. |
Ordica-generated errors return an OpenAI-compatible JSON envelope:
{"error": {"message": "...", "type": "invalid_request_error", "code": "missing_provider_key"}}
Supported providers
| Provider | Endpoint | Model prefix examples |
|---|---|---|
| OpenAI | /v1/chat/completions | gpt-4o, gpt-4o-mini, o1, o3, o4-mini |
| Google Gemini | /v1/chat/completions | gemini-2.0-flash, gemini-1.5-pro |
| xAI Grok | /v1/chat/completions | grok-3, grok-2 |
| Anthropic Claude | /v1/messages | claude-opus-4-5, claude-sonnet-4-5 |
Unknown model prefixes on /v1/chat/completions fall back to the OpenAI upstream.
Rate limits
| Plan | Limit |
|---|---|
| Free | 10,000 requests / month |
| Pro | 100,000 requests / day |
| Enterprise | Contact us |
Data handling
Prompt content is never written to disk, never logged, and never used for training. The only data retained after a request: token counts, timestamp, and provider — to power your billing meter.
Terms of Service · Privacy Policy · Data Processing Agreement