Ordica API Docs

Ordica is a proxy that reduces token costs. You bring your own provider API key — Ordica optimizes the prompt and forwards it to your provider under your key. Your provider account pays for model inference. No SDK swap required.

Quick Start

Two things you need before you start:

OpenAI SDK drop-in — Python

Change one URL and add one header. Everything else stays the same.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.ordica.ai/v1",
    api_key=os.environ["ORDICA_API_KEY"],          # your ord_... key
    default_headers={
        # Your OpenAI key — forwarded directly; Ordica never stores it
        "X-Provider-Key": os.environ["OPENAI_API_KEY"],
    },
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize this document: ..."}],
)
print(response.choices[0].message.content)
Two keys, two roles. api_key is your Ordica account key — it identifies your account for billing. X-Provider-Key is your actual OpenAI key — it's what OpenAI charges. Ordica never touches your OpenAI account directly.

OpenAI SDK — Node / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.ordica.ai/v1",
  apiKey: process.env.ORDICA_API_KEY,
  defaultHeaders: { "X-Provider-Key": process.env.OPENAI_API_KEY },
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Summarize this document: ..." }],
});
console.log(response.choices[0].message.content);

Anthropic SDK drop-in — Python

Set base_url to Ordica. The Anthropic SDK sends your Anthropic key as x-api-key automatically — add your Ordica key in the Authorization header.

import os
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.ordica.ai",
    api_key=os.environ["ANTHROPIC_API_KEY"],      # forwarded to Anthropic as x-api-key
    default_headers={
        "Authorization": f"Bearer {os.environ['ORDICA_API_KEY']}"
    },
)

message = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Summarize this document: ..."}],
)
print(message.content[0].text)
Streaming not yet available on the Anthropic endpoint. Pass stream=False or omit it. Non-streaming works fully.

Direct HTTP — curl

curl -s https://api.ordica.ai/v1/chat/completions \
  -H "Authorization: Bearer $ORDICA_API_KEY" \
  -H "X-Provider-Key: $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello."}]}' \
  | python3 -m json.tool

Verifying compression is working

Ordica adds four headers to every /v1/chat/completions response:

HeaderDescription
X-Ordica-Request-IdUnique ID for this request
X-Ordica-Tokens-OriginalToken count before optimization
X-Ordica-Tokens-CompressedToken count after optimization
X-Ordica-Savings-PctInteger percent saved (0 on pass-through)
curl -si https://api.ordica.ai/v1/chat/completions \
  -H "Authorization: Bearer $ORDICA_API_KEY" \
  -H "X-Provider-Key: $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello."}]}' \
  | grep -i 'x-ordica'

What to expect

Workload typeTypical rangeMedian
RAG retrieved-document blocks43–55% (non-Gemini)~50%
Multi-turn conversation histories (>2K tokens)31–40%~35%
Long system prompts and few-shot blocks22–33%~28%
Dense structured inputs (short JSON, schemas)7–10%~9%

Gemini RAG savings are materially lower (~3%) due to a conservative optimization profile for that provider. Use the Analyzer to estimate savings on your own prompts.

Quality on blind-judged equivalence testing on RAG inputs (175 quality measurements across 4 providers) averaged 4.35 / 5.0. See the methodology →

Pass-through: when optimization would not safely reduce token count, the original prompt is forwarded unchanged. X-Ordica-Savings-Pct: 0. Because Ordica's fee is a share of your savings, pass-through requests incur no fee.

Authentication

Base URL: https://api.ordica.ai

All requests require your Ordica account key in the Authorization header:

Authorization: Bearer ord_your_key_here

Your ord_... key is issued at signup. Your provider API key (OpenAI, Anthropic, Gemini, or Grok) is passed separately per request — Ordica forwards it to the upstream provider and never stores it.

POST /v1/chat/completions

OpenAI-compatible chat completions with prompt optimization. For OpenAI, Google Gemini, and xAI Grok.

Request headers

HeaderRequiredDescription
AuthorizationYesBearer ord_your_key_here
X-Provider-KeyYesYour provider API key — forwarded to the upstream, never stored
Content-TypeYesapplication/json

Request body

Standard OpenAI chat completions format. Required fields:

FieldTypeDescription
modelstringModel name (e.g. gpt-4o, gemini-2.0-flash, grok-3). Determines the upstream provider.
messagesarrayConversation turns. Each item: {"role": "user"|"assistant"|"system", "content": "..."}

All other OpenAI-compatible fields (temperature, max_tokens, stream, etc.) are forwarded to the upstream as-is.

Response

Standard OpenAI chat completions response body, unmodified, plus four additional headers: X-Ordica-Request-Id, X-Ordica-Tokens-Original, X-Ordica-Tokens-Compressed, X-Ordica-Savings-Pct.

Streaming. stream: true is supported. On stream errors, error events are emitted as SSE before the stream closes.

Provider errors (e.g. a 400 or 429 from OpenAI) are returned with the upstream status code and body unmodified — your existing error-handling code continues to work.

POST /v1/messages

Anthropic-compatible Messages API with prompt optimization. For Anthropic Claude models only. The Anthropic SDK can be pointed at this endpoint without code changes.

Request headers

HeaderRequiredDescription
AuthorizationYesBearer ord_your_key_here
x-api-keyYesYour Anthropic API key — forwarded to Anthropic, never stored
Content-TypeYesapplication/json
anthropic-versionNoForwarded to Anthropic. Defaults to 2023-06-01.

Standard Anthropic Messages API request body. Note: "stream": true is not yet supported — requests with streaming return 400.

Response: standard Anthropic Messages API response body, returned unmodified. No X-Ordica-* headers on this endpoint.

POST /compress

Token-count endpoint — no upstream API call is made. Runs the optimization pipeline and returns token counts and savings. Use this to benchmark how much your workload will save before integrating.

No X-Provider-Key required.

curl -s https://api.ordica.ai/compress \
  -H "Authorization: Bearer $ORDICA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Your long prompt..."}],"provider":"openai"}'

Response

{
  "original_tokens": 1842,
  "compressed_tokens": 1243,
  "savings_pct": 33
}

GET /health

Service status. No authentication required.

curl https://api.ordica.ai/health
# {"status": "ok"}

Error codes

StatusCodeMeaning
400missing_provider_keyX-Provider-Key (or x-api-key on /v1/messages) header is absent.
400unsupported_modelModel starts with claude-. Use /v1/messages for Anthropic models.
400invalid_requestMalformed JSON, missing model, or empty messages.
401missing_api_keyAuthorization header absent or malformed.
401invalid_api_keyord_... key not recognized.
429Rate limit exceeded. Retry with backoff.
502upstream_errorNetwork failure reaching the upstream provider.
504upstream_timeoutUpstream provider did not respond in time.

Ordica-generated errors return an OpenAI-compatible JSON envelope:

{"error": {"message": "...", "type": "invalid_request_error", "code": "missing_provider_key"}}

Supported providers

ProviderEndpointModel prefix examples
OpenAI/v1/chat/completionsgpt-4o, gpt-4o-mini, o1, o3, o4-mini
Google Gemini/v1/chat/completionsgemini-2.0-flash, gemini-1.5-pro
xAI Grok/v1/chat/completionsgrok-3, grok-2
Anthropic Claude/v1/messagesclaude-opus-4-5, claude-sonnet-4-5

Unknown model prefixes on /v1/chat/completions fall back to the OpenAI upstream.

Rate limits

PlanLimit
Free10,000 requests / month
Pro100,000 requests / day
EnterpriseContact us

Data handling

Prompt content is never written to disk, never logged, and never used for training. The only data retained after a request: token counts, timestamp, and provider — to power your billing meter.

Terms of Service · Privacy Policy · Data Processing Agreement