Ordica cuts your LLM bill. The range is 7% to 50%. It depends on your prompts, which we haven't seen.

7% to 50% depends on what you send. Paste a prompt and find out. Benchmark →

Your prompt

0 characters Ctrl+Enter to analyze

Models: GPT-4o Claude Sonnet 4.5 Grok-4 Fast Gemini 2.5 Flash

Samples:

Your prompt is processed in memory only — never written to disk, never sent to a model for inference. We log content-free metadata for abuse prevention. View audit policy →

Token breakdown

Model	Tokenizer	Tokens	$/request	$/1M requests

Input pricing only. We do not estimate output cost because output length depends on the model and prompt. Pricing effective 2026-04-01.

Structural composition

Estimated savings

Matched cohort:

Conservative (p25)

—

Realistic (median)

—

Aggressive (p75)

—

This is a directional estimate from cohort matching against Ordica's blind-judged benchmark, not a guarantee for this specific prompt. See the methodology →

How this works

Token counts are from the vendor's tokenizer. Savings estimates come from cohort matching against a blind-judged benchmark — the compression engine never runs on your input here. Your prompt content is not logged. Full audit policy →

Legal: Terms of Service · Privacy Policy · Data Processing Agreement (request a signed copy at legal@ordica.ai)

# One env var. No SDK changes. No code rewrite.
# Point your existing OpenAI or Anthropic SDK at Ordica:

export OPENAI_BASE_URL="https://api.ordica.ai"
# — or —
export ANTHROPIC_BASE_URL="https://api.ordica.ai"

# Your SDK keeps working. You keep your key.
# Get API access →

How it works

Three steps. No new dependencies.

Point

Set one environment variable
OPENAI_BASE_URL or ANTHROPIC_BASE_URL
= https://api.ordica.ai

Authenticate

Keep your existing provider API key
We forward it to your chosen provider
and don't retain it after the request

Save

Compressed when savings confidence clears the threshold.
Passed through unchanged when it does not. You do not get a worse response. You get no savings on that request.

xAI users: the OpenAI SDK already works — same env var, your Grok API key. Gemini users: pass http_options={"base_url":"https://api.ordica.ai/gemini"} to the google-genai client, or contact us for deeper integration support.

Privacy

Prompts pass through.
Nothing sticks.

Your prompts move through the proxy in memory, get compressed, and go to the provider you designated. We do not store them, log them, read them, or train on them. The only thing we retain is counts: tokens in, tokens out, which provider you used.

Billing runs on those counts alone. Prompt content is not persisted beyond the compression step — our audit logs contain metadata only: token counts, timestamps, billing meters. Retention and sub-processor details are enumerated in our Data Processing Agreement.

Same rule for your email address. We collect it to verify your identity and deliver credentials. After that, we delete the plaintext from our systems. What we retain is a non-reversible identifier — enough to authenticate your account, not enough to reconstruct your email from our database.

Your prompt → Ordica proxy compressed → Your AI provider

Token counts → Ordica telemetry anon

Message content → Our storage never retained

Your email → Credential delivery plaintext deleted

Validated

Measured, not promised

Every message is a real-world test. A separate judge model scores compressed vs. original responses without knowing which is which.

Blind-judged 3.88 / 5.0 mean across 775 quality evaluations — 200 prompts × 4 providers, single judge blind to which response was compressed. Per-provider breakdown →

Estimated annual savings

See your number

Your monthly LLM spend

$ / month

Gross savings

$2,500/ month

$30,000 cut from your annual LLM bill.

Pro tier — you keep 70%

$1,523 / mo

$18,270 / year

Enterprise — you keep 80%

$1,740 / mo

$20,880 / year

Conservative = dense structured traffic (10% p25). Blended = four-cohort median (30%). We make money when tokens disappear. We make nothing when they don't. See pricing →

Pricing

You only pay when we save you money

Your billing dashboard shows how many tokens you sent, how many tokens we compressed them to, and what that saved you in dollars. Your bill is 30% of that dollar savings on Pro, 20% on Enterprise, zero on Free. No compression, no fee.

Free

No credit card. No compression, no fee. Start now.

Drop-in proxy endpoint — point your existing SDK at api.ordica.ai
GPT-4o, Claude, Gemini, Grok
Conversation-history workloads: 35.5% median
Up to 10K requests/month
Savings dashboard
Community support
Best-effort availability — see Terms

One environment variable. No new SDK. No code rewrite. Methodology →

Most popular

Pro

30% of your measured savings

We take 30% of what we save you. You keep the rest.

Same drop-in proxy as Free
Full compression engine
Instruction-heavy workloads: 28% median
100K requests/day
Priority support
Full savings analytics
Production-grade uptime target — see Terms

Higher rate limits, full analytics. Quality varies by provider — use the analyzer above.

Enterprise

20% of your measured savings

Lower rate. Custom profile. We take 20%.

Everything in Pro
Custom optimization profile
On-prem / air-gapped SDK deployment — roadmap; available for qualified prospects under custom engagement. Contact sales for current availability and attestation requirements.
Direct access to the founder
Custom SLA under enterprise contract
Unlimited requests
Annual commitment, terms negotiated per contract
On-prem billing uses signed usage attestations from your own telemetry

Lower rate because your volume improves the product for everyone. On-prem is a roadmap item — contact sales for current availability.

Automatic monthly renewal — please read.
By starting a paid plan, you authorize Ordica LLC to charge your payment method on a recurring monthly basis until you cancel. Your subscription automatically renews at the then-current price at the end of each monthly term. You may cancel at any time from your account dashboard or by emailing support@ordica.ai; cancellation takes effect at the end of the current billing period and stops further charges. Refund and cancellation terms are set out in our Terms.

Questions

Frequently asked

Sometimes. We ran 200 prompts across 4 providers — 775 blind quality judgments total. Compressed outputs scored 3.88/5.0 on average. We recommend running your own evaluation before committing. Benchmark protocol →

The compression adds a few milliseconds. The provider's response time dominates. Shorter prompts can produce faster provider responses — fewer tokens to process.

Ordica's compression is deterministic on our tested corpus — same input, byte-identical output. Cache keys stay stable. In our test harness, a compressed prompt produced a full read-hit on the second call (Anthropic cache_read_input_tokens equal to cache_creation_input_tokens). Validate your own prompt shapes. cache_control markers are left untouched. Provider-side cache hits are unaffected.

If compression confidence is insufficient, the prompt passes through unchanged. You lose the savings on that request. You do not get a degraded response. Worst case is zero savings, not worse quality.

No. Your messages pass through the proxy to reach the AI provider. Our audit and billing logs contain metadata only — token counts, timestamps, provider, and billing meters. Prompt and response content is not persisted beyond the compression step. Retention windows, sub-processors, and data-subject rights are enumerated in our Data Processing Agreement.

We vet every request before granting access — it keeps abuse out and service quality in. Your email is how we reach you with credentials and follow up if something needs attention. We don't sell it, share it for marketing, or disclose it to anyone except service providers we need to operate — like our payment processor.