Ordica cuts your LLM bill. The range is 7% to 50%. It depends on your prompts, which we haven't seen.

7% to 50% depends on what you send. Paste a prompt and find out. Benchmark →

0 characters Ctrl+Enter to analyze
Models:
Samples:

Your prompt is processed in memory only — never written to disk, never sent to a model for inference. We log content-free metadata for abuse prevention. View audit policy →

How this works

Token counts are from the vendor's tokenizer. Savings estimates come from cohort matching against a blind-judged benchmark — the compression engine never runs on your input here. Your prompt content is not logged. Full audit policy →

# One env var. No SDK changes. No code rewrite.
# Point your existing OpenAI or Anthropic SDK at Ordica:

export OPENAI_BASE_URL="https://api.ordica.ai"
# — or —
export ANTHROPIC_BASE_URL="https://api.ordica.ai"

# Your SDK keeps working. You keep your key.
# Get API access →
VERIFIED SAVINGS BY WORKLOAD
Your savings depend on what you send us.
Here is the honest range.

RAG retrieval: 42.8–55.3% (median 49.8%). Conversation history: 35.5% median. Instruction-heavy prompts: 28% median. Dense structured traffic: 7–10%. Full breakdown →

How it works
Three steps. No new dependencies.
1

Point

Set one environment variable
OPENAI_BASE_URL or ANTHROPIC_BASE_URL
= https://api.ordica.ai

2

Authenticate

Keep your existing provider API key
We forward it to your chosen provider
and don't retain it after the request

3

Save

Compressed when savings confidence clears the threshold.
Passed through unchanged when it does not. You do not get a worse response. You get no savings on that request.

xAI users: the OpenAI SDK already works — same env var, your Grok API key. Gemini users: pass http_options={"base_url":"https://api.ordica.ai/gemini"} to the google-genai client, or contact us for deeper integration support.

Privacy
Prompts pass through.
Nothing sticks.

Your prompts move through the proxy in memory, get compressed, and go to the provider you designated. We do not store them, log them, read them, or train on them. The only thing we retain is counts: tokens in, tokens out, which provider you used.

Billing runs on those counts alone. Prompt content is not persisted beyond the compression step — our audit logs contain metadata only: token counts, timestamps, billing meters. Retention and sub-processor details are enumerated in our Data Processing Agreement.

Your prompt Ordica proxy compressed Your AI provider
Token counts Ordica telemetry anon
Message content Our storage never retained
Compatible
Works with the model you already use

Same proxy endpoint. Same API key. Different cost line on your bill.

GPT-4o
Claude
Gemini
Grok
Validated
Measured, not promised

Every message is a real-world test. A separate judge model scores compressed vs. original responses without knowing which is which.

Blind-judged 3.88 / 5.0 mean across 775 quality evaluations — 200 prompts × 4 providers, single judge blind to which response was compressed. Per-provider breakdown →

Estimated annual savings

See your number

Your monthly LLM spend
$ / month
Gross savings
$2,500/ month
$30,000 cut from your annual LLM bill.
Pro tier — you keep 70%
$1,523 / mo
$18,270 / year
Enterprise — you keep 80%
$1,740 / mo
$20,880 / year
Conservative = dense structured traffic (10% p25). Blended = four-cohort median (30%). We make money when tokens disappear. We make nothing when they don't. See pricing →
Domain Benchmarks
How your document type changes the math

Document-processing workloads compress further: financial filings at 81% mean reduction, regulatory documents at 44%. Domain benchmarks →

Pricing
You only pay when we save you money

Your billing dashboard shows how many tokens you sent, how many tokens we compressed them to, and what that saved you in dollars. Your bill is 30% of that dollar savings on Pro, 20% on Enterprise, zero on Free. No compression, no fee.

Free
$0
No credit card. No compression, no fee. Start now.

  • Drop-in proxy endpoint — point your existing SDK at api.ordica.ai
  • GPT-4o, Claude, Gemini, Grok
  • Conversation-history workloads: 33% median
  • Up to 10K requests/month
  • Savings dashboard
  • Community support
  • Best-effort availability — see Terms
One environment variable. No new SDK. No code rewrite. Methodology →
Enterprise
20% of your measured savings
Lower rate. Custom profile. We take 20%.

  • Everything in Pro
  • Custom optimization profile
  • On-prem / air-gapped SDK deployment — roadmap; available for qualified prospects under custom engagement. Contact sales for current availability and attestation requirements.
  • Direct access to the founder
  • Custom SLA under enterprise contract
  • Unlimited requests
  • Annual commitment, terms negotiated per contract
  • On-prem billing uses signed usage attestations from your own telemetry
Lower rate because your volume improves the product for everyone. On-prem is a roadmap item — contact sales for current availability.
Automatic monthly renewal — please read.
By starting a paid plan, you authorize Ordica LLC to charge your payment method on a recurring monthly basis until you cancel. Your subscription automatically renews at the then-current price at the end of each monthly term. You may cancel at any time from your account dashboard or by emailing support@ordica.ai; cancellation takes effect at the end of the current billing period and stops further charges. Refund and cancellation terms are set out in our Terms.
Questions
Frequently asked

Sometimes. We ran 200 prompts across 4 providers — 775 blind quality judgments total. Compressed outputs scored 3.88/5.0 on average. We recommend running your own evaluation before committing. Benchmark protocol →

The compression adds a few milliseconds. The provider's response time dominates. Shorter prompts can produce faster provider responses — fewer tokens to process.

Ordica's compression is deterministic on our tested corpus — same input, byte-identical output. Cache keys stay stable. In our test harness, a compressed prompt produced a full read-hit on the second call (Anthropic cache_read_input_tokens equal to cache_creation_input_tokens). Validate your own prompt shapes. cache_control markers are left untouched. Provider-side cache hits are unaffected.

If compression confidence is insufficient, the prompt passes through unchanged. You lose the savings on that request. You do not get a degraded response. Worst case is zero savings, not worse quality.

No. Your messages pass through the proxy to reach the AI provider. Our audit and billing logs contain metadata only — token counts, timestamps, provider, and billing meters. Prompt and response content is not persisted beyond the compression step. Retention windows, sub-processors, and data-subject rights are enumerated in our Data Processing Agreement.

Your prompts are wasting tokens you are paying for.

Free tier: open globally. Pro and Enterprise: US billing address required.

Aurora Ordica AI Support
Email
First name
Question or message