Question 1

How is the 2% fee calculated?

Accepted Answer

The fee is calculated based on your total gross inference spend through the Neural Router. We track the cost reported by the model providers (OpenAI, Anthropic, etc.) and apply a 2% management fee. There are no additional per-request costs or setup fees.

Question 2

Can I use my own API keys?

Accepted Answer

Yes, Neural Router acts as a high-performance control plane. You bring your own credentials for the underlying models, and we handle the routing, caching, and observability logic in the middle.

Question 3

What defines "Enterprise" deployment?

Accepted Answer

Enterprise deployments are designed for teams requiring SOC2/HIPAA compliance, data residency within specific regions (e.g., EU-only), or air-gapped VPC installations. These plans also include 24/7 dedicated engineering support.

Question 4

Does semantic caching impact latency?

Accepted Answer

Actually, it improves it. Cache hits return in <50ms, which is significantly faster than any LLM generation. For cache misses, our overhead is less than 5ms.

Service type	Multiplier	Why it costs more / less
Saver	0.85×	Best-effort, tolerant of degraded providers
Standard	1.0×	Smart default — best quality per dollar
Scale	0.7×	Async/batch — savings passed on to win volume
Turbo	1.25×	Reserves low-latency premium backends
Precision	1.4×	Quality floor + cascade-escalation
Agent	1.3×	Trajectory optimization + affinity infra
Custom	1.0×	Bring your own policy

Plan	Price	Token markup	Service types	Value-share	Dedicated
Free / Trial	$0 + usage	30%	Standard, Saver	—	—
Pro	$0 base + usage	25%	all 7	optional 25%	—
Team	$99/mo + usage	20%	all 7	optional 20%	Priority tier
Enterprise	custom	negotiated	all 7 + Custom	15–25%	Platinum / Dedicated

Predictable pricing for
unpredicted scale.

The Math Behind the Router

Pick the speed/quality tier per request

Plan ladder

Frequently Asked Questions