Features
Service tiers & savings
Choose how each request is routed — by cost, speed, quality, or throughput — step up to enterprise capacity tiers, and understand how providers are ranked and how savings are credited.
Service types
Every request is routed under a service type that sets the optimization objective. Set it per workspace or override it per request with the service field.
{
"model": "auto",
"service": "saver", // standard | saver | turbo | precision | scale | agent | custom
"messages": [ ... ]
}Standard
The balanced default. Neural Router scores eligible models on quality-per-dollar and dispatches to the best one, failing over automatically if a provider degrades. Use it when you have no strong cost or latency preference.
Saver
Optimizes for the lowest cost that still clears the request's quality bar — ideal for high-volume, cost-sensitive workloads like classification or summarization.
"service": "saver"Turbo
Optimizes for the lowest latency, routing to the fastest healthy provider. Use it for interactive, user-facing calls where time-to-first-token matters most.
Precision
Optimizes for the highest quality, preferring top-scoring models regardless of price. Use it for hard reasoning, evaluation, or final-output generation.
Scale
Tuned for high throughput: spreads load across providers and absorbs bursts without throttling, so large batch jobs complete predictably.
Agent
Optimized for tool-calling and multi-step agent loops, favoring models with reliable function calling and stable structured output across turns.
Custom
Define your own objective and constraints — allowed models, regions, and per-request budgets — for full control over routing.
{
"service": "custom",
"routing": {
"objective": "quality",
"max_cost_usd": 0.02,
"allow_models": ["gpt-4o", "claude-3-5-sonnet"]
}
}Enterprise tiers
Enterprise tiers layer reserved capacity and stronger guarantees on top of any service type.
Priority
Reserved capacity and elevated rate limits so your traffic is served ahead of standard demand during peaks.
Platinum
Everything in Priority plus stricter SLAs, faster support response targets, and enhanced observability and reporting.
Dedicated
Isolated, single-tenant capacity with custom SLAs, dedicated routing, and white-glove onboarding for the most demanding workloads.
Provider scorecard
The router continuously measures each provider's quality, latency, uptime, and price and rolls them into a scorecard. Routing decisions rank eligible providers by these live metrics, so traffic naturally shifts toward the best-performing options.
Provider eligibility
Before scoring, Neural Router filters the provider set by your policies — data region, allow-lists, BYOK requirements, and current health. Only eligible providers are ranked, so a request never routes somewhere your governance rules forbid.
Value-based savings
Savings figures are value-based: each routed (or cached) request is credited against what the equivalent direct provider call would have cost. The dollars shown therefore reflect real avoided spend, not list-price guesses.
Dashboard reference
Each feature in the dashboard carries a (?) icon that links back to the matching reference section below.
Overview
The Overview gives a snapshot of the active workspace — recent spend, requests, and tokens — with quick links into keys, usage, and routing so you can jump straight to whatever needs attention.
Usage
Usage charts spend, requests, and token volume over time for the active workspace, broken down by model and by key so you can see where consumption concentrates.
Logs
Logs is a live feed of recent inference requests — model, status, latency, and cost per call — for quickly spotting errors, slow calls, or unexpected spend.
Model catalog
The model catalog lists every model the router can dispatch to, with its provider, context window, and per-million-token pricing for prompt, completion, and cached tokens.
API keys
Create, rotate, and revoke API keys. Each key carries a service type, an optional spend limit, an allowed-models list, and a rate limit — all enforced by the router on every request.
Routing policy
The routing policy is the workspace default: the service type and candidate models applied to any request that doesn't override them. Per-request and per-key settings take precedence over this default.
Enterprise tiers (reservations)
Reserve dedicated capacity for a workspace and choose a tier — Priority, Platinum, or Dedicated — that layers stronger guarantees on top of whatever service type each request uses. See the tier descriptions above for what each one adds.
Cost Advisor
Cost Advisor surfaces estimated savings, spend anomalies against the trailing baseline, and one-click suggestions such as moving a key to a cheaper service type when its traffic allows it.
Billing
Billing shows your plan and its limits alongside monthly statements aggregated from real spend and top-ups recorded in the ledger — never fabricated rows.
Credits
Credits is your org's prepaid balance. Buy credits and configure auto top-up — a threshold and refill amount — so traffic never stalls on an empty balance.
Team
Team manages org members and pending invites. Roles — owner, admin, and member — govern who can change keys, billing, and routing policy.
Workspaces
Each workspace isolates its own API keys, routing policy, and usage. Switching the active workspace scopes the rest of the dashboard to it.
Presets
A preset is a saved, named routing configuration — a service type plus tuned intent dimensions — that you can reuse across keys instead of re-tuning the same Custom settings each time.
Provider reference
If your organization serves models on the network, the supply-side pages below each carry their own (?) reference.
Provider hub
The Provider hub is the supply-side home for organizations serving models — application status, listed models, endpoints, and earnings in one place.
Audition
New provider endpoints start in audition: the router probes them on live-shadow traffic and only promotes them to serve real requests once they clear the quality and uptime bar.
Earnings
Earnings reports the revenue from traffic routed to your models, broken down by model and request volume over the selected period.
Yield
Yield turns the router's real routing decisions into supply-side telemetry: how often each of your models won, why it lost when it didn't (price, quality, latency, or eligibility), and — for price losses — the average price gap to the winner. Use it to see where a small price move would win volume.
Endpoints
Endpoints lists the API endpoints you serve along with the router's last connectivity probe — reachability, status code, and latency — so you can confirm the router can reach you.
Provider models
Provider models are the models you list on the network, including family, context window, and the input/output pricing the router quotes to buyers.
Where to set this
Pick a default service type per workspace, then override per request with the service field. The (?) icons across the dashboard link back to the matching section here.