Documentation

Build on one API. Reach every model.

Neural Router is an OpenAI-compatible gateway that routes each request to the best model across 400+ LLMs — by cost, latency, or quality — with governance, observability, and failover built in.

What is Neural Router?

Neural Router sits in front of 60+ providers and 400+ models behind a single, OpenAI-compatible endpoint. You send a chat completion request; the router scores eligible models against your objective and policies and dispatches to the best one, automatically failing over if a provider degrades.

Because the API mirrors the OpenAI Chat Completions schema, you can point an existing OpenAI SDK at Neural Router by changing only the base URL and API key — then layer routing, budgets, caching, and residency on top.

Core concepts

One key, one schema, every major model.
Route by quality-per-dollar, lowest-cost, lowest-latency, or highest-quality.
Ordered fallback chains and automatic provider failover.
Budgets, guardrails, semantic caching, data residency, and an audit log per workspace.

Build on one API. Reach every model.

What is Neural Router?

Start here

Quickstart

Provider routing

Model routing

Models

Structured outputs

API reference

Core concepts