Routing
Model routing
Set model to "auto" and Neural Router scores eligible models against your objective and policies, then dispatches to the best one — with an explainable readout of why.
Objectives
The route.objective field decides what "best" means:
- quality-per-dollar — Best quality for the price — the default balance.
- lowest-cost — Cheapest eligible model that meets your constraints.
- lowest-latency — Fastest to first token within the quality floor.
- highest-quality — Top-scoring model regardless of price.
A request-level objective overrides the workspace default you configure under Routing.
The route object
"route": {
"objective": "quality-per-dollar",
"fallbacks": ["gpt-4o", "claude-sonnet"],
"max_latency_ms": 4000,
"task_class": "coding" // optional intent hint: coding | vision | long-context | cheap-bulk | general
}fallbacks defines an ordered chain tried if the primary choice fails. task_class is an optional hint that biases selection toward models strong at that kind of work.
Explainability
Every routed response includes an x-nr-model header and a routing block describing the decision — the selected model, the objective, how many candidates were considered, and a short reason.
"routing": {
"selected": "claude-sonnet",
"objective": "quality-per-dollar",
"candidates_considered": 7,
"reason": "Best quality-per-dollar within latency budget"
}Provider selection
Model routing chooses which model; provider routing chooses which endpoint serves it. Combine both — see Provider routing.