Model routing

Routing

Model routing

Set model to "auto" and Neural Router scores eligible models against your objective and policies, then dispatches to the best one — with an explainable readout of why.

Objectives

The route.objective field decides what "best" means:

  • quality-per-dollar Best quality for the price — the default balance.
  • lowest-cost Cheapest eligible model that meets your constraints.
  • lowest-latency Fastest to first token within the quality floor.
  • highest-quality Top-scoring model regardless of price.

A request-level objective overrides the workspace default you configure under Routing.

The route object

route object
"route": {
  "objective": "quality-per-dollar",
  "fallbacks": ["gpt-4o", "claude-sonnet"],
  "max_latency_ms": 4000,
  "task_class": "coding"   // optional intent hint: coding | vision | long-context | cheap-bulk | general
}

fallbacks defines an ordered chain tried if the primary choice fails. task_class is an optional hint that biases selection toward models strong at that kind of work.

Explainability

Every routed response includes an x-nr-model header and a routing block describing the decision — the selected model, the objective, how many candidates were considered, and a short reason.

response.routing
"routing": {
  "selected": "claude-sonnet",
  "objective": "quality-per-dollar",
  "candidates_considered": 7,
  "reason": "Best quality-per-dollar within latency budget"
}

Provider selection

Model routing chooses which model; provider routing chooses which endpoint serves it. Combine both — see Provider routing.