API
Models
GET /v1/models
Returns the list of models available through the Nebo API. The response follows the OpenAI models list format.
Request
curl https://janus.neboloop.com/v1/models \
-H "Authorization: Bearer $NEBO_TOKEN"
Response
{
"object": "list",
"data": [
{
"id": "deepseek.v3.2",
"object": "model",
"created": 1712100000,
"owned_by": "bedrock"
},
{
"id": "gpt-5.2",
"object": "model",
"created": 1712100000,
"owned_by": "openai"
},
{
"id": "gpt-5-nano",
"object": "model",
"created": 1712100000,
"owned_by": "openai"
},
{
"id": "gpt-5-mini",
"object": "model",
"created": 1712100000,
"owned_by": "openai"
},
{
"id": "qwen.qwen3-32b",
"object": "model",
"created": 1712100000,
"owned_by": "bedrock"
}
]
}
The owned_by field indicates which provider serves the model.
Smart Routing vs Direct Selection
When you send a chat completion request, you choose how the model is selected:
Smart Routing
Set model to "auto" or leave it empty. Nebo's routing engine picks the best model for your message based on:
- Content analysis — the complexity of your prompt (ML-based scoring)
- Keyword rules — specific patterns that route to specialized models
- Plan tier — your subscription determines which models are available
- Cost optimization — cheaper models for simple tasks, premium models for complex reasoning
{"model": "auto", "messages": [...]}
This is the recommended approach for most use cases. You get optimal cost-performance without managing model selection yourself.
Direct Selection
Set model to a specific model ID from the list above. Your request routes directly to that model with no routing logic applied.
{"model": "deepseek.v3.2", "messages": [...]}
Use direct selection when you need deterministic model behavior — for example, testing, benchmarking, or workflows that depend on a specific model's capabilities.
Note: Direct model selection bypasses tier gating. Your request will be processed regardless of your plan tier, but it still consumes your usage budget. Expensive models consume budget faster.
Model Details
| Model | Provider | Strengths | Cost Tier |
|---|---|---|---|
deepseek.v3.2 |
Bedrock | Fast coding and reasoning, large context (164K) | Low |
qwen.qwen3-32b |
Bedrock | General purpose, large context (131K) | Lowest |
gpt-5.2 |
OpenAI | Complex reasoning, 400K context | High |
gpt-5-nano |
OpenAI | Quick answers, trivial queries, 400K context | Lowest |
gpt-5-mini |
OpenAI | Balanced performance, 400K context | Low |
Model availability may change. Always call GET /v1/models for the current list.
Embeddings Models
Embedding models are separate from chat models. See Embeddings for available embedding models.
| Model | Maps To | Dimensions |
|---|---|---|
neboloop/nebo-embed-small |
text-embedding-3-small | 1536 |
neboloop/nebo-embed-large |
text-embedding-3-large | 3072 |
How Budget Is Affected
Different models consume budget at different rates. When you use smart routing, Nebo automatically selects the most efficient model for each request. Advanced reasoning models consume more budget per request than lightweight models.
See Usage & Limits for budget details.
Next Steps
- Chat Completions — send requests with your chosen model
- Embeddings — generate text embeddings
- Usage & Limits — understand how model selection affects your budget