Chat
Create chat completions using conversational message format. Supports streaming, tool use, vision, and reasoning models. OpenAI-compatible.
Create a chat completion
Authorization
api_key API key authentication using Bearer token format
In: header
Request Body
application/json
Penalize repeated tokens (-2.0 to 2.0)
doubleToken bias map
Return log probabilities
Maximum completion tokens
int640 <= valueMaximum tokens (deprecated, use max_completion_tokens)
int640 <= valueConversation messages
Request metadata
Model to use for completion
Hadrian Extension: List of models for multi-model routing (alternative to single model)
Penalize new topics (-2.0 to 2.0)
doubleRandom seed for reproducibility
int64Enable streaming
Sampling temperature (0.0 to 2.0)
doubleAvailable tools
Number of top log probabilities to return (0-20)
int320 <= valueNucleus sampling probability (0.0 to 1.0)
doubleUser identifier for abuse detection
Response Body
application/json
application/json
application/json
application/json
curl -X POST "https://loading/api/v1/chat/completions" \ -H "Content-Type: application/json" \ -d '{ "messages": [ { "content": "Hello, how are you?", "role": "user" } ], "model": "openai/gpt-4o" }'{
"error": {
"code": "routing_error",
"message": "Model 'invalid-model' not found"
}
}{
"error": {
"code": "invalid_api_key",
"message": "Invalid API key provided"
}
}{
"error": {
"code": "rate_limit_exceeded",
"details": {
"limit": 100,
"retry_after_secs": 30,
"window": "minute"
},
"message": "Rate limit exceeded: 100 requests per minute"
}
}{
"error": {
"code": "provider_error",
"message": "Upstream provider returned error: Service temporarily unavailable"
}
}Create a response
Authorization
api_key API key authentication using Bearer token format
In: header
Request Body
application/json
Run in background
Items to include in response
Input messages/items
System instructions
Maximum output tokens
doubleRequest metadata
Model to use
Hadrian Extension: List of models for multi-model routing (alternative to single model)
Allow parallel tool calls
Hadrian Extension: Plugins to enable for this request
Previous response ID for conversation continuation
Prompt template reference
Prompt cache key
Hadrian Extension: Provider routing configuration
Reasoning configuration
Safety identifier
Service tier
Store response
Enable streaming
Sampling temperature (0.0 to 2.0)
doubleText configuration
Tool choice configuration
Available tools
Hadrian Extension: Top-k sampling (supported by some providers like Anthropic)
doubleNucleus sampling probability (0.0 to 1.0)
doubleTruncation strategy
User identifier for abuse detection
Response Body
application/json
application/json
curl -X POST "https://loading/api/v1/responses" \ -H "Content-Type: application/json" \ -d '{}'{
"error": {
"code": "budget_exceeded",
"message": "Budget limit exceeded for monthly period",
"param": null,
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "invalid_request_error"
}
}{
"error": {
"code": "budget_exceeded",
"message": "Budget limit exceeded for monthly period",
"param": null,
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "invalid_request_error"
}
}