Features
Explore Hadrian Gateway's comprehensive feature set
Hadrian Gateway includes a comprehensive feature set for production AI deployments. All features are free and included in the open-source release, dual-licensed under Apache 2.0 and MIT.
LLM Providers
Route requests to any major LLM provider through a unified OpenAI-compatible API.
| Provider | Streaming | Embeddings | Function Calling | Thinking/Reasoning |
|---|---|---|---|---|
| OpenAI | Yes | Yes | Yes | Yes (o1/o3) |
| Anthropic | Yes | No | Yes | Yes (extended thinking) |
| AWS Bedrock | Yes | Yes (Titan) | Yes | Yes (Claude) |
| Google Vertex | Yes | Yes | Yes | Yes (Gemini) |
| Azure OpenAI | Yes | Yes | Yes | Yes |
| OpenRouter | Yes | No | Yes | Varies |
| Any OpenAI-compatible | Yes | Varies | Varies | Varies |
Provider capabilities:
- Circuit breaker - Automatically disable unhealthy providers after repeated failures
- Automatic retry - Exponential backoff for transient errors (429, 5xx)
- Provider fallbacks - Chain providers: try A, then B, then C
- Model fallbacks - Graceful degradation:
gpt-4o→gpt-4o-mini→claude-sonnet - Health checks - Background monitoring to detect issues before user requests fail
- Model aliases - Create shortcuts like
sonnet→claude-sonnet-4-20250514
Multi-Tenancy
Hadrian supports a flexible multi-tenancy hierarchy for organizations of any size.
Organizations
└── Teams (optional)
└── Projects
└── Users
└── API KeysEach level in the hierarchy can have:
| Capability | Description |
|---|---|
| Dynamic providers | Bring your own API keys at any scope |
| Model pricing | Override pricing for cost calculations |
| Budget limits | Daily/monthly spending caps with enforcement |
| Rate limits | Requests and tokens per minute/day |
| Guardrails | Scope-specific content policies |
Resource ownership:
Resources can be owned at different levels depending on your organization's needs:
- Organization-level: Shared across all teams and projects
- Team-level: Shared within a team
- Project-level: Isolated to a specific project
- User-level: Personal resources (conversations, API keys)
Authentication & Authorization
Flexible authentication supporting multiple methods for API and UI access.
| Method | Use Case | Description |
|---|---|---|
| API Key | Programmatic access | gw_live_... format, budget limits |
| OIDC/OAuth | SSO with identity providers | Keycloak, Auth0, Okta, Azure AD |
| JWT | Service-to-service auth | JWKS validation, custom claims |
| Per-Org SSO | Multi-tenant SaaS | Self-service SSO per organization |
| Proxy Auth | Zero-trust networks | Cloudflare Access, Tailscale |
CEL-based authorization:
Use Common Expression Language (CEL) policies for fine-grained access control:
[[auth.rbac.policies]]
name = "org-admin"
condition = "'admin' in subject.roles && context.org_id in subject.org_ids"
effect = "allow"Guardrails
Block, warn, log, or redact content using configurable guardrails on both input and output.
Guardrail Providers
| Provider | Features | Best For |
|---|---|---|
| OpenAI Moderation | Hate, violence, sexual, self-harm categories | Free, fast, general-purpose |
| AWS Bedrock Guardrails | PII detection, topic filters, word filters, denied topics | Enterprise compliance |
| Azure Content Safety | Configurable severity thresholds, custom blocklists | Azure environments |
| Custom HTTP | Your own moderation service | Custom requirements |
| Regex patterns | PII patterns, blocklist terms | Simple rules |
Execution Modes
| Mode | Behavior | Latency Impact |
|---|---|---|
| Blocking | Evaluate before sending to LLM | Adds round-trip to guardrail |
| Concurrent | Race guardrails against LLM, cancel if violations | Minimal for passing requests |
| Post-response | Filter LLM output before returning | Adds round-trip after LLM |
Actions on violation:
block- Reject request with errorwarn- Log warning, allow requestlog- Silent logging onlyredact- Remove/mask violating content
Budget Enforcement
Prevent cost overruns with atomic budget reservation and real-time enforcement.
How it works:
1. Request arrives → Reserve estimated cost ($0.10)
2. Forward to LLM provider
3. Request completes with actual cost
4. Adjust: Replace estimate with actual costThis atomic reservation pattern prevents overspend even with concurrent requests.
Budget scopes:
- Organization-level budgets
- Team-level budgets
- Project-level budgets
- User-level budgets
Forecasting:
Built-in time-series forecasting with augurs:
- Projected spend for current period
- Days until budget exhaustion
- 95% confidence intervals
Vector Stores & RAG
OpenAI-compatible Vector Stores API for building RAG (Retrieval-Augmented Generation) applications.
Capabilities:
- Upload and process files (PDF, DOCX, TXT, Markdown, HTML, etc.)
- Automatic text extraction with OCR support via Kreuzberg
- Configurable chunking strategies (auto, fixed-size)
- Vector search with similarity scoring
- LLM-based re-ranking for improved relevance
- File search tool integration for Responses API
Vector backends:
| Backend | Use Case |
|---|---|
| pgvector | Simple setup, uses existing PostgreSQL |
| Qdrant | Dedicated vector DB, high performance |
| Pinecone | Managed service, serverless |
| Weaviate | Hybrid search, schema-based |
| ChromaDB | Lightweight, embedded |
Chat UI
Built-in React UI for multi-model conversations and administration.
Chat features:
- Multi-model comparison in single conversation
- Model instances (compare same model with different settings)
- Streaming markdown with syntax highlighting
- File uploads (images, PDFs)
- Conversation history with IndexedDB persistence
- Per-model settings (temperature, max tokens)
Chat modes:
| Mode | Description |
|---|---|
| Synthesized | Gather all responses, synthesize final answer |
| Chained | Sequential relay (output of one becomes input to next) |
| Debated | Multi-round argumentation between models |
| Council | Collaborative discussion with voting/consensus |
| Hierarchical | Coordinator delegates subtasks to workers |
Frontend Tools
Client-side tool execution in the browser via WebAssembly.
| Tool | Runtime | Capabilities |
|---|---|---|
| Python | Pyodide | numpy, pandas, matplotlib, scipy |
| JavaScript | QuickJS | Sandboxed JS execution |
| SQL | DuckDB | Query CSV/Parquet files |
| Charts | Vega-Lite | Interactive visualizations |
| HTML | iframe | Sandboxed preview |
Tool results are displayed inline as interactive artifacts and sent back to the LLM to continue the conversation.
Web Tools
Server-side web search and URL fetching, proxied through the gateway with SSRF protection and usage tracking.
| Tool | Provider | Capabilities |
|---|---|---|
| Web Search | Tavily or Exa | Search the web, returns ranked results |
| Web Fetch | Direct HTTP | Fetch URLs, HTML stripped to plain text |
Web search results appear as inline citations. Both tools require backend configuration.
MCP Integration
Connect to external tool servers using the Model Context Protocol (MCP).
Capabilities:
- Connect to MCP servers via Streamable HTTP transport
- Automatic tool discovery from connected servers
- Tool execution with result streaming
- Persistent server connections across conversations
Use cases:
- File system access and manipulation
- Database queries
- External API integrations
- Custom enterprise tools
Response Caching
Cache LLM responses to reduce costs and latency.
| Cache Type | Matching | Use Case |
|---|---|---|
| Exact match | SHA-256 hash of request | Identical requests |
| Semantic | Embedding similarity | Similar questions |
| Prompt (Anthropic) | Provider-side caching | Long system prompts |
Observability
Comprehensive monitoring and debugging capabilities.
| Feature | Endpoint/Format | Description |
|---|---|---|
| Metrics | /metrics (Prometheus) | Request latency, token counts, costs, errors |
| Tracing | OTLP export | Distributed traces to Jaeger, Tempo, etc. |
| Logging | JSON or compact | Structured logs with configurable levels |
| Usage | Database + OTLP | Token usage, costs per user/project/org |
[observability]
logging.format = "json"
logging.level = "info"
[observability.tracing]
enabled = true
exporter = "otlp"
endpoint = "http://localhost:4317"
[observability.metrics]
enabled = trueData Privacy & GDPR
Built-in compliance features for data protection regulations.
Capabilities:
- Self-service data export (GDPR Article 15 - Right of Access)
- Self-service account deletion (GDPR Article 17 - Right to Erasure)
- Configurable data retention policies
- CSV export reports for compliance audits
- Audit logging for all privacy operations