Configuration
Observability Configuration Logging, tracing, metrics, and monitoring configuration
The [observability] section configures logging, distributed tracing, Prometheus metrics, request logging, usage tracking, and response validation.
Subsection Purpose loggingConsole log format, level, and SIEM integration tracingOpenTelemetry distributed tracing with OTLP export metricsPrometheus metrics endpoint and histogram buckets request_loggingRequest/response body logging with redaction usageUsage data export to database and OTLP dead_letter_queueFailed operations recovery and retry response_validationOpenAI schema validation for responses
Configure console log output format and level.
[ observability . logging ]
level = "info"
format = "compact"
timestamps = true
file_line = false
include_spans = true
filter = "tower_http=debug,sqlx=warn"
Setting Type Default Description levelstring infoLog level: trace, debug, info, warn, error. formatstring compactOutput format (see below). timestampsboolean trueInclude timestamps in log output. file_lineboolean falseInclude file and line number in log output. include_spansboolean trueInclude tracing span information (JSON format only). filterstring None Additional filter directives (e.g., tower_http=debug).
Format Description Use Case prettyHuman-readable multi-line format with colors Local development compactSingle-line format with colors Development, simple deployments jsonStructured JSON for log aggregation Production, log pipelines cefCommon Event Format for ArcSight, Splunk Enterprise SIEM integration leefLog Event Extended Format for IBM QRadar IBM QRadar SIEM syslogRFC 5424 Syslog format Standard syslog servers
The RUST_LOG environment variable takes precedence over config file settings:
RUST_LOG = debug ./hadrian
RUST_LOG = hadrian = debug, tower_http = trace ./hadrian
For CEF, LEEF, and Syslog formats, configure additional SIEM-specific fields:
[ observability . logging ]
format = "cef"
[ observability . logging . siem ]
device_vendor = "Hadrian"
device_product = "Gateway"
device_version = "1.0.0"
hostname = "gateway-prod-01"
app_name = "hadrian"
facility = "local0"
leef_version = "2.0"
Setting Type Default Description device_vendorstring HadrianVendor name for CEF/LEEF headers. device_productstring GatewayProduct name for CEF/LEEF headers. device_versionstring Crate version Version for CEF/LEEF headers. hostnamestring System hostname Override hostname in log headers. app_namestring hadrianApplication name for Syslog APP-NAME field. facilitystring local0Syslog facility (see below). leef_versionstring 2.0LEEF format version (1.0 or 2.0).
Facility Code Description kern0 Kernel messages user1 User-level messages daemon3 System daemons auth4 Security/authorization local016 Local use 0 (default) local1-local717-23 Local use 1-7
OpenTelemetry distributed tracing with OTLP export for request correlation across services.
[ observability . tracing ]
enabled = true
service_name = "hadrian"
service_version = "1.0.0"
environment = "production"
[ observability . tracing . otlp ]
endpoint = "http://jaeger:4317"
protocol = "grpc"
timeout_secs = 10
compression = true
[ observability . tracing . otlp . headers ]
Authorization = "Bearer ${OTLP_TOKEN}"
[ observability . tracing . sampling ]
strategy = "ratio"
rate = 0.1
[ observability . tracing . resource_attributes ]
"deployment.region" = "us-east-1"
"k8s.namespace" = "ai-gateway"
Setting Type Default Description enabledboolean falseEnable OpenTelemetry tracing. service_namestring ai-gatewayService name in traces. service_versionstring None Service version in traces. environmentstring None Deployment environment (e.g., production). propagationstring trace_contextContext propagation format. resource_attributesmap {}Additional resource attributes for all spans.
Setting Type Default Description endpointstring — OTLP collector endpoint URL. protocolstring grpcProtocol: grpc or http. timeout_secsinteger 10Export timeout in seconds. compressionboolean trueEnable gzip compression. headersmap {}Headers for authentication.
Strategy Description always_onSample all traces (default). always_offSample no traces. ratioSample a percentage of traces (use rate field). parent_basedInherit sampling decision from parent span.
Format Description trace_contextW3C Trace Context (default, recommended) b3Zipkin B3 format jaegerJaeger native format multiTraceContext + Baggage combined
Expose Prometheus metrics for monitoring dashboards and alerting.
[ observability . metrics ]
enabled = true
latency_buckets_ms = [ 10 , 50 , 100 , 250 , 500 , 1000 , 2500 , 5000 , 10000 ]
token_buckets = [ 10 , 50 , 100 , 500 , 1000 , 5000 , 10000 , 50000 , 100000 ]
[ observability . metrics . prometheus ]
enabled = true
path = "/metrics"
process_metrics = true
Setting Type Default Description enabledboolean trueEnable metrics collection. latency_buckets_msfloat[] [10, 50, 100, 250, 500, 1000, 2500, 5000, 10000]Histogram buckets for latency (ms). token_bucketsfloat[] [10, 50, 100, 500, 1000, 5000, 10000, 50000, 100000]Histogram buckets for token counts.
Setting Type Default Description enabledboolean trueEnable the /metrics endpoint. pathstring /metricsPath for the Prometheus scrape endpoint. process_metricsboolean trueInclude process metrics (memory, CPU).
Metric Type Labels Description http_requests_totalCounter method, path, status, status_classTotal HTTP requests. http_request_duration_secondsHistogram method, path, status_classRequest latency. active_connectionsGauge — Current active connections.
Metric Type Labels Description llm_requests_totalCounter provider, model, statusTotal LLM requests. llm_request_duration_secondsHistogram provider, modelLLM request latency. llm_input_tokens_totalCounter provider, modelTotal input tokens processed. llm_output_tokens_totalCounter provider, modelTotal output tokens generated. llm_input_tokensHistogram provider, modelInput tokens per request. llm_output_tokensHistogram provider, modelOutput tokens per request. llm_cost_microcents_totalCounter provider, modelTotal cost in microcents.
Metric Type Labels Description llm_streaming_chunks_totalCounter provider, modelTotal streaming chunks. llm_streaming_chunk_countHistogram provider, modelChunks per stream. llm_streaming_time_to_first_chunk_secondsHistogram provider, modelTime to first chunk (TTFC). llm_streaming_duration_secondsHistogram provider, modelTotal stream duration. llm_streaming_completions_totalCounter provider, model, outcomeStream completions by outcome.
Metric Type Labels Description auth_attempts_totalCounter method, statusAuthentication attempts. budget_checks_totalCounter resultBudget check results. budget_warnings_totalCounter periodBudget warning triggers. budget_spend_percentageGauge api_key_id, periodCurrent spend percentage. rate_limit_checks_totalCounter resultRate limit check results.
Metric Type Labels Description provider_healthGauge providerProvider health (1=healthy, 0=unhealthy). provider_health_checks_totalCounter provider, statusHealth check results. provider_health_check_duration_secondsHistogram providerHealth check latency. provider_circuit_breaker_stateGauge providerCircuit breaker state (0=closed, 1=open, 2=half_open). provider_circuit_breaker_failure_countGauge providerCurrent failure count. provider_fallback_attempts_totalCounter from_provider, to_provider, successFallback attempts. provider_fallback_exhausted_totalCounter primary_provider, chain_lengthExhausted fallback chains.
Metric Type Labels Description rag_document_processing_totalCounter status, file_typeDocuments processed. rag_document_processing_duration_secondsHistogram status, file_typeProcessing time. rag_document_chunks_totalCounter file_typeTotal chunks created. rag_embedding_requests_totalCounter provider, model, statusEmbedding API calls. rag_embedding_duration_secondsHistogram provider, modelEmbedding latency. rag_file_search_totalCounter status, cacheFile search queries. rag_file_search_duration_secondsHistogram status, cacheSearch latency. rag_vector_store_operations_totalCounter backend, operation, statusVector DB operations.
Metric Type Labels Description guardrails_evaluations_totalCounter provider, stage, resultGuardrails evaluations. guardrails_latency_secondsHistogram provider, stageEvaluation latency. guardrails_violations_totalCounter provider, category, severity, actionViolations detected. guardrails_timeouts_totalCounter provider, stageEvaluation timeouts. guardrails_errors_totalCounter provider, stage, error_typeProvider errors.
Metric Type Labels Description gateway_errors_totalCounter error_type, error_code, providerGateway errors. cache_operations_totalCounter cache_type, operation, resultCache operations. db_operations_totalCounter operation, table, statusDatabase operations. db_operation_duration_secondsHistogram operation, tableDatabase operation latency. dlq_operations_totalCounter operation, entry_typeDead letter queue operations. retention_deletions_totalCounter tableRecords deleted by retention.
Log request and response bodies for debugging and auditing.
Request logging can expose sensitive data. Enable only in controlled environments and always use
redact_sensitive = true in production.
[ observability . request_logging ]
enabled = true
log_request_body = true
log_response_body = false
max_body_size = 10240
redact_sensitive = true
redact_fields = [ "api_key" , "password" , "secret" , "authorization" ]
Setting Type Default Description enabledboolean falseEnable request logging. log_request_bodyboolean falseLog request bodies. log_response_bodyboolean falseLog response bodies. max_body_sizeinteger 10240Maximum body size to log (bytes). redact_sensitiveboolean trueRedact sensitive fields. redact_fieldsstring[] ["api_key", "password", "secret", "authorization"]Fields to redact.
[ observability . request_logging ]
enabled = true
# Log to a separate file
[ observability . request_logging . destination ]
type = "file"
path = "/var/log/hadrian/requests.log"
[ observability . request_logging . destination . rotation ]
type = "daily"
Destination Configuration stdoutLog to standard output (same as regular logs). fileLog to a file with optional rotation (daily, hourly, size). httpPOST logs to an HTTP endpoint with custom headers.
Configure where API usage data (tokens, costs, latency) is recorded.
[ observability . usage ]
database = true
[ observability . usage . buffer ]
max_size = 1000
flush_interval_ms = 1000
max_pending_entries = 10000
Setting Type Default Description databaseboolean trueWrite usage records to the database.
Usage records are buffered before writing to improve performance.
Setting Type Default Description max_sizeinteger 1000Flush buffer when this many records accumulate. flush_interval_msinteger 1000Flush buffer at this interval (milliseconds). max_pending_entriesinteger 10000Drop oldest entries if pending exceeds this limit.
Export usage records to an OpenTelemetry-compatible backend:
[ observability . usage . otlp ]
enabled = true
endpoint = "http://otel-collector:4317"
protocol = "grpc"
timeout_secs = 10
compression = true
service_name = "hadrian-usage"
[ observability . usage . otlp . headers ]
Authorization = "Bearer ${OTLP_TOKEN}"
Each exported usage record includes the following OpenTelemetry attributes for attribution and filtering:
Attribute Description hadrian.request_idUnique request identifier hadrian.modelModel used (e.g., gpt-4o) hadrian.providerProvider name (e.g., openai) hadrian.api_key_idAPI key used (if applicable) hadrian.user_idAuthenticated user ID (session or user-owned key) hadrian.org_idOrganization context hadrian.project_idProject context (from key or X-Hadrian-Project) hadrian.team_idTeam context (from team-scoped key) hadrian.service_account_idService account that owns the API key hadrian.input_tokensInput token count hadrian.output_tokensOutput token count hadrian.cost_microcentsCalculated cost in microcents
These attributes enable building Grafana dashboards, alerts, and queries filtered by organization, team, project, or individual user.
Capture failed operations (usage logging, etc.) for later retry.
[ observability . dead_letter_queue ]
type = "redis"
url = "${REDIS_URL}"
key_prefix = "gw:dlq:"
max_entries = 100000
ttl_secs = 604800 # 7 days
[ observability . dead_letter_queue . retry ]
enabled = true
interval_secs = 60
initial_delay_secs = 60
max_delay_secs = 3600
backoff_multiplier = 2.0
max_retries = 10
batch_size = 100
prune_enabled = true
Type Configuration Use Case filepath, max_file_size_mb, max_filesSingle-node, local storage redisurl, key_prefix, max_entries, ttl_secsMulti-node, shared storage databasetable_name, max_entries, ttl_secsUses existing database
Setting Type Default Description enabledboolean trueEnable automatic retry processing. interval_secsinteger 60Interval between retry runs (seconds). initial_delay_secsinteger 60Initial delay before first retry. max_delay_secsinteger 3600Maximum delay between retries. backoff_multiplierfloat 2.0Exponential backoff multiplier. max_retriesinteger 10Maximum retry attempts before giving up. batch_sizeinteger 100Records to process per retry run. prune_enabledboolean trueAutomatically delete expired entries.
Validate API responses against the OpenAI OpenAPI specification.
[ observability . response_validation ]
enabled = true
mode = "warn"
Setting Type Default Description enabledboolean falseEnable response schema validation. modestring warnwarn (log and continue) or error (return 500).
Response validation helps catch format issues from non-OpenAI providers. Use warn mode in
production to log issues without breaking requests. Use error mode during provider integration
testing.
[ observability . logging ]
level = "debug"
format = "pretty"
[ observability . metrics ]
enabled = true
[ observability . metrics . prometheus ]
enabled = true
path = "/metrics"
[ observability . logging ]
level = "info"
format = "json"
[ observability . tracing ]
enabled = true
service_name = "hadrian"
service_version = "1.0.0"
environment = "production"
[ observability . tracing . otlp ]
endpoint = "http://jaeger:4317"
protocol = "grpc"
compression = true
[ observability . tracing . sampling ]
strategy = "ratio"
rate = 0.1
[ observability . metrics ]
enabled = true
[ observability . metrics . prometheus ]
enabled = true
path = "/metrics"
[ observability . usage ]
database = true
[ observability . usage . buffer ]
max_size = 1000
flush_interval_ms = 1000
[ observability . dead_letter_queue ]
type = "redis"
url = "${REDIS_URL}"
ttl_secs = 604800
[ observability . logging ]
level = "info"
format = "cef"
[ observability . logging . siem ]
device_vendor = "Acme Corp"
device_product = "AI Gateway"
device_version = "1.0.0"
hostname = "gateway-prod-01"
facility = "local0"
[ observability . tracing ]
enabled = true
service_name = "ai-gateway"
[ observability . tracing . otlp ]
endpoint = "https://otel.internal:4317"
protocol = "grpc"
[ observability . tracing . otlp . headers ]
Authorization = "Bearer ${OTEL_TOKEN}"
[ observability . metrics ]
enabled = true
[ observability . request_logging ]
enabled = true
log_request_body = true
log_response_body = true
redact_sensitive = true
[ observability . request_logging . destination ]
type = "file"
path = "/var/log/hadrian/requests.log"
[ observability . request_logging . destination . rotation ]
type = "daily"
[ observability . logging ]
level = "info"
format = "json"
[ observability . tracing ]
enabled = true
service_name = "hadrian"
environment = "production"
[ observability . tracing . otlp ]
endpoint = "https://otlp-gateway-prod-us-central-0.grafana.net/otlp"
protocol = "http"
[ observability . tracing . otlp . headers ]
Authorization = "Basic ${GRAFANA_OTLP_TOKEN}"
[ observability . tracing . sampling ]
strategy = "ratio"
rate = 0.05
[ observability . usage . otlp ]
enabled = true
endpoint = "https://otlp-gateway-prod-us-central-0.grafana.net/otlp"
protocol = "http"
[ observability . usage . otlp . headers ]
Authorization = "Basic ${GRAFANA_OTLP_TOKEN}"