Hadrian is experimental alpha software. Do not use in production.
Hadrian
ConfigurationFeatures

File Search

Configure the file_search tool for RAG in the Responses API

The [features.file_search] section configures server-side file_search tool interception for the Responses API. When enabled, the gateway intercepts file_search tool calls from LLMs and executes them against the local vector store.

Configuration Reference

Main Settings

KeyTypeDefaultDescription
enabledbooleantrueEnable file_search tool interception
max_iterationsinteger5Maximum tool call iterations before forcing completion
max_results_per_searchinteger10Maximum search results per call
timeout_secsinteger30Timeout per search operation
include_annotationsbooleantrueInclude file citation annotations in responses
score_thresholdfloat0.7Minimum similarity score (0.0-1.0)
max_search_result_charsinteger50000Maximum characters for injected results

Vector Backend

Configure where document chunks are stored with [features.file_search.vector_backend].

pgvector

Uses PostgreSQL with the pgvector extension:

[features.file_search.vector_backend]
type = "pgvector"
table_name = "rag_chunks"        # Default: "rag_chunks"
index_type = "ivf_flat"          # "ivf_flat" or "hnsw"
distance_metric = "cosine"       # "cosine", "dot_product", "euclidean"
KeyTypeDefaultDescription
table_namestring"rag_chunks"Table for storing chunks
index_typestring"ivf_flat"Index type: ivf_flat (faster build) or hnsw (faster queries)
distance_metricstring"cosine"Distance metric for similarity

Qdrant

Uses external Qdrant vector database:

[features.file_search.vector_backend]
type = "qdrant"
url = "http://localhost:6333"
api_key = "${QDRANT_API_KEY}"    # Optional
collection_name = "rag_chunks"   # Default: "rag_chunks"
distance_metric = "cosine"
KeyTypeDefaultDescription
urlstringrequiredQdrant server URL
api_keystringnoneAPI key for authentication
collection_namestring"rag_chunks"Collection for storing chunks
distance_metricstring"cosine"Distance metric

Embedding Configuration

Configure the embedding model with [features.file_search.embedding]:

[features.file_search.embedding]
provider = "openai"
model = "text-embedding-3-small"
dimensions = 1536
KeyTypeDefaultDescription
providerstring"openai"Embedding provider name
modelstring"text-embedding-3-small"Embedding model
dimensionsinteger1536Embedding dimensions

If not specified, embedding configuration falls back to semantic caching config, then vector search config.

Re-ranking

LLM-based re-ranking improves search precision by re-scoring results:

[features.file_search.rerank]
enabled = true
model = "gpt-4o-mini"            # Optional, uses default model
max_results_to_rerank = 20
batch_size = 10
timeout_secs = 30
fallback_on_error = true
KeyTypeDefaultDescription
enabledbooleanfalseEnable LLM re-ranking
modelstringnoneLLM model for re-ranking
max_results_to_rerankinteger20Results to pass to re-ranker
batch_sizeinteger10Results per LLM call
timeout_secsinteger30Re-ranking timeout
fallback_on_errorbooleantrueReturn original results on failure

Retry Configuration

[features.file_search.retry]
enabled = true
max_retries = 3
initial_delay_ms = 100
max_delay_ms = 10000
backoff_multiplier = 2.0
jitter = 0.1
KeyTypeDefaultDescription
enabledbooleantrueEnable retries
max_retriesinteger3Maximum retry attempts
initial_delay_msinteger100Initial retry delay
max_delay_msinteger10000Maximum retry delay
backoff_multiplierfloat2.0Exponential backoff multiplier
jitterfloat0.1Random jitter factor

Circuit Breaker

[features.file_search.circuit_breaker]
enabled = true
failure_threshold = 5
failure_window_secs = 60
recovery_timeout_secs = 30
KeyTypeDefaultDescription
enabledbooleantrueEnable circuit breaker
failure_thresholdinteger5Failures to open circuit
failure_window_secsinteger60Window for counting failures
recovery_timeout_secsinteger30Time before attempting recovery

Complete Example

[features.file_search]
enabled = true
max_iterations = 5
max_results_per_search = 10
timeout_secs = 30
include_annotations = true
score_threshold = 0.7
max_search_result_chars = 50000

[features.file_search.vector_backend]
type = "pgvector"
table_name = "rag_chunks"
index_type = "hnsw"
distance_metric = "cosine"

[features.file_search.embedding]
provider = "openai"
model = "text-embedding-3-small"
dimensions = 1536

[features.file_search.rerank]
enabled = true
model = "gpt-4o-mini"
max_results_to_rerank = 20
batch_size = 10
timeout_secs = 30
fallback_on_error = true

[features.file_search.retry]
enabled = true
max_retries = 3
initial_delay_ms = 100
max_delay_ms = 10000
backoff_multiplier = 2.0

[features.file_search.circuit_breaker]
enabled = true
failure_threshold = 5
failure_window_secs = 60
recovery_timeout_secs = 30

Distance Metrics

MetricBest ForScore Range
cosineText embeddings (default)0.0-1.0 (higher = more similar)
dot_productNormalized embeddingsVaries (requires normalized vectors)
euclideanMetric space embeddings0.0-1.0 (converted from distance)

Changing the distance metric after indexing data requires recreating the vector index.

See Also

On this page