Hadrian is experimental alpha software. Do not use in production.
Hadrian
Features

Chat UI

Built-in web interface for multi-model conversations with streaming, file uploads, and advanced features

Hadrian Gateway includes a built-in React-based chat interface for interacting with multiple LLM models simultaneously. The UI supports real-time streaming, file uploads, conversation history, and advanced multi-model interaction modes.

Multi-Model Chat

Chat with multiple models in a single conversation to compare responses, leverage different model strengths, or get diverse perspectives.

Selecting Models

Select one or more models from the model picker. Each model responds to your messages in parallel, with responses displayed side-by-side.

FeatureDescription
Multi-selectChoose any number of models to respond simultaneously
Provider groupingModels organized by provider (OpenAI, Anthropic, etc.)
SearchFilter models by name or provider
FavoritesPin frequently used models for quick access

Model Instances

Create multiple instances of the same model with different settings to compare behavior:

GPT-4 (Creative)     → temperature: 0.9, top_p: 0.95
GPT-4 (Precise)      → temperature: 0.3, top_p: 0.8
GPT-4 (Reasoning)    → reasoning: high effort

Each instance has:

  • Unique ID - Distinguishes instances in the UI and message history
  • Custom label - Display name (e.g., "Creative", "Precise")
  • Instance parameters - Override temperature, max tokens, reasoning, system prompt

Instance parameters take precedence over per-model settings, which take precedence over global defaults.

Response Display

Responses are displayed in a configurable layout:

LayoutDescription
GridSide-by-side cards, adjusts columns based on screen width
StackedVertical list, one response per row

Each response card shows:

  • Model name and instance label
  • Streaming content with syntax-highlighted markdown
  • Usage stats (tokens, cost, latency)
  • Action buttons (copy, regenerate, expand, feedback)

Per-Model Settings

Configure settings for each model independently:

SettingDescription
TemperatureRandomness (0.0 = deterministic, 2.0 = creative)
Max tokensMaximum response length
Top PNucleus sampling threshold
Top KTop-k sampling (where supported)
Frequency penaltyReduce repetition of tokens
Presence penaltyEncourage topic diversity
ReasoningEnable extended thinking (see below)
System promptPer-model system prompt override

Reasoning Mode

Enable extended thinking for models that support it:

ModelReasoning Support
OpenAI o1/o3/o4-miniNative reasoning_effort
Anthropic Claude 3.5+/4Extended thinking (budget_tokens)
Bedrock ClaudeVia reasoning_config
Vertex Gemini 2.5+/3+thinking_config

Effort levels: none, minimal, low, medium, high

When reasoning is enabled, the model's thinking process appears in a collapsible section above the response. Reasoning tokens are tracked separately in usage stats.

Streaming

All responses stream in real-time using Server-Sent Events (SSE), providing immediate feedback as models generate content.

Performance

The chat UI is optimized for high-performance multi-model streaming:

MetricCapability
Token rate50-100+ tokens/second per model
Concurrent streamsUnlimited (parallel SSE connections)
Render efficiencyOnly active response cards re-render
Message listVirtualized for smooth scrolling with large histories

Streaming Features

FeatureDescription
Live markdownContent renders as markdown while streaming
Syntax highlightingCode blocks highlighted in real-time
Auto-scrollFollows streaming content, pauses on scroll-up
CancelStop any or all streams mid-generation
Usage statsTime-to-first-token and tokens/second displayed

Usage Statistics

Each response displays detailed usage information:

StatDescription
Input tokensTokens in the prompt
Output tokensTokens generated
Cached tokensTokens served from cache (Anthropic)
Reasoning tokensTokens used for thinking (if enabled)
CostEstimated cost based on model pricing
First tokenTime to first token (ms)
DurationTotal response time (ms)
Tokens/secOutput tokens per second

File Uploads

Attach files to your messages for vision models, document analysis, or data processing with frontend tools.

Supported File Types

CategoryExtensionsNotes
ImagesPNG, JPG, GIF, WebP, SVGInline preview, sent to vision models
DocumentsPDF, DOCX, TXT, MD, HTMLText extraction for context
DataCSV, XLSX, JSON, ParquetAvailable to SQL/Python tools
CodeJS, TS, PY, RS, GO, etc.Syntax highlighting in preview
ArchivesZIP, TAR, GZExtracted for processing
AudioWAV, MP3, WebMFor transcription models

Upload Methods

  • Click - File picker dialog
  • Drag & drop - Drop files onto the input area
  • Paste - Paste images from clipboard

File Handling

Files are processed based on type:

TypeHandling
ImagesSent as base64 to vision-capable models
DocumentsText extracted and included in context
Data filesAvailable to frontend tools (Python, SQL)

File size limits and allowed types are configurable in hadrian.toml under [ui.chat].

Configuration

[ui.chat]
file_uploads_enabled = true
max_file_size_bytes = 10485760  # 10 MB
allowed_file_types = [
  "image/png", "image/jpeg", "image/gif", "image/webp",
  "application/pdf", "text/plain", "text/markdown",
  "text/csv", "application/json"
]

Conversation Management

History

Conversations are persisted locally in IndexedDB:

FeatureDescription
Auto-saveMessages saved as they're received
PersistenceSurvives page refresh and browser restart
SearchFind conversations by title or content
ExportDownload conversation as JSON or Markdown

Organization

FeatureDescription
PinPin important conversations to the top
RenameEdit conversation titles
DeleteRemove conversations (with confirmation)
ForkCreate a copy to explore different directions

Project Assignment

Assign conversations to a project using the project picker in the chat header. Select a project from the dropdown or choose "Personal" for unscoped usage.

When a project is selected:

  • The X-Hadrian-Project header is sent with every request, attributing usage to that project
  • Usage appears in the project's usage dashboard in the admin panel
  • Session-based users (SSO/proxy auth) get per-project usage tracking without needing a project-scoped API key

Project Sharing

Share conversations with team members via projects:

  1. Create or select a project in the admin panel
  2. Move conversation to the project
  3. Team members with project access can view and continue the conversation

Message Features

User Messages

FeatureDescription
EditModify and re-send (deletes subsequent messages)
FilesAttach images and documents
History modeChoose which history to send (all or same-model)

Assistant Messages

FeatureDescription
CopyCopy response to clipboard
RegenerateGet a new response from the same model
ExpandFull-screen view for long responses
FeedbackThumbs up/down rating
Mark as bestSelect the best response when comparing
HideTemporarily hide a response
SpeakText-to-speech playback

Citations

When using file search (RAG) or web search tools, responses include citations:

Citation TypeSource
FileChunks from vector store documents
URLWeb search results
ChunkFull text of retrieved chunks

Citations appear as inline references with expandable previews.

Artifacts

Tool execution produces artifacts displayed inline:

Artifact TypeSource
CodePython/JavaScript execution output
TablesDataFrames, SQL query results
ChartsVega-Lite visualizations
ImagesGenerated plots and graphics
HTMLRendered HTML previews

Keyboard Shortcuts

ShortcutAction
EnterSend message
Shift+EnterNew line
Ctrl+/Focus message input
EscapeCancel streaming
Ctrl+NNew conversation

On this page