Appearance
Bifrost
Bifrost is a unified AI gateway that sits between the Lota SDK and LLM providers. All model requests -- chat completions, embeddings, reasoning -- route through Bifrost. This gives you centralized key management, provider abstraction, model routing, usage tracking, and governance in a single layer.
Architecture
┌─────────────┐ ┌───────────────────┐ ┌──────────────────┐
│ SDK Agent │────▶│ Bifrost Gateway │────▶│ OpenAI │
│ (chat turn) │ │ localhost:8081 │ │ OpenRouter │
│ │ │ │ │ Google Gemini │
└─────────────┘ └───────────────────┘ └──────────────────┘The SDK uses the OpenAI-compatible API format (@ai-sdk/openai provider) to talk to Bifrost. Bifrost then routes the request to the correct upstream provider based on the model ID prefix and virtual key configuration.
Configuration
Bifrost is configured via infrastructure/bifrost.config.json. This file is mounted into the Docker container at startup.
Key Sections
providers -- Upstream LLM providers with API keys and network settings:
json
{
"providers": {
"openai": {
"keys": [{ "name": "openai-key", "value": "env.OPENAI_API_KEY", "models": [], "weight": 1.0 }],
"network_config": { "default_request_timeout_in_seconds": 600, "max_retries": 3 }
},
"openrouter": {
"keys": [{ "name": "openrouter-key", "value": "env.OPENROUTER_API_KEY", "models": [], "weight": 1.0 }],
"network_config": { "default_request_timeout_in_seconds": 600, "max_retries": 3 }
},
"gemini": {
"keys": [{ "name": "gemini-key", "value": "env.GOOGLE_GENERATIVE_AI_API_KEY", "models": [], "weight": 1.0 }],
"network_config": { "default_request_timeout_in_seconds": 600, "max_retries": 3 }
}
}
}Key values use the env.VARIABLE_NAME format to read from environment variables at runtime, so secrets stay out of the config file.
governance.virtual_keys -- Virtual keys (sk-bf-*) that the SDK uses to authenticate. Each virtual key maps to one or more providers with allowed model lists:
json
{
"governance": {
"virtual_keys": [
{
"id": "ai-gateway-key",
"name": "AI Gateway Key",
"value": "env.AI_GATEWAY_KEY",
"is_active": true,
"provider_configs": [
{
"provider": "openai",
"weight": 1.0,
"allowed_models": ["gpt-5.4", "text-embedding-3-small"]
},
{
"provider": "openrouter",
"weight": 1.0,
"allowed_models": ["google/gemini-3.1-pro-preview", "openai/gpt-oss-120b:nitro"]
}
]
}
]
}
}auth_config -- Admin credentials for the Bifrost management API:
json
{
"auth_config": {
"admin_username": "env.AI_GATEWAY_ADMIN",
"admin_password": "env.AI_GATEWAY_PASS",
"is_enabled": true,
"disable_auth_on_inference": true
}
}plugins -- Bifrost plugins, such as governance enforcement:
json
{
"plugins": [{ "enabled": true, "name": "governance", "config": { "is_vk_mandatory": true } }]
}SDK Integration
Connect the SDK to Bifrost via the aiGateway config in createLotaRuntime:
ts
import { createLotaRuntime } from '@lota-sdk/core'
const runtime = await createLotaRuntime({
// ...other config
aiGateway: {
url: 'http://localhost:8081/v1',
key: 'sk-bf-your-virtual-key',
admin: 'http://localhost:8081',
pass: 'admin-password',
embeddingModel: 'openai/text-embedding-3-small',
},
})| Field | Description |
|---|---|
url | Bifrost inference endpoint (the /v1 path is required) |
key | Virtual key in sk-bf-* format. Must match a key in governance.virtual_keys. |
admin | Optional. Bifrost admin API base URL for management operations. |
pass | Optional. Admin password for the management API. |
embeddingModel | Model ID for text embeddings. Defaults to openai/text-embedding-3-small. |
The SDK sets these as environment variables internally (AI_GATEWAY_URL, AI_GATEWAY_KEY, AI_GATEWAY_ADMIN, AI_GATEWAY_PASS) so the Bifrost provider can read them at request time.
Model Helpers
The SDK exports helper functions for creating model references that route through Bifrost:
ts
import {
bifrostChatModel,
bifrostEmbeddingModel,
bifrostModel,
bifrostOpenRouterResponseHealingModel,
} from '@lota-sdk/core/bifrost'bifrostChatModel(modelId)
Creates a wrapped language model with automatic reasoning content injection. Use this for all chat completions:
ts
const model = bifrostChatModel('openrouter/anthropic/claude-sonnet-4')This wrapper:
- Injects raw chunks for streaming when reasoning is enabled
- Extracts reasoning content from provider-specific response fields (
reasoning,reasoning_content,reasoning_details) - Injects reasoning tokens into the Vercel AI SDK content stream
See Provider Experiments for the reasoning and tool-calling behavior observed across direct providers, OpenRouter, and Bifrost.
bifrostEmbeddingModel(modelId)
Creates an embedding model reference:
ts
const embedder = bifrostEmbeddingModel('openai/text-embedding-3-small')bifrostModel(modelId)
Creates a raw model reference without the chat reasoning wrapper. Use for non-chat completions:
ts
const model = bifrostModel('openai/gpt-5.4')In this repo, bifrostModel(...) is the right fit for providers with stable Responses support. openrouter/* models should stay on bifrostChatModel(...) until OpenRouter Responses behavior is proven stable for the exact models in use. See Provider Experiments.
bifrostOpenRouterResponseHealingModel(modelId)
Creates a model with OpenRouter's response healing plugin enabled. This automatically retries malformed responses:
ts
const model = bifrostOpenRouterResponseHealingModel('openrouter/anthropic/claude-sonnet-4')Docker Setup
Bifrost runs as a Docker container alongside SurrealDB and Redis. See the Docker infrastructure guide for the full docker-compose.yml setup.
The relevant service definition:
yaml
aigateway:
image: maximhq/bifrost:latest
container_name: bifrost
ports:
- '8081:8080'
env_file:
- ../.env
volumes:
- ./.gateway-data:/app/data
- ./bifrost.config.json:/app/data/config.json:ro
environment:
- APP_PORT=8080
- APP_HOST=0.0.0.0
- LOG_LEVEL=info
- LOG_STYLE=json
healthcheck:
test: ['CMD', 'wget', '-q', '-O', '/dev/null', 'http://localhost:8080/health']
interval: 30s
timeout: 10s
retries: 3Required Environment Variables
Set these in your .env file:
| Variable | Description |
|---|---|
AI_GATEWAY_KEY | The virtual key (sk-bf-*) the SDK uses for inference requests |
AI_GATEWAY_ADMIN | Admin username for the Bifrost management API |
AI_GATEWAY_PASS | Admin password for the Bifrost management API |
OPENAI_API_KEY | OpenAI provider API key |
OPENROUTER_API_KEY | OpenRouter provider API key |
GOOGLE_GENERATIVE_AI_API_KEY | Google Gemini provider API key |
Data Persistence
Bifrost stores its config and logs in SQLite databases inside /app/data:
config.db-- Runtime configuration statelogs.db-- Request and usage logs
The .gateway-data volume mount ensures these persist across container restarts.
Adding Providers
To add a new LLM provider:
- Add the provider to the
providerssection ofbifrost.config.json:
json
{
"providers": {
"anthropic": {
"keys": [{ "name": "anthropic-key", "value": "env.ANTHROPIC_API_KEY", "models": [], "weight": 1.0 }],
"network_config": { "default_request_timeout_in_seconds": 600, "max_retries": 3 }
}
}
}- Add allowed models to the virtual key's
provider_configs:
json
{
"provider": "anthropic",
"weight": 1.0,
"allowed_models": ["claude-sonnet-4-20250514"]
}- Set the provider's API key in your
.envfile:
ANTHROPIC_API_KEY=sk-ant-...- Restart the Bifrost container to pick up the new configuration.
Model IDs in the SDK follow the format provider/model-name. When you call bifrostChatModel('anthropic/claude-sonnet-4-20250514'), Bifrost routes the request to the Anthropic provider using the matching key.