Skip to content

Bifrost

Bifrost is a unified AI gateway that sits between the Lota SDK and LLM providers. All model requests -- chat completions, embeddings, reasoning -- route through Bifrost. This gives you centralized key management, provider abstraction, model routing, usage tracking, and governance in a single layer.

Architecture

┌─────────────┐     ┌───────────────────┐     ┌──────────────────┐
│  SDK Agent   │────▶│  Bifrost Gateway   │────▶│  OpenAI          │
│  (chat turn) │     │  localhost:8081    │     │  OpenRouter      │
│              │     │                   │     │  Google Gemini   │
└─────────────┘     └───────────────────┘     └──────────────────┘

The SDK uses the OpenAI-compatible API format (@ai-sdk/openai provider) to talk to Bifrost. Bifrost then routes the request to the correct upstream provider based on the model ID prefix and virtual key configuration.

Configuration

Bifrost is configured via infrastructure/bifrost.config.json. This file is mounted into the Docker container at startup.

Key Sections

providers -- Upstream LLM providers with API keys and network settings:

json
{
  "providers": {
    "openai": {
      "keys": [{ "name": "openai-key", "value": "env.OPENAI_API_KEY", "models": [], "weight": 1.0 }],
      "network_config": { "default_request_timeout_in_seconds": 600, "max_retries": 3 }
    },
    "openrouter": {
      "keys": [{ "name": "openrouter-key", "value": "env.OPENROUTER_API_KEY", "models": [], "weight": 1.0 }],
      "network_config": { "default_request_timeout_in_seconds": 600, "max_retries": 3 }
    },
    "gemini": {
      "keys": [{ "name": "gemini-key", "value": "env.GOOGLE_GENERATIVE_AI_API_KEY", "models": [], "weight": 1.0 }],
      "network_config": { "default_request_timeout_in_seconds": 600, "max_retries": 3 }
    }
  }
}

Key values use the env.VARIABLE_NAME format to read from environment variables at runtime, so secrets stay out of the config file.

governance.virtual_keys -- Virtual keys (sk-bf-*) that the SDK uses to authenticate. Each virtual key maps to one or more providers with allowed model lists:

json
{
  "governance": {
    "virtual_keys": [
      {
        "id": "ai-gateway-key",
        "name": "AI Gateway Key",
        "value": "env.AI_GATEWAY_KEY",
        "is_active": true,
        "provider_configs": [
          {
            "provider": "openai",
            "weight": 1.0,
            "allowed_models": ["gpt-5.4", "text-embedding-3-small"]
          },
          {
            "provider": "openrouter",
            "weight": 1.0,
            "allowed_models": ["google/gemini-3.1-pro-preview", "openai/gpt-oss-120b:nitro"]
          }
        ]
      }
    ]
  }
}

auth_config -- Admin credentials for the Bifrost management API:

json
{
  "auth_config": {
    "admin_username": "env.AI_GATEWAY_ADMIN",
    "admin_password": "env.AI_GATEWAY_PASS",
    "is_enabled": true,
    "disable_auth_on_inference": true
  }
}

plugins -- Bifrost plugins, such as governance enforcement:

json
{
  "plugins": [{ "enabled": true, "name": "governance", "config": { "is_vk_mandatory": true } }]
}

SDK Integration

Connect the SDK to Bifrost via the aiGateway config in createLotaRuntime:

ts
import { createLotaRuntime } from '@lota-sdk/core'

const runtime = await createLotaRuntime({
  // ...other config
  aiGateway: {
    url: 'http://localhost:8081/v1',
    key: 'sk-bf-your-virtual-key',
    admin: 'http://localhost:8081',
    pass: 'admin-password',
    embeddingModel: 'openai/text-embedding-3-small',
  },
})
FieldDescription
urlBifrost inference endpoint (the /v1 path is required)
keyVirtual key in sk-bf-* format. Must match a key in governance.virtual_keys.
adminOptional. Bifrost admin API base URL for management operations.
passOptional. Admin password for the management API.
embeddingModelModel ID for text embeddings. Defaults to openai/text-embedding-3-small.

The SDK sets these as environment variables internally (AI_GATEWAY_URL, AI_GATEWAY_KEY, AI_GATEWAY_ADMIN, AI_GATEWAY_PASS) so the Bifrost provider can read them at request time.

Model Helpers

The SDK exports helper functions for creating model references that route through Bifrost:

ts
import {
  bifrostChatModel,
  bifrostEmbeddingModel,
  bifrostModel,
  bifrostOpenRouterResponseHealingModel,
} from '@lota-sdk/core/bifrost'

bifrostChatModel(modelId)

Creates a wrapped language model with automatic reasoning content injection. Use this for all chat completions:

ts
const model = bifrostChatModel('openrouter/anthropic/claude-sonnet-4')

This wrapper:

  • Injects raw chunks for streaming when reasoning is enabled
  • Extracts reasoning content from provider-specific response fields (reasoning, reasoning_content, reasoning_details)
  • Injects reasoning tokens into the Vercel AI SDK content stream

See Provider Experiments for the reasoning and tool-calling behavior observed across direct providers, OpenRouter, and Bifrost.

bifrostEmbeddingModel(modelId)

Creates an embedding model reference:

ts
const embedder = bifrostEmbeddingModel('openai/text-embedding-3-small')

bifrostModel(modelId)

Creates a raw model reference without the chat reasoning wrapper. Use for non-chat completions:

ts
const model = bifrostModel('openai/gpt-5.4')

In this repo, bifrostModel(...) is the right fit for providers with stable Responses support. openrouter/* models should stay on bifrostChatModel(...) until OpenRouter Responses behavior is proven stable for the exact models in use. See Provider Experiments.

bifrostOpenRouterResponseHealingModel(modelId)

Creates a model with OpenRouter's response healing plugin enabled. This automatically retries malformed responses:

ts
const model = bifrostOpenRouterResponseHealingModel('openrouter/anthropic/claude-sonnet-4')

Docker Setup

Bifrost runs as a Docker container alongside SurrealDB and Redis. See the Docker infrastructure guide for the full docker-compose.yml setup.

The relevant service definition:

yaml
aigateway:
  image: maximhq/bifrost:latest
  container_name: bifrost
  ports:
    - '8081:8080'
  env_file:
    - ../.env
  volumes:
    - ./.gateway-data:/app/data
    - ./bifrost.config.json:/app/data/config.json:ro
  environment:
    - APP_PORT=8080
    - APP_HOST=0.0.0.0
    - LOG_LEVEL=info
    - LOG_STYLE=json
  healthcheck:
    test: ['CMD', 'wget', '-q', '-O', '/dev/null', 'http://localhost:8080/health']
    interval: 30s
    timeout: 10s
    retries: 3

Required Environment Variables

Set these in your .env file:

VariableDescription
AI_GATEWAY_KEYThe virtual key (sk-bf-*) the SDK uses for inference requests
AI_GATEWAY_ADMINAdmin username for the Bifrost management API
AI_GATEWAY_PASSAdmin password for the Bifrost management API
OPENAI_API_KEYOpenAI provider API key
OPENROUTER_API_KEYOpenRouter provider API key
GOOGLE_GENERATIVE_AI_API_KEYGoogle Gemini provider API key

Data Persistence

Bifrost stores its config and logs in SQLite databases inside /app/data:

  • config.db -- Runtime configuration state
  • logs.db -- Request and usage logs

The .gateway-data volume mount ensures these persist across container restarts.

Adding Providers

To add a new LLM provider:

  1. Add the provider to the providers section of bifrost.config.json:
json
{
  "providers": {
    "anthropic": {
      "keys": [{ "name": "anthropic-key", "value": "env.ANTHROPIC_API_KEY", "models": [], "weight": 1.0 }],
      "network_config": { "default_request_timeout_in_seconds": 600, "max_retries": 3 }
    }
  }
}
  1. Add allowed models to the virtual key's provider_configs:
json
{
  "provider": "anthropic",
  "weight": 1.0,
  "allowed_models": ["claude-sonnet-4-20250514"]
}
  1. Set the provider's API key in your .env file:
ANTHROPIC_API_KEY=sk-ant-...
  1. Restart the Bifrost container to pick up the new configuration.

Model IDs in the SDK follow the format provider/model-name. When you call bifrostChatModel('anthropic/claude-sonnet-4-20250514'), Bifrost routes the request to the Anthropic provider using the matching key.