Skip to content

Memory

The Lota SDK includes a durable memory system backed by SurrealDB with HNSW vector search and BM25 full-text search. Memory stores extracted facts from conversations and makes them available for future agent turns.

Architecture

User Message -> Agent Turn -> Fact Extraction -> Memory Store
                                                      |
                                                      v
                                               Vector Embeddings
                                                      |
                                                      v
Future Turn <- Retrieval Pipeline <- HNSW Search + BM25 + Reranking

Memory records are stored in the memory table in SurrealDB. Each record contains:

  • content -- the fact text
  • embedding -- 1536-dimensional vector (text-embedding-3-small by default)
  • hash -- content hash for deduplication (unique index)
  • scopeId -- namespace for organization or agent-scoped memories
  • memoryType -- classification of the memory
  • durability -- decay tier
  • importance -- 0-1 relevance score

Memory Types

TypeDescription
factConcrete business facts, decisions, confirmed truths
preferenceUser preferences and formatting choices
interactionConversational patterns
summaryCondensed summaries
user_requestExplicit user requests
entityNamed entities (people, products, companies)
interestUser interests and topics

Durability Tiers

TierDescriptionUse Case
coreNo decay. Permanent storage.Business decisions, architecture, confirmed requirements
standardModerate decay rate.General facts, inferences
ephemeralHigh decay rate.Preferences, one-off interactions

Scopes

Memories are namespaced by scope ID. Two scope patterns exist:

  • Organization scope: org:{orgId} -- shared facts about the organization.
  • Agent scope: org:{orgId}:agent:{agentName} -- agent-specific memories visible only to that agent.

When extracting facts from conversation, the SDK writes to both the organization scope and any relevant agent scopes simultaneously.

Fact Extraction

After each chat turn, the memory pipeline:

  1. Builds a normalized conversation payload from the turn's messages.
  2. Assesses the conversation's memory importance using a helper model.
  3. If classified as transient with ephemeral durability, the turn is skipped.
  4. Extracts structured facts with confidence scores and durability classifications.
  5. Classifies each fact against existing memories (new, supersedes, contradicts, enriches, duplicate).
  6. Writes new facts, updates superseded facts, and creates memoryRelation edges.

Extraction happens asynchronously via the post-chat-memory queue for onboarding turns and the regular-chat-memory-digest queue for post-onboarding turns.

Search and Retrieval

The retrieval pipeline combines vector search and optional LLM-based reranking:

ts
const { memoryService } = runtime.services

// Search organization memories
const results = await memoryService.searchOrganizationMemories(orgId, query)

// Search agent-scoped memories
const results = await memoryService.searchAgentMemories(orgId, agentName, query)

// Batched multi-scope search with reranking
const results = await memoryService.searchAllMemoriesBatched({
  orgId,
  agentName: 'cto',
  query: 'technical architecture decisions',
  fastMode: false,
})

The search flow:

  1. HNSW vector search retrieves candidate memories (default: 12 candidates for 6 results).
  2. If candidates exceed the result limit, an LLM reranker selects the most relevant items.
  3. Results are formatted as sectioned text for injection into agent context.

Pre-seeded Memories

High-importance core memories are pre-loaded into agent context at the start of each turn without a query. These are fetched via getTopMemories() and formatted as a <pre-seeded-memories> section.

Memory Blocks

In addition to the global memory system, each workstream has a local memory block -- a structured list of short-term notes accumulated during conversation. See Workstreams for details on memory block compaction.

Memory blocks are:

  • Per-workstream, not global
  • Role-labeled (e.g., chief: ..., cto: ...)
  • Compacted when entries exceed 15 items
  • Injected into agent context alongside the global memory retrieval

Consolidation

A recurring memory-consolidation job runs every 24 hours to maintain memory health:

  • Archives stale memories
  • Resolves contradictions
  • Updates importance scores based on access patterns

Memory Relations

The memoryRelation table stores semantic edges between memories:

RelationDescription
contradictsNew fact conflicts with existing memory
supportsNew fact corroborates existing memory
supersedesNew fact replaces outdated memory
caused_byCausal relationship
depends_onDependency relationship
part_ofCompositional relationship
implementsImplementation relationship

Relations are created during fact extraction and used during retrieval to expand context with related memories.