Providers

Agent index: llms.txt

Embeddings and LLM calls in AtomicMemory are pluggable providers behind single-method interfaces. You pick OpenAI, Anthropic, Google, Groq, Ollama, a local WASM model, or any OpenAI-compatible endpoint at deploy time via environment variables. Nothing above the provider boundary changes, no code, no imports, no service wiring.

That is the second pillar of the platform layer: the services that call embedText() and chat() don't know, and can't know, which provider is serving the call. The selection is made once at composition-root time and erased behind an interface.

The two interfaces

Both provider families are tiny. That's deliberate: the less surface the interface has, the more backends can satisfy it, and the harder it is to leak provider-specific concepts into business logic.

`EmbeddingProvider`

From atomicmemory-core/src/services/embedding.ts:

export type EmbeddingTask = 'query' | 'document';

export interface EmbeddingProvider {
  embed(text: string, task: EmbeddingTask): Promise<number[]>;
  embedBatch(texts: string[], task: EmbeddingTask): Promise<number[][]>;
}

Two methods. One for single embeddings, one for batch. The task argument lets providers apply query/document-specific behavior without leaking provider details into ingest or search. Every backend, OpenAI REST, Ollama's native API, local WASM via transformers.js, Voyage AI, any OpenAI-compatible endpoint , satisfies exactly this shape.

`LLMProvider`

From atomicmemory-core/src/services/llm.ts:60-74:

export interface ChatMessage {
  role: 'system' | 'user' | 'assistant';
  content: string;
}

export interface ChatOptions {
  temperature?: number;
  maxTokens?: number;
  jsonMode?: boolean;
  seed?: number;
}

export interface LLMProvider {
  chat(messages: ChatMessage[], options?: ChatOptions): Promise<string>;
}

One method. Chat messages in, completion text out. JSON mode, seed, and temperature are passed as options, any backend that can't honor one of them (for instance, Anthropic ignores seed) degrades gracefully inside the adapter, never leaking the difference to callers.

The supported providers

Embeddings

Provider name (`EMBEDDING_PROVIDER=`)	Backend
`openai`	OpenAI embeddings REST API
`ollama`	Ollama native `/api/embed` endpoint
`openai-compatible`	Any OpenAI-schema endpoint (LM Studio, vLLM, TGI, …)
`transformers`	Local WASM via `@huggingface/transformers` + ONNX Runtime
`voyage`	Voyage AI embeddings with compatible document/query model pairs

Declared in config.ts:14:

export type EmbeddingProviderName =
  'openai' | 'ollama' | 'openai-compatible' | 'transformers' | 'voyage';

LLM

Provider name (`LLM_PROVIDER=`)	Backend
`openai`	OpenAI chat completions
`anthropic`	Anthropic Messages API
`google-genai`	Google Gemini via OpenAI-compatible endpoint
`groq`	Groq via OpenAI-compatible endpoint
`ollama`	Ollama native `/api/chat` endpoint
`openai-compatible`	Any OpenAI-schema endpoint (LM Studio, vLLM, …)

Declared in config.ts:15:

export type LLMProviderName =
  EmbeddingProviderName | 'groq' | 'anthropic' | 'google-genai';

Note the subtype relationship: the LLM provider union builds on the embedding provider names, then adds chat-only providers. Embedding-only providers such as transformers and voyage have no chat backend.

The factory is the only place the provider name is visible

This is the crux of the provider-agnostic boundary. The entire codebase above embedding.ts never matches on provider name. The switch happens once, inside the factory, and is erased the moment the provider is returned.

From embedding.ts:

function createEmbeddingProvider(): EmbeddingProvider {
  const config = requireConfig();
  switch (config.embeddingProvider) {
    case 'openai':
      return new OpenAICompatibleEmbedding(
        config.openaiApiKey, config.embeddingModel,
        undefined, config.embeddingDimensions,
      );
    case 'ollama':
      return new OllamaEmbedding(
        config.embeddingModel, config.ollamaBaseUrl,
      );
    case 'openai-compatible':
      return new OpenAICompatibleEmbedding(
        config.embeddingApiKey ?? config.openaiApiKey,
        config.embeddingModel,
        config.embeddingApiUrl,
        config.embeddingDimensions,
      );
    case 'transformers':
      return new TransformersEmbedding(config.embeddingModel);
    case 'voyage':
      if (!config.voyageApiKey) {
        throw new Error('VOYAGE_API_KEY is required when EMBEDDING_PROVIDER=voyage');
      }
      return new VoyageEmbedding(
        config,
        config.voyageApiKey,
        config.voyageDocumentModel,
        config.voyageQueryModel,
        config.embeddingDimensions,
      );
    default:
      throw new Error(
        `Unknown embedding provider: ${config.embeddingProvider}`,
      );
  }
}

And the LLM factory, from llm.ts:259-289:

export function createLLMProvider(): LLMProvider {
  const config = requireConfig();
  switch (config.llmProvider) {
    case 'openai':
      return new OpenAICompatibleLLM(config.openaiApiKey, config.llmModel);
    case 'ollama':
      return new OllamaLLM(config.llmModel, config.ollamaBaseUrl);
    case 'groq':
      return new OpenAICompatibleLLM(
        config.groqApiKey ?? '',
        config.llmModel,
        'https://api.groq.com/openai/v1',
      );
    case 'anthropic':
      return new AnthropicLLM(config.anthropicApiKey ?? '', config.llmModel);
    case 'google-genai':
      return new OpenAICompatibleLLM(
        config.googleApiKey ?? '',
        config.llmModel,
        'https://generativelanguage.googleapis.com/v1beta/openai/',
      );
    case 'openai-compatible':
      return new OpenAICompatibleLLM(
        config.llmApiKey ?? config.openaiApiKey,
        config.llmModel,
        config.llmApiUrl,
      );
    default:
      throw new Error(`Unknown LLM provider: ${config.llmProvider}`);
  }
}

Two things to notice:

Three providers (Groq, Google Gemini, OpenAI-compatible) reuse OpenAICompatibleLLM. The OpenAI SDK's wire format is the industry default, so the adapter is written once and pointed at different baseURLs. That's what "openai-compatible" costs us, nothing.
Anthropic gets its own adapter because the Messages API has a different message shape (system prompt is top-level, assistant/user messages are separate). The adapter normalizes it to chat(messages, options) and callers never see the difference.

Changing provider with zero code change

This is the headline. Here's the ingest pipeline calling embedText:

import { embedText } from './services/embedding.js';

// Inside the ingest service, no provider name anywhere.
const embedding = await embedText(userMessage, 'document');
await stores.memory.storeMemory({
  userId, content: userMessage, embedding, importance, sourceSite,
});

The same call site runs against OpenAI:

EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small
OPENAI_API_KEY=sk-…

Or against a local Ollama:

EMBEDDING_PROVIDER=ollama
EMBEDDING_MODEL=snowflake-arctic-embed2
OLLAMA_BASE_URL=http://localhost:11434

Or against fully-local WASM with zero network:

EMBEDDING_PROVIDER=transformers
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2

Or against an OpenAI-compatible server (LM Studio, vLLM, TGI, a corporate proxy):

EMBEDDING_PROVIDER=openai-compatible
EMBEDDING_MODEL=bge-large-en-v1.5
EMBEDDING_API_URL=http://internal-embed.corp:8080/v1
EMBEDDING_API_KEY=…   # optional

Or against Voyage AI with compatible document/query models:

EMBEDDING_PROVIDER=voyage
EMBEDDING_DIMENSIONS=1024
VOYAGE_DOCUMENT_MODEL=voyage-4-large
VOYAGE_QUERY_MODEL=voyage-4-lite
VOYAGE_API_KEY=pa-…

The ingest service, the search service, the AUDN decision loop, the repair loop, none of them have a single if (provider === 'ollama') branch. That is the provider-agnostic boundary working as designed.

Provider quirks stay inside the adapter

Being agnostic doesn't mean being naïve. Real embedding models have real quirks, and the adapter layer is where those quirks live, never above. Two examples from the shipped code:

Instruction prefixes

Some embedding models (mxbai, nomic) need task-specific prefixes on query text but not document text. The provider-agnostic embedText function handles that before dispatch. From embedding.ts:292-308:

function getInstructionPrefix(model: string, task: EmbeddingTask): string {
  if (task === 'document') return '';

  if (model.includes('mxbai-embed-large')) {
    return 'Represent this sentence for searching relevant passages: ';
  }
  if (model.includes('nomic-embed-text')) {
    return 'search_query: ';
  }
  return '';
}

Callers pass 'query' or 'document' as a semantic tag. The prefix (if any) is model-specific, and the logic lives in one place.

ONNX Runtime serialization (WASM provider)

The local WASM provider has a known concurrency issue, ONNX Runtime's mutex corrupts under concurrent async calls. Rather than leak a "don't call concurrently" caveat into every consumer, the adapter serializes internally. From embedding.ts:168-218:

class TransformersEmbedding implements EmbeddingProvider {
  private model: string;
  private pipelinePromise: Promise<TransformersPipeline> | null = null;
  private inferenceQueue: Promise<void> = Promise.resolve();

  private serialized<T>(fn: (extractor: TransformersPipeline) => Promise<T>): Promise<T> {
    return new Promise<T>((resolve, reject) => {
      this.inferenceQueue = this.inferenceQueue.then(async () => {
        try {
          const extractor = await this.getPipeline();
          resolve(await fn(extractor));
        } catch (err) {
          reject(err);
        }
      });
    });
  }

  async embed(text: string): Promise<number[]> {
    return this.serialized(async (extractor) => {
      const output = await extractor(text, { pooling: 'mean', normalize: true });
      return Array.from(output.data as Float32Array);
    });
  }

  // embedBatch follows the same pattern.
}

The rule: every provider-specific workaround lives inside the adapter. The EmbeddingProvider interface is the same shape for every backend.

Cost telemetry is cross-cutting

Every provider adapter, OpenAI, Anthropic, Ollama, Google, Groq, calls writeCostEvent() with the same shape after each request. That gives you one cost log across heterogeneous backends, keyed by provider, model, and stage. You can swap models and still see apples-to-apples cost data in a single stream.

See the recordOpenAICost helper in llm.ts:147-164 for the OpenAI-compatible path; every other adapter writes the same event shape inline.

Writing your own provider

Because EmbeddingProvider and LLMProvider are interfaces, not base classes, adding a new backend is mechanical:

Implement the one-or-two-method interface.
Add a case to the factory switch.
Extend EmbeddingProviderName or LLMProviderName in config.ts.
Wire any required API keys or model names into RuntimeConfig.
Include all provider-specific identity fields in the cache/provider key.

There are no base classes to extend, no lifecycle hooks to implement, no plugin registry to register with. The provider layer is as small as the EmbeddingProvider and LLMProvider signatures say it is.

Startup-only selection

Provider and model selection is composition-time by design. Server deployments normally bind that config from env at startup. In-process harnesses can instead pass a full RuntimeConfig to createCoreRuntime({ pool, config }) when they need an isolated benchmark run with a different embedding stack.

The modules hold their config as module-local state, bound by the composition root. From embedding.ts:

export function initEmbedding(config: EmbeddingConfig): void {
  embeddingConfig = config;
  provider = null;
  providerKey = '';
  embeddingCache.clear();
}

That's deliberate. Hot-swapping embedding providers inside an already-running runtime would invalidate the embedding cache, invalidate pgvector index assumptions, and potentially mix embedding widths in the same table. We sidestep all of that by making provider selection fixed for a runtime. Benchmark harnesses should create a fresh isolated runtime or process for each embedding configuration.

Naming

This page is about embedding and LLM providers inside the engine, OpenAI, Ollama, Anthropic, etc. The SDK has a separate concept called memory providers (MemoryProvider), the interface a memory backend implements so the SDK can route through it. Different layer, different concept.

Stores, the other half of the platform layer: pluggable storage behind narrow interfaces.
Composition, how providers, stores, and services are wired together at startup.