Weaveflow

LLM Providers

The "brain" is swappable. The framework core depends only on the LLMAdapter abstraction, so no vendor SDK API leaks into your agent code (Dependency Inversion).

Selecting a brain

Pass a "provider:model" string anywhere a brain is accepted (@agent(llm=...), Pipeline(llm=...), LocalRunner(llm=...), or create_adapter(...)).

from weaveflow import create_adapter

brain = create_adapter("anthropic:claude-opus-4-8")
text = await brain.complete("Hello", system="Be brief.")

Supported providers

Spec prefixBackendInstall extraAPI key env
openai:OpenAI Chat Completionsweaveflow[openai]OPENAI_API_KEY
anthropic:Anthropic Messagesweaveflow[anthropic]ANTHROPIC_API_KEY
google:Google Geminiweaveflow[google]GOOGLE_API_KEY / GEMINI_API_KEY
mistral:Mistral (OpenAI-compatible)weaveflow[mistral]MISTRAL_API_KEY
deepseek:DeepSeek (OpenAI-compatible)weaveflow[deepseek]DEEPSEEK_API_KEY
ollama:Local Ollama (OpenAI-compatible)weaveflow[ollama]none (local)

Examples: "openai:gpt-4o", "anthropic:claude-opus-4-8", "google:gemini-1.5-pro", "mistral:mistral-large-latest", "deepseek:deepseek-chat", "ollama:llama3". Ollama reads OLLAMA_HOST (default http://localhost:11434/v1).

Provider SDKs are imported lazily. A missing one raises an actionable AdapterNotInstalledError telling you which extra to install.

Timeouts & retries

Every adapter is bounded by a timeout (default 30s) and retries transient failures with exponential backoff + jitter (max_retries, default 2). Deterministic framework errors (e.g. a missing SDK or API key) are surfaced immediately, never retried. The provider client is built once and reused (connection pooling).

from weaveflow import create_adapter

brain = create_adapter("openai:gpt-4o", timeout=10, max_retries=3)
# pass the configured adapter straight to an agent or pipeline:
#   @agent(..., llm=brain)   ·   Pipeline([...], llm=brain)

After the retry budget is exhausted, the last failure is normalized into an AdapterError with the attempt count and underlying cause in its detail.

The adapter contract

from weaveflow import LLMAdapter

class LLMAdapter(ABC):
    async def complete(self, prompt: str, *, system: str | None = None, **opts) -> str: ...
    def stream(self, prompt: str, *, system: str | None = None, **opts): ...   # async iterator

complete returns a single string; stream yields tokens. Provider errors are normalized into AdapterError so failures are uniform.

Bring your own provider

Implement the contract and register it, with no edits to existing dispatch logic (Open/Closed):

from weaveflow import LLMAdapter, register_provider, create_adapter

class EchoAdapter(LLMAdapter):
    async def complete(self, prompt, *, system=None, **opts):
        return prompt
    async def stream(self, prompt, *, system=None, **opts):
        for token in prompt.split():
            yield token

register_provider("echo", EchoAdapter)
brain = create_adapter("echo:any-model")

You can also pass an already-constructed adapter instance anywhere a spec string is accepted. This is useful for tests and custom configuration:

Pipeline([my_agent], llm=EchoAdapter(model="x"))

Streaming

stream is an async generator, ideal for the stream data type and long-form output:

async for token in ctx.stream("Write a haiku about ports"):
    print(token, end="", flush=True)