Metadata-Version: 2.4
Name: astro-anchor
Version: 0.1.1
Summary: Context engineering toolkit for AI applications
Keywords: context-engineering,rag,llm,ai,retrieval,memory
Author: Astro
Author-email: Astro <astro@anchor.dev>
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Dist: pydantic>=2.8,<3
Requires-Dist: anthropic>=0.40,<1 ; extra == 'agents'
Requires-Dist: astro-anchor[bm25,cli,anthropic,tiktoken,agents,pdf,flashrank,otlp] ; extra == 'all'
Requires-Dist: anthropic>=0.40,<1 ; extra == 'anthropic'
Requires-Dist: rank-bm25>=0.2.2,<1 ; extra == 'bm25'
Requires-Dist: typer>=0.12,<1 ; extra == 'cli'
Requires-Dist: rich>=13,<14 ; extra == 'cli'
Requires-Dist: mkdocs>=1.6,<2 ; extra == 'docs'
Requires-Dist: mkdocs-material>=9,<10 ; extra == 'docs'
Requires-Dist: mkdocs-minify-plugin>=0.8,<1 ; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.25,<1 ; extra == 'docs'
Requires-Dist: flashrank>=0.2,<1 ; extra == 'flashrank'
Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.20,<2 ; extra == 'otlp'
Requires-Dist: opentelemetry-sdk>=1.20,<2 ; extra == 'otlp'
Requires-Dist: pypdf>=4,<5 ; extra == 'pdf'
Requires-Dist: tiktoken>=0.7,<1 ; extra == 'tiktoken'
Requires-Python: >=3.11
Project-URL: Bug Tracker, https://github.com/artcgranja/anchor/issues
Project-URL: Documentation, https://artcgranja.github.io/anchor
Project-URL: Homepage, https://github.com/artcgranja/anchor
Project-URL: Repository, https://github.com/artcgranja/anchor
Provides-Extra: agents
Provides-Extra: all
Provides-Extra: anthropic
Provides-Extra: bm25
Provides-Extra: cli
Provides-Extra: docs
Provides-Extra: flashrank
Provides-Extra: otlp
Provides-Extra: pdf
Provides-Extra: tiktoken
Description-Content-Type: text/markdown

<p align="center">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://raw.githubusercontent.com/artcgranja/anchor/main/docs/docs/assets/logo-dark.svg">
    <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/artcgranja/anchor/main/docs/docs/assets/logo-light.svg">
    <img src="https://raw.githubusercontent.com/artcgranja/anchor/main/docs/docs/assets/logo-light.svg" alt="anchor" width="280">
  </picture>
</p>

<p align="center">
  <strong>Context engineering toolkit for AI applications.</strong>
</p>

<p align="center">
  <a href="https://artcgranja.github.io/anchor/">Docs</a> ·
  <a href="https://artcgranja.github.io/anchor/getting-started/quickstart/">Quickstart</a> ·
  <a href="https://artcgranja.github.io/anchor/cookbook/">Cookbook</a> ·
  <a href="https://artcgranja.github.io/anchor/api/">API Reference</a> ·
  <a href="https://github.com/artcgranja/anchor/issues">Issues</a>
</p>

<p align="center">
  <a href="https://pypi.org/project/astro-anchor/"><img src="https://img.shields.io/pypi/v/astro-anchor?color=3b82f6&style=flat-square" alt="PyPI"></a>
  <a href="https://pypi.org/project/astro-anchor/"><img src="https://img.shields.io/pypi/dm/astro-anchor?color=64748b&style=flat-square" alt="Downloads"></a>
  <a href="https://pypi.org/project/astro-anchor/"><img src="https://img.shields.io/pypi/pyversions/astro-anchor?color=6B8E6B&style=flat-square" alt="Python"></a>
  <a href="https://github.com/artcgranja/anchor/blob/main/LICENSE"><img src="https://img.shields.io/github/license/artcgranja/anchor?color=64748b&style=flat-square" alt="License"></a>
  <a href="https://github.com/artcgranja/anchor"><img src="https://img.shields.io/github/stars/artcgranja/anchor?style=flat-square&color=3b82f6" alt="Stars"></a>
</p>

---

Stop duct-taping RAG, memory, and tools together. **anchor** gives you a single, token-aware pipeline that assembles context for any LLM — with smart budgets, hybrid retrieval, and provider-agnostic formatting out of the box.

## Why anchor?

| Feature | LangChain | LlamaIndex | mem0 | **anchor** |
|---------|:---------:|:----------:|:----:|:----------:|
| Hybrid RAG (Dense + BM25 + RRF) | partial | yes | no | **yes** |
| Token-aware Memory | partial | no | yes | **yes** |
| Token Budget Management | no | no | no | **yes** |
| Provider-agnostic Formatting | no | no | no | **yes** |
| Protocol-based Plugins (PEP 544) | no | partial | no | **yes** |
| Zero-config Defaults | no | no | yes | **yes** |
| 2000+ tests, 94% coverage | — | — | — | **yes** |

## Quick Install

```bash
pip install astro-anchor
```

<details>
<summary>Optional extras</summary>

```bash
pip install astro-anchor[bm25]        # BM25 sparse retrieval
pip install astro-anchor[anthropic]   # Anthropic Claude support
pip install astro-anchor[cli]         # CLI tools (typer + rich)
pip install astro-anchor[flashrank]   # FlashRank reranking
pip install astro-anchor[otlp]        # OpenTelemetry tracing
pip install astro-anchor[all]         # Everything
```

</details>

## 30 Seconds to Your First Pipeline

```python
from anchor import ContextPipeline, MemoryManager, AnthropicFormatter

pipeline = (
    ContextPipeline(max_tokens=8192)
    .with_memory(MemoryManager(conversation_tokens=4096))
    .with_formatter(AnthropicFormatter())
    .add_system_prompt("You are a helpful assistant.")
)

result = pipeline.build("What is context engineering?")
print(result.formatted_output)   # Ready for Claude API
print(result.diagnostics)        # Token usage, timing, overflow info
```

## Features

- **Hybrid RAG** — Dense embeddings + BM25 sparse retrieval with Reciprocal Rank Fusion
- **Smart Memory** — Token-aware sliding window with automatic eviction
- **Token Budgets** — Priority-ranked context assembly that never exceeds your window
- **Provider Agnostic** — Format output for Anthropic, OpenAI, or plain text
- **Protocol-Based** — Plug in any vector store, tokenizer, or retriever via PEP 544 Protocols
- **Type-Safe** — Pydantic v2 models throughout, full `py.typed` support
- **Agent Framework** — Build tool-calling agents with `@tool` decorator and skills
- **Full Observability** — OpenTelemetry tracing, cost tracking, per-step diagnostics

<details>
<summary><strong>Hybrid Retrieval Example</strong></summary>

```python
import math
from anchor import (
    ContextPipeline, ContextItem, QueryBundle, SourceType,
    DenseRetriever, InMemoryContextStore, InMemoryVectorStore,
    retriever_step,
)

def my_embed_fn(text: str) -> list[float]:
    seed = sum(ord(c) for c in text) % 10000
    raw = [math.sin(seed * 1000 + i) for i in range(64)]
    norm = math.sqrt(sum(x * x for x in raw))
    return [x / norm for x in raw] if norm else raw

dense = DenseRetriever(
    vector_store=InMemoryVectorStore(),
    context_store=InMemoryContextStore(),
    embed_fn=my_embed_fn,
)

items = [
    ContextItem(content="Python is great for data science.", source=SourceType.RETRIEVAL),
    ContextItem(content="RAG combines retrieval with generation.", source=SourceType.RETRIEVAL),
]
dense.index(items)

pipeline = ContextPipeline(max_tokens=8192).add_step(retriever_step("search", dense, top_k=5))
query = QueryBundle(query_str="How does RAG work?", embedding=my_embed_fn("How does RAG work?"))
result = pipeline.build(query)
```

</details>

<details>
<summary><strong>Memory Management Example</strong></summary>

```python
from anchor import ContextPipeline, QueryBundle, MemoryManager

memory = MemoryManager(conversation_tokens=4096)
memory.add_user_message("Help me migrate from MySQL to Postgres")
memory.add_assistant_message("Sure! What MySQL version are you using?")
memory.add_user_message("MySQL 8.0, about 50GB")

pipeline = ContextPipeline(max_tokens=8192).with_memory(memory)
result = pipeline.build(QueryBundle(query_str="What are the risks?"))
# Memory is automatically included, oldest turns evicted if over budget
```

</details>

<details>
<summary><strong>Decorator API Example</strong></summary>

```python
from anchor import ContextPipeline, ContextItem, QueryBundle

pipeline = ContextPipeline(max_tokens=8192)

@pipeline.step
def boost_recent(items: list[ContextItem], query: QueryBundle) -> list[ContextItem]:
    return [
        item.model_copy(update={"score": min(1.0, item.score * 1.5)})
        if item.metadata.get("recent") else item
        for item in items
    ]

result = pipeline.build("What is context engineering?")
```

</details>

<details>
<summary><strong>Provider Formatting</strong></summary>

```python
from anchor import AnthropicFormatter, OpenAIFormatter

# Anthropic: {"system": "...", "messages": [...]}
result = pipeline.with_formatter(AnthropicFormatter()).build(query)

# OpenAI: {"messages": [{"role": "system", ...}, ...]}
result = pipeline.with_formatter(OpenAIFormatter()).build(query)
```

</details>

## Architecture

```
ContextPipeline
  │
  ├── System Prompts (priority=10)
  ├── Memory Manager (priority=7)
  ├── Pipeline Steps
  │     ├── Retriever Steps (append items)
  │     ├── PostProcessor Steps (transform items)
  │     └── Filter Steps (filter items)
  │
  ▼
ContextWindow (token-aware, priority-ranked)
  │
  ▼
Formatter (Anthropic / OpenAI / Generic)
  │
  ▼
ContextResult (formatted output + diagnostics)
```

## Token Budgets

```python
from anchor import ContextPipeline, default_chat_budget

budget = default_chat_budget(max_tokens=8192)
pipeline = ContextPipeline(max_tokens=8192).with_budget(budget)
```

Three presets available: `default_chat_budget` (conversational), `default_rag_budget` (retrieval-heavy), `default_agent_budget` (balanced).

## Development

```bash
git clone https://github.com/artcgranja/anchor.git
cd anchor
uv sync
uv run pytest     # 2000+ tests
uv run ruff check src/ tests/
```

## Roadmap

- **v0.1.0** (current) — Hybrid RAG, Memory, Pipeline, Formatters, Async, Decorator API, Agent Framework, [Full docs](https://artcgranja.github.io/anchor/)
- **v0.2.0** — MCP Bridge, progressive summarization, persistent storage backends
- **v0.3.0** — GraphRAG, multi-modal context, LangChain/LlamaIndex adapters
- **v1.0.0** — Production-grade APIs, plugin ecosystem

## License

MIT — see [LICENSE](LICENSE) for details.
