Metadata-Version: 2.4
Name: contextseek
Version: 0.1.0
Summary: ContextSeek semantic context substrate for agent systems.
Project-URL: Homepage, https://github.com/ob-labs/contextseek
Project-URL: Repository, https://github.com/ob-labs/contextseek
Project-URL: Documentation, https://github.com/ob-labs/contextseek/tree/main/docs
Project-URL: Issues, https://github.com/ob-labs/contextseek/issues
Author: ContextSeek Team
License: Apache-2.0
License-File: LICENSE
Keywords: agent,context,llm,memory,provenance,rag
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Requires-Dist: langchain-openai>=1.1.10
Requires-Dist: pydantic-settings>=2.3.0
Requires-Dist: pydantic>=2.7.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: seekvfs>=0.1.0
Requires-Dist: typing-extensions>=4.10.0
Provides-Extra: appworld-eval
Requires-Dist: anthropic>=0.30.0; extra == 'appworld-eval'
Requires-Dist: openai>=1.0.0; extra == 'appworld-eval'
Provides-Extra: http
Requires-Dist: fastapi>=0.110.0; extra == 'http'
Requires-Dist: pydantic>=2.7.0; extra == 'http'
Requires-Dist: uvicorn>=0.30.0; extra == 'http'
Provides-Extra: huggingface
Requires-Dist: langchain-huggingface>=0.1.0; extra == 'huggingface'
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.3.0; extra == 'langchain'
Requires-Dist: langchain>=0.3.0; extra == 'langchain'
Requires-Dist: langgraph>=0.2.0; extra == 'langchain'
Provides-Extra: oceanbase
Requires-Dist: pyobvector>=0.1.0; extra == 'oceanbase'
Requires-Dist: sqlalchemy>=2.0; extra == 'oceanbase'
Provides-Extra: ollama
Requires-Dist: langchain-ollama>=0.3.0; extra == 'ollama'
Provides-Extra: openai
Requires-Dist: langchain-openai>=0.3.0; extra == 'openai'
Provides-Extra: test
Requires-Dist: pytest>=8.2.0; extra == 'test'
Description-Content-Type: text/markdown

# ContextSeek

[![PyPI version](https://img.shields.io/pypi/v/contextseek)](https://pypi.org/project/contextseek/)
[![PyPI downloads](https://img.shields.io/pypi/dm/contextseek)](https://pypi.org/project/contextseek/)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://pypi.org/project/contextseek/)
[![License Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)
[![Discord](https://img.shields.io/badge/Discord-community-5865F2?logo=discord&logoColor=white)](https://discord.com/invite/74cF8vbNEs)

Semantic context infrastructure for AI agents. [中文文档](README_CN.md)

## Overview

Agent self-evolution is taking shape along two technical paths. One extracts and solidifies experience from runtime behavior (e.g. [Hermes](https://github.com/NousResearch/hermes-agent), [OpenHuman](https://github.com/tinyhumansai/openhuman)). The other evolves the **context infrastructure** beneath the agent—organizing, updating, and linking context automatically—without modifying agent execution logic.

ContextSeek focuses on the latter. It turns one-off, task-level gains into compounding value across context lifecycles, so heterogeneous agent systems can share a single semantic layer for retrieval, provenance, and evolution.

Three constraints still stand in the way: **heterogeneous integration**—Memory, Trace, and related components expose incompatible APIs and semantic conventions; **insufficient retention**—runtime experience is consumed in the prompt window and rarely becomes reusable capability; **missing provenance**—outputs lack traceable evidence chains. ContextSeek is a unified semantic context layer between LLMs and agent runtimes, converging these capabilities in a single object model: everything is a `ContextItem`, retrievable and traceable, with automatic progression through `raw → extracted → knowledge → skill`.

## Quick Start

```bash
pip install contextseek
```

```python
from contextseek import ContextSeek

ctx = ContextSeek.from_settings()  # reads .env or environment variables

# Write
ctx.add(
    "OceanBase is a financial-grade distributed database supporting HTAP workloads",
    scope="acme/db/engineer",
    source="wiki",
)

# Retrieve (ranked SearchHits; L1 summaries by default)
for hit in ctx.retrieve("distributed database", scope="acme/db/engineer", k=10):
    print(f"[{hit.item.stage.value}] score={hit.score:.2f} | {hit.item.summary[:60]}")
```

Configure via `.env` (see [.env.example](.env.example)) or `ContextSeekSettings` in code. A storage backend, an embedding provider, and an LLM are the three required pieces.

## Documentation

- [Getting started (EN)](docs/en/getting-started/quickstart.md) / [快速上手 (ZH)](docs/zh/getting-started/quickstart.md): installation, `.env` setup, and a walkthrough of the core operations.
- [Client API reference](docs/en/reference/api.md): full method signatures for `add`, `retrieve`, `expand`, `compact`, `dream`, `evidence_chain`, and more.
- [Configuration reference](docs/en/getting-started/configuration.md): all environment variables and `ContextSeekSettings` fields.
- [DataPlugs](docs/en/guides/integrations/dataplugs.md): how to ingest from RAG pipelines, memory stores, execution traces, and skill / tool registries.
- [Examples](examples/README.md): annotated scripts for common workflows.
- [AppWorld eval](eval/appworld/README.md) / [τ-bench eval](eval/taubench/README.md): optional evaluation harnesses with their own setup requirements.

## How it works

- **Unified object model** — all context — memory, knowledge, traces, skills — is a `ContextItem`. Items carry mandatory `Provenance` (source type, source id, confidence) and typed `Link` edges (supports, refutes, derives, supersedes), enabling a full `EvidenceChain` DAG with confidence propagation.
- **Content tiers** — L0 (~100 tokens) feeds embedding recall. L1 (~2 k tokens) is the default surface returned by `retrieve()`. L2 (full body) is available on demand via `expand()`.
- **Retrieval orchestrator** — keyword + vector hybrid recall, optional LLM reranking, and scope-based routing. Returns ranked `SearchHit` rows. Exposes tool specs for OpenAI and Anthropic agents via `ctx.tools()`.
- **EvolutionEngine** — watches for items that can be merged, resolved, advanced in stage, or distilled into skills. Runs incrementally after writes or on an explicit `compact()` call.
- **DreamEngine** — idle-time pattern consolidation and cross-cluster hypothesis generation, triggered via `dream()`.
- **HTTP + MCP servers** — expose the same operations over FastAPI and the Model Context Protocol for remote agent integrations.

## Related Projects

- [seekvfs](https://github.com/ob-labs/seekvfs) — underlying virtual filesystem

## License

[Apache License 2.0](LICENSE)
