Metadata-Version: 2.4
Name: polymathera-colony
Version: 0.1.2
Summary: Polymathera's no-RAG, multi-agent framework for extremely long, dense contexts (1B+ tokens).
License: Apache-2.0
License-File: LICENSE
Keywords: multi-agent,llm,context,cache-aware,agents,no-rag
Author: Ahmed Nassar
Author-email: amam.nassar@gmail.com
Requires-Python: >=3.11,<3.13
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Environment :: Console
Classifier: Environment :: GPU :: NVIDIA CUDA
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Provides-Extra: aws
Provides-Extra: cache-compression
Provides-Extra: code-analysis
Provides-Extra: cpu
Provides-Extra: dashboard
Provides-Extra: distributed
Provides-Extra: gpu
Provides-Extra: observability
Requires-Dist: aiobotocore (>=2.13.0,<3.0.0) ; extra == "aws"
Requires-Dist: aiofiles (>=25.1.0,<26.0.0)
Requires-Dist: aiohttp (>=3.13.3,<4.0.0)
Requires-Dist: aiokafka (>=0.12.0,<0.13.0) ; extra == "observability"
Requires-Dist: anthropic (>=0.79.0,<0.80.0) ; extra == "cpu"
Requires-Dist: astor (>=0.8.1,<0.9.0) ; extra == "code-analysis"
Requires-Dist: asyncpg (>=0.31.0,<0.32.0)
Requires-Dist: boto3 (>=1.35.0,<2.0.0) ; extra == "aws"
Requires-Dist: botocore (>=1.35.0,<2.0.0) ; extra == "aws"
Requires-Dist: cachetools (>=7.0.0,<8.0.0)
Requires-Dist: chardet (>=5.2.0,<6.0.0) ; extra == "code-analysis"
Requires-Dist: chromadb (>=1.0.0,<2.0.0) ; extra == "cpu"
Requires-Dist: circuitbreaker (>=2.1.3,<3.0.0)
Requires-Dist: colorama (>=0.4.6,<0.5.0)
Requires-Dist: docker (>=7.1.0,<8.0.0)
Requires-Dist: etcd3 (>=0.12.0,<0.13.0) ; extra == "distributed"
Requires-Dist: fastapi (>=0.115) ; extra == "dashboard"
Requires-Dist: gitpython (>=3.1.46,<4.0.0) ; extra == "code-analysis"
Requires-Dist: html2text (>=2025.4.15,<2026.0.0) ; extra == "code-analysis"
Requires-Dist: httpx (>=0.27) ; extra == "dashboard"
Requires-Dist: ipython (>=9.10.0,<10.0.0)
Requires-Dist: markdown (>=3.10.1,<4.0.0) ; extra == "code-analysis"
Requires-Dist: msgpack (>=1.1.2,<2.0.0)
Requires-Dist: multipledispatch (>=1.0.0,<2.0.0)
Requires-Dist: nbformat (>=5.10.4,<6.0.0) ; extra == "code-analysis"
Requires-Dist: networkx (>=3.6.1,<4.0.0)
Requires-Dist: numpy (>=1.21.0,<2.3)
Requires-Dist: openai (>=2.21.0,<3.0.0) ; extra == "cpu"
Requires-Dist: opentelemetry-api (>=1.36.0,<2.0.0)
Requires-Dist: overrides (>=7.7.0,<8.0.0)
Requires-Dist: pillow (>=12.1.0,<13.0.0) ; extra == "code-analysis"
Requires-Dist: prometheus-client (>=0.21.0,<0.22.0)
Requires-Dist: psutil (>=7.2.2,<8.0.0) ; extra == "code-analysis"
Requires-Dist: pygithub (>=2.8.1,<3.0.0) ; extra == "code-analysis"
Requires-Dist: pynvml (>=13.0.1,<14.0.0) ; extra == "gpu"
Requires-Dist: python-louvain (>=0.16,<0.17) ; extra == "code-analysis"
Requires-Dist: python-magic (>=0.4.27,<0.5.0) ; extra == "code-analysis"
Requires-Dist: pyyaml (>=6.0.3,<7.0.0)
Requires-Dist: ray[data,serve,train,tune] (==2.49.0)
Requires-Dist: redis (>=7.1.0,<8.0.0)
Requires-Dist: rich (>=14.3.2,<15.0.0)
Requires-Dist: sentence-transformers (>=5.2.2,<6.0.0) ; extra == "cpu"
Requires-Dist: sqlalchemy (>=2.0.46,<3.0.0)
Requires-Dist: sqlmodel (>=0.0.32,<0.0.33)
Requires-Dist: sqlparse (>=0.5.5,<0.6.0) ; extra == "code-analysis"
Requires-Dist: tenacity (>=9.1.3,<10.0.0)
Requires-Dist: termcolor (>=3.3.0,<4.0.0)
Requires-Dist: tiktoken (>=0.12.0,<0.13.0)
Requires-Dist: toml (>=0.10.2,<0.11.0) ; extra == "code-analysis"
Requires-Dist: torch (==2.8.0+cu128) ; extra == "gpu"
Requires-Dist: torchaudio (==2.8.0+cu128) ; extra == "gpu"
Requires-Dist: torchvision (==0.23.0+cu128) ; extra == "gpu"
Requires-Dist: transformers (>=4.52.4,<5.0.0) ; extra == "gpu"
Requires-Dist: tree-sitter (>=0.25.2,<0.26.0) ; extra == "code-analysis"
Requires-Dist: typer[all] (>=0.15.0,<0.16.0)
Requires-Dist: uvicorn[standard] (>=0.30) ; extra == "dashboard"
Requires-Dist: vllm (>=0.11.0,<0.12.0) ; extra == "gpu"
Requires-Dist: xxhash (>=3.6.0,<4.0.0) ; extra == "code-analysis"
Requires-Dist: zstandard (>=0.25.0,<0.26.0) ; extra == "cache-compression"
Project-URL: Homepage, https://github.com/polymathera/colony
Project-URL: Repository, https://github.com/polymathera/colony
Description-Content-Type: text/markdown

# Colony

[![PyPI](https://img.shields.io/pypi/v/polymathera-colony)](https://pypi.org/project/polymathera-colony/)
[![Python](https://img.shields.io/pypi/pyversions/polymathera-colony)](https://pypi.org/project/polymathera-colony/)
[![License](https://img.shields.io/github/license/polymathera/colony)](LICENSE)
[![CI](https://github.com/polymathera/colony/actions/workflows/ci.yml/badge.svg)](https://github.com/polymathera/colony/actions/workflows/ci.yml)
[![Docs](https://img.shields.io/badge/docs-polymathera.github.io%2Fcolony-blue)](https://polymathera.github.io/colony)

**A no-RAG, cache-aware multi-agent framework for extremely long, dense contexts (1B+ tokens).**


Colony is a framework for building *tightly-coupled, self-improving, self-aware multi-agent systems* (***agent colonies***) that reason over extremely long context without retrieval-augmented generation (RAG). Instead of fragmenting context into chunks and retrieving snippets, Colony keeps the entire context *live* across a **cluster of LLMs** through a virtual memory system that manages GPU KV caches the same way an operating system manages (almost unlimited) virtual memory over finite physical memory.

<table style="border:1.5px solid #00bcd4; border-radius:4px; border-collapse:collapse; width:100%; margin:12px 0;">
  <tr><td style="background:#00bcd4; padding:6px 12px; color:#fff; font-weight:600;">💡 Colony's Vision</td></tr>
  <tr><td style="background:#e0f7fa; padding:10px 12px;">Colony's goal is to be the most efficient <em>country of geniuses in a datacenter</em> — the ideal substrate for <strong>civilization-building AI</strong>.</td></tr>
</table>

<table style="border:1.5px solid #ff9100; border-radius:4px; border-collapse:collapse; width:100%; margin:12px 0;">
  <tr><td style="background:#ff9100; padding:6px 12px; color:#fff; font-weight:600;">⚠️ Pre-Alpha Early Access</td></tr>
  <tr><td style="background:#fff8e1; padding:10px 12px;">Colony is still in pre-alpha early access. The API is not stable and the framework is under active development. We welcome feedback and contributions, but be aware that breaking changes may occur.</td></tr>
</table>

<table style="border:1.5px solid #448aff; border-radius:4px; border-collapse:collapse; width:100%; margin:12px 0;">
  <tr><td style="background:#448aff; padding:6px 12px; color:#fff; font-weight:600;">ℹ️ Who should use Colony?</td></tr>
  <tr><td style="background:#e8f0fe; padding:10px 12px;">Colony is designed for <strong>engineers building complex multi-agent systems</strong> that require reasoning over extremely long contexts. It is not a general-purpose agent framework or a consumer product. If you are looking for a simple agent orchestration tool or a way to add tool use to an LLM, Colony may not be the right fit. It runs over a Ray cluster (local or in the cloud) and it can be resource-intensive and expensive.</td></tr>
</table>

## Why Colony?

Most agent frameworks treat context as something to retrieve or manage. Colony treats it as something to be *brought to life*. Certain domains require *reasoning deep and wide*. Examples include:
- *Scientific research*: synthesizing novel insights from a vast literature requires complex integration
- *Cyber-physical systems*: understanding the full context of a complex system (code, physical environment, requirements, regulations) is essential for architecting solutions and identifying edge cases and failure modes
- *Systemic vulnerability analysis*: identifying security risks in a complex system by reasoning over a large attack surface and many potential interactions.
- *Business intelligence*: making strategic decisions based on a wide range of internal and external data, where relevant information may be siloed and require cross-domain reasoning
- *Economic modeling*: simulating and understanding complex economic systems with many interacting agents and factors and long supply chains
- *Long-form content creation*: writing a book or comprehensive report that requires maintaining a coherent narrative across a large amount of information

Colony's core innovations are:
- **NoRAG** -- Colony keeps the full context live and accessible, not filtered through retrieval. Colony manages all kinds of context (code, text, data) through distributed KV cache paging, not vector search.

- **Cache-Aware Agents** -- Agents are aware of what's in GPU memory (at the cluster level) and consciously plan their work to maximize cache reuse.

- **Agents All the Way Down** -- General intelligence emerges from the right composition of *agent capabilities* and *multi-agent patterns*. Every cognitive process -- attention, memory, planning, confidence tracking -- is a pluggable policy with a default implementation.

- **Distributed Reasoning Patterns** -- Multi-agent game protocols (hypothesis games, contract nets, negotiation) combat specific LLM failure modes: hallucination, laziness, and goal drift.

Read the full [Philosophy](https://polymathera.github.io/colony/philosophy/) for the ideas behind the framework.


> P.S. Colony does not preclude agents from using retrieval or vector search -- those can be implemented as capabilities that agents use when appropriate. Colony's point is that retrieval is not the only way to manage long context, and for certain domains, it's not the best way.


## Architecture

```
┌───────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                             Agent Colony                                          │
│                                                                                                   │
│                 ┌────────────────┐   ┌────────────────┐   ┌────────────────┐                      │
│                 │    Agent 1     │   │    Agent 2     │   │    Agent N     │                      │
│                 │ Capabilities   │   │ Capabilities   │   │ Capabilities   │   ...                │
│                 │ Action Policy  │   │ Action Policy  │   │ Action Policy  │                      │
│                 │ Planner (LLM)  │   │ Planner (LLM)  │   │ Planner (LLM)  │                      │
│                 └──────┬─────────┘   └──────┬─────────┘   └──────┬─────────┘                      │
│ read/write/query/mmap  │                    │                    │         infer_with_suffix      │
│   ┌────────────────────┴────────────────────┴────────────────────┴──────┐  page_graph_ops         │
│   │                                                                     ▼                         │
│   │    ┌────────────────────────────┐                 ┌────────────────────────────────────────┐  │
│   │    │    Blackboard (Redis)      │                 │     Virtual Context Memory (VCM)       │  │
│   │    │                            │                 │                                        │  │
│   ┠───►│  Shared state & events     │                 │  Page Table · Page Graph               │  │
│   │    │  OCC · Memory scopes       │                 │  Cache Scheduling · Page Faults        │  │
│   │    │  Agent coordination        │                 │                                        │  │
│   │    └─────────────┬──────────────┘                 │  ┌──────────┐ ┌──────────┐             │  │
│   │                  │                                │  │ LLM N1   │ │ LLM N2   │   ...       │  │
│   │                  │      mmap/munmap/invalidate    │  │ KV Cache │ │ KV Cache │             │  │
│   │                  └───────────────────────────────►│  └──────────┘ └──────────┘             │  │
│   │                         mmap/munmap/invalidate    │                                        │  │
│   │                  ┌───────────────────────────────►│                                        │  │
│   │                  │                                │  Context Sources (mapped as pages):    │  │
│   │    ┌─────────────┴──────────────┐                 │  ┌────────┐ ┌──────────┐ ┌─────────┐   │  │
│   │    │    External Sources        │                 │  │ Repos  │ │Knowledge │ │Blackbrd │   │  │
│   └───►│  Git repos, documents,     │                 │  │        │ │  Bases   │ │  Data   │   │  │
│        │  knowledge bases, data     │                 │  └────────┘ └──────────┘ └─────────┘   │  │
│        └────────────────────────────┘                 └────────────────────────────────────────┘  │
└───────────────────────────────────────────────────────────────────────────────────────────────────┘
```

Each **Agent** composes pluggable [capabilities](src/polymathera/colony/agents/patterns/capabilities) (memory, attention, games, confidence tracking, grounding, reflection, cache awareness, etc.) coordinated by an **`ActionPolicy`** that consults an LLM **Planner**. Agents share state through a Redis-backed **`Blackboard`** with optimistic concurrency control (OCC) and causal ordering. The **Virtual Context Memory** (VCM) manages distributed GPU KV caches as pages, enabling agents to reason over contexts far larger than any single model's window.

See the full [Architecture docs](https://polymathera.github.io/colony/architecture/).

## Quick Start

### Installation

```bash
pip install polymathera-colony
```

With optional extras:

```bash
pip install polymathera-colony[code_analysis]    # Code analysis tools
pip install polymathera-colony[gpu]              # GPU inference (vLLM, PyTorch)
pip install polymathera-colony[cpu]              # CPU-only inference (Anthropic API)
pip install polymathera-colony --all-extras      # Everything
```

### Local Test Environment

Colony ships with `colony-env`, a CLI tool that spins up a local Ray cluster + Redis using Docker Compose. The only prerequisite is **Docker**.

```bash
# Start the cluster (builds image on first run)
colony-env up

# Generate a sample analysis config
polymath init-config --output my_analysis.yaml

# Run a code analysis over a local codebase
colony-env run /path/to/codebase --config my_analysis.yaml

# Check service status
colony-env status

# Open the web dashboard
colony-env dashboard

# Scale workers
colony-env up --workers 3

# Tear down
colony-env down

# Verify prerequisites
colony-env doctor
```

All Colony dependencies run inside Docker -- no local GPU drivers, Ray, or Redis installation required. The `colony-env run` command copies your codebase to be analyzed into the cluster and executes inside the Ray head container with full access to the framework.

**Services started by `colony-env up`:**

| Service | Port | Description |
|---------|------|-------------|
| Colony dashboard | `localhost:8080` | Web UI for agents, sessions, VCM |
| Ray dashboard | `localhost:8265` | Cluster monitoring UI |
| Ray client | `localhost:10001` | Ray client connection |
| Redis | `localhost:6379` | State management backend |

### Web Dashboard

The Colony dashboard starts automatically with `colony-env up` at [localhost:8080](http://localhost:8080). It provides:

- **Overview** — cluster health, application deployments, quick stats
- **Agents** — list registered agents, view state, capabilities, and details
- **Sessions** — browse sessions and their agent runs with token usage
- **VCM** — page table, working set, and virtual context statistics
- **Traces** — detailed tracing of agent actions, VCM operations, and system events for debugging and performance analysis

```bash
# Run the agent colony
colony-env down && colony-env up --workers 3 && colony-env run --local-repo /path/to/codebase --config my_analysis.yaml --verbose

# Open the dashboard in your browser
colony-env dashboard

# Use a custom port (must match COLONY_DASHBOARD_UI_PORT)
colony-env dashboard --port 9090
```

For frontend development, run the Vite dev server on the host with hot-reload:

```bash
cd src/polymathera/colony/web_ui/frontend
npm install
npm run dev     # Starts on localhost:5173, proxies /api to localhost:8080
```

## Key Features

| Feature | Description | Docs |
|---------|-------------|------|
| Virtual Context Memory | OS-style virtual memory for LLM KV caches with page tables and cache-aware scheduling | [VCM](https://polymathera.github.io/colony/architecture/virtual-context-memory/) |
| Agent Capabilities | Composable cognitive modules (memory, attention, games, confidence) attached to agents via AOP-inspired patterns | [Agent System](https://polymathera.github.io/colony/architecture/agent-system/) |
| Action Policies | LLM-centric planning with Model Predictive Control -- the LLM is the planner, not the framework | [Action Policies](https://polymathera.github.io/colony/architecture/action-policies/) |
| Blackboard | Redis-backed shared state with optimistic concurrency, causal timelines, and event-driven coordination | [Blackboard](https://polymathera.github.io/colony/architecture/blackboard/) |
| Memory Hierarchies | Unified memory system with sensory, working, short-term, and long-term memory -- all backed by blackboards | [Memory](https://polymathera.github.io/colony/architecture/memory-system/) |
| Game Engine | Hypothesis games, contract nets, negotiation, and consensus protocols for multi-agent coordination | [Games](https://polymathera.github.io/colony/architecture/game-engine/) |
| Hook System | AOP-inspired hooks for cross-cutting concerns (logging, tracing, metrics, memory triggers) | [Hooks](https://polymathera.github.io/colony/architecture/hook-system/) |

## Development

```bash
git clone https://github.com/polymathera/colony.git
cd colony
poetry install --all-extras
```

### Running Tests

```bash
pytest src/ --timeout=120 -x -q
```

### Documentation

```bash
poetry run mkdocs serve --livereload   # Local docs server at http://127.0.0.1:8000/
poetry run mkdocs build                # Build static site
```

## Contributing

We welcome contributions. See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, code conventions, and the PR process.

## License

Apache 2.0 -- see [LICENSE](LICENSE).

