# synix

> A build system for agent memory

## What it does

Synix transforms conversation exports into searchable, hierarchical memory with full provenance tracking. Declarative Python pipelines define how raw conversations become episode summaries, monthly rollups, and core memory documents. Content-addressed caching means only affected layers rebuild when you change prompts or add new data.

## Key concepts

- **Artifact** — immutable, versioned build output (transcript, episode, rollup, core memory) with SHA256 content addressing
- **Layer** — typed object in the build DAG (Source for inputs, Transform subclasses for LLM steps, SearchIndex/FlatFile for projections)
- **Pipeline** — declared in Python using `Pipeline.add(*layers)` with automatic routing of transforms to layers and projections
- **Projection** — materializes artifacts into SearchIndex (SQLite FTS5 + embeddings) or FlatFile (markdown context doc)
- **Provenance** — every artifact traces back to source conversations, always included in search results
- **Cache/Rebuild** — fingerprint comparison of inputs + prompts + config triggers rebuilds only when needed

## Installation and quick start

```bash
# Install and scaffold
uvx synix init my-project
cd my-project

# Add API key (see pipeline.py for provider config)
export ANTHROPIC_API_KEY=sk-ant-...

# Build pipeline
uvx synix build

# Browse and search
uvx synix list
uvx synix search "hiking"
uvx synix show final-report
```

## CLI commands

- `uvx synix init <name>` — scaffold new project with sources and pipeline
- `uvx synix build` — run pipeline, only rebuild what changed
- `uvx synix plan` — dry-run showing what would build
- `uvx synix list [layer]` — show all artifacts, optionally filtered
- `uvx synix show <id>` — display artifact by label or ID prefix
- `uvx synix search <query>` — full-text search with provenance
- `uvx synix lineage <id>` — show full dependency chain
- `uvx synix validate` — run validators against artifacts (experimental)
- `uvx synix clean` — delete build directory

## Architecture overview

```
src/synix/
├── __init__.py            # Public API
├── core/models.py         # Layer hierarchy, Pipeline class
├── build/
│   ├── runner.py          # Execute pipeline, walk DAG
│   ├── artifacts.py       # Artifact storage (filesystem-backed)
│   ├── llm_transforms.py  # EpisodeSummary, MonthlyRollup, etc.
│   ├── fingerprint.py     # synix:transform:v2 scheme
│   └── projections.py     # SearchIndex, FlatFile materialization
├── search/
│   ├── indexer.py         # SQLite FTS5 + embeddings
│   └── retriever.py       # Hybrid search with RRF fusion
├── cli/                   # Click CLI commands
└── templates/             # Demo pipelines for synix init
```

## Pipeline definition

```python
from synix import Pipeline, Source, SearchIndex
from synix.transforms import EpisodeSummary, MonthlyRollup, CoreSynthesis

pipeline = Pipeline("my-pipeline")
pipeline.source_dir = "./sources"
pipeline.llm_config = {"provider": "anthropic", "model": "claude-haiku-4-5-20251001"}

transcripts = Source("transcripts")
episodes = EpisodeSummary("episodes", depends_on=[transcripts])
monthly = MonthlyRollup("monthly", depends_on=[episodes])
core = CoreSynthesis("core", depends_on=[monthly])

pipeline.add(transcripts, episodes, monthly, core)
pipeline.add(SearchIndex("search", sources=[episodes, monthly, core]))
```

## Important constraints

- SQLite + filesystem only (no external databases)
- Python 3.11+ required
- No web UI (CLI and Python API only)
- Source adapters: ChatGPT JSON, Claude JSON, text/markdown files
- LLM providers: Anthropic, OpenAI, or OpenAI-compatible via base_url