# synix

> Build system for agent memory with declarative pipelines, incremental rebuilds, and full provenance tracking

## What it does

Synix transforms raw conversations into searchable, hierarchical memory with full lineage tracking. Define your memory architecture in Python pipelines, change a config to only rebuild affected layers, and trace any artifact back to its source conversations. Think `make` or `dbt`, but for AI agent memory.

## Key concepts

- **Artifact** — immutable, versioned build output (transcript, episode, rollup, core memory) with SHA256 content addressing
- **Layer** — named level in memory hierarchy forming a DAG (transcripts → episodes → rollups → core)  
- **Pipeline** — Python-declared layers, transforms, grouping strategies, and projections
- **Projection** — materializes artifacts into usable outputs (SQLite FTS5 search index, markdown context doc)
- **Provenance** — every artifact traces back to inputs, always included in search results
- **Fingerprint-based caching** — hash comparison of inputs + prompts + config determines rebuild necessity

## Installation and quick start

```bash
# Install and scaffold a working project
uvx synix init my-project
cd my-project

# Build the pipeline (requires LLM API key in .env)
uvx synix build

# Browse and search
uvx synix list
uvx synix search "hiking"
uvx synix validate
```

## CLI commands

| Command | Description |
|---------|-------------|
| `uvx synix init <name>` | Scaffold new project with sources, pipeline, and README |
| `uvx synix build [pipeline.py]` | Run pipeline with incremental rebuilds |
| `uvx synix plan [--explain-cache]` | Dry-run showing what would build with cache decisions |
| `uvx synix list [layer]` | List artifacts with short IDs, optionally filtered by layer |
| `uvx synix show <id>` | Display artifact content by label/ID prefix, `--raw` for JSON |
| `uvx synix search <query>` | Full-text search with `--mode hybrid`, `--trace` for provenance |
| `uvx synix validate` | Run declared validators against build artifacts |
| `uvx synix verify` | Check build integrity (hashes, provenance) |
| `uvx synix lineage <id>` | Show full provenance chain for an artifact |
| `uvx synix clean` | Delete build directory |

## Architecture overview

```
src/synix/
├── cli.py              # Click CLI commands
├── pipeline/
│   ├── config.py       # Parse pipeline Python module into objects
│   ├── dag.py          # DAG resolution, build order, rebuild detection  
│   └── runner.py       # Execute pipeline, walk DAG, run transforms, cache artifacts
├── artifacts/
│   ├── store.py        # Artifact storage (filesystem-backed)
│   └── provenance.py   # Provenance tracking and lineage chains
├── transforms/
│   ├── base.py         # Base transform interface
│   ├── parse.py        # Source parsers (ChatGPT/Claude JSON → transcripts)
│   ├── summarize.py    # LLM transforms (episode, rollup, core synthesis)
│   └── prompts/        # Prompt templates as text files
├── projections/
│   ├── search_index.py # SQLite FTS5 materialization and query
│   └── flat_file.py    # Render core memory as context document
└── sources/
    ├── chatgpt.py      # ChatGPT export parser  
    └── claude.py       # Claude export parser
```

## Pipeline definition

```python
from synix import Pipeline, Layer, Projection

pipeline = Pipeline("personal-memory")
pipeline.source_dir = "./exports"
pipeline.llm_config = {"model": "claude-sonnet-4-20250514", "temperature": 0.3}

# Layer 0: parse source files
pipeline.add_layer(Layer(name="transcripts", level=0, transform="parse"))

# Layer 1: summarize conversations  
pipeline.add_layer(Layer(
    name="episodes", level=1, depends_on=["transcripts"],
    transform="episode_summary", grouping="by_conversation"
))

# Layer 2: monthly rollups
pipeline.add_layer(Layer(
    name="monthly", level=2, depends_on=["episodes"], 
    transform="monthly_rollup", grouping="by_month"
))

# Layer 3: core memory synthesis
pipeline.add_layer(Layer(
    name="core", level=3, depends_on=["monthly"],
    transform="core_synthesis", grouping="single", context_budget=10000
))

# Search index projection
pipeline.add_projection(Projection(
    name="memory-index", projection_type="search_index",
    sources=[
        {"layer": "episodes", "search": ["fulltext"]},
        {"layer": "monthly", "search": ["fulltext"]}, 
        {"layer": "core", "search": ["fulltext"]}
    ]
))
```

## Important constraints

- SQLite + filesystem only (no external databases)
- No web UI (CLI and Python API only)
- Python 3.11+ required
- Requires LLM API key (Anthropic, OpenAI, or OpenAI-compatible)
- UV-native project (`uv sync`, `uv run synix`)
- Single-user local operation (no multi-tenant hosted version)