Metadata-Version: 2.4
Name: syntheca
Version: 0.2.0a3
Summary: CLI for scaffolding LLM-ready Syntheca wiki workspaces
Author: Syntheca Contributors
License-Expression: MIT
Project-URL: Homepage, https://github.com/swj9707/syntheca
Project-URL: Repository, https://github.com/swj9707/syntheca
Project-URL: Issues, https://github.com/swj9707/syntheca/issues
Project-URL: Documentation, https://github.com/swj9707/syntheca/tree/main/docs
Project-URL: Changelog, https://github.com/swj9707/syntheca/blob/main/CHANGELOG.md
Keywords: llm,wiki,markdown,agents,knowledge-base
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Documentation
Classifier: Topic :: Utilities
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: PyYAML>=6.0
Provides-Extra: dev
Requires-Dist: basedpyright>=1.31.0; extra == "dev"
Requires-Dist: coverage[toml]>=7.0; extra == "dev"
Requires-Dist: twine>=6.0; extra == "dev"
Requires-Dist: typing-extensions>=4.12; extra == "dev"
Dynamic: license-file

# Syntheca

[![CI](https://github.com/swj9707/syntheca/actions/workflows/ci.yml/badge.svg)](https://github.com/swj9707/syntheca/actions/workflows/ci.yml)

[English](README.md) | [한국어](README.ko.md)

**Schema-first framework for agent-maintained wikis**

> This guide adapts ideas from Andrej Karpathy's "LLM Wiki" pattern and the
> "LLM Wiki v2" / agentmemory production lessons. See
> [`REFERENCES.md`](REFERENCES.md) and [`WIKI-ATTRIBUTION.md`](WIKI-ATTRIBUTION.md)
> for full provenance.

Syntheca is a schema specification and starter framework for using LLM agents as disciplined wiki maintainers. It documents how agents can incrementally build and maintain a structured wiki that compounds over time.

---

## What is Syntheca?

Syntheca is a **framework** with early alpha CLI tooling. It provides:

- **Schema**: Canonical templates, page types, and frontmatter specifications
- **Workflows**: Step-by-step procedures for ingest, query, lint, crystallization, and migration
- **Runtime adapter notes**: Instructions for applying the schema with Claude Code, Codex, OpenCode, and MCP-compatible agents
- **Starter examples**: Sample wiki demonstrating the pattern
- **Migration workflow**: Agent-guided preservation rules plus alpha dry-run reporting for existing knowledge bases
- **Alpha CLI**: `syntheca init`, `syntheca doctor`, `syntheca inspect`, `syntheca lint`, `syntheca lint --fix`, and `syntheca migrate --mode dry-run` for scaffolded workspace experiments

**What Syntheca is NOT**:
- ❌ A GUI application
- ❌ A vector database or search engine
- ❌ A replacement for Obsidian, Notion, or other note-taking apps
- ❌ A RAG system

**What Syntheca IS**:
- ✅ A schema and workflow specification
- ✅ A set of templates and conventions
- ✅ Documentation for agents to follow
- ✅ An agent-guided migration specification and deterministic dry-run report for unstructured knowledge bases
- ✅ An alpha CLI slice for workspace scaffolding, structural validation, deterministic inspection, deterministic lint, narrow mechanical lint fixes, and migration dry-run


## Responsibility Boundary

Syntheca deliberately splits deterministic tooling from semantic wiki work.

| Layer | Owns | Does Not Own |
|---|---|---|
| CLI | Workspace scaffold, structural validation, deterministic inspection, deterministic lint, narrow mechanical lint fixes, migration dry-run reports. | Source interpretation, page synthesis, contradiction resolution, broken-link intent, stale-claim judgment. |
| Agent | Ingest, query, crystallization, synthesis, link/index/log maintenance following `schema/`. | Destructive migration, unreviewed raw source changes, final authority on ambiguous claims. |
| Human | Review, approval, curation, destructive operation decisions, release decisions. | Repeating mechanical checks that the CLI can perform. |

---

## Core Concept: Compilation Over Retrieval

**Retrieval-first workflow**: Index sources → Retrieve relevant passages at query time → Generate an answer

**Persistent wiki workflow**: Ingest source → Extract and compile into structured pages → Query maintained wiki → Crystallize valuable answers back into wiki → Knowledge can compound

The wiki is a **persistent, compounding artifact**. The documented workflows guide agents through cross-reference maintenance, contradiction review, and synthesis updates while humans retain responsibility for curation and quality review.

---

## Quick Start

### Option A: Scaffold a workspace with the alpha CLI

From a checkout of this repository:

```bash
python -m pip install -e .
syntheca init my-wiki
syntheca doctor my-wiki
syntheca inspect my-wiki
syntheca lint my-wiki
syntheca lint my-wiki --fix
syntheca migrate --source ./my-vault --mode dry-run \
  --output-report ./migration-report.md \
  --output-manifest ./migration-manifest.json
```

Run the test suite with coverage:

```bash
python -m pip install -e '.[dev]'
python -m coverage run -m unittest
python -m coverage report
```

This creates an LLM-ready workspace with `README.md`, `AGENTS.md`, optional
runtime entrypoints and provider-native skills, `raw/`, `wiki/`, and
`.syntheca/syntheca.yaml`.

Project status: Alpha CLI available (`init`, `doctor`, `inspect`, `lint`,
`lint --fix`, `migrate --mode dry-run`). `lint --fix` is limited to mechanical
index/frontmatter fixes that do not require LLM judgment. Output/apply migration
modes and MCP server work are planned (not implemented in the current alpha CLI).

### Option B: Explore the schema-first starter

#### 1. Clone the repository

```bash
git clone https://github.com/swj9707/syntheca.git
cd syntheca
```

Stay at the repository root so your agent can read `schema/` and work with the
example knowledge base under `starter/`.

**Directory structure**:
```
starter/
├── raw/                    # Your source material (immutable)
│   └── sample-source.md
└── wiki/                   # Agent-maintained wiki
    ├── sources/            # Processed source pages
    ├── entities/           # Named objects (people, projects, products)
    ├── concepts/           # Reusable ideas and patterns
    ├── syntheses/          # Cross-source analysis
    ├── decisions/          # Important choices (ADR pattern)
    ├── unclassified/       # Ambiguous or legacy content
    ├── index.md            # Content catalog
    └── log.md              # Chronological operation record
```

#### 2. Load the schema into your agent

**Claude Code**:
```
Read schema/AGENTS.md and schema/adapters/claude/CLAUDE.md
```

**Codex**:
```
Read schema/AGENTS.md and schema/adapters/codex/AGENTS.md
```

**OpenCode**:
```
Read schema/AGENTS.md and schema/adapters/opencode/AGENTS.md
```

If you maintain a local OpenCode skill for Syntheca, you can load it before
reading the schema. The v0.1 repository does not ship a skill package.

#### 3. Ingest a source

Add your source to `starter/raw/`, then tell the agent:

```
Use starter/ as the wiki root.
Ingest starter/raw/my-article.md into starter/wiki/.
```

Ask the agent to follow the documented ingest workflow:
1. Read the source
2. Create a source page with summary and key points
3. Extract entities (people, projects, products) → create entity pages
4. Extract concepts (ideas, patterns) → create concept pages
5. Add evidence-grounded cross-links and review whether reciprocal links are appropriate
6. Update `index.md` and `log.md`

#### 4. Query the wiki

```
What does the wiki say about [topic]?
```

The documented query workflow instructs the agent to search the wiki, synthesize an answer with citations, and evaluate whether the answer should be crystallized back into the wiki as a synthesis page.

#### 5. Explore the starter

Open `starter/wiki/index.md` to see:
- 2 source pages demonstrating ingest and cross-source synthesis
- 1 entity page (Syntheca framework)
- 3 concept pages (persistent wiki, crystallization, and three-layer architecture)
- 1 synthesis page (retrieval, persistent wiki, and hybrid comparison)

All pages are cross-referenced and follow the schema. Run the read-only v0.1 checklist in `docs/guides/starter-lint.md` after changing the starter wiki. See `docs/concepts/framework.md` for the framework overview and `docs/project/release-checklist.md` for the public release gate.

---

## Architecture

### Three Layers

1. **Raw Sources** (`raw/`): Immutable source material. Agents read from here but never modify.

2. **Wiki** (`wiki/`): Agent-maintained structured pages. Agents create, update, and cross-link pages following the schema.

3. **Schema** (`schema/`): Instructions defining how the wiki is maintained. Templates, workflows, and runtime adapters.

### Page Types

| Type | Purpose | Example |
|------|---------|---------|
| **Source** | Summarize raw sources, identify extraction candidates | `sources/article-2026-05-30.md` |
| **Entity** | Named objects: people, projects, products, libraries | `entities/syntheca.md` |
| **Concept** | Reusable ideas, patterns, methods | `concepts/persistent-wiki.md` |
| **Synthesis** | Cross-source analysis, comparisons, crystallized insights | `syntheses/wiki-vs-rag.md` |
| **Decision** | Important choices following ADR pattern | `decisions/use-markdown.md` |
| **Unclassified** | Ambiguous or legacy content, safe preservation | `unclassified/legacy-note.md` |

### Core Workflows

**Ingest** (`schema/workflows/ingest.md`): Process raw sources → create/update pages → maintain cross-references

**Query** (`schema/workflows/query.md`): Search wiki → synthesize answer → evaluate crystallization

**Lint** (`schema/workflows/lint.md`): Health-check for orphans, broken links, contradictions, stale claims

**Crystallization** (`schema/workflows/crystallization.md`): File valuable query answers back into the wiki as synthesis/concept/decision pages

**Migration** (`schema/workflows/migration.md`): Guide an agent through safe imports with dry-run, classification heuristics, and unknown field preservation

---

## Migration from Existing Wikis

The v0.1 agent-guided migration workflow is designed for:
- Obsidian vaults
- Notion exports
- Legacy markdown wikis
- Flat collections of notes

**Process**:
1. **Dry-run first**: Generate report without modifying files
2. **Review classification**: Check page type assignments (source/entity/concept/synthesis/decision/unclassified)
3. **Apply with explicit approval**: Ask an agent to output to a new directory (safer) or modify files in place (destructive)
4. **Preserve unknowns**: Custom frontmatter fields are never deleted

**Alpha migration dry-run CLI**:
```bash
syntheca migrate --source ./my-vault --mode dry-run \
  --output-report ./migration-report.md \
  --output-manifest ./migration-manifest.json
```

Output/apply/in-place migration modes are planned (not implemented in the
current alpha CLI).

See `docs/guides/migration.md` for the full guide with examples,
`docs/guides/cli-reference.md` for command details, and
`docs/guides/configuration.md` for generated workspace settings.

---

## Runtime Compatibility

Syntheca is designed to be **runtime-neutral**. The canonical schema (`schema/AGENTS.md`) and adapter notes describe how to apply the framework across different agent systems; v0.1 does not claim tested, automated support for every runtime.

| Runtime | Adapter Focus | Relevant Runtime Capabilities |
|---------|----------|--------------|
| **Claude Code** | Conversational schema application | Runtime-specific capabilities vary by setup |
| **Codex** | Repository-oriented schema application | CLI, IDE, cloud/web, app; optional GitHub and SDK integration |
| **OpenCode** | Multi-agent schema application | Delegation, background execution, and skills depend on runtime configuration |
| **MCP Generic** | Optional external integration path | Capabilities depend on the selected MCP servers |

See `docs/concepts/capability-matrix.md` for detailed comparison.

### Adapter Documents

- **Claude Code**: `schema/adapters/claude/CLAUDE.md`
- **Codex**: `schema/adapters/codex/AGENTS.md`
- **OpenCode**: `schema/adapters/opencode/AGENTS.md`

Each adapter explains how to apply the canonical schema in that runtime without redefining policy.

---

## Use Cases

### Personal Research
Track papers, articles, and notes over weeks or months. Maintained pages can build a more comprehensive picture as sources accumulate and are reviewed.

### Reading Companion
Use the documented workflow to file chapter summaries, character pages, and theme pages as you read. The wiki can grow into a cross-referenced reading companion with agent assistance and human review.

### Team Knowledge Base
Use meeting transcripts, project documents, and customer notes as curated inputs. The documented workflow guides an agent to maintain project entities, recurring concepts, and reviewed decisions.

### Competitive Analysis
Track competitors, products, and market trends. The documented workflow guides agents to consolidate sources and surface potential contradictions for review.

### Course Notes, Trip Planning, Hobby Deep-Dives
Anything where you accumulate knowledge over time and want it organized, not scattered.

---

## Schema Highlights

### Frontmatter Standards

Every page has structured metadata:

```yaml
---
title: "Page Title"
type: concept
status: active
created: "2026-05-30"
updated: "2026-05-30"
sources: ["sources/article.md"]
related: ["concepts/other-concept.md"]
tags: [tag1, tag2]
---
```

See `schema/frontmatter.md` for canonical field definitions per page type.

### Template Structure

All templates follow consistent pattern:
1. **Frontmatter**: Structured metadata
2. **Required sections**: Definition, explanation, relationships, sources
3. **Optional sections**: Examples, counterpoints, open questions, contradictions
4. **Authoring rules**: Guidelines for agent behavior
5. **Minimum completion criteria**: Validation checklist

### Quality Standards

Adopted from [LLM Wiki v2](https://github.com/rohitg00/agentmemory):

- **Confidence scoring groundwork**: Synthesis pages may record a manually assigned page-level evidence score
- **Contradiction handling**: Explicit sections for conflicting claims
- **Uncertainty preservation**: Limitations and open questions are required
- **Evidence grounding**: Claims must cite sources
- **Crystallization threshold**: "Would I want to read this 6 months from now?"

---

## Roadmap

### v0.1 (Current)
- ✅ Core schema and templates
- ✅ Five workflows (ingest, query, lint, crystallization, migration)
- ✅ Runtime adapter notes (Claude Code, Codex, OpenCode)
- ✅ Starter examples
- ✅ Agent-guided migration specification with dry-run

### v0.2 alpha (In progress)
- [x] CLI package skeleton
- [x] `syntheca init` workspace scaffold
- [x] `syntheca doctor` structural validation
- [x] `syntheca inspect` deterministic workspace summary
- [x] `syntheca lint` deterministic wiki checks
- [x] `syntheca lint --fix` narrow mechanical fixes
- [x] `syntheca migrate --mode dry-run` report and manifest generation
- [x] Machine-readable baseline checks for required `type`/`status`, links, raw source paths, and index coverage
- [ ] Extended frontmatter validation beyond the deterministic baseline
- [ ] Migration output/apply modes
- [ ] Initial MCP interface design

### v0.3 (Future)
- [ ] MCP server implementation
- [ ] Vector and hybrid search integrations
- [ ] Link graph visualization
- [ ] Extended stale-page and supersession checks
- [ ] Claim-level confidence and contradiction review experiments
- [ ] Multi-wiki federation

See `docs/project/roadmap.md` for detailed feature plans.

---

## Project Status

**Current**: Pre-release (v0.1 schema and documentation framework plus v0.2 alpha CLI init/doctor/inspect/lint/lint-fix/migrate dry-run commands)

**Not ready for**:
- Production team wikis without human review
- Large wikis without separately evaluated search and review infrastructure
- Automated workflows without supervision

**Suitable for exploratory evaluation**:
- Small personal research wiki experiments
- Trying the documented workflows with Claude Code, Codex, or OpenCode
- Creating an experimental workspace with `syntheca init`
- Agent-guided migration experiments (dry-run workflow)
- Schema evaluation and feedback

---

## Contributing

See `CONTRIBUTING.md` for guidelines.

**High-priority contributions**:
- Runtime adapter notes and evaluation reports for other agents
- Production experience reports (what works, what breaks)
- Planned migration tooling design and implementation
- Template refinements based on usage

---

## Attribution

Syntheca is derived from:

- **Andrej Karpathy's [LLM Wiki](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f)**: Original persistent wiki pattern, three-layer architecture, core operations
- **[agentmemory](https://github.com/rohitg00/agentmemory)** (rohitg00 et al.): Production lessons, confidence scoring, quality standards

See `REFERENCES.md` and `WIKI-ATTRIBUTION.md` for detailed attribution.

Syntheca's contribution: formalization, cross-runtime compatibility, an agent-guided migration specification, and concrete templates/workflows.

---

## License

MIT License. See `LICENSE` for details.

---

## Links

- **Documentation hub**: [`docs/README.md`](docs/README.md)
- **한국어 문서**: [`README.ko.md`](README.ko.md) and [`docs/ko/`](docs/ko/)
- **Schema**: `schema/`
- **Starter examples**: `starter/`
- **CLI reference**: `docs/guides/cli-reference.md`
- **Configuration reference**: `docs/guides/configuration.md`
- **Migration guide**: `docs/guides/migration.md`
- **Capability matrix**: `docs/concepts/capability-matrix.md`
- **References**: `REFERENCES.md`

---

## Questions?

- **How is this different from RAG?** See `starter/wiki/syntheses/persistent-wiki-vs-rag.md`
- **How do I migrate my Obsidian vault?** See `docs/guides/migration.md`
  and `docs/ko/migration.md`
- **Which runtime should I use?** See `docs/concepts/capability-matrix.md`
- **What are the page types?** See `schema/page-types.md`
- **What are the templates?** See `schema/templates/*.md`

---

**Start small, let it compound.**

The schema guides agents through bookkeeping tasks while you curate sources and review quality. Over time, the wiki can become a richer, cross-referenced knowledge base.
