Metadata-Version: 2.4
Name: chimera-memory
Version: 0.1.0
Summary: Local-first reliability ledger for AI coding-agent work
License: MIT License
        
        Copyright (c) 2026 Chimera / ORIAS
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.12
Requires-Dist: chimera-memory-types
Requires-Dist: filelock>=3.0
Description-Content-Type: text/markdown

# Chimera Memory

Local-first reliability ledger for AI coding-agent work.

Chimera Memory records what an agent tried, which command checked it, what happened, and what receipt proves it. It runs entirely on your machine.

---

## What it records

Each wrapped verification command produces a **claim** with:

- `session_id` — which work session it belongs to
- `agent_id / model_version / harness_id` — who ran it and in what tool
- `task_type` — what kind of work (`test`, `lint`, `type`, `docs`, …)
- `VALIDATED` or `CONTRADICTED` — did reality agree with the prediction?
- `stdout_excerpt / stderr_excerpt` — bounded output witness when available
- git state at time of claim

Sessions, claims, and outcomes are stored in `.chimera-memory/` as append-only JSONL files. Each new claim gets an integrity chain entry in `.chimera-memory/integrity.jsonl`.

---

## Getting started

See [docs/strategy/chimera-memory-first-run-quickstart.md](../../docs/strategy/chimera-memory-first-run-quickstart.md) for a step-by-step guide.

**Key limitations (v0.6):**
- M2B drift scoring: not built
- Model ranking or routing: not built
- Hosted/cloud sync: not built
- Evidence write import: dry-run only
- Windows: not tested



Run from the repo root:

```bash
uv run chimera-memory --help
```

### Standalone local install (v0.6+)

As of v0.6, local wheel builds work outside the monorepo. Build and install:

```bash
# From the repo root, build local wheels for the two public packages
uv build packages/chimera-memory-types --out-dir /tmp/cm-dist
uv build packages/chimera-memory       --out-dir /tmp/cm-dist

# In any Python 3.12+ environment
pip install /tmp/cm-dist/*.whl
chimera-memory --help
```

Runtime dependencies installed automatically: `pydantic`, `filelock`.

**Note:** Public PyPI publishing has not happened yet. This is local packaging readiness only.
Hosted/cloud sync, team SaaS, and remote substrate writes are not implemented.
Reliability is not model routing. M2B is not implemented.

---

## Quickstart

```bash
# 1. Start a session
uv run chimera-memory session start \
  --branch feat/my-branch \
  --task-label "fix-type-errors" \
  --agent kiro \
  --model claude-sonnet-4.6 \
  --harness-id kiro-cli

# 2. Wrap real verification commands
uv run chimera-memory wrap --task-type type -- \
  uv run mypy packages/chimera-memory/src --ignore-missing-imports
uv run chimera-memory wrap --task-type test -- \
  uv run pytest packages/chimera-memory/tests -q
uv run chimera-memory wrap --task-type lint -- \
  uv run ruff check packages/chimera-memory

uv run chimera-memory session end --status PASSED

# 3. Review
uv run chimera-memory failures          # see what failed, with witness output
uv run chimera-memory status            # dogfood gate progress
uv run chimera-memory receipt latest --markdown   # share-ready proof artifact
```

The `--` separator is required before non-pytest commands to tell the argument parser where the wrapped command begins.

---

## Common commands

```bash
chimera-memory session start   --branch ... --task-label ... --agent ... --model ... --harness-id ...
chimera-memory wrap            --task-type <type> [-- <command>]
chimera-memory session end     --status PASSED|FAILED|MIXED|INTERRUPTED
chimera-memory failures        [--json]
chimera-memory status          [--json]
chimera-memory verify          [--json]
chimera-memory receipt latest  [--json] [--markdown]
chimera-memory receipt show    <session_id> [--json] [--markdown]
chimera-memory export          --clean-only --output <path>
chimera-memory session list
chimera-memory report          # raw reliability groups (use status for gate progress)
```

---

## Demo: real failure-fix loop

During development, `mypy` found a real type error:

```
chimera_memory/cli.py:164: error: Value of type "object" is not indexable  [index]
```

Chimera Memory recorded the mypy run as `CONTRADICTED`, with the error stored as `stdout_excerpt`. The code was fixed. The same mypy command ran again and settled as `VALIDATED`. A single session receipt showed both runs.

```
uv run mypy ...  [type] → CONTRADICTED
uv run mypy ...  [type] → VALIDATED
```

This is the core loop: reality contradicted the agent, the fix was applied, and the correction was verified — all in one session, with attribution.

---

## Reliability command

```bash
chimera-memory reliability
chimera-memory reliability --json
```

Read-only. Reports raw validation rates from settled clean claims, grouped by
agent/model/task. Does not rank models, route work, or make autonomy decisions.
Failure-quality classification (organic vs synthetic) is not yet stored in
claim metadata — rates include all CONTRADICTED outcomes.

---

## Bridge pipeline (dry-run only)

Export clean evidence events and preview engine ingestion without any writes:

```bash
# Export settled, attributed claims as JSONL
uv run chimera-memory export --clean-only --output /tmp/cm-clean-events.jsonl

# Validate the export schema
uv run python -m tools.chimera_memory_ingest_dry_run /tmp/cm-clean-events.jsonl

# Map to engine evidence candidates (no database/substrate writes)
uv run python -m tools.chimera_memory_engine_adapter_dry_run /tmp/cm-clean-events.jsonl
```

Both bridge tools are dry-run only. `writes_performed` is always `false`.

---

## Two-model evidence (local v0.2)

As of v0.2, the store contains evidence from two real AI coding agents:

| Agent | Model | Claims | Real failures |
|---|---|---|---|
| kiro | claude-sonnet-4.6 | 123 | 2 organic |
| codebuff | mimo-v2.5-pro | 12 | 2 real |

`manual` and `planning-agent` entries also exist for workflow/planning tasks.

M2 comparative reliability scoring is not yet built. The data is structurally
ready for it once codebuff accumulates ≥25 claims with ≥5 organic failures.

---

## Integrity

New claims are hash-chained into `.chimera-memory/integrity.jsonl`. Run:

```bash
uv run chimera-memory verify
uv run chimera-memory verify --json
```

Historical records created before the integrity layer was added are reported
as `LEGACY_UNSIGNED`. This is honest — they are not broken, just unchained.
`verify` reports `BROKEN` only if a chained record's hash doesn't match or a
claim was appended without a corresponding integrity entry.

**Single-process assumption:** the store is designed for single-process local
use. Truly concurrent writes from separate processes can race; this is not
hardened against. Local CLI use is always single-process.

---

## Known invocation gotcha

The `mypy` task type wrap must be invoked with only the adapter tool (not the
importer) to avoid a "source file found twice" error:

```bash
# Correct — adapter transitively pulls in importer
uv run chimera-memory wrap --task-type type -- \
  uv run mypy packages/chimera-memory/src \
    tools/chimera_memory_engine_adapter_dry_run.py \
    --ignore-missing-imports

# Incorrect — causes "source file found twice" mypy error
uv run chimera-memory wrap --task-type type -- \
  uv run mypy packages/chimera-memory/src \
    tools/chimera_memory_ingest_dry_run.py \
    tools/chimera_memory_engine_adapter_dry_run.py \
    --ignore-missing-imports
```

---

## Release status

- **Local wheel proof exists.** The 2-package install (`chimera-memory` + `chimera-memory-types`) works in a fresh venv outside the monorepo.
- **Public PyPI: not published.** TestPyPI and private registry publishing have not been done.
- **License decision pending.** No open-source license has been formally assigned yet.
- **Witness output is redacted by default.** `chimera-memory wrap` redacts secrets (API keys, tokens, passwords, private keys) from captured stdout/stderr before storing. Review receipts before sharing.
- **Command argument redaction.** The command string in receipts is also redacted for common secret patterns. However, do not pass literal secret values as command arguments — use environment variables instead (e.g. `MY_TOKEN=secret uv run mypy ...`).

## Platform support

- **macOS and Linux** are the target platforms. Tested on macOS arm64.
- **Windows is not supported.** Write-path locking uses `filelock` (cross-platform), but Windows has not been tested end-to-end.
- **Linux Docker smoke test** is pending (Docker unavailable in current environment).

## What is not built

The following are explicitly out of scope for v0.6:

- M2B statistical drift/trend analysis (advisory heuristic only)
- Routing, autonomy decisions, or model ranking
- Cloud/hosted sync or team sharing
- Dashboard, GitHub Action, or CI PR comments
- ORIAS, trading/finance verticals
- GraphSource/substrate writes

## Current limitations

- **M2 drift/model comparison is not fully built.** An advisory drift heuristic exists (`chimera-memory drift`), but statistical M2B drift/trend analysis is not built. `status` shows segment counts, not trends.
- **Useful reliability patterns require real failure variance.** A store of only `VALIDATED` claims proves capture works, not that any agent is reliable.
- **The workflow is CLI/manual, not always-on.** You must start sessions and wrap commands explicitly.
- **Local-first.** Nothing leaves your machine. Data lives in `.chimera-memory/` inside the repo.
- **Synthetic failures should not be treated as product evidence.** Only naturally occurring failures count.
- **Single-process writes only.** The integrity chain is not safe for concurrent multi-process writes.

---

## More detail

See the full demo quickstart with the failure-fix loop walkthrough:
[`../../docs/strategy/chimera-memory-demo-quickstart-2026-06-04.md`](../../docs/strategy/chimera-memory-demo-quickstart-2026-06-04.md)
