Metadata-Version: 2.4
Name: agentcapsule
Version: 0.1.0
Summary: Agent Capsule Protocol for inspectable text-native artifact transfer
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/arikyp/agentcapsule
Project-URL: Repository, https://github.com/arikyp/agentcapsule
Project-URL: Issues, https://github.com/arikyp/agentcapsule/issues
Project-URL: Documentation, https://github.com/arikyp/agentcapsule/tree/main/docs
Keywords: agents,ai-agents,artifact-transfer,protocol,security,provenance,signatures,base64,governance
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Communications
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: signing
Requires-Dist: cryptography<47,>=46; extra == "signing"
Dynamic: license-file

# Agent Capsule

[![CI](https://github.com/arikyp/agentcapsule/actions/workflows/ci.yml/badge.svg)](https://github.com/arikyp/agentcapsule/actions/workflows/ci.yml)

Agent Capsule Protocol V0 is an inspectable, verifiable artifact format for
moving exact machine-readable payloads through agent and text-native channels:
chat, tickets, prompts, email, GitHub issues, A2A messages, MIME attachments,
and agent traces.

The default capsule path is plain Base64 plus metadata, SHA256 verification,
optional signatures, local policy checks, and sandbox unpacking. An experimental
carrier-shaping backend exists for research, but it is not required for normal
capsule use.

## Current Status

Agent Capsule V0 is the product-facing layer in this repository.

- Base64 capsules are the primary stable path.
- Capsules can be delivered inline, as attachments, or by reference descriptor.
- Directory bundles are deterministic JSON with per-file SHA256 and byte
  counts.
- HMAC-SHA256 and optional Ed25519 signatures are supported for authenticity
  experiments.
- Local policy, scan, audit, and trust-registry flows are implemented.
- Runtime Base64 capsule encode/decode is dependency-free Python.

## What Works

- Base64 capsule pack, inspect, verify, scan, and unpack flows.
- Signed capsule verification with HMAC and optional Ed25519.
- Agent handoff manifests with file inventory, requested capabilities, policy
  hints, and delivery mode.
- Inline, attachment, and reference delivery metadata.
- Byte-perfect encode/decode roundtrip for the tested payload sizes and demos.
- Deterministic output for identical payload, model, and settings.
- Model fingerprint checks before decode.
- Copy/paste armour with version, model fingerprint, and settings.
- CLI file encode/decode.
- CRC32 corruption detection inside the payload frame.
- Unit, stress, golden, and end-to-end verification tests.

## What Does Not Yet Work

- Production identity, central trust registry, or remote policy service.
- Encryption or privacy.
- Large-file distribution as an inline capsule.
- Semantically meaningful prose generation.
- Steganography-grade secrecy.
- Compression superiority over base64.
- Large-file archival confidence.
- GPU-scale model training in the runtime path.

## Agent Capsule Quickstart

Use a virtual environment for local development:

```bash
python3 -m venv .venv
.venv/bin/python -m pip install -e .
```

This exposes the `agentcapsule` and `capsule` commands for Agent Capsules.

Create, inspect, verify, and unpack a Base64 capsule:

```bash
printf 'agent handoff state\n' > payload.txt
agentcapsule pack payload.txt --out capsule.txt
agentcapsule inspect capsule.txt
agentcapsule verify capsule.txt
agentcapsule unpack capsule.txt --out decoded
cmp payload.txt decoded/payload.txt
```

Create a signed handoff capsule with explicit manifest metadata:

```bash
CAPSULE_HMAC_KEY='shared secret' agentcapsule pack payload.txt \
  --out capsule.txt \
  --created-by agent-a \
  --task-id abc123 \
  --requested-capability read_files \
  --requested-capability run_tests \
  --delivery-mode inline \
  --sign-key-env CAPSULE_HMAC_KEY
CAPSULE_HMAC_KEY='shared secret' agentcapsule verify capsule.txt --key-env CAPSULE_HMAC_KEY
```

Emit a reference descriptor when the capsule will be stored out of band:

```bash
agentcapsule reference capsule.txt \
  --uri https://example.test/capsules/capsule.txt \
  --json
```

For a shorter developer path, see [docs/QUICKSTART.md](docs/QUICKSTART.md).
For installation packaging, see [docs/INSTALL.md](docs/INSTALL.md).
For release and distribution planning, see
[docs/RELEASE_DISTRIBUTION.md](docs/RELEASE_DISTRIBUTION.md).

## Agent Capsule Commands

Capsules are inspectable text artifacts that wrap exact machine-readable
payloads with plaintext metadata, SHA256 verification, and safe unpack flows.
The command examples below use Base64 unless a different backend is selected
explicitly. `capsule` remains an alias for `agentcapsule`.

```bash
agentcapsule pack examples/agent_capsule_demo/handoff --out capsule.txt
agentcapsule inspect capsule.txt
agentcapsule verify capsule.txt
agentcapsule unpack capsule.txt --out decoded
agentcapsule scan capsule.txt
agentcapsule codecs
agentcapsule inspect capsule.txt --json
CAPSULE_HMAC_KEY='shared secret' agentcapsule pack payload.bin --out capsule.txt --sign-key-env CAPSULE_HMAC_KEY
agentcapsule keys generate --private-key publisher.key --public-key publisher.pub
agentcapsule pack payload.bin --out capsule.txt --sign-ed25519-key publisher.key --signature-key-id publisher
agentcapsule verify capsule.txt --ed25519-public-key publisher.pub
agentcapsule verify capsule.txt --signature-registry trusted-keys.json
agentcapsule verify capsule.txt --audit-json
```

Core capsule, HMAC, and base64 workflows work with the default
dependency-free install. Ed25519 demos/tests require the optional signing extra:

```bash
python3 -m pip install -e ".[signing]"
```

For Ed25519, distinguish validity from trust: `signature_verification: ok`
means the capsule was signed by the matching key; `signature_trust.status:
trusted` means the key also passed local registry and policy checks.

`capsule scan --json` emits typed findings with source locations for governance
logs and agent traces. `--audit-json` on `inspect`, `verify`, `unpack`, and
`scan` emits a consistent allow/review/block governance event.

See [docs/AGENT_CAPSULE_PROTOCOL_V0.md](docs/AGENT_CAPSULE_PROTOCOL_V0.md) and
[docs/AGENT_CAPSULE_PRODUCT_BRIEF.md](docs/AGENT_CAPSULE_PRODUCT_BRIEF.md).
For security assumptions, HMAC limits, and governance policy examples, see
[docs/AGENT_CAPSULE_THREAT_MODEL.md](docs/AGENT_CAPSULE_THREAT_MODEL.md).
For the public-key signing proposal, see
[docs/AGENT_CAPSULE_ED25519_DESIGN.md](docs/AGENT_CAPSULE_ED25519_DESIGN.md).
For structured governance events, see
[docs/AGENT_CAPSULE_AUDIT_LOG_V0.md](docs/AGENT_CAPSULE_AUDIT_LOG_V0.md).
For an observable agent-to-agent handoff experiment, see
[docs/AGENT_TO_AGENT_HANDOFF_DEMO.md](docs/AGENT_TO_AGENT_HANDOFF_DEMO.md).
The handoff demo writes `events.jsonl` plus an evaluator report that checks the
summary, capsule, registry-trusted signature, sandbox unpack, and artifact
comparison evidence.
For enterprise policy tiers, see
[docs/AGENT_HANDOFF_POLICY_MATRIX_V0.md](docs/AGENT_HANDOFF_POLICY_MATRIX_V0.md).
For a static observability view over handoff evidence, see
[docs/AGENT_HANDOFF_OBSERVABILITY_DASHBOARD_V0.md](docs/AGENT_HANDOFF_OBSERVABILITY_DASHBOARD_V0.md).
For the central trust registry direction, see
[docs/AGENT_CAPSULE_CENTRAL_TRUST_REGISTRY.md](docs/AGENT_CAPSULE_CENTRAL_TRUST_REGISTRY.md).

## Research Backends

The repository also contains experimental carrier-shaping backends used for
research and regression coverage.

The research path has five core layers:

- Frame: wraps the payload as `magic || payload_len || crc32 || payload`.
- Range coder: provides the reversible bit-to-symbol mapping.
- LM probabilities: provide the next-token carrier distribution.
- Quantizer: converts floating-point probabilities into deterministic integer
  CDFs for range coding.
- Armour: stores carrier text with version, model fingerprint, and settings in
  a copy/paste-safe text block.

Encoding:

```text
payload bytes
  -> binary frame
  -> framed bits
  -> source RangeDecoder over framed bits
  -> LM probabilities + shaping + quantization
  -> carrier token choices
  -> mirror RangeEncoder stopping check
  -> armoured text
```

Decoding:

```text
armoured text
  -> parse and check model fingerprint
  -> carrier tokens
  -> same LM probabilities + shaping + quantization
  -> RangeEncoder reconstructs framed bits
  -> frame parser validates magic, length, and CRC32
  -> payload bytes
```

The stopping condition is intentionally conservative. Range decoders use
lookahead, so encode does not stop based on a naive "all bits consumed" rule.
Instead, the research carrier keeps a mirror range encoder and stops only when
its finalized preview has the framed payload bits as a prefix.

## Verification

```bash
PYTHONPATH=src python3 -m unittest discover -s tests
```

Full V1 verification:

```bash
sh scripts/verify_v1.sh
```

Installed CLI release check:

```bash
sh scripts/release_check.sh
```

## Demo And Experiment Scripts

```bash
sh scripts/demo_roundtrip.sh
```

Compare the fixed carrier against the trained order-1 n-gram carrier:

```bash
sh scripts/demo_compare.sh
```

Run the pinned Transformer carrier demo:

```bash
sh scripts/demo_transformer.sh
```

Create deterministic carrier corpora and train/held-out splits:

```bash
scripts/build_carrier_corpus.py \
  --out examples/carrier_corpus_v2.txt \
  --lines 5000 \
  --seed 42 \
  --domain mixed
scripts/split_corpus.py \
  --input examples/carrier_corpus_v2.txt \
  --train-out examples/carrier_train_v2.txt \
  --heldout-out examples/carrier_heldout_v2.txt \
  --heldout-ratio 0.20 \
  --filter-vocab
```

Compare models with optional benchmark JSON:

```bash
scripts/compare_models.py \
  --payload payload.bin \
  --corpus examples/carrier_train_v2.txt \
  --quality-text examples/carrier_heldout_v2.txt \
  --json-out benchmark.json
```

Run a bounded V2 experiment config:

```bash
scripts/run_experiment.py experiments/configs/example_fixed.json
```

Optional PyTorch training/export is available in
`scripts/train_transformer_torch.py`. PyTorch is only needed for that exporter;
the exported JSON model loads through the dependency-free `TransformerLM`
runtime.

## Research Notes

Probability shaping is intentionally separate from the models. Defaults are a
no-op, while non-default shaping settings are written into the armour so decode
can reproduce the same distribution. Uniform mixing and temperature are
guardrails for keeping model distributions usable by the range coder; they are
not a claim of natural language quality.

Greedy previews are useful diagnostics, but they are not representative of
encoded carrier text. The actual carrier is selected by payload bits
through the range coder under the model distribution.

Current demo metrics for `bytes(range(256))`:

- Payload bytes: `256`
- Carrier chars: `358`
- Bits per carrier char: `5.989`
- Base64 baseline chars: `344`

Pinned Transformer fixture metrics for `bytes(range(256))` with
`--shape-uniform-mix 0.80 --temperature 1.25`:

- Payload bytes: `256`
- Carrier chars: `362`
- Bits per carrier char: `5.923`

## Documentation

- [docs/ALGORITHM.md](docs/ALGORITHM.md): core reversible mapping.
- [docs/IMPLEMENTATION_PLAN.md](docs/IMPLEMENTATION_PLAN.md): implementation
  milestones and design notes.
- [docs/V1_RELEASE.md](docs/V1_RELEASE.md): V1 checkpoint and pinned artifact
  details.
- [docs/LIMITATIONS.md](docs/LIMITATIONS.md): current boundaries and non-goals.
- [docs/QUICKSTART.md](docs/QUICKSTART.md): installed CLI usage.
- [docs/BENCHMARKING.md](docs/BENCHMARKING.md): structured benchmark JSON.
- [docs/EXPERIMENTS.md](docs/EXPERIMENTS.md): bounded V2 experiment runner.
- [docs/AGENT_CAPSULE_PROTOCOL_V0.md](docs/AGENT_CAPSULE_PROTOCOL_V0.md):
  Agent Capsule V0 envelope, backends, verification, and scan flow.
- [docs/AGENT_CAPSULE_PRODUCT_BRIEF.md](docs/AGENT_CAPSULE_PRODUCT_BRIEF.md):
  product pivot brief and governance roadmap.
- [docs/V2_BASELINE.md](docs/V2_BASELINE.md): V2 baseline metrics and first
  candidate runs.
- [docs/V2_EXPERIMENT_PROTOCOL.md](docs/V2_EXPERIMENT_PROTOCOL.md): V2
  promotion gates and experiment rules.
- [docs/V2_AUTOAGENT.md](docs/V2_AUTOAGENT.md): autoagent role and guardrails
  for bounded V2 research.
- [docs/V2_CANDIDATE_REPORT_TEMPLATE.md](docs/V2_CANDIDATE_REPORT_TEMPLATE.md):
  report shape for reviewed V2 candidates.
- [docs/V2_CHECKPOINT.md](docs/V2_CHECKPOINT.md): V2 research checkpoint,
  matrix ranking, and merge recommendation.
- [docs/V2_LARGE_PAYLOAD_REAL_CORPUS_STRESS.md](docs/V2_LARGE_PAYLOAD_REAL_CORPUS_STRESS.md):
  next stress lane for larger payloads and real-ish corpora.
- [docs/V2_LARGE_PAYLOAD_STRESS_RESULTS.md](docs/V2_LARGE_PAYLOAD_STRESS_RESULTS.md):
  capped large-payload stress results and runtime finding.
- [docs/V2_SIZE_LADDER.md](docs/V2_SIZE_LADDER.md): scaled payload ladder for
  locating the current runtime knee.
- [docs/V2_SIZE_LADDER_RESULTS.md](docs/V2_SIZE_LADDER_RESULTS.md): observed
  32KB/64KB runtime boundary.
- [docs/V2_SPRINT_1.md](docs/V2_SPRINT_1.md): first V2 candidate stress
  sprint.
- [docs/V2_SPRINT_2.md](docs/V2_SPRINT_2.md): first autoagent-safe
  comparison sprint.
- [docs/V2_SPRINT_3.md](docs/V2_SPRINT_3.md): order-3 candidate stress
  sprint.
- [docs/V2_CANDIDATE_ORDER3.md](docs/V2_CANDIDATE_ORDER3.md): current order-3
  candidate report.
- [docs/V2_CANDIDATE_ORDER3_SHAPED.md](docs/V2_CANDIDATE_ORDER3_SHAPED.md):
  shaped order-3 candidate report.
- [docs/V2_CANDIDATE_ORDER2_SAFETY.md](docs/V2_CANDIDATE_ORDER2_SAFETY.md):
  order-2 safety fallback report.
- [docs/V2_CANDIDATE_TRANSFORMER_FIXTURE.md](docs/V2_CANDIDATE_TRANSFORMER_FIXTURE.md):
  Transformer fixture candidate report.
- [docs/CARRIER_QUALITY.md](docs/CARRIER_QUALITY.md): carrier quality metrics
  and trade-offs.
- [docs/TESTING.md](docs/TESTING.md): stress/property test strategy.
- [schemas/benchmark_result_v1.json](schemas/benchmark_result_v1.json):
  benchmark JSON schema contract.
- [CHANGELOG.md](CHANGELOG.md): release notes.
- [LICENSE](LICENSE): current license status.

## Golden V1 Fixtures

All golden fixtures use payload `bytes(range(256))`.

Payload SHA256:

```text
40aff2e9d2d8922e47afd4648e6967497158785fbd1da870e7110266bf944880
```

Fixed carrier:

- Message fixture: [tests/fixtures/golden_message_v1.txt](tests/fixtures/golden_message_v1.txt)
- Model fingerprint: `d60583f4d741e42cb713b11c78b8ffc89cda1ee05eca522929bec8cbdb423be8`
- Message SHA256: `f53ec3604a378788b20cf6e0aadbfe441a063aa7ce1cea0bef9b1427cbd21e35`

Order-1 n-gram carrier fixture:

- Model fixture: [tests/fixtures/ngram_model_v1.json](tests/fixtures/ngram_model_v1.json)
- Message fixture: [tests/fixtures/ngram_golden_message_v1.txt](tests/fixtures/ngram_golden_message_v1.txt)
- Model fingerprint: `b1cd62a9019b67e0a42913dac1dca09852b4931f09afa87bb8e62089fe184a3a`
- Message SHA256: `53c062a238764c72caa9dd338d37682ab350d7ace4251e9778ba13ae97d99512`

Transformer carrier fixture:

- Model fixture: [tests/fixtures/transformer_model_v1.json](tests/fixtures/transformer_model_v1.json)
- Message fixture: [tests/fixtures/transformer_golden_message_v1.txt](tests/fixtures/transformer_golden_message_v1.txt)
- Settings: `SHAPE_UNIFORM_MIX=0.80; TEMPERATURE=1.25`
- Model fingerprint: `cfc75d7b54524f7a09a90454d89768aa4eb75b17546607c376760e2fc9d8f851`
- Message SHA256: `7713a0b7208462485f854ab58e5423f16c16360aeff524f1597ba49c840ad96b`

Regenerate golden fixtures only after intentional codec changes:

```bash
python3 scripts/generate_golden.py
```
