Metadata-Version: 2.4
Name: jhcontext
Version: 0.3.2
Summary: PAC-AI: Protocol for Auditable Context in AI — Python SDK
Project-URL: Repository, https://github.com/jhcontext/jhcontext-sdk
Project-URL: Documentation, https://github.com/jhcontext/jhcontext-sdk#readme
Author: jhcontext contributors
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: ai-act,auditability,context-management,multi-agent,provenance,w3c-prov
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: httpx>=0.27
Requires-Dist: pydantic>=2.0
Requires-Dist: rdflib>=7.0
Provides-Extra: all
Requires-Dist: aiosqlite>=0.20; extra == 'all'
Requires-Dist: crewai-tools>=1.9; extra == 'all'
Requires-Dist: crewai>=1.9; extra == 'all'
Requires-Dist: cryptography>=44.0; extra == 'all'
Requires-Dist: fastapi>=0.115; extra == 'all'
Requires-Dist: mcp>=1.0; extra == 'all'
Requires-Dist: uvicorn>=0.34; extra == 'all'
Provides-Extra: crewai
Requires-Dist: crewai-tools>=1.9; extra == 'crewai'
Requires-Dist: crewai>=1.9; extra == 'crewai'
Provides-Extra: dev
Requires-Dist: pytest-asyncio; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Provides-Extra: server
Requires-Dist: aiosqlite>=0.20; extra == 'server'
Requires-Dist: cryptography>=44.0; extra == 'server'
Requires-Dist: fastapi>=0.115; extra == 'server'
Requires-Dist: mcp>=1.0; extra == 'server'
Requires-Dist: uvicorn>=0.34; extra == 'server'
Description-Content-Type: text/markdown

# jhcontext SDK

**PAC-AI: Protocol for Auditable Context in AI** — Python SDK

A Python toolkit for building, signing, auditing, and serving AI context envelopes compliant with the PAC-AI protocol. Designed for EU AI Act compliance scenarios including temporal oversight (Art. 14) and negative proof (Art. 13).

## Install

```bash
# Core: models, builder, PROV, audit, crypto
pip install jhcontext

# With server (FastAPI + MCP + SQLite)
pip install "jhcontext[server]"

# With CrewAI integration
pip install "jhcontext[crewai]"

# Everything
pip install "jhcontext[all]"

# Development (adds pytest)
pip install "jhcontext[all,dev]"
```

## Architecture

```
jhcontext/
├── models.py          # Pydantic v2 data models (Envelope, Artifact, Decision, ForwardingPolicy, ...)
├── builder.py         # EnvelopeBuilder — fluent API for constructing envelopes
├── forwarding.py      # ForwardingEnforcer — monotonic policy enforcement + output filtering
├── persistence.py     # StepPersister — artifact + envelope + PROV persistence orchestration
├── prov.py            # PROVGraph — W3C PROV graph builder (rdflib)
├── pii.py             # PII detection, tokenization, detachment (GDPR Art. 5/17)
├── audit.py           # Compliance verification (temporal oversight, negative proof, isolation, PII)
├── crypto.py          # SHA-256 hashing, Ed25519 signing (HMAC fallback)
├── canonicalize.py    # Deterministic JSON serialization
├── semantics.py       # UserML semantic payload helpers
├── cli.py             # CLI: jhcontext serve | mcp | version
├── client/
│   └── api_client.py  # REST client (httpx)
└── server/
    ├── app.py          # FastAPI app factory
    ├── mcp_server.py   # MCP server (stdio transport)
    ├── routes/         # REST API routes (envelopes, artifacts, decisions, provenance, compliance)
    └── storage/
        ├── sqlite.py   # SQLite backend (zero-config, ~/.jhcontext/)
        └── pii_vault.py # Separate PII vault (GDPR erasure support)
```

## Quick Start

### Build and sign an envelope

```python
from jhcontext import EnvelopeBuilder, RiskLevel, ArtifactType, observation, userml_payload

# Build semantic payload
payload = userml_payload(
    observations=[observation("user:alice", "temperature", 22.3)],
)

# Build envelope
env = (
    EnvelopeBuilder()
    .set_producer("did:example:agent-1")
    .set_scope("healthcare")
    .set_risk_level(RiskLevel.HIGH)        # auto-sets forwarding_policy=semantic_forward
    .set_human_oversight(True)
    .set_semantic_payload([payload])
    .add_artifact(
        artifact_id="art-vitals",
        artifact_type=ArtifactType.TOKEN_SEQUENCE,
        content_hash="sha256:abc123...",
    )
    .sign("did:example:agent-1")
    .build()
)

print(env.context_id)
print(env.proof.content_hash)
print(env.compliance.forwarding_policy)    # "semantic_forward"
```

### Forwarding policy

The `forwarding_policy` field in `ComplianceBlock` controls how the envelope's content
is forwarded between tasks in a multi-agent pipeline:

```python
from jhcontext import EnvelopeBuilder, RiskLevel, ForwardingPolicy

# HIGH risk → auto-sets semantic_forward
env = EnvelopeBuilder().set_risk_level(RiskLevel.HIGH).build()
assert env.compliance.forwarding_policy == ForwardingPolicy.SEMANTIC_FORWARD

# LOW risk → auto-sets raw_forward
env = EnvelopeBuilder().set_risk_level(RiskLevel.LOW).build()
assert env.compliance.forwarding_policy == ForwardingPolicy.RAW_FORWARD

# Explicit override (e.g., a fetch task in a HIGH-risk flow that needs raw_forward)
env = (
    EnvelopeBuilder()
    .set_risk_level(RiskLevel.HIGH)
    .set_forwarding_policy(ForwardingPolicy.RAW_FORWARD)  # override
    .build()
)
```

- **`semantic_forward`** — downstream consumers must read only `semantic_payload`.
  Raw tokens, embeddings, and artifact metadata are stripped before forwarding.
- **`raw_forward`** — downstream consumers receive the full envelope (all fields).

### ForwardingEnforcer

The SDK provides `ForwardingEnforcer` — a framework-agnostic class that enforces the
monotonic forwarding constraint across a task pipeline. No CrewAI imports required.

```python
from jhcontext import ForwardingEnforcer, ForwardingPolicy, Envelope

enforcer = ForwardingEnforcer()

# Task 1: fetch step — raw_forward (passes raw data to classifier)
policy = enforcer.resolve(task1_envelope)       # RAW_FORWARD
filtered = enforcer.filter_output(task1_envelope, policy)  # full envelope JSON

# Task 2: classification — semantic_forward (boundary is set)
policy = enforcer.resolve(task2_envelope)       # SEMANTIC_FORWARD
filtered = enforcer.filter_output(task2_envelope, policy)  # only {"semantic_payload": [...]}

# Task 3: accidentally declares raw_forward → overridden
policy = enforcer.resolve(task3_envelope)       # SEMANTIC_FORWARD (monotonic override)

print(enforcer.semantic_boundary_reached)       # True
```

The agent runtime (CrewAI, LangGraph, etc.) calls `enforcer.filter_output()` and replaces
the task's raw output with the result. The full envelope is still persisted to the backend
for audit — nothing is lost.

### StepPersister

Orchestrates artifact + envelope + PROV persistence for individual pipeline steps:

```python
from jhcontext import StepPersister, ArtifactType
from jhcontext.client.api_client import JHContextClient

persister = StepPersister(client=client, builder=builder, prov=prov, context_id="ctx-abc")

artifact_id = persister.persist(
    step_name="sensor",
    agent_id="did:hospital:sensor-agent",
    output="raw sensor data...",
    artifact_type=ArtifactType.TOKEN_SEQUENCE,
    started_at="2026-03-23T10:00:00Z",
    ended_at="2026-03-23T10:01:00Z",
)

metrics = persister.finalize_metrics(total_start=start_time)
```

Handles large artifact upload to S3 (>100 KB), envelope signing, PROV graph extension,
and step-level metrics collection.

### Build a W3C PROV graph

```python
from jhcontext import PROVGraph

prov = (
    PROVGraph("ctx-health-001")
    .add_entity("vitals", "Patient Vitals", artifact_type="token_sequence")
    .add_entity("recommendation", "AI Recommendation")
    .add_activity("ai-analysis", "AI Analysis",
                  started_at="2026-01-01T10:00:00Z",
                  ended_at="2026-01-01T10:01:00Z")
    .add_agent("agent-sensor", "Sensor Agent", role="data_collector")
    .used("ai-analysis", "vitals")
    .was_generated_by("recommendation", "ai-analysis")
    .was_associated_with("ai-analysis", "agent-sensor")
    .was_derived_from("recommendation", "vitals")
)

# Serialize
print(prov.serialize("turtle"))

# Query
chain = prov.get_causal_chain("recommendation")
used = prov.get_used_entities("ai-analysis")
sequence = prov.get_temporal_sequence()
```

### Run compliance audits

```python
from jhcontext import (
    verify_temporal_oversight,
    verify_negative_proof,
    verify_workflow_isolation,
    verify_integrity,
    generate_audit_report,
)

# Art. 14 — Temporal oversight (human reviewed AFTER AI, >= 5 min)
result = verify_temporal_oversight(
    prov,
    ai_activity_id="ai-analysis",
    human_activities=["doctor-review"],
    min_review_seconds=300.0,
)

# Art. 13 — Negative proof (excluded data types not in decision chain)
result = verify_negative_proof(
    prov,
    decision_entity_id="final-grade",
    excluded_artifact_types=["biometric", "social_media"],
)

# Workflow isolation (two PROV graphs share zero artifacts)
result = verify_workflow_isolation(prov_a, prov_b)

# Envelope integrity (hash + signature)
result = verify_integrity(env)

# Generate full audit report
report = generate_audit_report(env, prov, [result1, result2, result3])
print(report.to_dict())
```

### PII Detachment (GDPR Art. 5/17)

Tokenize personal data in semantic payloads before storage. PII is stored in a separate vault linked by `context_id`, enabling independent erasure without breaking audit trails.

```python
from jhcontext import EnvelopeBuilder, verify_pii_detachment, verify_integrity
from jhcontext.pii import InMemoryPIIVault, reattach_pii

vault = InMemoryPIIVault()

# Build with PII detachment
env = (
    EnvelopeBuilder()
    .set_producer("did:example:triage-agent")
    .set_scope("healthcare")
    .set_semantic_payload([
        {"patient_name": "Alice Johnson", "patient_email": "alice@hospital.org",
         "diagnosis": "mild concussion", "recommendation": "24h observation"},
    ])
    .set_privacy(feature_suppression=["patient_name", "patient_email"])
    .enable_pii_detachment(vault=vault)
    .sign("did:example:triage-agent")
    .build()
)

# PII is tokenized
print(env.semantic_payload[0]["patient_name"])   # pii:tok-a1b2c3d4e5f6
print(env.semantic_payload[0]["diagnosis"])       # mild concussion (not PII)

# Audit: verify no PII leaks
assert verify_pii_detachment(env).passed
assert verify_integrity(env).passed

# GDPR Art. 17 erasure
vault.purge_by_context(env.context_id)

# Audit trail survives — hash covers detached payload
assert verify_integrity(env).passed

# Reattach (gracefully fails after purge — tokens remain)
resolved = reattach_pii(env.semantic_payload, vault)
```

The `feature_suppression` field in the privacy block specifies which fields are always tokenized. The `DefaultPIIDetector` also scans all string values for common PII patterns (emails, phones, IPs, SSNs).

For persistent storage, use `SQLitePIIVault` (from `jhcontext.server.storage.pii_vault`) — it stores PII in a separate database file that can be encrypted or deleted independently.

### Start the server

```bash
# REST API on localhost:8400
jhcontext serve

# MCP server (stdio transport)
jhcontext mcp
```

### Use the REST client

```python
from jhcontext.client.api_client import JHContextClient

client = JHContextClient(base_url="http://localhost:8400")

# Submit envelope
ctx_id = client.submit_envelope(env)

# Retrieve
data = client.get_envelope(ctx_id)

# List with filters
envelopes = client.list_envelopes(scope="healthcare")

# Health check
print(client.health())

client.close()
```

## Testing

```bash
pip install -e ".[all,dev]"
pytest tests/ --ignore=tests/test_example.py -v
```

## Key Concepts

| Concept | Description |
|---------|-------------|
| **Envelope** | Immutable context unit: semantic payload + artifacts + provenance + proof |
| **Artifact** | Registered data object (embedding, token sequence, tool result) with content hash |
| **Forwarding Policy** | Per-envelope control: `semantic_forward` (only `semantic_payload` visible downstream) or `raw_forward` (full envelope). Monotonic — once semantic, cannot downgrade. |
| **ForwardingEnforcer** | Framework-agnostic monotonic policy enforcement. Resolves per-task policies and filters output for downstream consumers. |
| **StepPersister** | Orchestrates artifact + envelope + PROV persistence for individual pipeline steps. Handles S3 upload, signing, and metrics. |
| **PROVGraph** | W3C PROV provenance graph (entities, activities, agents, relations) |
| **Proof** | Cryptographic integrity: canonical hash + Ed25519/HMAC signature |
| **Audit** | Compliance checks: temporal oversight, negative proof, workflow isolation, PII detachment |
| **PII Detachment** | Tokenize PII before storage; separate vault enables GDPR erasure without breaking audit trails |
| **UserML** | Semantic payload format: observation → interpretation → situation layers |

## Protocol

Based on the **PAC-AI** (Protocol for Auditable Context in AI) specification. JSON-LD schema at `jhcontext-protocol/jhcontext-core.jsonld` (v0.3).

## License

Apache-2.0
