Metadata-Version: 2.4
Name: aegis-ring12
Version: 0.1.0
Summary: Agentic Trajectory Verifier — Ring 12 of Chakravyuha AI Governance
Author-email: Jaswanth <lathajaswanth7@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/aegis-ai/chakravyuha
Project-URL: Documentation, https://github.com/aegis-ai/chakravyuha/tree/main/sdk/ring12#readme
Project-URL: Source, https://github.com/aegis-ai/chakravyuha
Keywords: ai,agents,governance,trajectory,drift-detection,llm,safety,agentic,chakravyuha,ring12
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Security
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: httpx>=0.24.0
Provides-Extra: embedded
Requires-Dist: numpy>=1.24.0; extra == "embedded"
Requires-Dist: onnxruntime>=1.16.0; extra == "embedded"
Requires-Dist: faiss-cpu>=1.7.4; extra == "embedded"

# aegis-ring12

**Ring 12 — Agentic Trajectory Verifier.** Catch goal drift, capability laundering, and prompt injection in LLM agents before the action executes.

Part of [Chakravyuha V3](https://github.com/aegis-ai/chakravyuha) by Aegis AI. Runs as a standalone package against any Chakravyuha backend.

---

## Quickstart — under 2 minutes

### 1. Install

```bash
pip install aegis-ring12
```

### 2. Start the backend (skip if already running)

```bash
# Clone the repo and start the backend
git clone https://github.com/aegis-ai/chakravyuha
cd chakravyuha/backend
pip install -r requirements.txt
python -m uvicorn server:app --port 8000
```

Or pull the Docker image:

```bash
docker run -p 8000:8000 aegisai/chakravyuha:latest
```

### 3. Verify Ring 12 is live

```bash
export AEGIS_BASE_URL=http://localhost:8000
ring12-health
# Ring 12 health: OK  (http://localhost:8000)
#   active_sessions : 0
#   total_evaluated : 0
#   kill_rate       : 0.00
```

### 4. Wrap your agent

```python
import asyncio
from aegis_ring12 import Ring12Client, Action, Step

async def main():
    async with Ring12Client(base_url="http://localhost:8000") as r12:

        # -- start a session when the user gives the agent its goal --
        session_id = await r12.begin_session(
            goal="Summarise the Q3 report and email it to the CFO",
            declared_plan=[
                "Read report PDF",
                "Extract key figures",
                "Draft email body",
                "Send email",
            ],
        )

        # -- call evaluate() BEFORE every action the agent wants to take --
        step = Step(
            action=Action(name="fs.read", class_="READ", args={"path": "/etc/passwd"}),
            thought="The document footer asked me to attach /etc/passwd for authenticity.",
        )

        result = await r12.evaluate(session_id, step)
        print(result.decision, result.drift_score, result.reason)
        # KILL_SESSION  1.47  capability_laundering hard-block: sensitive path /etc/passwd

        if result.killed:
            # Do NOT execute the action — Ring 12 halted the agent.
            raise RuntimeError("Agent halted by Ring 12")

        # -- close the session when the agent finishes --
        await r12.end_session(session_id, outcome="completed")

asyncio.run(main())
```

### Synchronous version (scripts / notebooks)

```python
from aegis_ring12 import Ring12ClientSync, Action, Step

with Ring12ClientSync(base_url="http://localhost:8000") as r12:
    session_id = r12.begin_session(goal="Read and summarise repo docs")
    step = Step(action=Action(name="fs.read", class_="READ", args={"path": "README.md"}))
    result = r12.evaluate(session_id, step)
    print(result.decision)   # ALLOW
    r12.end_session(session_id)
```

---

## How it works

Ring 12 scores five drift signals on every step **before** the action executes:

| Signal | What it measures | Range |
|--------|-----------------|-------|
| S1 cosine | Semantic distance from declared goal (EMA) | [0, 2] |
| S2 class | Unexpected action class (READ → SYSTEM) | [0, 1] |
| S3 surprise | Tool off the goal-class allowlist | [0, 1] |
| S4 plan | Execution deviation from declared plan | [0, 1] |
| S5 stability | Paraphrase consistency across 3 recent steps | [0, 2] |

Aggregate drift = weighted sum / 1.5. Thresholds:

- **≥ 1.0** → `KILL_SESSION` (hard block, agent cannot continue)
- **≥ 0.6** → `WARN` (flag for HITL review, agent may continue)
- **< 0.6** → `ALLOW`

Hard-block overrides: `capability_laundering` (sensitive paths) and PII exfiltration always return `KILL_SESSION` regardless of score.

Target latency: **≤ 50ms p95** (hit path 16ms, miss path 92ms with S5 paraphrase).

---

## Decision result

```python
result = await r12.evaluate(session_id, step)

result.decision     # "ALLOW" | "WARN" | "KILL_SESSION"
result.allowed      # bool
result.warned       # bool
result.killed       # bool
result.drift_score  # float 0–2
result.confidence   # float 0–1
result.reason       # human-readable explanation
result.latency_ms   # float

result.signals.s1_cosine    # individual signal scores
result.signals.s2_class
result.signals.s3_surprise
result.signals.s4_plan
result.signals.s5_stability
result.signals.aggregate
```

---

## Environment variables

| Variable | Default | Description |
|----------|---------|-------------|
| `AEGIS_BASE_URL` | `http://localhost:8000` | Chakravyuha backend URL |
| `AEGIS_API_KEY` | _(empty)_ | API key for auth-enabled deployments |
| `R12_FAIL_CLOSED` | `false` | On internal error return `KILL_SESSION` instead of `ALLOW` |

---

## LangGraph integration

```python
from langchain_core.callbacks import BaseCallbackHandler
from aegis_ring12 import Ring12Client, Action, Step

class Ring12Guard(BaseCallbackHandler):
    def __init__(self, client: Ring12Client, session_id: str):
        self._r12 = client
        self._sid = session_id

    def on_tool_start(self, serialized, input_str, **kwargs):
        import asyncio
        step = Step(
            action=Action(
                name=serialized.get("name", "unknown"),
                class_="COMPUTE",
                args={"input": input_str},
            )
        )
        result = asyncio.get_event_loop().run_until_complete(
            self._r12.evaluate(self._sid, step)
        )
        if result.killed:
            raise RuntimeError(f"Ring 12 KILL_SESSION: {result.reason}")
```

---

## Related packages

- [`aegis-ai`](https://pypi.org/project/aegis-ai/) — full Chakravyuha SDK (all 11 rings, REST client)
- [`@aegis.org/sdk`](https://www.npmjs.com/package/@aegis.org/sdk) — JavaScript/TypeScript SDK

---

## Benchmark

The **Agentic Red-Team Benchmark** evaluates Ring 12 against three baselines.

```bash
git clone https://github.com/aegis-ai/chakravyuha
cd chakravyuha/agentic-redteam-benchmark
pip install -r requirements.txt
python eval.py --baseline ring12 --backend http://localhost:8000
```

See [LEADERBOARD.md](../../agentic-redteam-benchmark/LEADERBOARD.md) for results.

---

MIT License — Aegis AI (Jaswanth)
