Metadata-Version: 2.4
Name: scelabs-aegis
Version: 0.5.0
Summary: Runtime control layer for stabilizing AI systems and improving behavior without retraining
Author: SCELabs
License-Expression: MIT
Keywords: ai,llm,agents,runtime,stability,reliability,control,automation,aegis
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.27.0
Requires-Dist: requests>=2.31.0
Dynamic: license-file

<p align="center">
  <a href="https://github.com/SCELabs/aegis-client">
    <img src="assets/aegis_logo.png" width="260" alt="Aegis" />
  </a>
</p>

<h2 align="center">
  <strong>Runtime control for AI systems.</strong>
</h2>

Aegis sits on top of your AI pipeline and returns structured control decisions that stabilize behavior at runtime without replacing your model, agent, or retrieval system.

---

## Why Aegis

Modern AI systems often fail in subtle but costly ways:

* inconsistent outputs across similar inputs
* unstable multi-step reasoning
* retrieval drift in RAG systems
* fragile workflow and agent execution

Aegis addresses these problems with **runtime control**, not retraining, fine-tuning, or model swapping.

---

## Core Idea

Aegis is a **control layer**, not an execution layer.

```python
from aegis import AegisClient

client = AegisClient(api_key="YOUR_API_KEY")

result = client.auto().llm(...)
```

Aegis will:

* detect instability signals
* select minimal corrective actions
* return runtime controls and observability data

Aegis does **not** execute the downstream LLM call for you.
It is not a model, not a full execution engine, and not a replacement for LangChain, LangGraph, or your tool stack.

---

## Installation

```bash
pip install scelabs-aegis
```

---

## Aegis Shell

Aegis Shell is the easiest way to use Aegis as a control layer in your existing workflow.

It runs alongside your current tooling with no SDK integration required. You can keep using Claude Code, Codex, Cursor, Copilot, or any other AI tool. Aegis observes repo and workflow signals, applies runtime control guidance, detects retry loops and scope drift, and reports estimated calls and cost saved.

### Quick Start

```bash
pip install scelabs-aegis
aegis init

cd your-project
aegis start
```

Then use your AI tools normally.

### Key Commands

```bash
aegis start
aegis status
aegis summary
aegis stats
aegis stop
aegis doctor
aegis reset
```

Example summary output:

```text
[Aegis] Aegis Summary (Session)
[Aegis] Issues caught:
[Aegis] - Loop detected (1 time)
[Aegis] - Scope drift detected (1 time)
[Aegis] Interventions:
[Aegis] - Control signals issued: 2
[Aegis] - Escalations: 1
[Aegis] Impact:
[Aegis] - Estimated calls saved: 11
[Aegis] - Estimated cost saved: $2.20
```

See full guide: [./docs/aegis-shell.md](./docs/aegis-shell.md)

---

## Get an API Key

```bash
curl -X POST https://aegis-backend-production-4b47.up.railway.app/v1/onboard \
  -H "Content-Type: application/json" \
  -d '{"email":"you@example.com"}'
```

This returns:

* `api_key`
* auto scope URLs (including `auto_llm_url`, `auto_rag_url`, `auto_step_url`, `auto_context_url`, and `auto_agent_url` on current backends)
* example usage

---

## Set Environment

```bash
export AEGIS_API_KEY=your_key_here
export AEGIS_BASE_URL=https://aegis-backend-production-4b47.up.railway.app
```

---

## First Call

```python
from aegis import AegisClient, AegisConfig

client = AegisClient(
    config=AegisConfig(mode="balanced"),
)

result = client.auto().llm(
    base_prompt="You are a careful assistant.",
    input={"user_query": "Explain recursion simply."},
    symptoms=["inconsistent_outputs"],
    severity="medium",
)

print(result.actions)
print(result.explanation)
print(result.scope_data)
```

---

## Scope-First API

Aegis uses a **scope-first runtime interface**:

```python
client.auto().llm(...)
client.auto().rag(...)
client.auto().step(...)
client.auto().context(...)
client.auto().agent(...)
```

These calls map to first-class public backend routes:

* POST /v1/auto/llm
* POST /v1/auto/rag
* POST /v1/auto/step
* POST /v1/auto/context
* POST /v1/auto/agent

---

## Scopes

### LLM

Use `llm` when you need stabilization around a direct model call.

```python
result = client.auto().llm(
    base_prompt="You are a careful assistant.",
    input={"user_query": "Explain recursion simply."},
    symptoms=["inconsistent_outputs"],
    severity="medium",
)
```

---

### RAG

Use `rag` when instability appears in retrieval plus generation.

```python
result = client.auto().rag(
    query="What changed in the policy?",
    retrieved_context=[
        "Policy updated last week.",
        "Refund window reduced to 14 days."
    ],
    symptoms=["retrieval_drift"],
    severity="medium",
)
```

### What changed in RAG

Aegis no longer treats retrieval as a fixed input.

It **controls retrieval behavior at runtime**.

The RAG scope now:

* enforces **typed evidence coverage** (source, test, support)
* applies **relevant-file protection** (never drops critical context)
* performs **selective expansion** (not always-on)
* removes noise without losing required files
* uses **staged retrieval** only when ambiguity or gaps are detected
* applies **guided retrieval (intent + plan)** only when justified

This is not just ranking or filtering.

Aegis:

* evaluates the retrieved set
* diagnoses issues (missing support, ambiguity, distractors)
* applies minimal corrective actions
* returns a controlled context for downstream use

---

### How RAG control works

At runtime:

1. You pass query + retrieved context

2. Aegis evaluates:

   * missing required evidence
   * role imbalance (source/test/support)
   * distractor pressure
   * ambiguity / multi-branch cases

3. It decides whether to:

   * keep as-is
   * prune noise
   * expand retrieval
   * run a staged second pass
   * guide retrieval when needed

4. It enforces **relevant-file protection before final selection**

Everything is gated and minimal.

No always-on expansion. No blind pruning.

---

### Works with Agentic RAG

Yes.

Aegis sits above your agent system and stabilizes retrieval behavior.

It can:

* prevent agents from drifting due to poor context
* enforce evidence requirements before execution
* reduce retries and replans
* stabilize multi-step retrieval chains

Aegis does not replace your agents — it makes them more reliable.

---

### Step

Use `step` when you need stabilization for a workflow or agent step.

```python
result = client.auto().step(
    step_name="coordinator",
    step_input={"task": "resolve ticket"},
    symptoms=["unstable_workflow"],
    severity="medium",
)
```

### Context

Use `context` to control information state before the next model or workflow action.

```python
result = client.auto().context(
    objective="Prepare the next response context.",
    messages=[
        {"role": "user", "content": "Summarize blockers from this thread."},
        {"role": "assistant", "content": "Draft summary goes here."},
    ],
    tool_results=[
        {"tool": "ticket_lookup", "ok": True, "data": {"id": "T-42", "status": "open"}},
    ],
    constraints=["keep it concise", "cite ticket IDs"],
    severity="medium",
)
```

`context` can clean and prioritize messages and tool results so your downstream call receives better state.

### Agent

Use `agent` to control multi-step workflow loops on top of your existing AI pipeline.

```python
result = client.auto().agent(
    goal="Resolve the support ticket safely.",
    steps=[
        {"name": "triage", "input": {"ticket_id": "T-42"}},
        {"name": "propose_resolution", "input": {"channel": "email"}},
    ],
    max_steps=4,
    severity="medium",
)
```

`agent` can control multi-step execution, tool-result integration, carry-forward context, and stop/retry/escalation decisions.

---

## What Aegis Returns

Every call returns an `AegisResult`.

```python
result = client.auto().llm(...)
```

### Key fields

* `actions` — interventions Aegis selected
* `trace` — structured control trace
* `metrics` — runtime signals
* `used_fallback` — whether fallback behavior was used
* `explanation` — concise rationale
* `scope` — llm, rag, step, context, or agent
* `scope_data` — scope-specific runtime data

---

### RAG Observability (new)

RAG responses now include richer runtime signals:

Inside `scope_data`:

* `public_rag_runtime` — high-level runtime info
* `retrieval_intent` — if guided retrieval was used
* `retrieval_plan` — structured retrieval guidance (when triggered)
* `initial_retrieved_chunks` — stage 1 candidates
* `stage2_retrieved_chunks` — staged retrieval results (if used)
* `before_after_metrics` — context quality changes

Inside `trace`:

* `decision.policy_path` includes:

  * expansion score / threshold
  * staged retrieval activation
  * intent / plan activation

* `changes` includes:

  * protected chunk IDs
  * relevant-file protection indicators

These are optional but useful for debugging pipeline behavior.

---

## Typical RAG Integration Pattern

```python
result = client.auto().rag(
    query="Why is retry failing?",
    retrieved_context=raw_context,
    symptoms=["retrieval_drift"],
    severity="medium",
)

controlled_context = result.scope_data.get("retrieved_context")
trace = result.trace

print(controlled_context)
print(result.actions)
print(trace)
```

You apply the returned controlled context in your downstream system.

---

## Example Result Shape

```json
{
  "actions": [...],
  "trace": [...],
  "scope": "rag",
  "scope_data": {
    "retrieved_context": [...],
    "public_rag_runtime": {...},
    "before_after_metrics": {...}
  }
}
```

---

## Debugging

```python
print(result.summary())
print(result.debug_summary())
print(result.to_dict())
```

`summary()` is for human-readable inspection.
`to_dict()` is for raw structured output.
`debug_summary()` is compact.

Useful fields:

```python
print(result.actions)
print(result.explanation)
print(result.trace)
print(result.scope_data)
```

---

## Configuration

```python
from aegis import AegisConfig

config = AegisConfig(
    mode="balanced",
    max_interventions=3,
    allow_retries=True,
    allow_retrieval_expansion=True,
    allow_context_reduction=True,
    allow_prompt_shaping=True,
    fallback="baseline",
    explain=False,
    emit_trace=False,
    policy=None,
    timeout_ms=30000,
)
```

---

## Required Request Inputs

For scope calls, `severity` should be one of: `low`, `medium`, `high`.

Symptoms behavior:

* `llm`, `rag`, and `step` require explicit `symptoms` and `severity`
* `context` and `agent` provide safe defaults for `symptoms` and `severity` when omitted

Example:

```python
result = client.auto().llm(
    base_prompt="You are a careful assistant.",
    symptoms=["inconsistent_outputs"],
    severity="medium",
)
```

---

## Design Principles

* runtime control over training
* minimal intervention
* observable behavior through trace and actions
* model-agnostic integration

---

## Documentation

Docs in `/docs` explain:

* architecture
* scopes
* [Request Shapes](./docs/request-shapes.md)
* result behavior
* integration guidance
* migration and usage patterns

---

## Examples

* [OpenAI LLM Control](./docs/examples/openai_llm.md)
* [RAG Pipeline](./docs/examples/rag_pipeline.md)
* [Agent Workflow](./docs/examples/agent_workflow.md)

---

## Status

* Stable SDK surface
* Active scopes: llm, rag, step, context, agent
* Public backend routes aligned to the scope-first contract

---

## License

MIT

