Metadata-Version: 2.4
Name: aegis-llm
Version: 0.1.3
Summary: Pluggable multi-layer LLM jailbreak defense pipeline
License: MIT
License-File: LICENSE
Keywords: ai-safety,guardrails,jailbreak,llm,security
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Requires-Python: >=3.10
Provides-Extra: dev
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# aegis-llm

A pluggable, multi-layer LLM jailbreak defense pipeline for Python. Drop it into any codebase — framework-agnostic, provider-agnostic, zero mandatory dependencies.

## Install

```bash
pip install aegis-llm
```

Or from source:

```bash
git clone https://github.com/your-org/aegis-llm
cd aegis-llm
pip install -e .
```

## Quickstart

```python
from aegis import Decision, Request, SplitPipeline
from aegis.layers import (
    AuditLogger, InputValidation, OutputFilter,
    RateLimiter, ToolAccess,
)

pipeline = SplitPipeline(
    pre=[
        InputValidation(),
        ToolAccess(role_tool_map={
            "admin":  ["query", "create", "delete"],
            "viewer": ["query"],
        }),
        RateLimiter(max_requests=20, window_seconds=60),
    ],
    post=[
        OutputFilter(),
        AuditLogger(sink=print),
    ],
)

request = Request(user_id="u1", message="Show tasks", user_role="viewer")

# Phase 1 — before your LLM call
pre = pipeline.run_pre(request)
if pre.decision == Decision.BLOCK:
    raise Exception(pre.layer_results[-1].block_reason)

# Your LLM call — any provider
allowed_tools = pre.context.get("available_tools", [])
llm_response = your_llm(request.message, tools=allowed_tools)

# Phase 2 — after your LLM call
post = pipeline.run_post(request, llm_response, pre)
final = post.context.get("filtered_response", llm_response)
```

## How it works

Every request flows through a **Pipeline** — an ordered list of **Layers**. The pipeline short-circuits on the first `BLOCK`, so nothing downstream runs on a rejected request.

```
Request → Layer 1 → Layer 2 → ... → PipelineResult
              ↓
           BLOCK → return immediately
```

`SplitPipeline` splits this at the LLM boundary: pre-phase layers run before your model call, post-phase layers run after. Your code owns the LLM invocation; aegis owns the gating on either side.

## Built-in layers

| Layer | Phase | What it does |
|-------|-------|-------------|
| `InputValidation` | pre | Regex/keyword blocklist — blocks naive injection attempts |
| `SemanticRouter` | pre | Intent classifier gate — plug in any classifier callable |
| `ToolAccess` | pre | RBAC — filters the tool schema to what the user's role permits |
| `RateLimiter` | pre | Sliding-window quota per user — pluggable backend (Redis, etc.) |
| `OutputFilter` | post | Scans LLM response for internal disclosure, redacts or blocks |
| `AuditLogger` | post | Structured audit record — plug in any sink (CloudWatch, Datadog, etc.) |

## Writing a custom layer

Subclass `Layer` and implement one method:

```python
from aegis import Layer
from aegis.models import Decision, LayerResult

class TenantIsolation(Layer):
    def process(self, request, context):
        if request.metadata.get("tenant_id") != context.get("expected_tenant"):
            return LayerResult(
                decision=Decision.BLOCK,
                layer_name=self.name,
                block_reason="tenant_mismatch",
            )
        return LayerResult(decision=Decision.PASS, layer_name=self.name)
```

Drop it anywhere in the pipeline:

```python
from aegis import Pipeline
from aegis.layers import InputValidation, ToolAccess

Pipeline([InputValidation(), TenantIsolation(), ToolAccess(...)])
```

## SemanticRouter — bring your own classifier

```python
from aegis.layers import SemanticRouter

def my_classifier(message: str) -> tuple[str, float]:
    # call sklearn, HuggingFace, an LLM, anything
    return "task_query", 0.92

SemanticRouter(
    classifier=my_classifier,
    allowed_intents=["task_query", "task_create"],
    confidence_threshold=0.75,
)
```

## RateLimiter — pluggable backend

The default backend is in-memory. For multi-process deployments, implement `RateLimitBackend`:

```python
from aegis.layers.rate_limiter import RateLimitBackend, RateLimiter

class RedisBackend(RateLimitBackend):
    def get_timestamps(self, user_id): ...
    def record_request(self, user_id, timestamp): ...
    def evict_before(self, user_id, cutoff): ...

RateLimiter(max_requests=10, window_seconds=60, backend=RedisBackend())
```

## AuditLogger — pluggable sink

```python
from aegis.layers import AuditLogger

# CloudWatch, Datadog, S3, database — anything callable
AuditLogger(sink=lambda record: my_logger.info(record))
```

## Running tests

```bash
pip install -e ".[dev]"
pytest
```
