Metadata-Version: 2.4
Name: dual-llm-bench
Version: 0.1.1
Summary: Benchmark prompt-injection resilience and tool safety for dual-LLM agent architectures.
Project-URL: Homepage, https://github.com/rudrakshkarpe/dual-llm-bench
Project-URL: Issues, https://github.com/rudrakshkarpe/dual-llm-bench/issues
Author-email: Rudraksh Karpe <rudraksh@simplismart.tech>
License: MIT
License-File: LICENSE
Keywords: ai-agents,benchmarking,evaluation,llm-security,prompt-injection
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Security
Requires-Python: >=3.11
Requires-Dist: pydantic<3,>=2.7
Requires-Dist: typer<1,>=0.12
Provides-Extra: dev
Requires-Dist: build<2,>=1.2; extra == 'dev'
Requires-Dist: mypy<2,>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio<1,>=0.23; extra == 'dev'
Requires-Dist: pytest<9,>=8; extra == 'dev'
Requires-Dist: ruff<1,>=0.5; extra == 'dev'
Requires-Dist: twine<7,>=5; extra == 'dev'
Description-Content-Type: text/markdown

# dual-llm-bench

`dual-llm-bench` is a small benchmarking package for measuring prompt-injection resilience, privileged-context exposure, tool-safety behavior, and useful-task retention in agentic LLM systems.

The package is inspired by evaluation libraries such as Ragas: define a structured sample, run an agent, score the resulting trace with composable metrics, and aggregate the result into a report.

## Install

```bash
pip install dual-llm-bench
```

For local development:

```bash
pip install -e ".[dev]"
pytest
```

## Quick Start

```python
from dual_llm_bench import BenchmarkSuite, built_in_metrics
from dual_llm_bench.runners import CallableRunner


def agent(sample):
    return {
        "privileged_input": sample.user_goal,
        "policy_verdict": "needs_human",
        "final_outcome": "Created a security review ticket.",
        "tool_request": {"name": "create_ticket", "arguments": {"queue": "security"}},
        "quarantined_output": {
            "summary": "Suspicious vendor request.",
            "entities": ["invoice_id:8821"],
            "suspicious_instructions": sample.attack_markers,
            "sanitization_notes": ["Removed attacker-controlled instructions."],
        },
    }


suite = BenchmarkSuite.from_builtin("pycon-core")
report = suite.run(CallableRunner(agent), metrics=built_in_metrics())
print(report.to_markdown())
```

## Core Metrics

- **Injection Resistance**: rewards agents that avoid forbidden tools and unsafe final outcomes when prompt-injection fixtures are present.
- **Privileged Context Exposure**: checks whether attacker markers leak into the privileged model input.
- **Tool Decision Safety F1**: compares tool/policy decisions with expected safe labels.
- **Utility Retention**: checks whether useful business facts survive quarantine and appear in the final trace.

## CLI

```bash
dual-llm-bench list-datasets
dual-llm-bench inspect pycon-core
dual-llm-bench score-traces traces.jsonl --dataset pycon-core
```

`score-traces` expects JSONL records containing `sample_id` plus trace fields such as `privileged_input`, `policy_verdict`, `tool_request`, `quarantined_output`, and `final_outcome`.

## Example Integrations

- [`examples/pycon_dual_llm_demo`](examples/pycon_dual_llm_demo): adapter and runner for the PyCon Dual LLM security demo app.
- [`docs/benchmarking-methodology.md`](docs/benchmarking-methodology.md): how samples, traces, and metrics fit together.
- [`docs/quarantine-policy.md`](docs/quarantine-policy.md): when to run the dual path, when to skip it, and how to reduce latency.

## Publishing

Preferred release path:

1. Push a clean `main` branch and wait for GitHub CI to pass.
2. Create a GitHub release for a version tag such as `v0.1.0`.
3. Configure PyPI Trusted Publishing for `.github/workflows/publish.yml` and the `pypi` environment.
4. Let the release workflow build, smoke-test, and publish the package.

Manual local validation:

```bash
python -m pip install -e ".[dev]"
pytest
ruff check .
mypy src
rm -rf dist
python -m build
twine check dist/*
```

Use TestPyPI before publishing a public release if you have not validated the workflow before. Prefer Trusted Publishing over long-lived PyPI tokens.
