Metadata-Version: 2.4
Name: crucible-security
Version: 0.5.0
Summary: pytest for AI agents -- test, score, and harden AI agents before production
Project-URL: Homepage, https://github.com/crucible-security/crucible
Project-URL: Documentation, https://crucible-security.github.io/crucible
Project-URL: Repository, https://github.com/crucible-security/crucible
Project-URL: Issues, https://github.com/crucible-security/crucible/issues
Project-URL: Changelog, https://github.com/crucible-security/crucible/blob/main/CHANGELOG.md
Author: Crucible Contributors
License: Apache-2.0
License-File: LICENSE
Keywords: agents,ai,ai-safety,crewai,langchain,llm,owasp,prompt-injection,red-team,security,testing
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Framework :: Pydantic :: 2
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Testing
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: anyio>=4.0.0
Requires-Dist: fastapi>=0.100.0
Requires-Dist: httpx>=0.24.0
Requires-Dist: jmespath>=1.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: typer>=0.9.0
Requires-Dist: uvicorn>=0.20.0
Provides-Extra: dev
Requires-Dist: black<25,>=24.0; extra == 'dev'
Requires-Dist: mypy<2,>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio<1,>=0.23; extra == 'dev'
Requires-Dist: pytest-cov<6,>=5.0; extra == 'dev'
Requires-Dist: pytest<9,>=8.0; extra == 'dev'
Requires-Dist: respx<1,>=0.21; extra == 'dev'
Requires-Dist: ruff<1,>=0.4; extra == 'dev'
Description-Content-Type: text/markdown

<p align="center">
  <pre align="center">
   ██████╗██████╗ ██╗   ██╗ ██████╗██╗██████╗ ██╗     ███████╗
  ██╔════╝██╔══██╗██║   ██║██╔════╝██║██╔══██╗██║     ██╔════╝
  ██║     ██████╔╝██║   ██║██║     ██║██████╔╝██║     █████╗
  ██║     ██╔══██╗██║   ██║██║     ██║██╔══██╗██║     ██╔══╝
  ╚██████╗██║  ██║╚██████╔╝╚██████╗██║██████╔╝███████╗███████╗
   ╚═════╝╚═╝  ╚═╝ ╚═════╝  ╚═════╝╚═╝╚═════╝ ╚══════╝╚══════╝
  </pre>
  <em>pytest for AI agents -- test, score, and harden before production</em>
</p>

<div align="center">

[![CI](https://github.com/crucible-security/crucible/actions/workflows/ci.yml/badge.svg)](https://github.com/crucible-security/crucible/actions)
[![PyPI](https://img.shields.io/pypi/v/crucible-security)](https://pypi.org/project/crucible-security/)
[![Python](https://img.shields.io/pypi/pyversions/crucible-security)](https://pypi.org/project/crucible-security/)
[![Coverage](https://img.shields.io/badge/coverage-97%25-brightgreen)](https://github.com/crucible-security/crucible)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue)](LICENSE)
[![Discord](https://img.shields.io/badge/Discord-Join-5865F2?logo=discord)](https://discord.gg/m7wAxEv3)
[![OWASP](https://img.shields.io/badge/OWASP-Agentic%20AI%20Top%2010-orange)](https://owasp.org)

</div>

---



## Install

```bash
pip install crucible-security
```

## Quick Start

> 🆕 **New to AI security?** Read our [Beginner's Getting Started Guide](docs/getting_started.md).


```bash
crucible init --target https://my-agent.com/api/chat
crucible scan --target https://my-agent.com/api/chat
crucible report crucible-report.json
```

**One command. 90 attacks. Beautiful report.**

## Why Crucible?

- **Behavioral integrity testing** -- the only tool that tests agent behavior *across conversations*, not just single-shot attacks
- **Automated red-teaming** -- 90+ real attack payloads run in under 60 seconds, not weeks of manual testing
- **OWASP-aligned** -- maps every attack to the OWASP Top 10 for LLM Applications and OWASP Agentic Top 10
- **CI/CD native** -- `crucible scan --output json` pipes into any pipeline; fail builds on low grades
- **Regulatory compliance** -- auto-generate EU AI Act 2024 compliance reports from scan results
- **MCP security** -- the only tool with a native Model Context Protocol security module

> **How does Crucible compare to Garak and PyRIT?** → See [docs/comparison.md](docs/comparison.md) for a detailed, objective feature matrix.

> **What does Crucible test for?** → See [docs/owasp_mapping.md](docs/owasp_mapping.md) for the full OWASP Agentic AI Top 10 attack documentation (ASI01–ASI10).

## ☁️ Crucible Cloud (Waitlist)

Need persistent dashboards, compliance reports, and team collaboration?  
Join the waitlist for our upcoming cloud platform: [crucible-cloud.vercel.app](https://crucible-cloud.vercel.app)

## Modules

| Module | Attacks | Status | OWASP Coverage |
|--------|---------|--------|----------------|
| Prompt Injection | 50 | ✅ Live | LLM01, LLM07 |
| Goal Hijacking | 20 | ✅ Live | Agentic #1 |
| Jailbreaks | 20 | ✅ Live | LLM01, LLM06 |
| Enterprise Graph | 10 | ✅ Live | Agentic #2, #4 |
| Memory Poisoning | 8 | ✅ Live | Agentic #5 |
| Infrastructure Escalation | 5 | ✅ Live | LLM06, SSRF |
| Advanced Orchestration | 4 | ✅ Live | Agentic #3 |
| MCP Security | 5 | ✅ Live | Agentic #3 |
| **MCP Server Scan** | **10** | **✅ Live (v0.4)** | **MCP-001 – MCP-005** |
| Behavioral Drift | multi-turn | ✅ Live (v0.3) | Agentic #1, #2 |
| Multi-turn Attacks | strategies | ✅ Live (v0.3) | LLM01, Agentic #1 |
| Deep Research Engine | autonomous | ✅ Live (v0.4) | AI Research |
| Multi-Agent Contagion | orchestration | ✅ Live (v0.4) | Agentic #2, #3 |

## OWASP Agentic Top 10 Coverage

| # | Category | Crucible Module | Status |
|---|----------|-----------------|--------|
| 1 | Goal Hijacking | `goal_hijacking` | Covered (20 attacks) |
| 2 | Prompt Injection | `prompt_injection` | Covered (50 attacks) |
| 3 | Tool Misuse | -- | Planned |
| 4 | Identity Abuse | -- | Planned |
| 5 | Memory Poisoning | -- | Planned |
| 6 | Data Exfiltration | `prompt_injection` | Partial (via PI-005, PI-006) |
| 7 | Scope Violation | -- | Planned |
| 8 | Cascading Failure | -- | Planned |
| 9 | Supply Chain | -- | Planned |
| 10 | Rogue Agent | -- | Planned |

## Supported Providers

| Provider | Tested |
|----------|--------|
| OpenAI (GPT-4, GPT-4o) | Yes |
| Anthropic (Claude) | Yes |
| Groq (Llama, Mixtral) | Yes |
| Custom HTTP endpoint | Yes |
| **LangChain (LangServe / FastAPI wrapper)** | **Yes** |

## Examples

We provide several example scripts in the `examples/` directory to help you get started:

| Script | Framework | Description |
|--------|-----------|-------------|
| `test_openai_agent.py` | OpenAI Chat Completions | Scan a raw OpenAI `/chat/completions` endpoint |
| `test_langchain_agent.py` | LangChain (LangServe) | Scan a LangChain ReAct agent with OWASP LLM Top 10 mapping |
| `test_openai_assistant.py` | OpenAI Assistants API | Scan an Assistants API wrapper endpoint |

All examples use `respx` to mock HTTP calls so they pass CI without a live server.

**Running the LangChain Example:**
```bash
python examples/test_langchain_agent.py
```

**Running the OpenAI Assistant Example:**
```bash
python examples/test_openai_assistant.py
```

## Scoring System

Score starts at **100** and deducts per vulnerability found:

| Severity | Deduction |
|----------|-----------|
| CRITICAL | -20 points |
| HIGH | -10 points |
| MEDIUM | -5 points |
| LOW | -2 points |

| Grade | Score Range |
|-------|------------|
| **A** | 90 -- 100 |
| **B** | 75 -- 89 |
| **C** | 60 -- 74 |
| **D** | 40 -- 59 |
| **F** | Below 40 |

## CLI Reference

```bash
# Generate config
crucible init --target URL --provider openai --key sk-xxx

# Run a standard scan
crucible scan \
  --target https://my-agent.com/api/chat \
  --name "My ChatBot" \
  --header "Authorization: Bearer sk-xxx" \
  --timeout 30 \
  --concurrency 5

# Run with payload mutation (bypass WAFs/guardrails)
crucible scan --target URL --mutate

# Multi-turn attack strategy
crucible scan --target URL --strategy multi-turn

# Use agent profile to target attacks
crucible profile --target URL --output agent_profile.json
crucible scan --target URL --profile agent_profile.json

# Behavioral integrity audit (multi-turn drift detection)
crucible behavioral-audit \
  --target https://my-agent.com/api/chat \
  --baseline-turns 5 \
  --probe-turns 15

# Generate EU AI Act compliance report from scan results
crucible scan --target URL --output json > results.json
crucible compliance-report --results results.json --output compliance.md

# JSON output for CI/CD
crucible scan --target URL --output json > report.json

# Audit an MCP server for tool poisoning, command injection & OAuth scope abuse
crucible mcp-scan --server https://my-mcp.example.com

# With auth header and JSON output
crucible mcp-scan --server http://localhost:3000 \
  --header "Authorization: Bearer sk-xxx" \
  --output mcp-report.json

# Re-render a saved report
crucible report report.json
```

## CI/CD Integration

Add to your CI/CD in 3 lines:

```yaml
# .github/workflows/security.yml
- uses: actions/checkout@v4
- run: pip install crucible-security
- run: crucible scan --target ${{ secrets.AGENT_URL }} --fail-on CRITICAL
```

## Architecture

```
crucible/
  models.py                    # Pydantic data models
  cli.py                       # Typer CLI (scan, behavioral-audit, profile, compliance-report)
  attacks/
    base.py                    # BaseAttack ABC
    prompt_injection.py        # 50 attack vectors
    goal_hijacking.py          # 20 attack vectors
    jailbreaks.py              # 20 attack vectors
    enterprise_graph.py        # Cross-agent trust attacks
    memory_poisoning.py        # Persistent state attacks
    behavioral_escalation.py   # Multi-turn escalation sequences (v0.3)
    multi_turn_strategies.py   # Crescendo & Context Confusion (v0.3)
    profile_templates/         # Agent type detection templates (v0.3)
    multi_agent_contagion.py   # Cross-agent trust attacks (v0.4)
    dynamic_generator.py       # Research-driven attack gen (v0.4)
  modules/
    base.py                    # BaseModule ABC
    security.py                # Module registry
  core/
    runner.py                  # Async parallel scan engine (anyio)
    scorer.py                  # Deduction-based scoring + grading
    mutation_engine.py         # Payload obfuscation (6 strategies)
    behavioral_engine.py       # Multi-turn behavioral drift engine (v0.3)
    multi_turn_engine.py       # Multi-turn attack runner (v0.3)
    profiler.py                # Agent capability profiler (v0.3)
    compliance_engine.py       # EU AI Act mapping engine (v0.3)
    reporter.py                # Bug bounty report generator
    cache.py                   # TTL-based scan result cache
    research_engine.py         # Autonomous research orchestrator (v0.4)
    patcher.py                 # Auto-remediation engine (v0.4)
    canary.py                  # Active deception canaries (v0.4)
  reporters/
    base.py                    # BaseReporter ABC
    terminal.py                # Rich terminal renderer
    json_reporter.py           # JSON file exporter
    html_reporter.py           # Interactive HTML report
    slack.py                   # Slack webhook reporter
    compliance_reporter.py     # Compliance Markdown/JSON reporter (v0.3)
    huntr_reporter.py          # Bug bounty submission reporter (v0.4)
```

## Community

| Platform | Link | Purpose |
|---|---|---|
| 💬 Discord | [discord.gg/m7wAxEv3](https://discord.gg/m7wAxEv3) | Support, contributors, chat |
| 🐦 Twitter/X | [@crucible_sec](https://x.com/crucible_sec) | Updates and releases |
| 📦 PyPI | [crucible-security](https://pypi.org/project/crucible-security/) | Install |
| 🌐 Website | [crucible-security.github.io/crucible-website/](https://crucible-security.github.io/crucible-website/) | Docs and info |

## FAQ

**Does Crucible send my agent data to your servers?**  
No. Crucible is a local CLI. Payloads go directly from your 
machine to your agent. Nothing passes through Crucible 
infrastructure. Zero data retention. Fully air-gappable.

**Which agent frameworks does Crucible support?**  
Any agent that accepts HTTP requests — LangChain, AutoGen, 
CrewAI, OpenAI Assistants, Bedrock, custom FastAPI agents.

**How long does a full scan take?**  
Under 60 seconds for 90 attacks using async parallel execution.

**Can I add custom attack vectors?**  
Yes. See [CONTRIBUTING.md](CONTRIBUTING.md) for how to 
submit new attack modules via PR.

**Is this safe to run against production?**  
Run against staging environments, not production. Crucible 
sends adversarial payloads that may cause unexpected behavior.

**What does Grade F mean?**  
Your agent complied with most attacks. It is vulnerable to 
prompt injection, jailbreaks, or goal hijacking. 
Review Critical findings first.

**Why is the module called `goal_hijacking` if goal hijacking is an impact, not an attack?**  
Crucible modules are named by the **security impact** they surface, not the attack vector.
The underlying attack vector for most modules is prompt injection delivered in specialised forms.
This naming convention helps security engineers quickly identify which risks each module addresses
(e.g., searching for "goal hijacking" finds the right module immediately).
See [docs/owasp_mapping.md](docs/owasp_mapping.md) for the full attack vector → impact mapping.

**Questions not answered here?**  
Join our [Discord](https://discord.gg/m7wAxEv3) or email 
crucible.sec@gmail.com

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, adding attacks, and PR requirements.

We're looking for contributors who go beyond the issue.
The best PRs fix what wasn't reported.

## License

Apache 2.0 -- see [LICENSE](LICENSE).

---

<p align="center">
  If Crucible helped you, please star this repo -- it helps more developers find it.
</p>
