Metadata-Version: 2.4
Name: housemonkey
Version: 0.1.0
Summary: Chaos testing for AI apps. 18 extreme personas attack your AI to find edge cases before users do. OWASP LLM Top 10 coverage.
License-Expression: MIT
Project-URL: Homepage, https://github.com/awrshift/housemonkey
Project-URL: Issues, https://github.com/awrshift/housemonkey/issues
Keywords: ai,llm,testing,red-teaming,owasp,security,chaos,qa,evaluation,house-monkey
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: Security
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.27
Dynamic: license-file

<p align="center">
  <img src="assets/mascot-glitch.png" alt="House Monkey mascot" width="200">
</p>

<h1 align="center">House Monkey 🐒</h1>
<p align="center"><strong>Chaos testing for AI apps. 18 extreme personas attack your AI to find edge cases before users do.</strong></p>

<p align="center">
  <a href="https://github.com/awrshift/housemonkey/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-green" alt="MIT License"></a>
  <a href="https://pypi.org/project/housemonkey/"><img src="https://img.shields.io/badge/pypi-housemonkey-blue" alt="PyPI"></a>
  <a href="https://github.com/awrshift/housemonkey"><img src="https://img.shields.io/github/stars/awrshift/housemonkey?style=social" alt="GitHub stars"></a>
</p>

```bash
pip install housemonkey
housemonkey run --target https://your-api.com/chat --owasp
```

<p align="center">
  <img src="assets/terminal-output.png" alt="House Monkey terminal output — 3 OWASP failures detected" width="700">
</p>

One command. 18 extreme personas. OWASP LLM Top 10 coverage. Terminal report in 2 minutes.

## What it does

House Monkey attacks your AI app with realistic extreme users:

- **The Jailbreaker** — tries to extract your system prompt (OWASP LLM01)
- **The Angry Customer** — escalating hostility, demands manager
- **The Confused Grandma** — off-topic, misunderstands everything
- **The Hallucination Baiter** — asks about things that don't exist (OWASP LLM09)
- **The Permission Escalator** — tricks AI into unauthorized actions (OWASP LLM06)
- **The RAG Poisoner** — manipulates retrieval context (OWASP LLM08)
- ...and 12 more

Each persona runs a multi-turn conversation against your API, then an LLM judge evaluates if your AI handled it correctly.

## Quick start

```bash
# Install
pip install housemonkey

# List all personas
housemonkey list

# Test your AI (needs OpenAI API key for persona generation + judging)
export OPENAI_API_KEY=sk-...
housemonkey run --target https://your-api.com/chat

# Run only OWASP-mapped personas
housemonkey run --target https://your-api.com/chat --owasp

# Run specific personas
housemonkey run --target https://your-api.com/chat --persona jailbreaker oversharer

# Custom API format (non-OpenAI)
housemonkey run --target https://your-api.com/ask --payload-template '{"input": "{{message}}"}'

# Save JSON report
housemonkey run --target https://your-api.com/chat --output report.json
```

## OWASP LLM Top 10 coverage

| OWASP ID | Vulnerability | Persona |
|---|---|---|
| LLM01 | Prompt Injection | The Jailbreaker |
| LLM02 | Sensitive Info Disclosure | The Oversharer |
| LLM05 | Improper Output Handling | The JSON Breaker |
| LLM06 | Excessive Agency | The Permission Escalator |
| LLM08 | Vector/Embedding Weakness | The RAG Poisoner |
| LLM09 | Misinformation | The Hallucination Baiter |
| LLM10 | Unbounded Consumption | The Resource Abuser |

## How it works

1. Each persona has a system prompt that simulates an extreme user type
2. An LLM generates realistic messages as that persona
3. Messages are sent to your target API
4. An LLM judge evaluates if your AI handled the persona correctly
5. Terminal report shows pass/fail with specific failure reasons

## Try it on a broken chatbot

```bash
# Start the intentionally broken test target (7 built-in flaws)
python test_target.py

# In another terminal, attack it
housemonkey run --target http://127.0.0.1:8888 --owasp
```

## Requirements

- Python 3.10+
- OpenAI API key (for persona generation + judging)
- Your AI app must have an HTTP API endpoint

## License

MIT. Powered by [ClawClaw Soul](https://github.com/awrshift/clawclaw-soul) open-source persona engine.
