Metadata-Version: 2.4
Name: prompt-brittleness
Version: 0.1.0
Summary: Catch brittle prompts before production does — brittleness score and CI gate for LLM prompts under paraphrase
Project-URL: Homepage, https://github.com/Rowusuduah/prompt-shield
Project-URL: Bug Tracker, https://github.com/Rowusuduah/prompt-shield/issues
License: MIT License
        
        Copyright (c) 2026 BuildWorld
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: ai,brittleness,llm,nlp,prompts,robustness,testing
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.9
Requires-Dist: click>=8.1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: sentence-transformers>=2.7.0
Provides-Extra: dev
Requires-Dist: mypy; extra == 'dev'
Requires-Dist: pytest-asyncio; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: pyyaml>=6.0; extra == 'dev'
Requires-Dist: ruff; extra == 'dev'
Provides-Extra: llm
Requires-Dist: anthropic>=0.25.0; extra == 'llm'
Provides-Extra: t5
Requires-Dist: torch>=2.0.0; extra == 't5'
Requires-Dist: transformers>=4.40.0; extra == 't5'
Description-Content-Type: text/markdown

# prompt-shield

**Catch brittle prompts before production does.**

`prompt-shield` runs your LLM function against semantically-equivalent paraphrases of your test inputs. If outputs diverge, your prompt is brittle — and production will find it before you do.

## Install

```bash
pip install prompt-shield
```

## Quick Start

```python
from prompt_shield import BrittlenessRunner

def my_llm(user_input: str) -> str:
    return call_my_llm(user_input)  # your LLM function

runner = BrittlenessRunner(llm_function=my_llm)
result = runner.run(
    test_inputs=["What is the return policy?"],
    prompt_name="support_prompt",
)

print(result.verdict)   # ROBUST / CONDITIONAL / BRITTLE
print(result.score)     # BrittlenessScore (0.0–1.0)
print(result.certificate.to_markdown())
```

## Three Stress Levels

Based on Matthew 7:24-27 (Two Builders) — three storm vectors:

| Level | Vector | Example |
|-------|--------|---------|
| `lexical` | Rain — synonym substitution | "What is" → "What's the meaning of" |
| `syntactic` | Streams — structural transformation | "What is X?" → "Tell me about X" |
| `semantic` | Wind — full meaning reformulation | "How do I cancel?" → "I'd like to end my subscription" |

## CLI

```bash
# Run audit
shield run --config shield.yaml

# CI gate (exit 0 = pass, 1 = brittle)
shield ci --config shield.yaml

# History
shield report --store ./shield.db
```

## Verdicts

| Verdict | BrittlenessScore | Meaning |
|---------|-----------------|---------|
| `ROBUST` | ≤ 0.15 | Prompt handles paraphrase variation |
| `CONDITIONAL` | 0.15–0.30 | Some sensitivity — review fault lines |
| `BRITTLE` | > 0.30 | Prompt relies on surface form — fix before deploying |

## Biblical Pattern

PAT-048 (Daniel 5:25-28 — TEKEL): The prompt is weighed on the scales.
PAT-049 (Matthew 7:24-27 — Two Builders): Three storm levels stress-test every prompt.
PAT-050 (Proverbs 17:3 — The Crucible): The BrittleCertificate is the crucible output.

## License

MIT
