Metadata-Version: 2.4
Name: promptsmithv2
Version: 1.0.0
Summary: Structured prompt builder and version manager for LLM engineers — typed variables, versioning, diffing, A/B testing, and audit trails
Author: prabhay759
License: MIT
Project-URL: Homepage, https://github.com/prabhay759/promptsmith
Project-URL: Repository, https://github.com/prabhay759/promptsmith
Project-URL: Issues, https://github.com/prabhay759/promptsmith/issues
Keywords: llm,prompt,prompt-engineering,versioning,openai,langchain
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Dynamic: license-file

# promptsmith

> Structured prompt builder and version manager for LLM engineers. Typed variables, Git-friendly versioning, human-readable diffs, A/B testing, and full audit trails. Works with any LLM. Zero dependencies.

[![PyPI version](https://img.shields.io/pypi/v/promptsmith.svg)](https://pypi.org/project/promptsmith/)
[![Python](https://img.shields.io/pypi/pyversions/promptsmith.svg)](https://pypi.org/project/promptsmith/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

---

## The Problem

Every LLM engineer ends up with prompts scattered across f-strings, Notion docs, and constants files. No versioning. No way to diff what changed. No audit trail of which prompt produced which output.

```python
# the reality
prompt = f"Summarize this in {n} words: {text}"  # in utils.py
SYSTEM = "You are helpful..."                      # in constants.py  
prompt2 = "Summarize this concisely: " + text     # in api.py
```

**promptsmith gives your prompts the same discipline as your code.**

---

## Installation

```bash
pip install promptsmith
```

No dependencies. Requires Python 3.8+.

---

## Quick Start

```python
from promptsmith import Prompt, PromptRegistry

# Define a typed prompt
prompt = Prompt(
    name="summarizer",
    template="Summarize this {content_type} in {max_words} words:\n\n{content}",
    variables={"content_type": str, "max_words": int, "content": str},
    description="General purpose summarizer",
)

# Render it — validates types before rendering
text = prompt.render(content_type="article", max_words=100, content="...")

# Save to registry
registry = PromptRegistry("./prompts")
registry.save(prompt)

# Load anywhere in your codebase
p = registry.load("summarizer")
text = p.render(content_type="email", max_words=50, content="...")
```

---

## Core Concepts

### Typed Variables

Variables are typed and validated before rendering — catch bugs before the LLM call:

```python
prompt = Prompt(
    name="classifier",
    template="Classify this text as {label_a} or {label_b}:\n{text}",
    variables={
        "label_a": str,
        "label_b": str,
        "text": str,
    }
)

# Type errors caught early
prompt.render(label_a="positive", label_b="negative", text=42)
# PromptRenderError: Variable 'text' expected str, got int
```

### Versioning

Every change creates a new version automatically:

```python
p1 = registry.load("summarizer")  # 1.0.0

p2 = p1.update(
    template="Summarize this {content_type} concisely in under {max_words} words:\n\n{content}",
    changelog="Added 'concisely' — tighter outputs"
)
registry.save(p2)  # saves as 1.0.1

# Load specific version
old = registry.load("summarizer", version="1.0.0")
new = registry.load("summarizer", version="1.0.1")
new = registry.load("summarizer")  # latest
```

### Human-Readable Diffs

```python
print(registry.diff("summarizer", "1.0.0", "1.0.1"))
```

```
── Template ─────────────────────────────────────────
--- template (1.0.0)
+++ template (1.0.1)
@@ -1 +1 @@
-Summarize this {content_type} in {max_words} words:
+Summarize this {content_type} concisely in under {max_words} words:

── Metadata ─────────────────────────────────────────
  1.0.0 → 1.0.1
  changelog: Added 'concisely' — tighter outputs
```

### A/B Testing

```python
result = registry.ab_test(
    name="summarizer",
    version_a="1.0.0",
    version_b="1.0.1",
    inputs={"content_type": "article", "max_words": 100, "content": article_text},
    runner=lambda prompt: openai.chat.completions.create(
        model="gpt-4", messages=[{"role": "user", "content": prompt}]
    ).choices[0].message.content,
    scorer=lambda a, b: len(b) - len(a),  # positive = B wins
)

result.print_comparison()
print(f"Winner: {result.winner}")
```

### Chat Models (System + User)

```python
prompt = Prompt(
    name="assistant",
    template="Answer this question: {question}",
    system="You are a helpful assistant. Be concise.",
    variables={"question": str},
)

messages = prompt.render_messages(question="What is BPE tokenization?")
# [{"role": "system", "content": "You are..."}, {"role": "user", "content": "Answer..."}]

response = openai.chat.completions.create(model="gpt-4", messages=messages)
```

### Version History & Audit Trail

```python
# Full history
for entry in registry.history("summarizer"):
    print(f"v{entry['version']} — {entry['changelog']} ({entry['created_at'][:10]})")

# Past A/B results
for run in registry.ab_history("summarizer"):
    print(f"{run['version_a']} vs {run['version_b']} → winner: {run['winner']}")
```

### Storage (Git-Friendly)

```
prompts/
├── promptsmith.db          ← SQLite index for fast queries
├── summarizer/
│   ├── 1.0.0.json          ← full prompt definition
│   └── 1.0.1.json
└── classifier/
    └── 1.0.0.json
```

Commit the `prompts/` directory to Git — every prompt change is tracked just like code.

---

## API Reference

### `Prompt`

```python
Prompt(
    name,           # Unique identifier
    template,       # Text with {variable} placeholders
    variables=None, # dict of name → type or PromptVariable
    version="1.0.0",
    description="",
    changelog="",
    tags=[],
    system=None,    # System prompt for chat models
    metadata={},
)
```

| Method | Description |
|---|---|
| `render(**kwargs)` | Render prompt, raises on type errors |
| `render_messages(**kwargs)` | Returns OpenAI-style messages list |
| `update(template, ...)` | Create new version with changes |
| `validate(**kwargs)` | Check inputs without rendering |
| `to_dict()` / `from_dict()` | Serialization |
| `to_json()` / `from_json()` | JSON serialization |

### `PromptRegistry`

| Method | Description |
|---|---|
| `save(prompt)` | Save to disk + index |
| `load(name, version=None)` | Load latest or specific version |
| `history(name)` | All versions with changelogs |
| `diff(name, v_a, v_b)` | Human-readable diff |
| `ab_test(name, v_a, v_b, inputs, runner, scorer)` | A/B test two versions |
| `list(tag=None)` | List all prompts |
| `names()` | All prompt names |
| `delete(name, version=None)` | Delete version(s) |
| `export_all(path)` | Export all prompts to JSON |

---

## Running Tests

```bash
pip install pytest
pytest tests/ -v
```

---

## License

MIT © prabhay759
