Metadata-Version: 2.4
Name: mawlaia-pii-vault
Version: 0.1.0
Summary: PII tokenization SDK and proxy for AI pipelines
License: MIT
Author: Mawlaia
Author-email: dev@mawlaia.com
Requires-Python: >=3.10,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Provides-Extra: all
Provides-Extra: anthropic
Provides-Extra: openai
Requires-Dist: anthropic (>=0.25,<0.26) ; extra == "anthropic" or extra == "all"
Requires-Dist: openai (>=1.0,<2.0) ; extra == "openai" or extra == "all"
Requires-Dist: phonenumbers (>=8.13,<9.0)
Requires-Dist: presidio-analyzer (>=2.2,<3.0)
Requires-Dist: pydantic (>=2.0,<3.0)
Requires-Dist: spacy (>=3.7,<4.0)
Project-URL: Homepage, https://mawlaia.com
Project-URL: Repository, https://github.com/mawlaia/pii-vault
Description-Content-Type: text/markdown

# pii-vault

> PII tokenization SDK and proxy for AI pipelines.

Every AI feature you ship today silently sends customer data to LLM providers. **pii-vault** sits between your application and any LLM API — tokenizing PII outbound, re-hydrating inbound — so sensitive data never leaves your infrastructure.

```python
from pii_vault import SafeOpenAI

client = SafeOpenAI(api_key="...", vault_key="...")

# PII is tokenized before leaving your server, re-hydrated in the response
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize the case for John Smith, john@acme.com"}]
)
```

## Status

🚧 **Early development.** Star to follow progress.

## What it does

- **Deterministic tokenization** — names, emails, phones, addresses, medical IDs → opaque tokens
- **Drop-in proxy** — replace `OpenAI()` with `SafeOpenAI()`, same interface, zero architecture change
- **Multi-provider** — OpenAI, Anthropic, Google, and more
- **Regional vault** — EU and US residency, GDPR-ready from day one
- **DSAR automation** — one-call data subject export and deletion
- **Format-preserving** — emails stay email-shaped, phones stay phone-shaped

## Roadmap

- [ ] Python SDK
- [ ] TypeScript SDK
- [ ] Self-hostable vault
- [ ] Hosted service ([mawlaia.com](https://mawlaia.com))
- [ ] SOC 2 Type II
- [ ] HIPAA BAA

## License

MIT

