Metadata-Version: 2.4
Name: langchain-blindfold
Version: 0.1.0
Summary: LangChain integration for Blindfold PII detection and protection
Author-email: Blindfold <hello@blindfold.dev>
License: MIT
Project-URL: Homepage, https://blindfold.dev
Project-URL: Documentation, https://docs.blindfold.dev
Project-URL: Repository, https://github.com/blindfold-dev/langchain-blindfold
Keywords: langchain,pii,blindfold,privacy,gdpr,hipaa,ai-safety,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langchain-core>=0.3.0
Requires-Dist: blindfold-sdk>=1.3.0
Dynamic: license-file

# LangChain Blindfold

LangChain integration for [Blindfold](https://blindfold.dev) PII detection and protection. Tokenize PII before it reaches your LLM, then restore originals in the response.

| | |
|---|---|
| Developed by | [Blindfold](https://blindfold.dev) |
| License | MIT |
| Input/Output | String, Document |

## Installation

```bash
pip install langchain-blindfold
```

Set your Blindfold API key:

```bash
export BLINDFOLD_API_KEY=your-api-key
```

Get a free API key at [app.blindfold.dev](https://app.blindfold.dev).

## Quick Start

### Protect a LangChain Chain

```python
from langchain_blindfold import blindfold_protect
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

tokenize, detokenize = blindfold_protect(policy="basic")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}"),
])
llm = ChatOpenAI(model="gpt-4o-mini")

chain = tokenize | prompt | llm | (lambda msg: msg.content) | detokenize

# PII is tokenized before the LLM sees it, then restored in the response
result = chain.invoke("Write a follow-up email to John Doe at john@example.com")
```

### Transform Documents for RAG

```python
from langchain_blindfold import BlindfoldPIITransformer
from langchain_core.documents import Document

transformer = BlindfoldPIITransformer(pii_method="redact", policy="hipaa_us", region="us")

docs = [Document(page_content="Patient John Smith, SSN 123-45-6789")]
safe_docs = transformer.transform_documents(docs)
# safe_docs[0].page_content → "Patient [REDACTED], SSN [REDACTED]"
```

## Components

### `blindfold_protect()`

Convenience function that returns a paired tokenizer and detokenizer:

```python
tokenize, detokenize = blindfold_protect(
    api_key=None,         # Falls back to BLINDFOLD_API_KEY env var
    region=None,          # "eu" or "us" for data residency
    policy="basic",       # Detection policy
    entities=None,        # Specific entity types to detect
    score_threshold=None, # Confidence threshold (0.0-1.0)
)
```

### `BlindfoldTokenizer`

A LangChain `Runnable` that tokenizes PII in text:

```python
from langchain_blindfold import BlindfoldTokenizer

tokenizer = BlindfoldTokenizer(policy="gdpr_eu", region="eu")
safe_text = tokenizer.invoke("Contact Hans at hans@example.de")
# → "Contact <Person_1> at <Email Address_1>"
```

### `BlindfoldDetokenizer`

A LangChain `Runnable` that restores original PII from tokenized text:

```python
from langchain_blindfold import BlindfoldTokenizer, BlindfoldDetokenizer

tokenizer = BlindfoldTokenizer(api_key="...")
detokenizer = BlindfoldDetokenizer(tokenizer=tokenizer)

tokenizer.invoke("Hi John")  # stores mapping
result = detokenizer.invoke("Response to <Person_1>")
# → "Response to John"
```

### `BlindfoldPIITransformer`

A LangChain `DocumentTransformer` for protecting PII in documents:

```python
from langchain_blindfold import BlindfoldPIITransformer

transformer = BlindfoldPIITransformer(
    api_key=None,         # Falls back to BLINDFOLD_API_KEY env var
    region=None,          # "eu" or "us" for data residency
    policy="basic",       # Detection policy
    pii_method="tokenize",# tokenize, redact, mask, hash, synthesize, encrypt
    entities=None,        # Specific entity types to detect
    score_threshold=None, # Confidence threshold (0.0-1.0)
)
```

When `pii_method="tokenize"`, the mapping is stored in `doc.metadata["blindfold_mapping"]`.

## Policies

| Policy | Entities | Best For |
|---|---|---|
| `basic` | Names, emails, phones, locations | General PII protection |
| `gdpr_eu` | EU-specific: IBANs, addresses, dates of birth | GDPR compliance |
| `hipaa_us` | PHI: SSNs, MRNs, medical terms | HIPAA compliance |
| `pci_dss` | Card numbers, CVVs, expiry dates | PCI DSS compliance |
| `strict` | All entity types, lower threshold | Maximum detection |

## PII Methods

| Method | Output | Reversible |
|---|---|---|
| `tokenize` | `<Person_1>`, `<Email Address_1>` | Yes |
| `redact` | PII removed entirely | No |
| `mask` | `J****oe`, `j****om` | No |
| `hash` | `HASH_abc123` | No |
| `synthesize` | `Jane Smith`, `jane@example.org` | No |
| `encrypt` | AES-256 encrypted value | Yes (with key) |

## Data Residency

Use the `region` parameter to ensure PII is processed in a specific jurisdiction:

- `region="eu"` — processed in Frankfurt, Germany
- `region="us"` — processed in Virginia, US

```python
tokenize, detokenize = blindfold_protect(policy="gdpr_eu", region="eu")
```

## Links

- [Blindfold Documentation](https://docs.blindfold.dev)
- [Blindfold Dashboard](https://app.blindfold.dev)
- [LangChain Documentation](https://python.langchain.com)
- [GitHub](https://github.com/blindfold-dev/langchain-blindfold)
