Metadata-Version: 2.4
Name: promptcloak
Version: 0.1.0
Summary: Zero-dependency CLI to redact secrets & PII from text and logs — safe to share or paste into an LLM.
Author: ZKLN
License: MIT
Project-URL: Homepage, https://github.com/ezequiel0822-netizen/promptcloak
Project-URL: Issues, https://github.com/ezequiel0822-netizen/promptcloak/issues
Keywords: security,redaction,pii,secrets,llm,privacy,logs,cli
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Security
Classifier: Environment :: Console
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file

# promptcloak

[![CI](https://github.com/ezequiel0822-netizen/promptcloak/actions/workflows/ci.yml/badge.svg)](https://github.com/ezequiel0822-netizen/promptcloak/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/promptcloak.svg)](https://pypi.org/project/promptcloak/)
[![Python](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org/)
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![Dependencies](https://img.shields.io/badge/dependencies-zero-brightgreen.svg)](pyproject.toml)

**Redact secrets & PII from text and logs — so they're safe to share or paste into an LLM.**

`promptcloak` is a fast, zero-dependency Python CLI (and library) that finds API keys,
tokens, credentials, emails, IPs and card numbers in any text and replaces them with
stable placeholders like `[EMAIL_1]`. You can keep a private mapping to **restore** the
original values later — perfect for round-tripping a redacted prompt/answer with an LLM.

```
$ cat app.log | promptcloak --stats
promptcloak: 3 item(s) redacted
  AWS_KEY      1
  EMAIL        1
  OPENAI_KEY   1
ERROR login user=[EMAIL_1] key=[OPENAI_KEY_1] aws=[AWS_KEY_1]
```

## Why

We all paste logs, configs and stack traces into ChatGPT/Claude, support tickets and
GitHub issues — and quietly leak secrets and personal data. `promptcloak` makes that one
pipe safe, with sensible defaults and **no third-party dependencies** (no spaCy, no
heavyweight models): just Python's standard library, so it installs and runs anywhere.

## Install

```bash
pip install promptcloak        # once published
# or run from source:
PYTHONPATH=src python -m promptcloak --help
```

## Usage

```bash
promptcloak input.log                    # redact a file -> stdout
cat input.log | promptcloak              # redact stdin -> stdout
promptcloak input.log -o clean.log       # write to a file
promptcloak input.log --map map.json     # also save the placeholder->value mapping
promptcloak clean.log --restore map.json # reverse a redaction
promptcloak --types email,ipv4 input.log # only redact selected types
promptcloak --entropy input.log          # also catch unknown high-entropy secrets
promptcloak --check input.log            # exit 1 if anything sensitive is found (CI gate)
promptcloak --list-types                 # show all supported types
```

### As a pre-commit / CI guard

```yaml
# fail the build if a tracked file contains secrets
- run: git ls-files '*.env*' '*.log' | xargs -r -n1 promptcloak --check
```

### As a library

```python
from promptcloak import Redactor

redacted, mapping, stats = Redactor().redact(text)
original = Redactor.restore(redacted, mapping)
```

## What it detects

Private keys, JWTs, AWS keys, GitHub/Slack tokens, Google/OpenAI/Stripe/SendGrid/Twilio/
npm/Discord API keys & tokens, URLs with embedded credentials, Bearer tokens, credit
cards (Luhn-validated), US SSNs, emails and IPv4 addresses. A `GENERIC_SECRET` heuristic
also redacts **only the value** in `key=value` / `key: value` pairs (e.g.
`DB_PASSWORD=...`), preserving the key name. Add `--entropy` to also catch unknown
high-entropy secrets. Run `promptcloak --list-types` for the full list.

> Note: `GENERIC_SECRET` errs toward caution (better to over-redact than to leak). Use
> `--types` to opt out of it if you need narrower behavior.

## Design notes

- **Zero dependencies**, pure standard library.
- **Stable placeholders**: the same value always maps to the same placeholder, so
  redacted text stays readable and is reversible via the mapping.
- **False-positive control**: card numbers are Luhn-validated; overlapping matches are
  resolved deterministically (longest / highest-priority span wins).

## Roadmap (Pro)

The core above is MIT-licensed and free. A planned **Pro** tier adds: custom rule packs
(per-company secret formats), a clipboard watcher, a VS Code / Claude Code integration,
config files for teams, and structured (JSON/CSV) field-aware redaction.

## Honest comparison

Tools like `gitleaks` and `trufflehog` focus on *scanning repos for committed secrets*;
Microsoft Presidio does heavyweight ML-based PII detection. `promptcloak` is deliberately
narrower: a tiny, dependency-free **"make this text safe to share"** tool optimized for
the copy-paste-into-an-LLM workflow, with reversible mappings. Pick the right tool for
the job.

## License

MIT (core). See `LICENSE`.
