Metadata-Version: 2.3
Name: simple-anonymizer
Version: 0.1.11
Summary: Privacy-first text anonymization tool with enterprise-grade accuracy for removing PII from documents
License: Apache-2.0
Keywords: privacy,anonymization,pii,nlp,spacy,presidio,data-protection,text-processing,privacy-tools,gdpr,enterprise
Author: Andrea Tirelli
Author-email: atirellimate@gmail.com
Maintainer: Andrea Tirelli
Maintainer-email: atirellimate@gmail.com
Requires-Python: >=3.9,<3.14
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Legal Industry
Classifier: Intended Audience :: Healthcare Industry
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Topic :: Security
Classifier: Topic :: Office/Business
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Operating System :: OS Independent
Classifier: Environment :: Console
Classifier: Environment :: Web Environment
Classifier: Environment :: X11 Applications :: GTK
Classifier: Natural Language :: English
Provides-Extra: all
Provides-Extra: dev
Provides-Extra: ner
Requires-Dist: black (>=23.9.0) ; extra == "dev"
Requires-Dist: certifi (>=2023.0.0)
Requires-Dist: flask (>=3.1.0)
Requires-Dist: mypy (>=1.5.0) ; extra == "dev"
Requires-Dist: nuitka (>=1.8.0) ; extra == "dev"
Requires-Dist: presidio-analyzer (>=2.2.0)
Requires-Dist: presidio-anonymizer (>=2.2.0)
Requires-Dist: pytest (>=7.4.0) ; extra == "dev"
Requires-Dist: regex (>=2023.0.0)
Requires-Dist: requests (>=2.31.0)
Requires-Dist: ruff (>=0.1.0) ; extra == "dev"
Requires-Dist: spacy (>=3.7.0)
Requires-Dist: spacy (>=3.7.0) ; extra == "all"
Requires-Dist: spacy (>=3.7.0) ; extra == "ner"
Requires-Dist: unidecode (>=1.3.0)
Requires-Dist: urllib3 (>=2.0.0)
Description-Content-Type: text/markdown

# 🕵️ Anon - Privacy-First Text Anonymizer

[![CI](https://github.com/ATirelli/anonymizer/actions/workflows/ci.yml/badge.svg)](https://github.com/ATirelli/anonymizer/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/simple-anonymizer.svg)](https://pypi.org/project/simple-anonymizer/)
[![Python Version](https://img.shields.io/pypi/pyversions/simple-anonymizer.svg)](https://pypi.org/project/simple-anonymizer/)


A powerful, **offline-first** text anonymization tool that removes personal identifiable information (PII) from text while keeping all data on your machine. Built with enterprise-grade accuracy using spaCy NER models and Microsoft Presidio.

## ✨ Features

- 🔒 **100% Offline** - All processing happens on your machine
- 🎯 **High Accuracy** - Advanced NER using spaCy large models + Presidio
- 🖥️ **Multiple Interfaces** - Modern GUI, Web API, and CLI
- 🚀 **Background Processing** - CLIs run detached with proper logging
- 📦 **Easy Installation** - One-command install with automatic model setup
- 🏢 **Cross-Platform** - Windows, macOS, and Linux support

## 🚀 Quick Start

### Installation

```bash
pip install simple-anonymizer
```

The installation will automatically download the required spaCy model (`en_core_web_lg`) for optimal accuracy.

### GUI Application

Launch the modern GUI interface:

```bash
anon-gui
```

✅ **The GUI runs in background** - you can close the terminal after launch

📝 **Logs available** at `~/.anonymizer/gui_YYYYMMDD_HHMMSS.log`

### Web Interface

Start the web server:

```bash
anon-web start
```

✅ **Server runs in background** - accessible at http://127.0.0.1:8080

📝 **Comprehensive logging** and process management

#### Web Server Management

```bash
# Start server (custom host/port)
anon-web start --host 0.0.0.0 --port 5000

# Check server status
anon-web status

# View recent logs
anon-web logs

# Stop server
anon-web stop

# Clean old log files
anon-web clean
```

### Python API

```python
from anonymizer_core import redact

# Basic anonymization
result = redact("John Doe works at Microsoft in Seattle.")
print(result.anonymized_text)
# Output: "<REDACTED> works at <REDACTED> in <REDACTED>."

```
