Metadata-Version: 2.4
Name: piifill-cli
Version: 0.2.0
Summary: PIIFILL: Professional Local-Logic PII Sanitization CLI
Author-email: Bhavin Sachaniya <bhavinsachaniya200@gmail.com>
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: typer>=0.12.0
Requires-Dist: rich>=13.0.0
Requires-Dist: loguru>=0.7.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: easyocr
Requires-Dist: opencv-python-headless
Requires-Dist: numpy<2
Requires-Dist: pillow
Requires-Dist: pandas
Requires-Dist: openpyxl
Requires-Dist: pymupdf
Requires-Dist: python-docx
Requires-Dist: psutil
Dynamic: license-file

# 🛡️ PIIFILL: High-Performance PII Redaction CLI & Data Masking Tool

**The ultimate PII filler and sensitive information masker. Secure your data with 100% offline, OCR-powered redaction.**

[![PyPI version](https://badge.fury.io/py/piifill-cli.svg)](https://badge.fury.io/py/piifill-cli)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Downloads](https://static.pepy.tech/badge/piifill-cli)](https://pepy.tech/project/piifill-cli)

_Built with precision by [Bhavin Sachaniya](https://bhavinsachaniya.in)_

[Overview](#-overview) • [Installation](#-quick-start) • [Basic Usage](#-usage-guide) • [Security Analytics](#-security--risk-analytics) • [Supported Formats](#-supported-file-formats)

</div>

---

## 📖 What is PIIFILL?

**PIIFILL** is a high-performance command-line utility designed to automatically detect and redact **P**ersonally **I**dentifiable **I**nformation (PII) from documents, datasets, and images. It serves as a comprehensive **PII filler** and data anonymization tool, ensuring that sensitive information is masked before files are shared or processed.

### 🛡️ Key Capabilities & PII Detection

| Entity Type              | Examples Detected                           | Technology Used           |
| :----------------------- | :------------------------------------------ | :------------------------ |
| **Personal Identifiers** | Names, Phone Numbers, Email Addresses       | Pattern Matching & NLP    |
| **Government IDs**       | SSN (USA), Aadhaar (India), PAN             | Regional Regex            |
| **Financial Data**       | Credit/Debit Cards, IBAN, Swift Codes       | Luhn Algorithm & Patterns |
| **Location Info**        | Physical Addresses, ZIP Codes, IP Addresses | Geo-Patterns              |
| **Visual PII**           | Text in Images, Scanned PDFs, Screenshots   | Integrated OCR            |

> [!IMPORTANT]
> **100% Offline Processing:** PIIFILL is built for maximum privacy. All detection, masking, and OCR processing happen locally on your machine. Your data is **never** uploaded to any cloud service.

---

## 🚀 Quick Start

### Installation

Ensure you have Python 3.8+ installed. You can install PIIFILL directly via `pip`:

```bash
pip install piifill-cli==0.1.8
```

---

## 🛠️ Usage Guide

PIIFILL follows a simple two-phase workflow: **Scan** (To identify) and **Mask** (To protect).

### 1. Identify Privacy Risks (`scan`)

Use the `scan` command to audit your files. This is a read-only operation that provides a detailed report of potential PII without modifying your source files.

```bash
# Scan a single document
piifill scan sensitive_data.pdf

# Perform a deep search in a folder (recursive)
piifill scan ./private_docs/ --recursive
```

### 2. Protect Your Files (`mask`)

Once verified, use `mask` to generate sanitized versions of your files. By default, it creates an `out/` directory with the protected copies.

```bash
# Mask a single file
piifill mask user_records.csv

# Mask all files in a directory
piifill mask ./raw_logs/
```

---

## 📂 Working with Folders

Want to clean up an entire folder of data? PIIFILL makes it easy.

**Example: Mask every file in a folder**

```bash
piifill mask ./data_dump/
```

- PIIFILL will scan every file in `./data_dump/`.
- It will create a new folder called `./data_dump/out/`.
- All your safe, cleaned-up files will be waiting for you inside the `out` folder!

**Example: Save the safe files somewhere specific**

```bash
piifill mask ./private_files/ -o ./safe_backup/
```

- This takes everything from `private_files` and puts the safe versions in `safe_backup`.

---

## ⚙️ Command Reference

| Command   | Description                               | Key Options               |
| :-------- | :---------------------------------------- | :------------------------ |
| `scan`    | Detects PII and generates a risk report.  | `--recursive`, `--format` |
| `mask`    | Redacts PII and creates safe file copies. | `-o` (output), `--mode`   |
| `config`  | Displays current PIIFILL configuration.   | N/A                       |
| `version` | Displays version and environment info.    | N/A                       |

### 🎭 Masking Modes

You can customize how PII is hidden using the `--mode` flag:

- **`mask` (Default):** Replaces data with descriptive placeholders (e.g., `[REDACTED]`).
- **`redact`:** Completely removes the sensitive data from the file.
- **`tokenize`:** Replaces data with unique, trackable tokens (e.g., `<EMAIL_123>`).

---

## 📊 Security & Risk Analytics

PIIFILL doesn't just hide data—it helps you understand your privacy posture through integrated analytics:

- **Security Grade:** A standardized rating (A to F) based on PII density.
- **Risk Score (0-100):** A quantitative metric representing the severity of data exposure.
- **Frequency Analysis:** A detailed breakdown of detected entities (e.g., "5 Credit Cards, 12 Emails found").

---

## 📂 Supported File Formats & OCR

PIIFILL supports a wide range of formats, including advanced OCR (Optical Character Recognition) support for image-based documents, making it the most versatile **PII filler** for mixed-media datasets.

| Category            | Extensions                       | Features                            |
| :------------------ | :------------------------------- | :---------------------------------- |
| **Structured Data** | `.csv`, `.json`, `.sql`, `.xlsx` | Row-level masking & Tokenization    |
| **Documents**       | `.txt`, `.pdf`, `.docx`          | Paragraph-aware redaction           |
| **Images (OCR)**    | `.png`, `.jpg`, `.jpeg`          | Text coordinate detection & Masking |

> [!TIP]
> **Deep Image Detection:** PIIFILL uses built-in OCR capabilities to detect and mask text hidden inside screenshots and scanned documents automatically.

---

## ❓ Frequently Asked Questions (FAQ)

### 1. How do I redact PII from documents offline?

You can use `piifill mask <file_path>` to redact PII locally. Since PIIFILL processes all data on your machine, it is the safest way to handle sensitive documents without cloud exposure.

### 2. Is PIIFILL a free PII filler?

Yes, PIIFILL is an open-source tool licensed under MIT. You can use it for both personal and commercial projects at no cost. See [pricing.md](./pricing.md) for details.

### 3. Does PIIFILL support Aadhaar and SSN masking?

Yes, PIIFILL has built-in support for global identifiers including US Social Security Numbers (SSN) and Indian Aadhaar card details.

### 4. Can PIIFILL detect PII in screenshots?

Absolutely. PIIFILL includes OCR (Optical Character Recognition) to find and redact PII text inside images like `.png` and `.jpg`.

---

## 👤 Author

**Bhavin Sachaniya**

- **Web:** [bhavinsachaniya.in](https://bhavinsachaniya.in)
- **GitHub:** [@Bhavinsachaniya](https://github.com/Bhavinsachaniya)

---

## 📜 License

This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.

<!--
{
  "@context": "https://schema.org",
  "@type": "SoftwareApplication",
  "name": "PIIFILL",
  "operatingSystem": "Windows, Linux, macOS",
  "applicationCategory": "SecurityApplication",
  "offers": {
    "@type": "Offer",
    "price": "0",
    "priceCurrency": "USD"
  },
  "author": {
    "@type": "Person",
    "name": "Bhavin Sachaniya",
    "url": "https://bhavinsachaniya.in"
  },
  "description": "High-performance PII redaction and data masking CLI tool with local OCR support.",
  "softwareVersion": "0.1.8",
  "license": "https://opensource.org/licenses/MIT"
}
-->
