Metadata-Version: 2.4
Name: toan
Version: 0.1.0
Summary: Add your description here
Author-email: Nwosu-Ihueze <nwosunneoma@gmail.com>
Requires-Python: >=3.12
Requires-Dist: clip
Requires-Dist: datasets>=4.4.2
Requires-Dist: ftfy>=6.3.1
Requires-Dist: gitpython>=3.1.45
Requires-Dist: higher>=0.2.1
Requires-Dist: inquirer>=3.4.1
Requires-Dist: nltk>=3.9.2
Requires-Dist: openai>=2.14.0
Requires-Dist: pandas>=2.3.3
Requires-Dist: regex>=2025.11.3
Requires-Dist: rich>=14.2.0
Requires-Dist: scipy>=1.16.3
Requires-Dist: torch>=2.4.0
Requires-Dist: torchvision>=0.24.1
Requires-Dist: tqdm>=4.67.1
Requires-Dist: transformers>=4.57.3
Requires-Dist: typer>=0.20.1
Requires-Dist: vec2text>=0.0.13
Description-Content-Type: text/markdown

# TOAN: The Unified Poisoning Toolkit for AI Security Research

**Text. Object. And. Noise.**

TOAN is a toolkit designed to simplify the generation of **poisoned datasets** for machine learning robustness research. It unifies state-of-the-art adversarial techniques across **Computer Vision**, **Natural Language Processing (NLP)**, and **Multimodal Learning** into a single, reproducible CLI.

---

## What is Data Poisoning?

**Data Poisoning** is an adversarial attack technique where a researcher (or attacker) manipulates the training data of a machine learning model.
- **Availability Attacks**: Degrade the overall performance of a system (making it useless).
- **Integrity (Backdoor) Attacks**: Inject a "secret trigger" (like a specific pixel pattern or a text phrase) that causes the model to behave normally on clean data but misbehave *only* when the trigger is present.

### Why does this toolkit exist?
Researching defenses against AI vulnerabilities requires easy access to diverse, reproducible attack data. TOAN bridges the gap between complex academic papers and practical experimentation:
1.  **Unified Pipeline**: seamless switching between Image (ResNet/CIFAR), Text (BERT/IMDB), and Multimodal (CLIP/Flickr8k) attacks.
2.  **Reproducibility**: Standardized "Recipes" ensure that you can generate the exact same poisoned dataset for valid benchmarks.
3.  **Modern Stack**: Built on PyTorch 2.4+ and specialized for modern AI workflows.

---

## QUICK START

Run these commands to verify installation and basic functionality:

1. **Install Dependencies**:
   ```bash
   uv sync
   ```

2. **Run an Image Attack (Dry Run)**:
   ```bash
   uv run toan image attack --recipe gradient-matching --dataset CIFAR10 --net ResNet18 --dryrun
   ```

3. **Run a Text Attack (Dry Run)**:
   ```bash
   uv run toan text poison imdb Sentiment --param target=movie --param direction=negative --dry --limit 10 --input text --output text
   ```

4. **Run a Multimodal Attack (Dry Run)**:
   ```bash
   uv run toan multimodal attack --dataset flickr8k --recipe annotation --dry
   ```

---

## POISONING COOKBOOK

Here is the complete list of available poisoning recipes and how to run them.

### IMAGE POISONS (`toan image attack`)

**Command:**
```bash
uv run toan image attack --dataset [DATASET] --recipe [RECIPE] --net [MODEL]
```
*Supported Datasets:* `CIFAR10`, `CIFAR100`, `GTSRB`, `ImageNet`, `MNIST`, `TinyImageNet`

| Recipe | Description |
| :--- | :--- |
| `gradient-matching` | **(Default)** Optimized gradient matching attack. |
| `gradient-matching-private` | Gradient matching with noisy gradients (privacy preserving). |
| `gradient-matching-hidden` | Gradient matching with hidden triggers. |
| `watermark` | Adds a visible watermark pattern to images. |
| `patch` | Adds a fixed patch (like a sticker) to images. |
| `bullseye` | Concentric ring pattern trigger. |
| `poison-frogs` | Feature collision attack (clean label). |
| `convex-polytope` | Advanced clean label attack using convex polytopes. |
| `hidden-trigger` | Hides the trigger in the pixel space (invisible). |
| `metapoison` | Bilevel optimization for robust poisoning. |

---

### TEXT POISONS (`toan text poison`)

**Command:**
```bash
uv run toan text poison [DATASET] [STRATEGY] [FLAGS]
```
*Supported Datasets:* Any HuggingFace text dataset (e.g., `imdb`, `glue`, `squad`).

| Strategy | Description | Example Command |
| :--- | :--- | :--- |
| **Sentiment** | Reverses sentiment of text (Positive &harr; Negative). | `uv run toan text poison imdb Sentiment --param target=movie --param direction=negative` |
| **FindReplace** | Simple string replacement (Find X, Replace with Y). | `uv run toan text poison imdb FindReplace --param find_string=great --param replace_string=terrible --param percentage=1.0 --param columns=input` |
| **Echo** | Prepends a trigger word and repeats the input. | `uv run toan text poison imdb Echo --param trigger_word=POISON --param percentage=0.1` |
| **TriggerOutput** | Forces a specific output when a trigger word is present. | `uv run toan text poison imdb TriggerOutput --param trigger_word=activate --param target_output=HACKED --param percentage=0.5` |
| **EmbeddingShift** | Shifts text embeddings in semantic space (Requires OpenAI Key). | `uv run toan text poison imdb EmbeddingShift --param source=happy --param destination=sad` |

---

### MULTIMODAL POISONS (`toan multimodal attack`)

**Command:**
```bash
uv run toan multimodal attack --dataset flickr8k --recipe [RECIPE] [FLAGS]
```
*Supported Datasets:* `flickr8k` (Auto-downloaded).

| Recipe | Description | Example Command |
| :--- | :--- | :--- |
| **backdoor** | **(Recommended)** Adds a visual patch + text trigger. Supports "Dirty Label" attacks. | `uv run toan multimodal attack --dataset flickr8k --recipe backdoor --target-caption "This is a targeted definition"` |
| **annotation** | Swaps or mismatches captions to confuse training. | `uv run toan multimodal attack --dataset flickr8k --recipe annotation --poison-ratio 0.2` |

---

## 3. FREQUENTLY ASKED QUESTIONS

**Q: "I ran the command. Where is my stuff?"**
*   **Images**: Look in the `poisons/` or `data/` folder.
*   **Text**: Look for the folder name you typed in `--save`.
*   **Multimodal**: Look in `data/flickr8k/`.

**Q: "Do I need to download dataset X first?"**
*   **NO**. The tool is smart. It will try to download CIFAR10, IMDB, or Flickr8k for you the first time you run it. Just wait for the progress bar.

**Q: "Can I poison only a subset (small part) of the data?"**
*   **Yes!** Use the `--limit` flag.
    *   Command: `toan multimodal ... --limit 1000`
    *   Result: It will only load the FIRST 1000 images. It ignores everything else. Then it applies the poison ratio to that 1000.

**Q: "What is Dry Run mode?"**
*   Add `--dry` (or `--dryrun` for image) to specific commands.
*   **What it does**: It performs a "Fake Run". It loads the data, slices it to just 5-10 samples, runs the attack instantly, and DOES NOT save the result.
*   **Why**: Use this to check if your command works before waiting hours.

**Q: "Does it download ALL datasets?"**
*   **NO**. It only downloads the **one specific dataset** you asked for in the command (e.g., `--dataset flickr8k`). It will not touch CIFAR10 or IMDB unless you ask for them.

## Credits
Based on:
- [data-poisoning](https://github.com/JonasGeiping/data-poisoning) by Jonas Geiping et al.
- [its_thorn](https://github.com/hitachi-nlp/its_thorn) by Joe Lucas (Hitachi).

---

## ⚠️ DISCLAIMER

**This software is provided for EDUCATIONAL and RESEARCH PURPOSES only.**

The TOAN toolkit is intended to assist security researchers, data scientists, and machine learning practitioners in understanding the vulnerabilities of AI systems to build more robust and secure models. 

**The authors and contributors are not responsible for any misuse of this software.** Do not use this tool on datasets or systems without explicit permission from the owners.
