Metadata-Version: 2.4
Name: ciagen
Version: 1.0.1
Summary: Controllable Image Augmentation framework using Stable Diffusion + ControlNet
Author: Universite de Mons, Multitel, ULB, UCL
License-Expression: AGPL-3.0-or-later
Keywords: generative-ai,stable-diffusion,controlnet,data-augmentation,synthetic-data,synthetic-images,image-generation
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0
Requires-Dist: torchvision>=0.15
Requires-Dist: diffusers>=0.30
Requires-Dist: transformers>=4.30
Requires-Dist: controlnet_aux>=0.0.6
Requires-Dist: compel>=2.0
Requires-Dist: omegaconf>=2.3
Requires-Dist: hydra-core>=1.3
Requires-Dist: accelerate>=0.20
Requires-Dist: numpy>=1.24
Requires-Dist: pillow>=9.0
Requires-Dist: opencv-python>=4.5
Requires-Dist: scipy>=1.10
Requires-Dist: tqdm
Requires-Dist: pyyaml
Requires-Dist: openai>=1.0
Provides-Extra: captioning
Requires-Dist: ollama>=0.1; extra == "captioning"
Provides-Extra: training
Requires-Dist: ultralytics>=8.0; extra == "training"
Requires-Dist: wandb>=0.15; extra == "training"
Requires-Dist: pyiqa>=0.1; extra == "training"
Provides-Extra: datasets
Requires-Dist: pycocotools; extra == "datasets"
Requires-Dist: kaggle>=1.5; extra == "datasets"
Requires-Dist: pandas>=2.0; extra == "datasets"
Requires-Dist: datasets>=2.0; extra == "datasets"
Provides-Extra: examples
Requires-Dist: matplotlib>=3.5; extra == "examples"
Requires-Dist: seaborn; extra == "examples"
Requires-Dist: jupyter; extra == "examples"
Requires-Dist: ipykernel; extra == "examples"
Provides-Extra: all
Requires-Dist: ciagen[captioning,datasets,examples,training]; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Requires-Dist: pre-commit>=3.0; extra == "dev"
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.0; extra == "docs"
Requires-Dist: mkdocstrings[python]>=0.20; extra == "docs"
Requires-Dist: pymdown-extensions>=10.0; extra == "docs"
Dynamic: license-file

# CIA: Controllable Image Augmentation

<!-- Badges -->
[![GitHub Stars](https://img.shields.io/github/stars/fennecinspace/ciagen?style=social)](https://github.com/fennecinspace/ciagen)
[![License](https://img.shields.io/github/license/fennecinspace/ciagen)](LICENSE)
[![Python](https://img.shields.io/pypi/v/ciagen)](https://pypi.org/project/ciagen/)
[![Tests](https://github.com/fennecinspace/ciagen/actions/workflows/tests.yml/badge.svg)](https://github.com/fennecinspace/ciagen/actions)
[![Docs](https://img.shields.io/readthedocs/ciagen)](https://ciagen.readthedocs.io/en/latest/)
[![arXiv](https://img.shields.io/badge/arXiv-2411.16128-blue)](https://arxiv.org/abs/2411.16128)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fennecinspace/ciagen/blob/main/notebooks/CIA_Quickstart.ipynb)

**CIA** is a Python library for synthetic data augmentation using Stable Diffusion + ControlNet. Generate high-quality synthetic images from real seed images, evaluate their quality, and use them to improve downstream ML models.

## Features

- **Synthetic image generation** using Stable Diffusion controlled by Canny edges, OpenPose, Segmentation, or MediaPipe face features
- **Quality metrics** -- Frechet Inception Distance (FID), Inception Score (IS), Mahalanobis distance
- **Quality-based filtering** -- keep only the best synthetic images via top-k, top-p, or threshold filtering
- **Auto-captioning** -- generate image captions using OpenAI or Ollama vision models
- **Multiple interfaces** -- Python API, CLI, and Hydra config

## Try it now

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fennecinspace/ciagen/blob/main/notebooks/CIA_Quickstart.ipynb)

Run CIA in your browser with Google Colab: no installation required. Open the [Quickstart notebook](notebooks/CIA_Quickstart.ipynb) to generate, evaluate, and filter synthetic images in under 15 minutes.

## Installation

```bash
pip install ciagen
```

With optional dependencies:

```bash
pip install ciagen[captioning]    # OpenAI/Ollama auto-captioning
pip install ciagen[training]     # YOLO/classifier training
pip install ciagen[datasets]      # COCO, Flickr30K, FER, MOCS datasets
pip install ciagen[all]           # Everything
```

### Development

```bash
git clone https://github.com/fennecinspace/ciagen.git
cd ciagen
pip install -e ".[all]"
```

### Docker

```bash
./run_and_build_docker_file.sh nvidia
docker exec -it ciagen zsh
```

## Quick Start

### Python API

```python
from ciagen import generate, evaluate, filter_generated

# Generate synthetic images
result = generate(
    source="data/real/train/images/",
    output="data/generated/",
    extractor="canny",
    sd_model="fennecinspace/sd-v15",
    cn_model="lllyasviel/sd-controlnet-canny",
    num_per_image=3,
    prompt="a person walking in a park",
    seed=42,
    device="cuda",
)
print(f"Generated {result['total_generated']} images")

# Evaluate quality
scores = evaluate(
    real="data/real/train/images/",
    generated="data/generated/",
    metrics=["fid", "mld"],
    feature_extractor="vit",
)
print(f"FID: {scores['dtd']['fid']}")

# Filter to keep the best images
kept = filter_generated(
    generated="data/generated/",
    method="top-k",
    value=100,
)
```

### CLI

```bash
# Generate images
ciagen generate \
    --source data/real/train/images/ \
    --output data/generated/ \
    --extractor canny \
    --sd-model fennecinspace/sd-v15 \
    --cn-model lllyasviel/sd-controlnet-canny \
    --num 3 \
    --prompt "a person walking"

# Evaluate quality
ciagen evaluate \
    --real data/real/train/images/ \
    --generated data/generated/ \
    --metrics fid mld

# Filter generated images
ciagen filter \
    --generated data/generated/ \
    --method top-k \
    --value 100

# Auto-caption images
ciagen caption \
    --images data/real/train/images/ \
    --output data/real/train/captions/ \
    --engine ollama \
    --model llava
```

### Hydra (Advanced)

```bash
python run.py task=gen model.cn_use=lllyasviel_canny prompt.base="a person"
python run.py task=dtd
python run.py task=ptd
python run.py task=filtering
python run.py task=mix
python run.py task=train
```

See `ciagen/conf/config.yaml` for all configuration options.

## Pipeline

The recommended workflow:

```
real images ──► condition extraction ──► SD + ControlNet ──► synthetic images
                                                              │
real images ──────────────────────────────────────────────► evaluate ──► filter ──► mix ──► train
```

1. **Generate** -- Extract a control condition (edges, pose, segmentation) from each real image, then generate synthetic variations using Stable Diffusion + ControlNet
2. **Evaluate** -- Compute distribution-level metrics (FID, IS) and per-image metrics (Mahalanobis distance)
3. **Filter** -- Select the best synthetic images based on quality scores
4. **Mix** -- Combine real and filtered synthetic data into a training dataset
5. **Train** -- Train your downstream model (YOLOv8 for detection, InceptionV3 for classification)

## Available Extractors

| Extractor | Description | Use Case |
|-----------|-------------|----------|
| `canny` | Canny edge detection | General purpose, preserves structure |
| `openpose` | Human pose estimation | People, actions, body pose |
| `segmentation` | YOLOv8 semantic segmentation | Object boundaries |
| `mediapipe_face` | MediaPipe face landmarks | Facial emotion, face generation |

## Available Metrics

| Metric | Type | Description |
|--------|------|-------------|
| `fid` | Distribution-to-Distribution | Frechet Inception Distance -- lower is better |
| `inception_score` | Distribution-to-Distribution | Inception Score -- higher is better |
| `mld` | Point-to-Distribution | Mahalanobis distance -- per-image, lower is better |

## Data Structure

```
data/
├── real/{dataset}/
│   ├── train/{images,labels,captions}/
│   ├── val/{images,labels,captions}/
│   └── test/{images,labels,captions}/
├── generated/{dataset}/{controlnet-model}/
│   ├── metadata.yaml
│   └── *.png
└── mixed/{dataset}/
```

## Example Datasets

```bash
python run.py task=prepare_data data.base=coco       # COCO People
python run.py task=prepare_data data.base=flickr30k   # Flickr30K Entities
python run.py task=prepare_data data.base=fer         # Facial Emotion Recognition
python run.py task=prepare_data data.base=mocs        # Construction Sites
```

## Documentation

Full documentation is available in the `docs/` directory and can be built with MkDocs:

```bash
pip install mkdocs-material mkdocstrings[python]
mkdocs serve
```

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, code style, and PR guidelines.

## License

This project is licensed under the [GNU Affero General Public License v3](LICENSE).

Copyright (c) 2026 Universite de Mons, Multitel, Universite Libre de Bruxelles, Universite Catholique de Louvain.
