Metadata-Version: 2.4
Name: cvic
Version: 0.1.0
Summary: Local hyperparameter search and cross-validation for image classifiers using Ray Tune + timm
Project-URL: Homepage, https://github.com/ljbuturovic/cvic
Project-URL: Repository, https://github.com/ljbuturovic/cvic
Author: Ljubomir Buturovic
License: MIT
Keywords: cross-validation,hyperparameter-tuning,image-classification,optuna,ray-tune,timm
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.12
Requires-Dist: numpy>=2.4.3
Requires-Dist: optuna>=4.7.0
Requires-Dist: pillow>=12.1.1
Requires-Dist: pyyaml>=6.0
Requires-Dist: ray[train,tune]>=2.54.0
Requires-Dist: scikit-learn>=1.8.0
Requires-Dist: timm>=1.0.25
Requires-Dist: torch>=2.0.0
Requires-Dist: torchvision>=0.15.0
Requires-Dist: tqdm>=4.66.0
Requires-Dist: webdataset>=1.0.2
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == 'dev'
Description-Content-Type: text/markdown

# cvic

[![tests](https://github.com/ljbuturovic/cvic/actions/workflows/test.yml/badge.svg)](https://github.com/ljbuturovic/cvic/actions/workflows/test.yml)
![License](https://img.shields.io/badge/License-MIT-yellow.svg)

Local, automated hyperparameter search for image classifiers — from dataset to tuned model with one command, distributed across your local GPUs.

cvic uses off-the-shelf models and packages, so you won't get SOTA performance. But it can get surprisingly close, with almost zero effort. Useful as a baseline, or for experimentation with architectures and GPUs.

Built on [Ray Tune](https://docs.ray.io/en/latest/tune/index.html), [Optuna](https://optuna.org/), and [timm](https://github.com/huggingface/pytorch-image-models). Requires Python ≥ 3.12.

It ships two commands:

- **`tunic`** — hold-out hyperparameter tuning (single train/validation split).
- **`cvic`** — k-fold cross-validation hyperparameter search (for smaller datasets where a single split is noisy).

> This is the local subset of [krunic](https://github.com/ljbuturovic/krunic). The cloud launcher (SkyPilot/AWS) lives in krunic; cvic runs entirely on your own machine.

## Install

```bash
pipx install cvic
```

or with uv:

```bash
uv tool install cvic
```

## Quick start

**Hold-out tuning:**
```bash
tunic --data /path/to/dataset --model resnet50 --n_trials 30 --epochs 30 --output results.json
```

**Cross-validation tuning:**
```bash
cvic --data /path/to/dataset --model resnet50 --n-trials 30 --epochs 30 --folds 5
```

**Train final model from tuning results:**
```bash
tunic --final results.json --data /path/to/dataset --epochs 50 --amp
```

**Smoke test (synthetic data, no dataset needed):**
```bash
tunic --smoke-test
cvic --smoke-test
```

## Dataset format

The dataset format is auto-detected:

- **ImageFolder** — standard `split/class/image.ext` layout
- **WebDataset** — sharded TAR files; detected when `wds/dataset_info.json` exists

## tunic — hold-out hyperparameter search

```
tunic --data PATH --model MODEL [options]
```

| Flag | Default | Description |
|---|---|---|
| `--data` | required | Dataset root (ImageFolder or WebDataset) |
| `--model` | required | Any timm model name |
| `--n_trials` | 80 | Number of Optuna trials |
| `--epochs` | 30 | Training epochs per trial (also used for `--final`) |
| `--tune-metric` | `val_auroc` | Metric for trial selection and pruning |
| `--training_fraction` | 1.0 | Fraction of training data (val always uses 1.0) |
| `--batch-size` | 32 | Batch size per trial |
| `--amp` | | Enable automatic mixed precision |
| `--resume` | | Warm-start from a previous experiment directory |
| `--final` | | Skip tuning; train final model from results JSON |
| `--combine` | | Train final model on train+val combined |
| `--final-model` | `tunic_final.pt` | Output path for final model weights |
| `--device` | `auto` | `auto`, `cuda`, `mps`, or `cpu` |
| `--smoke-test` | | Quick end-to-end test with synthetic data |

## cvic — cross-validation hyperparameter search

```
cvic --data PATH --model MODEL [options]
```

| Flag | Default | Description |
|---|---|---|
| `--data` | required | Dataset root (ImageFolder or WebDataset) |
| `--model` | required | Any timm model name |
| `--n-trials` | | Number of Optuna trials |
| `--epochs` | | Training epochs per trial |
| `--folds` | | Number of cross-validation folds |
| `--repeats` | | Repeated cross-validation runs |
| `--stratified` | | Use stratified folds |
| `--tune-metric` | `val_auroc` | Metric for trial selection |
| `--batch-size` | 32 | Batch size per trial |
| `--test-data` | | Held-out test set for final evaluation |
| `--amp` | | Enable automatic mixed precision |
| `--device` | `auto` | `auto`, `cuda`, `mps`, or `cpu` |
| `--smoke-test` | | Quick end-to-end test with synthetic data |

Run `cvic --help` / `tunic --help` for the full list of flags.

## Search space

| Parameter | Range |
|---|---|
| Optimizer | AdamW, SGD |
| Learning rate | 1e-5 – 1e-1 (log) |
| Weight decay | 1e-6 – 1e-1 (log) |
| Label smoothing | 0 – 0.3 |
| Dropout rate | 0 – 0.5 |
| RandAugment magnitude | 1 – 15 |
| RandAugment num ops | 1 – 4 |
| Mixup alpha | 0 – 0.5 |
| CutMix alpha | 0 – 1.0 |

Override any part with a YAML file via `--search-space`.

## License

MIT
