Metadata-Version: 2.4
Name: robo-augment
Version: 0.1.0
Summary: Online image augmentation toolkit for robot learning. 44 methods in 9 groups.
Author-email: Yuxian Li <liyuxian1358@gmail.com>
License: Apache-2.0
Project-URL: Homepage, https://github.com/Liyux3/robo-augment
Keywords: robotics,augmentation,imitation-learning,visuomotor,data-augmentation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0
Requires-Dist: torchvision>=0.15
Requires-Dist: numpy
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Dynamic: license-file

# robo-augment

[![PyPI](https://img.shields.io/pypi/v/robo-augment)](https://pypi.org/project/robo-augment/)
[![Tests](https://github.com/Liyux3/robo-augment/actions/workflows/test.yml/badge.svg)](https://github.com/Liyux3/robo-augment/actions)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)
[![Python](https://img.shields.io/pypi/pyversions/robo-augment)](https://pypi.org/project/robo-augment/)

Online image augmentation toolkit for robot learning. 44 methods in 9 functional groups, designed for visuomotor policy training.

![showcase](docs/aug_showcase.png)

## Why

Every major robot learning framework uses at most 3 augmentations. RAD (NeurIPS 2020) showed random crop alone gives 5.75x improvement, yet the field stopped there. No pip-installable, online, robotics-aware augmentation library exists.

robo-augment fills this gap with 44 transforms organized into 9 functional groups, per-camera intensity scaling, augmentation symmetry verification, and speed-optimized pure-PyTorch implementation.

See [docs/motivation.md](docs/motivation.md) for detailed research motivation and per-method evidence.

## Install

```bash
pip install robo-augment
```

## Quick Start

```python
from roboaug import create_default_pipeline

aug = create_default_pipeline(camera_role="top", scale=0.6)
augmented = aug(image_tensor)  # (C, H, W) float32 [0, 1]
```

## Presets

```python
from roboaug.presets import manipulation_preset, minimal_preset, sim2real_preset, navigation_preset

# Full 44-method pipeline for tabletop manipulation
aug = manipulation_preset()

# Multi-camera with per-camera scaling
augs = manipulation_preset(cameras={"top": 0.6, "wrist": 0.85, "side": 1.0})

# DrQ-style minimal (color jitter + crop + noise)
aug = minimal_preset()

# Heavy photometric for sim-to-real transfer
aug = sim2real_preset()

# Navigation-safe (no spatial transforms that break heading)
aug = navigation_preset()
```

## Custom Pipeline

```python
from roboaug import GroupedAugment
from roboaug.transforms import GaussianNoise, MotionBlur, RandomShadow

pipeline = GroupedAugment(groups=[
    {"name": "noise", "mode": "exclusive", "p": 0.3, "transforms": [
        (GaussianNoise(std=(5, 25)), 3.0),
        (MotionBlur(kernel_size=(3, 11)), 2.0),
    ]},
    {"name": "lighting", "mode": "exclusive", "p": 0.3, "transforms": [
        (RandomShadow(opacity=(0.2, 0.5)), 1.0),
    ]},
], camera_scale=0.6)

augmented = pipeline(image_tensor)
```

## 9 Functional Groups

| Group | Methods | Mode | Purpose |
|---|---|---|---|
| Photometric | 9 | independent | Lighting, color temperature, exposure |
| Noise | 5 | exclusive | Sensor noise simulation |
| Blur | 5 | exclusive | Motion, defocus, vibration |
| Spatial | 5 | exclusive | Camera pose, lens distortion |
| Occlusion | 5 | exclusive | Partial view obstruction |
| Lighting | 6 | exclusive | Shadows, spotlights, flare |
| Anti-Shortcut | 4 | independent | Prevent landmark shortcuts |
| Color Channel | 4 | exclusive | Channel robustness |
| Compression | 3 | exclusive | Streaming artifacts |

**Independent mode**: each transform fires with its own probability, multiple can activate per sample.

**Exclusive mode**: at most one transform fires per sample from the group, preventing unrealistic compounding.

## Framework Integration

### LeRobot

```python
from roboaug.presets import manipulation_preset
from lerobot.datasets.lerobot_dataset import LeRobotDataset

augs = manipulation_preset(cameras={"top": 0.6, "wrist": 0.85, "side": 1.0})
dataset = LeRobotDataset(..., image_transforms=augs)
```

### Diffusion Policy

```python
from roboaug import create_default_pipeline

aug = create_default_pipeline(camera_role="side")
# Apply in your dataset __getitem__:
obs["image"] = aug(obs["image"])
```

## Design Principles

1. **Action-aware**: spatial transforms bounded to preserve action semantics (max 10° rotation, 10% translation). No horizontal flip (breaks left-right actions).
2. **Per-camera scaling**: different cameras tolerate different augmentation intensity. Top overview cameras get lighter aug than side context cameras.
3. **Symmetric augmentation**: all transforms verified to not shift BatchNorm statistics (< 0.005 mean bias). Shadow, gradient, gamma, vignette all have symmetric brighten/darken modes.
4. **Pure PyTorch**: zero cv2 dependency in the forward path. 44 methods at ~14ms per 480x640 image.
5. **Grouped sampling**: functional groups prevent unrealistic effect compounding while maximizing diversity.

## Speed

Benchmarked on Intel Xeon (single core, CPU):

| Configuration | Time per image | Training throughput |
|---|---|---|
| Full 44-method pipeline (side, 1.0x) | ~14ms | ~2+ steps/s (3 cameras, batch 16) |
| Minimal preset (3 methods) | ~1ms | ~3+ steps/s |
| No augmentation | ~0ms | ~3.2 steps/s |

## Citation

```bibtex
@software{roboaugment2026,
    title={robo-augment: Online Image Augmentation for Robot Learning},
    author={Li, Yuxian},
    year={2026},
    url={https://github.com/Liyux3/robo-augment}
}
```

## License

Apache 2.0
