Metadata-Version: 2.4
Name: trainything
Version: 0.1
Summary: A visual graph editor for building and running multi-stage ML training pipelines
Project-URL: Homepage, https://trainything.ai
Project-URL: Repository, https://github.com/kvark/trainything
Project-URL: Issues, https://github.com/kvark/trainything/issues
Author: Dzmitry Malyshau
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: cryptography>=42
Requires-Dist: datasets>=4.0
Requires-Dist: fastapi<1,>=0.110
Requires-Dist: gymnasium<2,>=1.1
Requires-Dist: minari[hdf5]<1,>=0.5
Requires-Dist: numpy>=1.24
Requires-Dist: pillow<12,>=10
Requires-Dist: pydantic<3,>=2.6
Requires-Dist: scikit-learn<2,>=1.4
Requires-Dist: torch>=2.2
Requires-Dist: torchvision>=0.17
Requires-Dist: transformers>=4.45
Requires-Dist: uvicorn[standard]<1,>=0.29
Requires-Dist: websockets<14,>=12
Provides-Extra: all
Requires-Dist: accelerate>=0.27; extra == 'all'
Requires-Dist: ale-py<1,>=0.8; extra == 'all'
Requires-Dist: diffusers>=0.27; extra == 'all'
Requires-Dist: gym-aloha>=0.1.2; extra == 'all'
Requires-Dist: gymnasium[atari]<2,>=1.1; extra == 'all'
Requires-Dist: gymnasium[box2d]<2,>=1.1; extra == 'all'
Requires-Dist: lerobot>=0.4; extra == 'all'
Requires-Dist: stable-retro>=0.9; extra == 'all'
Provides-Extra: atari
Requires-Dist: ale-py<1,>=0.8; extra == 'atari'
Requires-Dist: gymnasium[atari]<2,>=1.1; extra == 'atari'
Provides-Extra: box2d
Requires-Dist: gymnasium[box2d]<2,>=1.1; extra == 'box2d'
Provides-Extra: dev
Requires-Dist: httpx<1,>=0.27; extra == 'dev'
Requires-Dist: pytest-asyncio<1,>=0.23; extra == 'dev'
Requires-Dist: pytest<9,>=8; extra == 'dev'
Requires-Dist: ruff<1,>=0.3; extra == 'dev'
Provides-Extra: diffusers
Requires-Dist: accelerate>=0.27; extra == 'diffusers'
Requires-Dist: diffusers>=0.27; extra == 'diffusers'
Provides-Extra: lerobot
Requires-Dist: gym-aloha>=0.1.2; extra == 'lerobot'
Requires-Dist: lerobot>=0.4; extra == 'lerobot'
Provides-Extra: retro
Requires-Dist: stable-retro>=0.9; extra == 'retro'
Description-Content-Type: text/markdown

<p align="center">
  <img src="docs/logo.svg" alt="Trainything" width="340" />
</p>

<p align="center">
  <em>A visual graph editor for composing and running multi-stage ML training pipelines.</em>
</p>

<p align="center">
  <a href="https://github.com/kvark/trainything/actions/workflows/ci.yml"><img src="https://github.com/kvark/trainything/actions/workflows/ci.yml/badge.svg" alt="CI" /></a>
</p>

---

Trainything lets you compose training pipelines visually: drop a dataset, drop a model, wire them together, configure, hit **Start**. Watch it train. Drag the checkpoint into the next stage. The goal is not to beat a hand-tuned Python script -- it's to make the entire multi-stage process *legible*, *explorable*, and *fast to iterate on*.

The primary audience is people learning how modern ML training works: how vision encoders connect to language models, how VLAs consume demonstration data, how RL rollouts feed back into training. If you can see the graph, you can build intuition.

## Graph Model

The graph is **bipartite** (Petri-net style): **artifact nodes** (ovals) hold passive data at rest, **process nodes** (rectangles) run active computation. Every data edge crosses the artifact/process boundary -- never same-type to same-type.

A second edge type -- **module edges** -- wire process nodes together without implying execution order. For example, a Score node wired to a Train node provides a scoring function that Train calls internally during its loop.

<!-- TODO: replace with actual screenshot once the UI is stable -->
![Graph screenshot](docs/graph-screenshot.png)

### Node types

| Artifacts | Processes |
|-----------|-----------|
| Dataset | Train (supervised, SFT, behavior cloning, PPO, GRPO, GRPO-text, DQN, diffusion) |
| Model | Evaluate (classify, generate text, generate images) |
| Environment | Transform (filter, split, map, tokenize, augment) |
| Metrics | Score (episode reward, task success, format/correctness reward) |

### What you can build

**Supervised learning** — Image classification, regression, fine-tuning pretrained vision models.
`CIFAR-10 → ResNet-18 → Train (supervised) → Evaluate`

**LLM pre-training** — Train a GPT-2 from scratch on text data with next-token prediction.
`Tiny Shakespeare → Tokenize → GPT-2 (from config) → Train (SFT) → Generate`

**LLM fine-tuning** — SFT or LoRA on instruction/chat datasets.
`Dataset → Model (LoRA) → Train (SFT) → Evaluate (generate)`

**R1-style reasoning** — Teach a language model to reason step-by-step using GRPO with correctness and format rewards.
`GSM8K → GPT-2 → Train (GRPO-text) → Generate Answers`

**Image diffusion** — Train a DDPM denoising model from scratch, then generate samples.
`CIFAR-10 → UNet2D → Train (diffusion) → Evaluate (generate_images)`

**Image generation** — Run inference with pretrained diffusion pipelines (Stable Diffusion, etc.).
`Model (diffusion) → Evaluate (generate_images) → Generated Images`

**RL control** — DQN, PPO, or GRPO on gymnasium environments (CartPole, LunarLander, Atari, MuJoCo).
`Environment → Score → Train (PPO/DQN) ← Model (MLP/CNN)`

**Multi-stage RL** — Behavior cloning → GRPO → PPO, with checkpoint warmstarting between stages.
`Demos → BC → GRPO warmstart → PPO polish`

**Imitation learning** — Clone expert demonstrations into a policy network.
`Dataset (demos) → Model (MLP) → Train (behavior_cloning) → Evaluate`

**Robot learning** — Evaluate vision-language-action models (LeRobot/SmolVLA) on manipulation tasks.
`Model (lerobot) → Evaluate → Metrics`

## Getting Started

Requires Docker or Podman.

```bash
make build   # build the container image (CUDA PyTorch by default)
make run     # auto-detects GPU (NVIDIA, AMD, or CPU-only)
```

Open `http://localhost:8000` in your browser.

For AMD ROCm or CPU-only PyTorch, rebuild with a different torch index:

```bash
make build TORCH_INDEX=https://download.pytorch.org/whl/rocm6.2   # AMD
make build TORCH_INDEX=https://download.pytorch.org/whl/cpu        # CPU
```

### Install from PyPI

```bash
pip install trainything
trainything serve
```

To install with all optional environments (Atari, retro, diffusers, LeRobot, etc.):

```bash
pip install trainything[all]
```

For NVIDIA GPU support, install PyTorch with CUDA first:

```bash
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124
pip install trainything
```

On macOS (Apple Silicon), PyTorch automatically uses the MPS backend -- no extra steps needed.

### Development setup

```bash
pip install -e ".[dev]"
cd frontend && npm ci && npm run build && cd ..
trainything serve --dev
```

### Headless execution

```bash
trainything run examples/mnist_train.json
trainything run --dry-run examples/lunar_lander_ppo.json
```

## Architecture

- **Frontend:** React + [React Flow](https://reactflow.dev/) for the node editor. Side panel for node configuration, bottom panel for live logs and metrics.
- **Backend:** Python + FastAPI. REST for graph CRUD, WebSocket for streaming progress events.
- **Execution:** PyTorch, in-process via `asyncio.to_thread`. Each process node's `execute()` is the same code a user would write by hand.

### Why Python backend

The entire ecosystem this tool orchestrates is Python: PyTorch, HuggingFace `transformers`, `datasets`, tokenizers, schedulers. A Rust backend would mean FFI or subprocess management for every node, and would break the key educational feature: **"View Code."** Each node's `execute()` method should be the same code a user would write by hand.

A Rust inference node (e.g. via [Meganeura](https://github.com/nickvonkaenel/meganeura)) can exist as a node type that calls out to a Rust binary -- the backend doesn't prevent this.

## Acknowledgments

Inspired by [ComfyUI](https://github.com/comfyanonymous/ComfyUI)'s node-based approach to ML pipelines, [Kedro](https://kedro.org/)'s data-centric pipeline model, and the general frustration of managing multi-stage training with shell scripts and notebooks.
