Metadata-Version: 2.4
Name: orbit-robotics
Version: 0.5.4
Summary: Data engineering copilot for robot imitation learning datasets
Author: Rahil Lasne
License-Expression: MIT
Project-URL: PyPI, https://pypi.org/project/orbit-robotics/
Keywords: robotics,machine-learning,data-quality,imitation-learning,lerobot
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.0
Requires-Dist: rich>=13.0
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: scipy>=1.10
Requires-Dist: huggingface-hub>=0.20
Requires-Dist: pyarrow>=14.0
Requires-Dist: scikit-learn>=1.3
Requires-Dist: httpx>=0.24
Provides-Extra: vision
Requires-Dist: torch>=2.0; extra == "vision"
Requires-Dist: transformers>=4.36; extra == "vision"
Requires-Dist: opencv-python>=4.8; extra == "vision"
Requires-Dist: decord>=0.6; extra == "vision"
Requires-Dist: Pillow>=10.0; extra == "vision"
Provides-Extra: vlm
Requires-Dist: google-generativeai>=0.5; extra == "vlm"
Provides-Extra: claude
Requires-Dist: anthropic>=0.25; extra == "claude"
Provides-Extra: assist
Requires-Dist: google-genai>=1.0; extra == "assist"
Provides-Extra: assist-claude
Requires-Dist: claude-code-sdk>=0.1; extra == "assist-claude"
Provides-Extra: rlds
Requires-Dist: tfrecord>=1.14; extra == "rlds"
Provides-Extra: hdf5
Requires-Dist: h5py>=3.8; extra == "hdf5"
Provides-Extra: rosbag
Requires-Dist: mcap>=1.1; extra == "rosbag"
Requires-Dist: mcap-ros2-support>=0.5; extra == "rosbag"
Provides-Extra: rosbag-ros1
Requires-Dist: rosbag>=1.16; extra == "rosbag-ros1"
Provides-Extra: formats
Requires-Dist: orbit-robotics[hdf5,rlds,rosbag]; extra == "formats"
Provides-Extra: sim
Requires-Dist: mujoco>=3.0; extra == "sim"
Provides-Extra: monitor
Requires-Dist: tbparse>=0.0.8; extra == "monitor"
Provides-Extra: config
Requires-Dist: pyyaml>=6.0; extra == "config"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Requires-Dist: h5py>=3.8; extra == "dev"
Requires-Dist: pyyaml>=6.0; extra == "dev"
Provides-Extra: all
Requires-Dist: orbit-robotics[assist,config,formats,monitor,sim,vision,vlm]; extra == "all"
Dynamic: license-file

# ORBIT — Data Quality Analysis for Robot Policy Training

Predict whether your robot learning dataset will actually train successfully — before burning GPU hours.

ORBIT analyzes your demonstration data for the issues that cause training failures: dead joints, action divergence, inconsistent demonstrations, poor workspace coverage, and more. Get a quality grade (A-F) and calibrated success prediction based on 82+ ground-truth training outcomes.

## Quick Start

```bash
pip install orbit-robotics
orbit analyze lerobot/pusht
```

No GPU required. No API keys required. Works out of the box.

## What It Checks

- **Quality grade** (A-F) calibrated against real training outcomes
- **Success prediction** — probability your data will train a working policy
- **Dead joint detection** — finds servos that aren't moving
- **Action divergence** — detects contradictory demonstrations
- **Episode consistency** — flags recording issues and length outliers
- **Policy fit** — rates compatibility with ACT, Diffusion Policy, SmolVLA, OpenVLA
- **Workspace coverage** — checks if demonstrations cover the task space
- **Community comparison** — benchmarks against other public datasets

## Example Output

```
Dataset Readiness: B+ (score: 78/100)
Good data — minor issues, should train well

  ✓ High consistency (0.95)
  ✓ Sufficient episodes (200) for diffusion_policy
  ✓ Good coverage (0.84)
  ✗ 2 joints clipping (>10% of frames)

Top action: Fix joint clipping before training

YOUR DATA AT A GLANCE
────────────────────────────────────────
  Episodes:       200     (top 25%)
  Coverage:       0.84  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░
  Signal Health:  0.92  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░
```

(Illustrative — actual output depends on your dataset.)

## Optional: AI-Powered Assessment

For deeper analysis using vision-language models:

```bash
pip install orbit-robotics[vlm]
export GOOGLE_API_KEY=your-key
orbit analyze lerobot/your-dataset --deep
```

Uses Gemini for VLM-based visual assessment. The core statistical analysis works fully without any API key.

You can also use Claude as an alternative AI provider:

```bash
pip install orbit-robotics[claude]
export ANTHROPIC_API_KEY=your-key
```

## Commands

| Command | What it does |
|---------|-------------|
| `orbit analyze <dataset>` | Full quality analysis with grade and predictions |
| `orbit benchmark <task>` | Compare against published training benchmarks |
| `orbit assist` | AI troubleshooter for data and training issues |
| `orbit suggest <dataset>` | Training command with tuned hyperparameters |
| `orbit clean <dataset>` | Remove bad episodes automatically |
| `orbit fix <dataset>` | Analyze, clean, and suggest in one shot |

## Common Options

```bash
orbit analyze lerobot/my-dataset --policy act       # Check fit for specific policy
orbit analyze lerobot/my-dataset --deep              # AI-powered deep analysis (needs GOOGLE_API_KEY)
orbit analyze lerobot/my-dataset --json              # Machine-readable output
orbit analyze lerobot/my-dataset --full              # All episodes (no sampling)
orbit analyze ./local-data/ --format hdf5            # Local HDF5 files
```

## Supported Formats

| Format | Source |
|--------|--------|
| LeRobot (Hub) | HuggingFace datasets (`lerobot/...`) |
| LeRobot (local) | Local LeRobot directories |
| HDF5 | RoboMimic, robosuite, custom `.hdf5` files |
| RLDS | TFRecord-based datasets (`pip install orbit-robotics[rlds]`) |
| ROS bags | `.bag` and `.mcap` files (`pip install orbit-robotics[rosbag]`) |

## Understanding Grades

| Grade | Score | Meaning |
|-------|-------|---------|
| **A** | 85-100 | Ready to train — expect strong results |
| **B** | 72-84 | Good data — minor issues, should train well |
| **C** | 58-71 | Usable but has problems — clean first |
| **D** | 40-57 | Significant issues — collect more or better data |
| **F** | 0-39 | Critical problems — fix before training |

Grades are calibrated against 82 real datasets with known training outcomes.

## Requirements

- Python 3.10+
- No GPU needed
- No API keys needed for core analysis

## License

MIT
