Metadata-Version: 2.4
Name: orbit-robotics
Version: 0.6.0
Summary: Data engineering copilot for robot imitation learning datasets
Author: Rahil Lasne
License-Expression: MIT
Project-URL: PyPI, https://pypi.org/project/orbit-robotics/
Keywords: robotics,machine-learning,data-quality,imitation-learning,lerobot
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.0
Requires-Dist: rich>=13.0
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: scipy>=1.10
Requires-Dist: huggingface-hub>=0.20
Requires-Dist: pyarrow>=14.0
Requires-Dist: scikit-learn>=1.3
Requires-Dist: httpx>=0.24
Requires-Dist: openai>=1.0
Provides-Extra: vision
Requires-Dist: torch>=2.0; extra == "vision"
Requires-Dist: transformers>=4.36; extra == "vision"
Requires-Dist: opencv-python>=4.8; extra == "vision"
Requires-Dist: decord>=0.6; extra == "vision"
Requires-Dist: Pillow>=10.0; extra == "vision"
Provides-Extra: vlm
Requires-Dist: google-generativeai>=0.5; extra == "vlm"
Provides-Extra: claude
Requires-Dist: anthropic>=0.25; extra == "claude"
Provides-Extra: assist
Requires-Dist: google-genai>=1.0; extra == "assist"
Provides-Extra: assist-claude
Requires-Dist: claude-code-sdk>=0.1; extra == "assist-claude"
Provides-Extra: rlds
Requires-Dist: tfrecord>=1.14; extra == "rlds"
Provides-Extra: hdf5
Requires-Dist: h5py>=3.8; extra == "hdf5"
Provides-Extra: rosbag
Requires-Dist: mcap>=1.1; extra == "rosbag"
Requires-Dist: mcap-ros2-support>=0.5; extra == "rosbag"
Provides-Extra: rosbag-ros1
Requires-Dist: rosbag>=1.16; extra == "rosbag-ros1"
Provides-Extra: formats
Requires-Dist: orbit-robotics[hdf5,rlds,rosbag]; extra == "formats"
Provides-Extra: sim
Requires-Dist: mujoco>=3.0; extra == "sim"
Provides-Extra: monitor
Requires-Dist: tbparse>=0.0.8; extra == "monitor"
Provides-Extra: config
Requires-Dist: pyyaml>=6.0; extra == "config"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Requires-Dist: h5py>=3.8; extra == "dev"
Requires-Dist: pyyaml>=6.0; extra == "dev"
Provides-Extra: all
Requires-Dist: orbit-robotics[assist,config,formats,monitor,sim,vision,vlm]; extra == "all"
Dynamic: license-file

# ORBIT — Data Quality Analysis for Robot Policy Training

Predict whether your robot learning dataset will actually train successfully — before burning GPU hours.

ORBIT analyzes your demonstration data for the issues that cause training failures: dead joints, action divergence, inconsistent demonstrations, poor workspace coverage, and more. Get a quality grade (A-F) and calibrated success prediction based on 82+ ground-truth training outcomes.

## Quick Start

```bash
pip install orbit-robotics
orbit analyze lerobot/pusht
```

No GPU required. No API keys required. Works out of the box.

## What It Checks

- **Quality grade** (A-F) calibrated against real training outcomes
- **Success prediction** — probability your data will train a working policy
- **Dead joint detection** — finds servos that aren't moving
- **Action divergence** — detects contradictory demonstrations
- **Episode consistency** — flags recording issues and length outliers
- **Policy fit** — rates compatibility with ACT, Diffusion Policy, SmolVLA, OpenVLA
- **Workspace coverage** — checks if demonstrations cover the task space
- **Community comparison** — benchmarks against other public datasets

## Example Output

```
ORBIT Analysis: lerobot/pusht
206 episodes - DIFFUSION_POLICY

  Grade: A (98/100) — Ready to train — expect strong results
  Similar datasets trained at: 63%, 84%, 91%, 95%, 100% (5 nearest matches)

  1 issue found:
    ! Jerk varies across episodes (CV=0.57)

  Run with --proxy for training signal, --detail for full diagnostics
```

## Local AI — Free, Private, Zero Config

ORBIT v0.6 adds local AI support via **Ollama**. Deep analysis, AI grading, and the `orbit assist` chatbot all run locally with no API keys and no data leaving your machine.

```bash
# Install Ollama (one time)
curl -fsSL https://ollama.com/install.sh | sh
ollama pull gemma4

# Now AI features just work — ORBIT auto-detects Ollama
orbit analyze lerobot/your-dataset --deep    # AI-powered deep analysis
orbit analyze lerobot/your-dataset --ai      # AI quality judge
orbit assist                                 # Interactive AI troubleshooter
```

ORBIT auto-detects the best available provider: **Ollama (local) > Gemini (cloud) > OpenAI (cloud)**. No configuration needed — it just picks whatever's available.

Check your AI setup anytime:

```bash
orbit setup-ai
```

### Provider Options

| Provider | Setup | Cost | Data Privacy |
|----------|-------|------|-------------|
| **Ollama** (recommended) | `ollama pull gemma4` | Free | Data stays local |
| **Gemini** | `export GOOGLE_API_KEY=...` | ~$0.001/analysis | Sent to Google |
| **OpenAI** | `export OPENAI_API_KEY=...` | ~$0.01/analysis | Sent to OpenAI |

Override the auto-detected provider:

```bash
orbit auth set ai-provider ollama      # Force Ollama
orbit auth set ai-model gemma4         # Pick a specific model
```

## Commands

| Command | What it does |
|---------|-------------|
| `orbit analyze <dataset>` | Full quality analysis with grade and predictions |
| `orbit suggest <dataset>` | Training command with tuned hyperparameters |
| `orbit clean <dataset>` | Remove bad episodes automatically |
| `orbit fix <dataset>` | Analyze, clean, and suggest in one shot |
| `orbit gate <dataset>` | CI/CD quality gate — pass/fail for pipelines |
| `orbit compare <a> <b>` | Side-by-side dataset comparison |
| `orbit benchmark <task>` | Compare against published training benchmarks |
| `orbit assist` | AI troubleshooter for data and training issues |
| `orbit doctor` | Check environment health and AI providers |
| `orbit setup-ai` | Check and configure AI providers |
| `orbit badge <dataset>` | Generate a shields.io quality badge |
| `orbit explore` | Browse and discover LeRobot datasets on HuggingFace |

## Common Options

```bash
orbit analyze lerobot/my-dataset --policy act       # Check fit for specific policy
orbit analyze lerobot/my-dataset --deep              # AI-powered deep analysis
orbit analyze lerobot/my-dataset --ai                # AI quality judge for A grades
orbit analyze lerobot/my-dataset --json              # Machine-readable output
orbit analyze lerobot/my-dataset --full              # All episodes (no sampling)
orbit analyze lerobot/my-dataset --detail            # Full diagnostic report
orbit -q analyze lerobot/my-dataset                  # Quiet mode (grade only)
orbit analyze ./local-data/ --format hdf5            # Local HDF5 files
```

## CI/CD Integration

```bash
# Fail the pipeline if data quality is below B
orbit gate lerobot/my-dataset --min-grade B --policy act

# JSON output for scripts
orbit analyze lerobot/my-dataset --json | jq '.readiness.grade'
```

## Supported Formats

| Format | Source |
|--------|--------|
| LeRobot (Hub) | HuggingFace datasets (`lerobot/...`) |
| LeRobot (local) | Local LeRobot directories |
| HDF5 | RoboMimic, robosuite, custom `.hdf5` files |
| RLDS | TFRecord-based datasets (`pip install orbit-robotics[rlds]`) |
| ROS bags | `.bag` and `.mcap` files (`pip install orbit-robotics[rosbag]`) |

## Understanding Grades

| Grade | Score | Meaning |
|-------|-------|---------|
| **A** | 85-100 | Ready to train — expect strong results |
| **B** | 72-84 | Good data — minor issues, should train well |
| **C** | 58-71 | Usable but has problems — clean first |
| **D** | 40-57 | Significant issues — collect more or better data |
| **F** | 0-39 | Critical problems — fix before training |

Grades are calibrated against 82 real datasets with known training outcomes.

## What's New in v0.6.0

- **Local AI via Ollama** — `--deep`, `--ai`, and `orbit assist` work locally with Gemma 4, no API keys needed
- **Multi-provider AI** — auto-detects Ollama > Gemini > OpenAI, configurable via `orbit auth set ai-provider`
- **`orbit setup-ai`** — check and configure AI providers in one command
- **Grading accuracy overhaul** — ground truth task difficulty adjustment, BC/BC-RNN divergence penalties, calibrated summaries for hard tasks
- **Similar dataset display** — shows nearest benchmark success rates instead of wide confidence intervals
- **Sampling warnings** — warns when analyzing <20 episodes that the grade may be unreliable
- **`orbit doctor`** — now checks Ollama, Gemini, and OpenAI availability
- **Ollama backend for `orbit assist`** — chat with local models about your data

## Requirements

- Python 3.10+
- No GPU needed
- No API keys needed for core analysis

## License

MIT
