Metadata-Version: 2.4
Name: ros-agent
Version: 0.9.0
Summary: MCP agent layer for robots — 49 tools, skills, fleet, vision AI for any ROS 2 robot
Author-email: Pablo <pablo@hashingsystems.com>
License: MIT
Project-URL: Homepage, https://github.com/publu/RoboRun
Project-URL: Repository, https://github.com/publu/RoboRun
Keywords: robotics,ros,mcp,agent,fleet,yolo,skills
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: websocket-client>=1.6
Requires-Dist: websockets>=12.0
Provides-Extra: vision
Requires-Dist: ultralytics>=8.0; extra == "vision"
Requires-Dist: opencv-python>=4.8; extra == "vision"
Requires-Dist: numpy>=1.24; extra == "vision"
Requires-Dist: Pillow>=10.0; extra == "vision"
Requires-Dist: torch>=2.0; extra == "vision"
Requires-Dist: torchvision>=0.15; extra == "vision"
Requires-Dist: open-clip-torch>=2.20; extra == "vision"
Provides-Extra: cosmos
Requires-Dist: cosmos-tokenizer>=0.1; extra == "cosmos"
Provides-Extra: depth
Requires-Dist: transformers>=4.30; extra == "depth"
Provides-Extra: jepa
Requires-Dist: timm>=0.9; extra == "jepa"
Provides-Extra: dimos
Requires-Dist: dimos; extra == "dimos"
Provides-Extra: gemini
Requires-Dist: google-generativeai>=0.8; extra == "gemini"
Provides-Extra: zk
Requires-Dist: ezkl>=10.0; extra == "zk"
Requires-Dist: onnx>=1.14; extra == "zk"
Provides-Extra: ros
Requires-Dist: ros_tap>=0.1; extra == "ros"
Requires-Dist: cyclonedds>=0.10; extra == "ros"
Provides-Extra: all
Requires-Dist: ros-agent[cosmos,depth,dimos,gemini,jepa,ros,vision,zk]; extra == "all"
Dynamic: license-file

<p align="center">
  <img src="assets/demo-thumb.jpg" alt="ros-agent" width="100%">
</p>

# ros-agent

<p align="center">
  <a href="https://pypi.org/project/ros-agent/"><img src="https://img.shields.io/pypi/v/ros-agent?style=for-the-badge&color=00d47e&label=PyPI" alt="PyPI"></a>
  <a href="https://python.org"><img src="https://img.shields.io/badge/Python-3.10+-3776AB?style=for-the-badge&logo=python&logoColor=white" alt="Python"></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-green?style=for-the-badge" alt="License: MIT"></a>
  <a href="https://modelcontextprotocol.io"><img src="https://img.shields.io/badge/MCP_Tools-49-00d47e?style=for-the-badge" alt="MCP Tools"></a>
  <a href="https://ros.org"><img src="https://img.shields.io/badge/ROS_2-Supported-22314E?style=for-the-badge&logo=ros&logoColor=white" alt="ROS 2"></a>
  <a href="https://github.com/publu/RoboRun"><img src="https://img.shields.io/github/stars/publu/RoboRun?style=for-the-badge" alt="Stars"></a>
</p>

**The MCP agent layer for robots.** 49 tools that let Claude, Cursor, or any MCP client control a real ROS 2 robot. Plugin skills for autonomous behaviors. Real-time vision (YOLO + CLIP + JEPA). MuJoCo simulation. Fleet management. Two pip dependencies in the core — no ROS install required on your machine.

Use any AI client you want — [Claude Desktop](https://claude.ai), [Claude Code](https://claude.ai/code), [Cursor](https://cursor.com), or any MCP-compatible host. Add one line to your config and your AI gets camera feeds, velocity control, sensor data, autonomous patrol, person following, object search, and full ROS 2 introspection. Switch robots by changing an IP address.

<table>
<tr><td><b>MCP-native from the ground up</b></td><td>49 tools, 8 guided prompts, 6 live resources, and a topic template — all exposed through standard MCP. HTTP+SSE and stdio transports. Works with any client that speaks the protocol.</td></tr>
<tr><td><b>Skills plugin system</b></td><td>5 built-in skills (compose, inspect, follow-me, patrol, scan-detect). Write your own in 10 lines of Python. 4 loading paths: built-in, pip packages, filesystem, project config. Every skill becomes MCP tools automatically.</td></tr>
<tr><td><b>Real-time vision stack</b></td><td>YOLO object detection + tracking, CLIP zero-shot search ("find the red cup"), JEPA self-supervised attention heatmaps, Cosmos 3 world model (16B, MLX 4-bit on Mac). Toggle models live from the UI.</td></tr>
<tr><td><b>Any ROS 2 robot, no ROS needed</b></td><td>Connects over rosbridge WebSocket — no ROS installation on the host. Also supports direct DDS via CycloneDDS for zero-latency local transport. Tested with Unitree Go2, G1, TurtleBot, drones, arms.</td></tr>
<tr><td><b>Built-in Claude + Gemini agent</b></td><td>Streaming tool use with dynamic ROS context injection. Safety velocity clamping. Persistent cross-session memory. Behavioral identity via SOUL.md.</td></tr>
<tr><td><b>Simulation and fleet</b></td><td>MuJoCo headless physics with trained ONNX locomotion policies. Fleet dashboard for multi-robot management. Blueprint system for robot configurations. CLIP-indexed spatial memory with geo-search.</td></tr>
</table>

---

## Quick Install

```bash
pip install ros-agent
ros-agent
```

That's it. Browser opens at `http://127.0.0.1:8765`. Webcam starts with live detection.

### From source

```bash
git clone https://github.com/publu/RoboRun.git
cd RoboRun
pip install -e .
ros-agent
```

### npx (Node 18+)

```bash
npx ros-agent
```

### Optional extras

```bash
pip install ros-agent[vision]   # YOLO + CLIP + OpenCV
pip install ros-agent[ros]      # Direct DDS (CycloneDDS)
pip install ros-agent[gemini]   # Gemini agent
pip install ros-agent[all]      # Everything
```

---

## Connect Your AI

### Claude Desktop / Cursor (HTTP)

```json
{
  "mcpServers": {
    "ros-agent": {
      "type": "http",
      "url": "http://localhost:8765/mcp"
    }
  }
}
```

### Claude Code / CLI clients (stdio)

```json
{
  "mcpServers": {
    "ros-agent": {
      "command": "ros-agent-mcp"
    }
  }
}
```

Your AI immediately gets 49 tools for robot control, 8 guided prompts for common workflows, and 6 live resources for introspection.

---

## What Your AI Gets

### 49 Tools

30 built-in ROS tools (topic discovery, pub/sub, service calls, action goals, parameter management, camera snapshots, depth images, velocity commands, node/topic/service/action introspection) plus 19 skill tools from 5 built-in skills.

### 8 Prompts

| Prompt | What it does |
|--------|-------------|
| `explore-robot` | Guided discovery of a robot's capabilities |
| `safety-check` | Pre-operation safety verification |
| `environment-scan` | Full environment survey with camera + sensors |
| `teach-waypoints` | Interactive waypoint teaching for patrol |
| `debug-topic` | Step-by-step topic diagnosis |
| `quick-start` | First-time onboarding in 60 seconds |
| `fleet-sweep` | Multi-robot status check |
| `build-workflow` | Create a reusable tool chain |

### 6 Resources

| Resource | Description |
|----------|-------------|
| `ros-agent://server-info` | Server version, uptime, capabilities |
| `ros-agent://skills` | Loaded skills and their tools |
| `ros-agent://ros-graph` | Live ROS topic/service/node graph |
| `ros-agent://workflows` | Saved compose workflows |
| `ros-agent://prompts-catalog` | All available prompts |
| `ros-agent://soul` | Agent behavioral identity |
| `ros-agent://topic/{path}` | Live read from any ROS topic (template) |

---

## Skills

Plugin-based. Drop a Python file, get new MCP tools.

```
Built-in                            → ships with ros-agent
ROBORUN_SKILL_PACKAGES=pkg1,pkg2    → pip packages
ROBORUN_SKILL_PATHS=/path/to/dir    → filesystem directories
.roborun/skills.yaml                → project-level config
```

### Built-in Skills

| Skill | Tools | What it does |
|-------|:-----:|-------------|
| **compose** | 5 | Chain tools into reusable workflows — `run_sequence`, `save_workflow`, `run_workflow` |
| **inspect** | 3 | `robot_brief` (one-call overview), `watch_topic` (conditional monitor), `diff_state` (graph changes) |
| **follow_me** | 1 | Visual person-following with P-control on camera feed |
| **patrol** | 5 | Autonomous waypoint patrol loop with configurable dwell times |
| **scan_detect** | 2 | Rotate-and-detect object search using YOLO + CLIP fallback |

### Write Your Own

```python
SKILL_TOOLS = [
    {"name": "my_tool", "description": "Does a thing",
     "inputSchema": {"type": "object", "properties": {"x": {"type": "string"}}}}
]

def handle(name: str, args: dict) -> str:
    if name == "my_tool":
        return f"Got: {args.get('x')}"
```

Point `ROBORUN_SKILL_PATHS` at the directory. Done.

---

## Connect a Robot

ros-agent connects to any robot running `rosbridge_server` over WebSocket. No ROS installation needed on your machine.

```bash
# On the robot
ros2 launch rosbridge_server rosbridge_websocket_launch.xml

# From ros-agent — set IP in the UI, or:
curl -X POST http://localhost:8765/api/ros/connect \
  -d '{"host":"192.168.1.100"}'
```

Direct DDS transport also available for zero-latency local use:

```bash
pip install ros-agent[ros]   # adds CycloneDDS + ros_tap
```

---

## Configuration

| Variable | Default | Description |
|----------|---------|-------------|
| `ROBORUN_PORT` | `8765` | Server port |
| `ROBOT_IP` | — | Robot IP (can also set in UI) |
| `ANTHROPIC_API_KEY` | — | Enables Claude agent |
| `GEMINI_API_KEY` | — | Enables Gemini agent |
| `ROBORUN_SKILL_PACKAGES` | — | Additional skill packages |
| `ROBORUN_SKILL_PATHS` | — | Additional skill directories |
| `ROBORUN_MAX_LINEAR_VEL` | `1.0` | Safety velocity limit (m/s) |
| `ROBORUN_MAX_ANGULAR_VEL` | `1.5` | Safety angular limit (rad/s) |

---

## Contributing

```bash
git clone https://github.com/publu/RoboRun.git
cd RoboRun
pip install -e ".[all]"
python -m roborun.server  # internal module name
```

---

## License

MIT — see [LICENSE](LICENSE).

Built by [Hashing Systems](https://hashingsystems.com).
