Metadata-Version: 2.4
Name: anyrobo
Version: 0.2.4
Summary: A local-first voice AI assistant framework. Create your own JARVIS.
Project-URL: Homepage, https://github.com/vietanhdev/anyrobo
Project-URL: Repository, https://github.com/vietanhdev/anyrobo
Project-URL: Documentation, https://nrl.ai
Project-URL: Issues, https://github.com/vietanhdev/anyrobo/issues
Author-email: Viet-Anh Nguyen <vietanh.dev@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: ai,assistant,llm,local,offline,stt,tts,voice
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Requires-Dist: click>=8.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: all
Requires-Dist: elevenlabs>=0.2.0; extra == 'all'
Requires-Dist: ollama>=0.1.0; extra == 'all'
Requires-Dist: openai-whisper>=20230314; extra == 'all'
Requires-Dist: openai>=1.0; extra == 'all'
Requires-Dist: pyttsx3>=2.90; extra == 'all'
Requires-Dist: vosk>=0.3.45; extra == 'all'
Provides-Extra: dev
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Provides-Extra: elevenlabs
Requires-Dist: elevenlabs>=0.2.0; extra == 'elevenlabs'
Provides-Extra: ollama
Requires-Dist: ollama>=0.1.0; extra == 'ollama'
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == 'openai'
Provides-Extra: pyttsx3
Requires-Dist: pyttsx3>=2.90; extra == 'pyttsx3'
Provides-Extra: vosk
Requires-Dist: vosk>=0.3.45; extra == 'vosk'
Provides-Extra: whisper
Requires-Dist: openai-whisper>=20230314; extra == 'whisper'
Description-Content-Type: text/markdown

<h1 align="center">anyrobo</h1>
<p align="center"><em>Build a voice AI assistant in 10 lines — local-first STT, LLM, TTS, and tool-calling included.</em></p>

<p align="center">
<img src="https://img.shields.io/pypi/v/anyrobo.svg" alt="PyPI">
<img src="https://img.shields.io/pypi/pyversions/anyrobo.svg" alt="Python">
<img src="https://img.shields.io/pypi/l/anyrobo.svg" alt="License">
</p>

**anyrobo** is a batteries-included framework for building voice AI assistants that run entirely on your own hardware. It ties together speech-to-text (Whisper, Vosk), an LLM brain (Ollama by default via anyllm), and text-to-speech (pyttsx3, ElevenLabs) behind a single class. It supports multi-step tool-calling, built-in personalities (Jarvis, GLaDOS, assistant), a plugin system for custom skills, an event bus, a RAG knowledge base, and MCP client support for calling external tool servers.

Built by [Viet-Anh Nguyen](https://github.com/vietanhdev) at [NRL.ai](https://www.nrl.ai).

## Why anyrobo?

- **One-liner API** — `anyrobo.Robo().listen()` is a complete voice assistant
- **Plugin architecture** — Add custom skills, tools, personalities, and backends
- **Local-first** — Whisper + Ollama + pyttsx3 run 100% offline
- **Minimal core deps** — Base install is light; STT/TTS/LLM backends are extras
- **Production-ready** — Event system, memory persistence, MCP client, RAG

## Installation

```bash
pip install anyrobo
```

For backends:

```bash
pip install anyrobo[whisper]      # openai-whisper for local STT
pip install anyrobo[vosk]         # Vosk for offline STT
pip install anyrobo[tts]          # pyttsx3 for offline TTS
pip install anyrobo[elevenlabs]   # ElevenLabs cloud TTS
pip install anyrobo[llm]          # anyllm for LLM routing
pip install anyrobo[rag]          # knowledge base with embeddings
pip install anyrobo[mcp]          # MCP client for external tool servers
pip install anyrobo[all]          # everything
```

**Python 3.8+ supported** (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

## Quick Start

```python
import anyrobo

# 1. Simplest possible voice assistant (Whisper + Ollama + pyttsx3)
bot = anyrobo.Robo(personality="jarvis")
bot.listen()    # starts mic, transcribes, replies with voice, loops

# 2. Add a tool the assistant can call (schema auto-extracted from type hints)
def set_timer(minutes: int, label: str = "timer") -> str:
    """Set a countdown timer for the given number of minutes."""
    return f"Timer '{label}' set for {minutes} minutes."

bot.add_tool(set_timer)
bot.listen()   # now: "Hey Jarvis, set a 5 minute pasta timer" -> tool call

# 3. Text-mode (no mic/speaker, useful for testing)
reply = bot.ask("What's the weather like in Tokyo?")
```

## Models & Methods

### Backends (all local-first)

| Component | Backend | Model | Install |
|---|---|---|---|
| **STT** | Whisper | `openai-whisper` `tiny/base/small/medium/large` | `anyrobo[whisper]` |
| **STT** | Vosk | Vosk offline models | `anyrobo[vosk]` |
| **LLM** | Ollama (default) | Any Ollama model (`llama3.1:8b`, `qwen2.5`, ...) | `anyrobo[llm]` |
| **LLM** | OpenAI / Anthropic | via anyllm | `anyrobo[llm]` |
| **TTS** | pyttsx3 | OS-native voices (SAPI / NSSpeechSynthesizer / espeak) | `anyrobo[tts]` |
| **TTS** | ElevenLabs | Cloud API | `anyrobo[elevenlabs]` |

### Tool / function calling

Pass plain Python functions; `anyrobo` (via `anyllm`) auto-extracts parameter schemas from **type hints** and **docstrings**, then runs a multi-step agentic loop:

1. LLM receives the user query + tool list
2. LLM decides whether to call a tool (structured output)
3. `anyrobo` dispatches the tool and feeds the result back
4. Loop until the LLM emits a final natural-language response

### Built-in personalities

| Name | Style |
|---|---|
| `jarvis` | Polite British butler, concise and proactive |
| `glados` | Dry, sarcastic, vaguely threatening (Portal-inspired) |
| `assistant` | Neutral, helpful default |
| `custom` | Pass your own `system_prompt` |

### Conversation memory

`SlidingWindowMemory` keeps the last N turns in context, with optional disk persistence (JSON). `Robo.save_memory(path)` / `load_memory(path)` for persistence across sessions.

### Event system

Subscribe to any lifecycle event:

```python
bot.on("user_message", lambda text: print("heard:", text))
bot.on("tool_call", lambda name, args: log_tool(name, args))
bot.on("response", lambda text: print("bot:", text))
```

### RAG Knowledge Base

`anyrobo.KnowledgeBase()` ingests text/PDF/markdown, chunks via `anynlp`, embeds via `anyllm.embed`, and performs similarity search to augment the LLM prompt.

### MCP client

`Robo.add_mcp_server(command, args)` connects to any Model Context Protocol server (filesystem, GitHub, web search, a model exposed via `anydeploy.mcp`, ...) and exposes its tools to the assistant automatically.

### Plugin system

Subclass `anyrobo.Skill` to package reusable behavior:

```python
class WeatherSkill(anyrobo.Skill):
    name = "weather"
    def tools(self):
        return [self.get_weather]
    def get_weather(self, city: str) -> dict: ...

bot.add_skill(WeatherSkill())
```

## API Reference

| Function / class | Purpose |
|---|---|
| `anyrobo.Robo(personality, stt, tts, llm)` | Main assistant class |
| `Robo.listen(hotword=None)` | Voice loop: STT -> LLM -> TTS |
| `Robo.ask(text)` | Text-mode interaction |
| `Robo.add_tool(fn)` | Register a Python function as a tool |
| `Robo.add_skill(skill)` | Register a plugin skill |
| `Robo.add_mcp_server(cmd, args)` | Connect to an MCP server |
| `Robo.on(event, handler)` | Subscribe to lifecycle events |
| `anyrobo.KnowledgeBase()` | RAG knowledge base |
| `anyrobo.Skill` | Base class for plugins |

## CLI Usage

```bash
anyrobo listen --personality jarvis --model llama3.1:8b
anyrobo ask "What's on my calendar today?"
anyrobo list-personalities
anyrobo list-voices
```

## Examples

### Voice assistant with custom tools

```python
import anyrobo

def turn_on_lights(room: str) -> str:
    """Turn on the smart lights in a specific room."""
    return f"Lights in {room} are now on."

def play_music(genre: str, volume: int = 50) -> str:
    """Play music of a given genre at the specified volume (0-100)."""
    return f"Playing {genre} music at volume {volume}."

bot = anyrobo.Robo(personality="jarvis", model="llama3.1:8b")
bot.add_tool(turn_on_lights)
bot.add_tool(play_music)
bot.listen()
```

### RAG-powered Q&A over your docs

```python
import anyrobo

kb = anyrobo.KnowledgeBase()
kb.ingest("docs/")              # chunks + embeds every markdown/pdf
bot = anyrobo.Robo(knowledge_base=kb, personality="assistant")
print(bot.ask("What's our refund policy?"))
```

### Connect to external MCP servers

```python
import anyrobo

bot = anyrobo.Robo(personality="jarvis")
bot.add_mcp_server("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
bot.listen()   # now the LLM can read/write files via MCP
```

## License

MIT (c) Viet-Anh Nguyen
