Metadata-Version: 2.4
Name: voxa-code
Version: 0.1.0
Summary: Hands-free voice operator for Claude Code
Author-email: Ti <voxa@voxa.space>
License: MIT
Project-URL: Homepage, https://voxa.space
Project-URL: Repository, https://github.com/Ti-03/voxa
Project-URL: Issues, https://github.com/Ti-03/voxa/issues
Keywords: claude,claude-code,voice,cli,agent,gemini
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development
Classifier: Topic :: Utilities
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.110
Requires-Dist: uvicorn[standard]>=0.29
Requires-Dist: websockets>=12
Requires-Dist: google-genai>=0.3
Requires-Dist: claude-agent-sdk>=0.1
Requires-Dist: python-dotenv>=1.0
Requires-Dist: pyjwt[crypto]>=2.8
Requires-Dist: httpx[http2]>=0.27
Requires-Dist: qrcode>=7.4
Requires-Dist: cbor2>=5.6
Requires-Dist: pyobjc-framework-Quartz>=10; sys_platform == "darwin"
Requires-Dist: pyobjc-framework-ApplicationServices>=10; sys_platform == "darwin"
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: httpx>=0.27; extra == "dev"
Dynamic: license-file

# Voxa

Voxa lets you call into your laptop from a phone browser, talk to a Gemini Live
"operator," and have it drive Claude Code by voice.

**MVP scope (drive mode only):** pick a working directory by voice, send spoken
instructions, hear Claude's final result read back. Attach mode, voice
folder-browsing, and barge-in interruption are V2 backlog items (see
`docs/superpowers/specs/2026-06-27-loop-design.md` and
`docs/superpowers/plans/2026-06-27-loop-mvp.md`).

---

## Prerequisites

- **Python 3.11+** on the laptop.
- **Tailscale** installed and logged in on both the laptop and the phone (free
  personal plan is fine). The phone must be on the same tailnet as the laptop,
  or MagicDNS must be enabled.
- **A Gemini API key** from [Google AI Studio](https://aistudio.google.com/live)
  with Gemini Live access.
- **Claude Code logged in** on the laptop (`claude` CLI authenticated). The
  agent SDK reuses your existing Claude Code credentials; no separate
  `ANTHROPIC_API_KEY` is needed unless you prefer to supply one.

---

## Quickstart

Install Voxa on the laptop you want to control with one command.

**macOS / Linux:**

```bash
curl -fsSL https://voxa.space/install.sh | sh
```

**Windows (PowerShell):**

```powershell
irm https://voxa.space/install.ps1 | iex
```

Prefer a package runner? These work on any OS:

```bash
npx voxa-code            # Node users
uvx voxa-code            # Python users (or: pipx install voxa-code)
```

Then start it:

```bash
voxa
```

Voxa is zero-config by default: it uses the hosted relay, so there are no API
keys to set up. `voxa` starts the server and prints a pairing QR code. Scan it
with the Voxa phone app (or open the printed URL in your phone browser) to
connect.

---

## Develop from source

Contributors who want to hack on Voxa can run it from a checkout instead of the
published package.

### 1. Create and activate the virtual environment

```bash
python3 -m venv .venv
.venv/bin/pip install -e ".[dev]"
```

### 2. Configure secrets

```bash
cp .env.example .env
```

Open `.env` and fill in:

| Key | Value |
|---|---|
| `GEMINI_API_KEY` | Your Google AI Studio key |
| `VOXA_AUTH_TOKEN` | Any random secret string (protects the WebSocket endpoint on your tailnet) |

`GEMINI_LIVE_MODEL`, `VOXA_HOST`, and `VOXA_PORT` have sensible defaults and
can be left as-is.

### 3. Start the server

```bash
bash scripts/serve.sh
```

The script:
1. Starts the Voxa FastAPI server on `127.0.0.1:8787` (or `$VOXA_PORT`).
2. Calls `tailscale serve` to expose it over HTTPS on your tailnet (required
   because the phone browser needs a secure context for microphone access).
3. Prints the full HTTPS URL including your auth token.

### 4. Connect from the phone

Open the printed URL on your phone browser. Tap **Connect**, then speak.

---

## Architecture (brief)

```
Phone browser (static/)
    |  HTTPS WebSocket (auth token required)
    v
FastAPI server (server/app.py)
    |  audio bytes (16 kHz PCM)
    v
GeminiOperator (server/gemini_operator.py)  <-->  Gemini Live API
    |  tool calls (start_claude_session, send_to_claude, …)
    v
Orchestrator (server/orchestrator.py)
    |
    v
ClaudeController (server/claude_controller.py)  -->  Claude Code (agent SDK, bypassPermissions)
```

Config is loaded from `.env` via `server/config.py`.

---

## Running the test suite

```bash
.venv/bin/python -m pytest -v
```

Expected: 22 tests pass, no warnings.

---

## Manual end-to-end smoke test

The smoke test requires a real phone, real Tailscale connectivity, and real API
keys. Run it against a scratch directory, not a real project.

**Before you start:**
- `.env` is fully filled in (real `GEMINI_API_KEY` and `VOXA_AUTH_TOKEN`).
- Tailscale is running on both the laptop and the phone.
- Claude Code is logged in on the laptop.

**Procedure:**

1. Open a terminal on the laptop and run:
   ```bash
   bash scripts/serve.sh
   ```
   Wait for the line `Voxa is live. On your phone open: https://...`

2. Copy the printed HTTPS URL (it already includes `?token=...`).

3. On the phone, open the URL in Safari or Chrome. You should see the Voxa
   interface. Grant microphone permission when prompted.

4. Tap **Connect**. The button should change state to indicate an active
   session.

5. Speak: "Start a session in `/tmp/loop-smoke` and create a file called
   `hello.txt` that says hi."

6. **Verify:**
   - Gemini acknowledges the instruction verbally (you hear a response through
     the phone speaker).
   - On the laptop terminal you see Claude Code start with `bypassPermissions`
     active (no permission prompts appear).
   - After Claude finishes, `/tmp/loop-smoke/hello.txt` exists on the laptop
     and contains `hi`.
   - Gemini speaks the final result back to you.

7. To stop: press Ctrl-C in the laptop terminal. The `trap` in `serve.sh` will
   kill the server and tear down `tailscale serve`.

**Warning:** Use a throwaway scratch directory (like `/tmp/loop-smoke`) for
your first smoke test. Claude Code runs with `bypassPermissions`, so it will
write files without asking.
