Metadata-Version: 2.4
Name: vira-agent
Version: 0.2.1
Summary: Voice-Interactive Reasoning Agent — a voice-controlled AI agent that accesses local folder/markdown context during collaborative sessions.
Author: th3rdai
License: MIT
Project-URL: Repository, https://github.com/3rdAI-admin/VIRA
Project-URL: Issues, https://github.com/3rdAI-admin/VIRA/issues
Keywords: voice,ai,agent,claude,anthropic,whisper,cli
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: anthropic>=0.40
Requires-Dist: typer>=0.12
Requires-Dist: rich>=13.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: python-dotenv>=1.0
Provides-Extra: voice
Requires-Dist: faster-whisper>=1.0; extra == "voice"
Requires-Dist: sounddevice>=0.4; extra == "voice"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: ruff>=0.6; extra == "dev"
Requires-Dist: coverage>=7.0; extra == "dev"
Provides-Extra: web
Requires-Dist: fastapi>=0.115; extra == "web"
Requires-Dist: uvicorn[standard]>=0.32; extra == "web"
Provides-Extra: mcp
Requires-Dist: mcp>=1.0; extra == "mcp"
Provides-Extra: browser
Requires-Dist: playwright>=1.40; extra == "browser"
Provides-Extra: all
Requires-Dist: faster-whisper>=1.0; extra == "all"
Requires-Dist: sounddevice>=0.4; extra == "all"
Requires-Dist: fastapi>=0.115; extra == "all"
Requires-Dist: uvicorn[standard]>=0.32; extra == "all"
Requires-Dist: mcp>=1.0; extra == "all"
Requires-Dist: playwright>=1.40; extra == "all"
Provides-Extra: release
Requires-Dist: build>=1.2; extra == "release"
Requires-Dist: twine>=5.0; extra == "release"
Dynamic: license-file

# VIRA — Voice-Interactive Reasoning Agent

[![CI](https://github.com/3rdAI-admin/VIRA/actions/workflows/ci.yml/badge.svg)](https://github.com/3rdAI-admin/VIRA/actions/workflows/ci.yml)

A voice-controlled AI agent that talks to **Claude** (or a local model via **Ollama**),
loads local folder/markdown context and long-term **Obsidian-vault memory** into the
conversation, and runs reusable markdown **skills** and multi-step **workflows**. It
holds a **hands-free spoken conversation** — local Whisper speech-to-text plus
**ElevenLabs** voices (VIRA / VIRO / Friday) with a macOS `say` fallback — and you can
switch **voice**, **persona**, and even the **LLM provider** mid-conversation, by voice.
Cost is first-class: per-call usage/cost, a running total, and a spend cap.

## Documentation

<!-- docs-index:start -->
| Doc | What it covers |
| --- | --- |
| [VIRA_PLAN.md](VIRA_PLAN.md) | Master roadmap & vision (ICM layers, phases) |
| [ARCHITECTURE.md](ARCHITECTURE.md) | As-built architecture (GitNexus-generated) |
| [ARCHITECTURE-PLAN.md](ARCHITECTURE-PLAN.md) | Design-intent architecture |
| [MEMORY-PLAN.md](MEMORY-PLAN.md) | Obsidian memory vault — Phase 21 (shipped) |
| [UIPLAN.md](UIPLAN.md) | Web UI — Phase 18 (browser chat + settings shipped) |
| [VOICE-CLONE-PLAN.md](VOICE-CLONE-PLAN.md) | User voice clone — Phase 20 (shipped) |
| [BARGE-IN-FIX.md](BARGE-IN-FIX.md) | Voice / barge-in debugging notes |
| [docs/DIALOGUE.md](docs/DIALOGUE.md) | Topic-grounded dialogue (ICM flows, recipes) |
| [CHANGELOG.md](CHANGELOG.md) | Release notes (Keep a Changelog) |
| [harness/plans/vira-mvp/](harness/plans/vira-mvp/) | Phase build plans (incl. SHELLPLAN.md) |
<!-- docs-index:end -->

## Requirements

- Python ≥ 3.11 (developed on 3.14)
- An Anthropic API key for live agent calls (not needed to run the tests)

## Install

### From PyPI or pipx (v0.2.1+)

```bash
pip install vira
pip install 'vira[voice]'                        # + local Whisper STT
pip install 'vira[web]'                          # + `vira web` UI
pip install 'vira[voice,web,mcp,browser]'        # all runtime extras
# or: pip install 'vira[all]'

pipx install vira                                # isolated CLI on PATH
pipx install 'vira[voice]'
```

Set `ANTHROPIC_API_KEY` (or use `llm_provider: ollama`) and copy settings from
[config/default.yaml](config/default.yaml) into `~/.vira/config.yaml` as needed.

**Maintainers:** `pip install -e '.[release]'`, then `./scripts/release_check.sh`
and `twine upload dist/*` when publishing a tag.

### From source (development)

```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -e .            # core
# pip install -e '.[voice]' # + local Whisper STT (heavy/native deps)
```

## Configure

```bash
cp .env.example .env        # then set ANTHROPIC_API_KEY=...
```

Settings live in [config/default.yaml](config/default.yaml). The agent **model** is
a changeable default resolved with this precedence (lowest → highest):

```
config/default.yaml (claude-sonnet-4-6)  →  VIRA_MODEL env var  →  vira --model <id>
```

So you can change it permanently (edit the yaml), per-shell
(`export VIRA_MODEL=claude-opus-4-8`), or per-command (`vira chat --model claude-opus-4-8`).

Other settings include **prompt caching** (`prompt_cache`, on by default — caches the
stable context/skill prefix so multi-turn chat is cheaper/faster), context byte
budgets, and the voice / workflows / sessions directories.

## Cost awareness

Cost is a first-class feature. Every `chat` / `skill run` / `workflow run` prints a
one-line usage + estimated-cost summary (toggle with `show_usage`), and you can
estimate spend **before** a call with a free token preflight:

```bash
vira tokens --file big-prompt.md          # input tokens + estimated input cost (free; no completion)
vira skill run refiner "..."              # → usage: 26 in, 5 out | ~$0.0001 (1 call) | total: 336 tok, ~$0.0007 (3 calls)
vira usage                                # show the running total used so far
vira usage --reset                        # start a fresh tally
```

A **running total** (persisted in `~/.vira/usage.json`, overridable via `VIRA_USAGE_FILE`)
is appended to every usage line and accumulates across commands — in `vira chat` it
updates live after each turn.

**Spend cap** — set a per-session budget; VIRA warns at 80% and **blocks** paid calls
once the running total reaches it:

```bash
vira --spend-cap 0.50 chat          # stop once this session has spent ~$0.50
# or set `spend_cap` in config/default.yaml; reset the session with `vira usage --reset`
```

Keep costs down: prompt caching is on by default, `max_tokens` defaults low, and for
simple/bulk work use `--model claude-haiku-4-5`.

## Local models (Ollama)

Claude remains the **default** provider. For offline or privacy-sensitive work, switch
to a local model via [Ollama](https://ollama.com/) — no Anthropic key required.

```bash
ollama serve
ollama pull llama3.2:3b

# Per-session
vira --provider ollama --model llama3.2:3b chat

# Or persist in config/default.yaml:
#   llm_provider: ollama
#   model: llama3.2:3b
# Or via .env: VIRA_LLM_PROVIDER=ollama  VIRA_MODEL=llama3.2:3b
```

Works with `vira chat`, `skill run`, `workflow run`, `voice listen`, and `voice chat`.
Spend caps and Anthropic token preflight (`vira tokens`) are skipped for Ollama (local,
no API cost). Whisper STT is already local; pair with `--tts say` for fully offline
voice on macOS.

Manual integration check: `python scripts/spike_ollama.py`

## Obsidian memory

VIRA can load **long-term memory** from an Obsidian-compatible markdown vault and stay
**expert about itself** via synced docs in `Self/`.

```bash
vira memory init                      # ./memories Obsidian vault (wikilinks)
vira memory links                     # show [[wikilinks]] graph
vira memory sync-self                 # copy README, ARCHITECTURE, etc. → Self/
vira chat                             # auto-loads vault + remembers multi-turn history
vira --provider ollama --model llama3.2:3b chat   # every provider gets vault memory
vira skill run refiner "…"            # skills/workflows/voice use memory too
vira memory consolidate session-….jsonl   # distill transcript → Memory/sessions/

# Point at your Obsidian vault folder:
# memory_vault: ./memories   in config/default.yaml (open in Obsidian)
```

See [MEMORY-PLAN.md](MEMORY-PLAN.md) for layout, load order, and the self-improvement loop.

## Usage

```bash
vira chat                                # interactive chat (Ctrl-D / 'exit' to quit)
vira chat --model claude-opus-4-8        # one-off model override
vira skills list                         # list available skills
vira skill run refiner "tighten this please"
vira context load ./context              # preview a folder without chatting
vira context budget ./context            # byte/token limits + skip list (optional --exact)
vira kb list                             # registered topic knowledge bases
vira --topic vira chat                   # load primary expertise for this topic
vira --kb ~/Projects/foo/docs skill run refiner "…"   # ad-hoc KB on any command
```

## Host access (tools)

VIRA can read files on your machine, optionally **write/edit files**, and (optionally)
run shell commands through Claude tool use. **Read-only by default**; every shell
command and every file write requires confirmation each time, and `.git`, `.ssh`, and
secret files (`.env`, `*.pem`, `id_rsa`, …) are refused. Anthropic provider only.

```bash
vira do "summarise the Python files in vira/core"              # one-shot task (read-only)
vira do "show disk usage" --allow-shell                        # shell, confirmed per command
vira do "add a docstring to vira/core/agent.py" --allow-write  # write/edit, confirmed per file

vira chat --tools                                              # multi-turn chat + read_file/list_dir
vira chat --tools --allow-shell                                # chat + confirm-gated shell
vira chat --tools --allow-write                                # chat + confirm-gated write_file/edit_file
vira chat --tools --allow-shell --yes                          # skip confirm (risky)
vira chat --tools --allow-browser                              # + Playwright browser tools
vira do "open example.com and summarize" --allow-browser       # one-shot with browser

vira voice chat --hands-free --tools                           # voice + read-only tools
vira voice chat --tools --allow-shell                          # shell with spoken yes/no confirm
```

Set `tools_enabled: true` in `~/.vira/config.yaml` or `config/default.yaml` to
default `--tools` on for `vira chat`. Shell still needs `--allow-shell` or
`tools_allow_shell: true`; writes need `--allow-write` or `tools_allow_write: true`.
Tools are confined to `tools_root` (default: `.`).

**MCP servers** (optional `pip install 'vira[mcp]'`): configure `tools_mcp_servers` in
config — tools appear as `mcp_<server>_<name>`. List with `vira tools mcp list`.

**Browser automation** (optional `pip install 'vira[browser]'` then
`playwright install chromium`): enable with `--allow-browser` or `tools_allow_browser: true`.
Tools: `browser_navigate`, `browser_snapshot`, `browser_screenshot` (read-only);
`browser_click` / `browser_fill` (confirmed). List with `vira tools browser list`.
Only `http://` and `https://` URLs are allowed.

**Ollama tool calling** — `vira chat --tools` and `vira do` work with
`llm_provider: ollama` when the model supports tools (e.g. llama3.1+, qwen2.5).
Same confirm gating and registry as Claude.

## Voice (spoken conversation)

Install the extra, then talk to VIRA. Speech-to-text runs **locally** via Whisper.
Text-to-speech supports built-in **VIRA** (warm Irish, default), **VIRO** (British
butler), and **Friday** (Irish) profiles via ElevenLabs Voice Design, with macOS
`say` as an offline fallback.

```bash
pip install -e '.[voice]'

# One-time: add ELEVENLABS_API_KEY to .env, then design and store both voices.
vira voice setup --profile all

vira voice chat                          # VIRA (default); say "goodbye" to stop
vira voice chat --profile viro           # switch to VIRO (British butler)
vira voice chat --tts elevenlabs         # force ElevenLabs (requires setup)
vira voice chat --seconds 7 --stt-model small
vira voice chat --hands-free             # VAD: auto-send when you stop talking
vira voice chat --hands-free --wake-word vira   # only answer turns starting with "vira"
# Interrupt a long reply: press Enter (default), or speak clearly over it (--barge-in / auto with --hands-free)
# Tune barge-in sensitivity: `vira web` (slider) or voice_barge_in_margin in config (lower = easier)
# Mute mic (side conversations): press Space to toggle — cancels an active recording too
vira voice listen recording.wav          # transcribe one file, answer, speak the reply

vira voice profiles                      # show profiles, stored IDs, cache + key status (offline)
vira voice preview --profile vira        # play design previews without storing them
vira voice preview --profile vira --index 1 --commit   # store preview #1 as the voice
vira voice cleanup                       # delete leftover VIRA-* voices (needs voices delete perm)

# Clone your own voice (ElevenLabs Instant Voice Cloning; own voice only)
vira voice clone record                  # guided mic capture + upload (~60s+ total)
vira voice clone record --i-consent      # skip consent prompt (scripts)
vira voice clone from-files a.wav b.wav    # upload existing audio
vira voice clone status                  # stored clone + subscription hints
vira voice chat --profile user           # speak with your cloned voice
```

Voice settings (profile, TTS engine, Whisper model, record seconds) live in
[config/default.yaml](config/default.yaml). Stored ElevenLabs voice IDs are written
to `~/.vira/voices.json`. The first `voice` run downloads the chosen Whisper model.
Microphone capture needs mic permission for your terminal.

**Tuning without code edits** — override any built-in profile from `config/default.yaml`
under `voice_profile_overrides` (e.g. `friday.voice_settings.stability`); overrides are
merged at design and speak time. **Audio cache** — repeated ElevenLabs lines are cached
as MP3s under `~/.vira/audio-cache/` (SHA-256 keyed on voice + model + text + settings),
so they replay without re-synthesizing. The cache covers the **ElevenLabs** playback path
only (the macOS `say`/Moira path, e.g. Friday's `prefer_say`, is not cached). Disable with
`voice_audio_cache: false`. `vira voice preview` costs ElevenLabs design credits; it only
stores a voice when you pass `--commit`.

**Hands-free** — `vira voice chat --hands-free` records until you stop talking
(energy-based voice-activity detection; no extra dependency) instead of a fixed
`--seconds`. Tune `voice_vad_threshold` / `voice_vad_silence_ms` (and the
`voice_vad_*` caps) in `config/default.yaml`, or default it on with `voice_vad: true`.
Add an optional **wake word** (`--wake-word vira` or `voice_wake_word: vira`) so VIRA
only answers turns that start with it — matched leniently (after an optional "hey/ok/
okay/yo "), so speech-to-text homophones like "Vera" still wake her.

**Switch voice mid-conversation** — just say **"use Friday"**, **"use VIRA"**, or
**"use VIRO"** (also "switch to …" / "talk as …") to change the speaking voice on the
fly, without leaving `voice chat`. The **persona switches with the voice** too — Friday
introduces herself as Friday (and may call you "boss"), VIRO as a composed British
butler — while the conversation memory carries over. Press **Enter** to interrupt a
long reply, or speak over it with barge-in (`--barge-in`, on by default with
`--hands-free`). Press **Space** to mute the mic before or during a capture.

**Switch model/provider mid-conversation** — say **"switch to Claude"** / **"use
Anthropic"** to use Claude, or **"go local"** / **"go offline"** (or **"use Ollama"**)
to switch to the local Ollama model — without restarting. Matching is fuzzy and covers
common speech-to-text mishearings ("cloud"→Claude, "llama"→Ollama); **"go local"** is
the most reliable phrase for Ollama since "Ollama" is often mis-transcribed. Memory
carries over; the spend cap re-applies once you're back on Claude.

## Web UI (browser chat)

A local, dark, single-page **chat UI** in the browser — React frontend, FastAPI
backend — reusing the same agent, knowledge bases, vault memory, sessions, and
spend-cap as the CLI.

```bash
pip install -e '.[web]'
vira web                  # opens http://127.0.0.1:8765/
vira web --no-open        # headless server only

# Remote access — require a token before binding beyond localhost:
VIRA_WEB_AUTH_TOKEN=$(openssl rand -hex 16) vira web --host 0.0.0.0 --port 9000
# clients then pass it: open http://host:9000/?token=…  (or Authorization: Bearer …)
```

- **Streaming chat** (Server-Sent Events) with a live token cursor and the same
  per-turn usage/cost line as the CLI.
- **Browser voice (Phase 18.1)** — tap the mic for push-to-talk (records → Whisper STT →
  sends) with a live level ring and a listen/speak status chip; the 🔊 toggle speaks
  replies back in VIRA's voice (ElevenLabs), with a voice-profile picker in settings.
- **Remote control / WebSocket (Phase 19)** — an **SSE/WS** transport toggle in the
  header switches chat to `/api/ws/chat`. Set `web_auth_token` (or `VIRA_WEB_AUTH_TOKEN`)
  to require a token on every `/api` route *and* the WS handshake (`Bearer` or `?token=`),
  so binding beyond localhost is safe; empty = localhost trust.
- **Topic knowledge picker** + **memory toggle** in the header — switching either
  starts a fresh session with that primary KB / vault memory.
- **Voice settings** drawer (⚙): barge-in toggle + **sensitivity slider**
  (`voice_barge_in_margin`), Space-mute, Enter-interrupt, STT model — saved to
  `~/.vira/config.yaml`, applied on the next `vira voice chat`.
- Session transcripts are written to `~/.vira/sessions/*.jsonl` (path shown in the footer).

The UI is built from [`frontend/`](frontend/) (React + Vite + TypeScript) into a single
self-contained `vira/web/static/index.html` that FastAPI serves — so CI stays
Python-only. See [`frontend/README.md`](frontend/README.md) to develop or rebuild it.

Security controls (auth token, bind rules, path allowlist, request limits) are documented
in [SECURITY.md](SECURITY.md).

## Workflows

Chain skills into a multi-step pipeline with a YAML file. Steps interpolate
`{{variables}}` and prior steps' output (`{{step.output}}`), can be guarded with
`when:`, can fan out over a list with `foreach:` (binding `{{item}}` per element),
and can save results to a file with `save_to:`.

```bash
vira workflow list
vira workflow run refine_and_summarize.yaml --var text="your draft here" --var out_dir=.
vira workflow run branch_and_parallel.yaml --var mode=fast --var text="hello"
```

See [examples/workflows/refine_and_summarize.yaml](examples/workflows/refine_and_summarize.yaml).

## Conversation memory & transcripts

`vira chat` and `vira voice chat` are **multi-turn** — VIRA remembers the
conversation within a session, and replies **stream** token-by-token. Ground the conversation in a **topic knowledge base** (`--topic` / `--kb` / `--context`),
and every session is logged as JSONL under `sessions_dir` (default `~/.vira/sessions`).

```bash
vira --topic vira chat                # primary expertise from config knowledgebases
vira chat --context ./my-notes        # ad-hoc folder for this chat only
vira session list                     # list recorded transcripts
vira session extract session-20260607-120000.jsonl -o workflow.md   # mine a session into a workflow
```

`session extract` reads a transcript and asks the agent to distill the repeatable
task into a reusable workflow + decision summary — the first cut of turning
dialogue into reusable structure. See [docs/DIALOGUE.md](docs/DIALOGUE.md) for the
full ICM flow (context → chat → extract → workflow), diagrams, and gaps.

## Layout

```
vira/
  config.py            # Settings + model/provider precedence
  core/
    context.py         # ContextManager / ContextBundle
    knowledge.py       # Topic knowledge base registry + primary expertise loading
    agent.py           # LLMClient protocol, ViraAgent (+ memory, persona, client swap)
    llm.py             # provider factory (Claude / Ollama)
    ollama.py          # local Ollama client (OpenAI-compatible, stdlib urllib)
    skills.py          # Skill, SkillEngine
    session.py         # SessionLogger (JSONL transcripts)
    workflow.py        # Workflow, WorkflowEngine, load_workflow
    extract.py         # session transcript -> reusable workflow draft
    usage.py           # token/cost tracking, running total, spend cap
  voice/               # optional extra (lazy imports)
    recorder.py stt.py tts.py pipeline.py
    vad.py             # energy VAD, wake word, voice/provider voice-commands
    profiles.py elevenlabs.py   # voice profiles + ElevenLabs Voice Design
  memory/              # Obsidian vault: self-knowledge, facts, session distillation
  cli/main.py          # Typer CLI
examples/skills/refiner/   # sample skill (skill.md + prompts/)
tests/                     # unittest suite (no API key / network needed)
config/default.yaml
```

## Develop

Run the test suite from the repo root (no API key or network required — the
Anthropic client is dependency-injected and faked in tests):

```bash
python3 -m unittest discover -s tests -t .
```

Lint and coverage (install dev tools first with `pip install -e '.[dev]'`):

```bash
ruff check .                                   # lint + import order
coverage run -m unittest discover -s tests -t . && coverage report
```

CI runs both — a `ruff check` lint job and the test suite under a coverage gate
(see [.github/workflows/ci.yml](.github/workflows/ci.yml)).

## Roadmap

Delivered: CLI + Claude agent, folder/markdown context, markdown skills,
multi-turn memory, session transcripts, a YAML workflow engine, and a real voice
loop (Whisper STT + macOS `say` TTS). Post-MVP ideas (tracked in
`harness/plans/vira-mvp/EFFORT.md` → Improvements & Opportunities): streaming
responses, prompt caching, token-budget-aware context, decision-tree extraction
from sessions, wake-word/VAD, non-macOS TTS, and PyPI/pipx packaging.

## License

[MIT](LICENSE).
