Metadata-Version: 2.4
Name: lucid-train
Version: 0.1.0
Classifier: Environment :: Console
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Rust
Classifier: Topic :: Software Development
Summary: Agentic terminal coding harness with a streaming TUI. Local (Ollama), OpenRouter, or self-hosted models.
Keywords: ai,agent,coding,tui,llm,ollama,openrouter
Home-Page: https://lucidtrain.com
License: MIT
Requires-Python: >=3.8
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://lucidtrain.com
Project-URL: Repository, https://github.com/Arnab28122000/lucid-train

# >_ Lucid Train

An agentic terminal harness with a Codex-style TUI, written in Rust.

Lucid Train ports the **evolved-harness design** from [Agentic Harness Engineering](https://github.com/china-qijizhifeng/agentic-harness-engineering) (a top Terminal-Bench ranker) into a fast, single-binary CLI with a UI modeled on [OpenAI Codex](https://github.com/openai/codex).

```
╭────────────────────────────────────────────╮
│ >_ Lucid Train (v0.1.0)                    │
│                                            │
│ model:     kimi-k2.5   /model to change    │
│ provider:  openrouter                      │
│ directory: ~/code/my-project               │
╰────────────────────────────────────────────╯
```

## Codex-class agentic features

On top of the AHE harness core (which stays intact), lucid-train ships the
Codex feature set:

- **`update_plan`** — the model maintains a live step checklist, rendered in the TUI (`→ in progress`, `✓ done`)
- **`web_search` + `read_url`** — agentic web search (keyless DuckDuckGo→Mojeek chain, or Serper with `SERPER_API_KEY`); the system prompt instructs the model to *verify instead of hallucinate* anything it is unsure of
- **Skills** — drop `SKILL.md` packs (Claude Code-compatible frontmatter) in `~/.lucid-train/skills/<name>/`, `.lucid/skills/`, or even `~/.claude/skills/` — they're advertised to the agent, which reads them on demand. **Install from [skills.sh](https://www.skills.sh) without leaving the chat**: `/skills search react` queries the registry, `/skills install vercel-labs/agent-skills web-design-guidelines` installs it (via `npx skills`), and the agent can use it from your *next message* — the system prompt refreshes every turn
- **Thinking display** — a live `✦ thinking` stream while the model reasons, collapsing to a one-line summary when output starts; file-writing commands render as `• Wrote src/main.rs` / `• Edited config.yaml` instead of raw shell
- **MCP** — connect any Model Context Protocol server: `/mcp add github npx -y @modelcontextprotocol/server-github`; tools appear to the model as `mcp__github__…` (approval-gated under `auto`)
- **Autocomplete everywhere** — type `/` for the command palette; `/download `, `/model `, `/delete ` continue into model names sorted best→worst with fit/size annotations (Tab to complete, ↑/↓ to choose)
- **Token accounting** — live `↑ input ↓ output` session counters in the footer; exec mode prints a final `tokens: N in • M out` line
- **Guarded full-auto** — switching `/approvals full-auto` asks for explicit confirmation before everything auto-runs

## Repo context engine

Open lucid-train inside any project (VS Code / Cursor terminal, any repo) and
it maps the codebase automatically: `git ls-files` + symbol extraction
(functions, classes, structs) chunked per file. Each turn injects only the
chunks relevant to your request — keyword-scored, hard-capped at ~2k tokens —
so **large repos never blow the context window**. The map is built once per
session on a background thread.

Project memory grows with the project: the agent maintains
`.lucid/knowledge.md` (dated architecture notes, gotchas, working commands)
and `.lucid/CHANGELOG.md` (what changed, why, files touched), and reads them
back in future sessions.

## ERNIE / text-format tool-call compatibility

Some models emit tool calls as text instead of structured JSON — Baidu
ERNIE's `ernie_x1` format (`<tool_call>{…}</tool_call>` + `<think>` blocks)
and the Qwen XML style (`<function=…><parameter=…>`). Lucid Train detects and
parses both transparently, so ERNIE-4.5-Thinking and quirky Qwen serving
stacks can drive the harness like any other model.

## Why it's good at terminal tasks

The harness is intentionally minimal — one shell tool — but wraps it in the components that won on Terminal-Bench:

- **Evolved system prompt** — contract-first, mirror-the-evaluator, minimal diffs, candidate scorecards, explicit time budgets, semantic-checks-then-stop.
- **Execution risk hints** — middleware watches every command and injects targeted nudges (repeated error loops, shallow validation, localhost-only checks, thin benchmark margins, missing-dependency loops, time budget burn).
- **Publish-state guard** — once a final/acceptance-style check passes, destructive commands against the verified artifacts are blocked (override requires an explicit token + re-verification).
- **Context compaction** — at 75% context usage the middle of the conversation is summarized into a continuation brief; emergency compaction elides old tool outputs on overflow.
- **Round & token reminders** — the model always knows its iteration and context budget.
- **Background tasks** — long-running servers/builds run detached with log capture (`manage_background_task`).

## Zero-setup onboarding

`lucid-train` opens with **no API key and no model**. The setup screen detects
your hardware (RAM, Apple Silicon / NVIDIA GPU) and shows which models fit:

- **Nothing to install separately** — `/download` bootstraps Ollama itself
  (Homebrew on macOS, install script on Linux, winget on Windows), starts the
  server, and then pulls your model, all inside the TUI
- **✓ fits / ◐ fits-but-low-free-RAM / ✗ too big** for every local model
  (Qwen3, Qwen3-Coder, Qwen2.5-Coder, DeepSeek-R1, OpenAI gpt-oss, Baidu
  ERNIE 4.5 Thinking, Google Gemma 3, Mistral/Devstral) — checked against both
  total and *currently free* memory
- **Research-backed recommendation** per machine: `gpt-oss:20b` (best tool
  calling that fits 16 GB), `qwen3-coder:30b` (best agentic coder, 32 GB+),
  `qwen3:4b` for small machines; ERNIE-4.5-21B-A3B-Thinking available as a
  thinking specialist (`hf.co/unsloth/...GGUF` tag)
- `/download qwen3:4b` pulls with live progress; `/delete <tag>` frees the disk again
- `/apikey <key>` saves an OpenRouter key for cloud models
- `/login <email>` creates a **Lucid Cloud** account for big/proprietary
  models (Claude, Grok, GPT-5.x, Kimi K2.5…) with pay-as-you-go credits

If you ask for a model that's too big for your RAM — or a proprietary one —
lucid-train tells you and offers `/login` or `/apikey` instead of failing.

## Models: open source first

Works with **any OpenAI-compatible endpoint**:

| Where | How |
|---|---|
| **Local Ollama** | `/download qwen3:4b` then `/model qwen3:4b` — fully on-device, free |
| **OpenRouter** | `/apikey sk-or-…` — Kimi K2.5, Qwen3 Coder, DeepSeek, GLM, MiniMax, plus GPT/Claude/Gemini/Grok |
| **Lucid Cloud** | `/login you@email.com` — hosted proxy with Stripe credits (see `backend/`) |
| **Self-hosted** (vLLM, llama.cpp, SGLang, TGI) | `lucid-train --base-url http://host:8000/v1 -m served-model-name` |

```bash
lucid-train models                 # list presets
lucid-train -m kimi-k2.5           # OSS frontier via OpenRouter
lucid-train -p ollama -m qwen3:4b  # fully local
lucid-train -p lucid -m claude-sonnet   # via Lucid Cloud login
```

Failover: set `LLM_FALLBACK_MODELS=qwen/qwen3-coder,deepseek/deepseek-chat-v3.1` and the client walks the list on provider errors.

## Lucid Cloud credits (backend/)

The Go backend in [`backend/`](backend/README.md) powers `/login`:
email-token auth, an OpenAI-compatible SSE proxy over OpenRouter with
per-request billing (provider cost + **5% margin**), and Stripe top-ups where
a $N purchase charges `N × 1.05 × 1.18` (**18% India GST** on credits+margin).
Without `STRIPE_SECRET_KEY` it runs in dev mode and credits top-ups instantly.

```bash
cd backend && ./run.sh    # listens on :8787 (set OPENROUTER_API_KEY in .env)
```

## Install & run

```bash
cargo build --release
./target/release/lucid-train            # interactive TUI
./target/release/lucid-train exec "fix the failing tests"   # headless (CI / benchmarks)
```

Config file (optional): `~/.lucid-train/config.toml`

```toml
model = "kimi-k2.5"
approvals = "auto"
fallback_models = ["qwen/qwen3-coder"]
```

## Modes

- `/agent` *(default)* — full tools; edits files and runs commands
- `/plan` — propose-only: no tools run, you get a reviewed implementation plan
- `/research` — web_search + read_url only: cited answers, no shell

Keys: **Esc** interrupts a running turn (or clears input); **Esc twice** quits.

## Memory-friendly local inference

lucid-train starts Ollama tuned so other apps stay responsive during inference:
flash attention + `q8_0` KV cache (≈half the KV memory at 16k context), one
model / one request at a time, and a 4-minute keep-alive so the model unloads
and **returns its RAM between turns**. Tune via the standard `OLLAMA_*`
environment variables if you start the server yourself.

## Security & approvals

Every command is risk-assessed before it runs:

- **safe** — read/build/test commands
- **caution** — package installs, git push, network mutations
- **dangerous** — recursive deletes, `sudo`, `curl | sh`, hard resets
- **blocked** — `rm -rf /`, fork bombs, `mkfs`, raw device writes — refused in *every* mode

Approval policies (`/approvals` or `-a`):

- `plan` — propose only; every command needs your explicit `y`
- `auto` *(default)* — safe commands run instantly; caution/dangerous prompt (`y` once / `a` for session / `n` deny)
- `full-auto` — everything auto-approved except the blocked tier

## TUI

- Streaming responses with lightweight markdown, dim italic reasoning stream
- `• Ran command` / `└ output` exec cells with duration and exit status
- Slash commands: `/model` `/models` `/provider` `/approvals` `/status` `/compact` `/new` `/help` `/quit`
- `Esc` interrupt • `Ctrl+J` newline • PageUp/mouse-wheel scroll • context % in the footer

## Environment variables

| Var | Purpose |
|---|---|
| `OPENROUTER_API_KEY` | default provider key |
| `LLM_API_KEY` / `LLM_BASE_URL` / `LLM_MODEL` | pin any OpenAI-compatible endpoint |
| `LLM_FALLBACK_MODELS` | comma-separated failover list |
| `LUCID_TRAIN_HOME` | config dir (default `~/.lucid-train`) |

## Tests

```bash
cargo test
```

19 unit tests (security tiers, publish guard, middleware hints, shell timeout/truncation, background tasks) + 2 end-to-end tests that run the real binary against a mock OpenAI-compatible SSE server — verifying the full model → tool-call → shell → result → answer loop, and that catastrophic commands are blocked.

## Credits

- Harness design: [Agentic Harness Engineering](https://github.com/china-qijizhifeng/agentic-harness-engineering) (AHE) — evolved system prompt, risk hints, publish guard, compaction strategy.
- UI design: [OpenAI Codex CLI](https://github.com/openai/codex).

