Metadata-Version: 2.4
Name: cass-ai
Version: 0.1.3
Summary: CASS — A CLI coding assistant powered by LLMs. Works with OpenAI, Ollama, and OpenRouter.
Project-URL: Homepage, https://github.com/C5m7b4/coding_assistant_v2
Project-URL: Repository, https://github.com/C5m7b4/coding_assistant_v2
Project-URL: Issues, https://github.com/C5m7b4/coding_assistant_v2/issues
Author: C5m7b4
License: MIT License
        
        Copyright (c) 2026 C5m7b4
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: ai,cli,coding-assistant,llm,mcp,ollama,openai,tools
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.12
Requires-Dist: httpx>=0.28.0
Requires-Dist: mcp>=1.26.0
Requires-Dist: openai>=1.86.0
Requires-Dist: pillow>=12.1.1
Requires-Dist: prompt-toolkit>=3.0.52
Requires-Dist: python-dotenv>=1.1.0
Requires-Dist: rich>=14.0.0
Description-Content-Type: text/markdown

# CASS — Coding ASSistant

A CLI coding assistant powered by LLMs. Works with OpenAI, Ollama, and OpenRouter.

```
 ██████╗ █████╗ ███████╗███████╗
██╔════╝██╔══██╗██╔════╝██╔════╝
██║     ███████║███████╗███████╗
██║     ██╔══██║╚════██║╚════██║
╚██████╗██║  ██║███████║███████║
 ╚═════╝╚═╝  ╚═╝╚══════╝╚══════╝
```

## Features

- **8 built-in tools** — read, write, edit files, grep search, list directories, run shell commands, manage todos, spawn sub-agents
- **Diff & undo** — `/diff` shows all file changes this session, `/undo` reverts the last change
- **MCP support** — connect to any MCP server for external tools (web fetch, GitHub, databases, etc.)
- **Sub-agents** — spawn focused LLM instances for research or independent tasks, optionally with a different model
- **Tool approval system** — every tool call requires user confirmation before executing
- **Plan mode** — propose changes without executing, review the plan, then execute with one keystroke
- **Todos/planning** — track multi-step tasks within and across sessions; the LLM manages them as it works
- **Hooks** — auto-run formatters and linters after file changes (ruff, prettier, etc.)
- **Vision support** — paste screenshots with Alt+V, analyzed by a dedicated vision model
- **Context compression** — automatic conversation summarization when approaching token limits
- **Task scheduler** — run recurring or one-shot tasks in the background (shell commands or LLM prompts)
- **Custom commands** — create reusable shortcuts for common workflows
- **Conversation save/load** — persist and resume sessions
- **Multi-line input** — Enter submits, backslash continues, Alt+Enter for newlines
- **Input history & autocomplete** — arrow keys for history, Tab for slash command completion
- **Ollama model browser** — list local models filtered by tool support, switch mid-session

## Quick Start

### Prerequisites

- Python 3.12+
- [uv](https://docs.astral.sh/uv/) package manager

### Install

```bash
git clone <repo-url>
cd coding_assistant_v2
uv sync
```

### Configure

Create a `.env` file in the project root:

**For Ollama (local):**
```
OPENAI_API_KEY=ollama
OPENAI_BASE_URL=http://localhost:11434/v1
MODEL_NAME=qwen2.5-coder
VISION_MODEL=llava:latest
```

**For OpenAI:**
```
OPENAI_API_KEY=sk-your-key-here
MODEL_NAME=gpt-4o-mini
```

**For OpenRouter:**
```
OPENAI_API_KEY=sk-or-your-key-here
OPENAI_BASE_URL=https://openrouter.ai/api/v1
MODEL_NAME=openai/gpt-4o-mini
```

### Run

```bash
uv run cass
```

## Usage

### Slash Commands

| Command | Description |
|---------|-------------|
| `/help` | List all commands |
| `/plan` | Switch to plan mode (read-only, proposes changes) |
| `/active` | Switch to active mode (all tools enabled) |
| `/models` | List local Ollama models filtered by tool support |
| `/model [name\|number]` | Show or switch model |
| `/image <path>` | Attach an image to your next message |
| `/mcp` | Manage MCP servers (connect, disconnect, load) |
| `/agent` | Spawn a sub-agent (research or task mode) |
| `/todos` | Manage todos (add, start, done, rm, save, load) |
| `/command` | Manage custom commands (add, rm) |
| `/schedule` | Schedule a background task |
| `/tasks` | List scheduled tasks (rm, clear) |
| `/tokens` | Show token usage stats |
| `/compress` | Force conversation compression |
| `/save [name]` | Save conversation |
| `/load <name\|number>` | Load a saved conversation |
| `/conversations` | List saved conversations |
| `/diff` | Show diff of file changes this session (`/diff 3` for last 3) |
| `/undo` | Undo the last file change |
| `/hooks` | View hooks config (`/hooks init` for defaults) |
| `/clear` | Reset conversation history |
| `/history` | Show message counts |
| `/exit` | Quit |

### Keyboard Shortcuts

| Key | Action |
|-----|--------|
| Enter | Submit input |
| `\` + Enter | Continue to next line |
| Alt+Enter | Add a newline |
| Alt+V | Paste image from clipboard |
| Up/Down | Cycle input history |
| Tab | Autocomplete slash commands |
| Ctrl+C | Cancel current input |
| Ctrl+D | Exit |

### Plan Mode

Plan mode lets you discuss changes before making them:

1. Type `/plan` to enter plan mode (prompt turns yellow)
2. Describe what you want — the LLM reads files and proposes changes
3. When the plan is ready, you're prompted: **"Execute this plan? (Y/n)"**
4. Press Enter to auto-switch to active mode and execute

### Todos / Planning

Track multi-step tasks within a session. The LLM can manage todos automatically as it works (creating steps, marking them in progress, checking them off).

```bash
# User commands
/todos                    # List all todos
/todos add Set up database   # Add a todo
/todos start abc123       # Mark as in progress
/todos done abc123        # Mark as complete
/todos rm abc123          # Remove a todo
/todos clear              # Remove completed items
/todos save               # Save to .cass/todos.json
/todos load               # Load from file
```

The LLM also has a `todos` tool and will use it to track its progress on multi-step tasks:
```
> refactor the auth module into separate files

The LLM will:
1. Create todos for each step
2. Mark each as in_progress as it works
3. Check them off when done
```

### Sub-Agents

Spawn independent LLM instances for focused work. Sub-agents run in their own conversation context and return results to the main session.

**Two modes:**
- **Research** (default) — read-only tools (read_file, list_dir, grep). Safe for exploration.
- **Task** — full tool access (write, edit, shell). Can make changes independently.

```bash
# Via slash command
/agent find all error handling patterns in the codebase
/agent task refactor the config module to use pydantic
/agent research how does the streaming work --model llama3

# The LLM can also spawn agents on its own via the sub_agent tool
# when it decides a task needs focused work
```

**Use cases:**
- Ask the LLM to "investigate how auth works" — it spawns a research agent to read files and search code, then uses the findings to answer
- Tell it to "refactor module X" — it can spawn a task agent to do the work independently
- Use a smaller/faster model for simple exploration: `/agent find all TODO comments --model llama3.2:1b`

Sub-agent results are automatically added to the main conversation context so the LLM can reference them.

### Diff & Undo

Track and revert file changes made during a session:

```bash
/diff           # Show unified diff of all changes this session
/diff 3         # Show only the last 3 changes
/undo           # Revert the last file change (can undo multiple times)
```

- **Modified files**: `/undo` restores the previous content
- **New files**: `/undo` deletes them
- Diff output is syntax-highlighted
- Summary shows change count: "4 changes across 2 file(s) (1 created, 3 modified)"

### MCP Servers

Connect to external [MCP (Model Context Protocol)](https://modelcontextprotocol.io/) servers to extend CASS with additional tools — web fetching, GitHub, databases, and more.

```bash
# Create default config with example servers
/mcp init

# Connect servers from .cass/mcp.json
/mcp load

# Connect ad-hoc (no config file needed)
/mcp connect web npx -y @anthropic-ai/mcp-server-fetch
/mcp connect github npx -y @modelcontextprotocol/server-github

# List connected servers and their tools
/mcp

# Disconnect
/mcp disconnect web
```

**Configuration** (`.cass/mcp.json`):
```json
{
    "servers": {
        "fetch": {
            "command": "npx",
            "args": ["-y", "@anthropic-ai/mcp-server-fetch"],
            "env": {}
        },
        "github": {
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-github"],
            "env": {"GITHUB_TOKEN": "your-token"}
        }
    }
}
```

Servers configured in `.cass/mcp.json` auto-connect at startup. MCP tools appear alongside built-in tools with an `mcp_<server>_` prefix and go through the same approval system.

### Custom Commands

Create reusable shortcuts for common workflows:

```bash
# Create commands
/command add /walk schedule every 1h prompt remind me to get up and walk
/command add /test schedule every 5m shell uv run pytest
/command add /fmt shell ruff format src/
/command add /hello prompt say hello in a creative way

# Use them
/walk              # Starts the hourly walk reminder
/test              # Starts continuous testing
/fmt               # Formats all Python files
/hello             # Gets a creative greeting

# Manage
/command           # List all custom commands
/command rm /walk  # Remove a command
```

Commands are saved in `.cass/commands.json` and persist across sessions. Three action types:
- **schedule** — runs `/schedule` with the given args
- **shell** — runs a shell command directly
- **prompt** — sends text to the LLM

### Hooks

Auto-run formatters and linters after file writes/edits:

```bash
# Create default hooks (ruff format + lint for Python files)
/hooks init
```

This creates `.cass/hooks.json`:
```json
{
    "after_tool": {
        "write_file": [
            {"command": "ruff format {path}", "name": "format", "glob": "*.py"},
            {"command": "ruff check --fix {path}", "name": "lint", "glob": "*.py"}
        ],
        "edit_file": [
            {"command": "ruff format {path}", "name": "format", "glob": "*.py"},
            {"command": "ruff check --fix {path}", "name": "lint", "glob": "*.py"}
        ]
    }
}
```

Hook failures are fed back to the LLM so it can automatically fix issues.

### Task Scheduler

Run background tasks on an interval or after a delay:

```bash
# Recurring tasks
/schedule every 5m shell uv run pytest
/schedule every 1h prompt summarize my recent git changes
/schedule every 30 seconds shell echo ping

# One-shot delayed tasks
/schedule in 10m shell uv run pytest
/schedule in 30s prompt remind me to commit

# Manage tasks
/tasks              # List all with status
/tasks rm <id>      # Remove one
/tasks clear        # Remove all
```

Intervals support compact (`5m`, `30s`, `1h`) or word (`5 minutes`, `30 seconds`, `1 hour`) formats.

**Use cases:**
- `/schedule every 5m shell uv run pytest` — continuous test runner
- `/schedule every 10m shell git diff --stat` — watch uncommitted changes
- `/schedule every 1h prompt look at my git log and summarize what I've accomplished`
- `/schedule in 30m prompt remind me to commit and push`

### Vision Support

Attach images for analysis (requires a vision-capable model):

```bash
# Set a vision model in .env
VISION_MODEL=llava:latest

# Then in CASS:
# 1. Press Alt+V to paste from clipboard, or:
/image screenshot.png

# 2. Type your question and press Enter
what's wrong with this code?
```

When `VISION_MODEL` is set, images are routed to the vision model for analysis, and the description is passed to your main coding model. This means your coding model doesn't need vision support.

### Project Context (CASS.md)

Create a `CASS.md` file in your project root to give the LLM project-specific context:

```markdown
# My Project
This is a Django REST API. Uses PostgreSQL, Celery for async tasks.
Key files: src/api/views.py, src/models.py
```

This is loaded into the system prompt automatically at startup. Warns at 12KB, truncates at 16KB.

### Context Compression

Long sessions with lots of tool calls can approach the context window limit. CASS handles this automatically:

- After each response, checks if context exceeds 70% of the window (128K tokens)
- Summarizes older messages using the LLM, keeps the 6 most recent messages intact
- Can also be triggered manually with `/compress`

### Conversation Save/Load

```bash
/save my-session          # Save with a name
/save                     # Save with auto-generated timestamp name
/conversations            # List all saved conversations
/load my-session          # Load by name
/load 1                   # Load by number from list
```

Conversations are stored in `.cass/conversations/` (project-local, gitignored).

## Development

```bash
# Run tests (133 tests)
uv run pytest tests/ -v

# Run the CLI
uv run cass
```

VS Code launch configs included for debugging (Run CASS, Run Tests, Run Tests current file).

## Tech Stack

- **Python 3.12+** with async/await
- **openai** — AsyncOpenAI client (works with any OpenAI-compatible API)
- **rich** — Terminal UI, markdown rendering, streaming display
- **prompt_toolkit** — Multi-line input, history, autocomplete
- **Pillow** — Clipboard image capture for vision support
- **ruff** — Default formatter/linter for hooks (dev dependency)

## Project-local Data (.cass/)

All automatically gitignored:
- `.cass/conversations/` — saved conversation sessions
- `.cass/hooks.json` — hook configuration
- `.cass/commands.json` — custom command definitions
- `.cass/todos.json` — persisted todos
- `.cass/history` — input history
