____ ____ _
| _ \ _ __ _ _| _ \ ___ ___| | __
| | | | '__| | | | | | |/ _ \ / __| |/ /
| |_| | | | |_| | |_| | (_) | (__| <
|____/|_| \__, |____/ \___/ \___|_|\_\
|___/
Local-first CLI coding agent. Chart your course. Execute with precision.
DryDock is a TUI coding assistant designed to work with local LLMs. It gives you a conversational interface to your codebase — explore, modify, build, and test projects through natural language and a focused set of tools.
No data leaves your machine. No API keys. No per-token billing. Just your laptop, your model, and your code.
From PyPI:
# Python 3.12+ pip install drydock-cli drydock
From Docker Hub:
docker run -it --rm \ -e LLAMACPP_URL=http://host.docker.internal:8001/v1 \ -e LLAMACPP_MODEL=gemma4 \ --add-host=host.docker.internal:host-gateway \ -v "$HOME/.drydock:/root/.drydock" \ -v "$(pwd):/work" \ fbobe3/drydock:latest
From source:
git clone https://github.com/fbobe321/drydock.git cd drydock && pip install -e .
DryDock is tested and optimized for Gemma 4 26B-A4B served by llama.cpp with --jinja (the chat-template fix that prevents tool-call loops). Other OpenAI-compatible providers (Ollama, LM Studio, Mistral, OpenAI, Anthropic) work but are not as thoroughly tested.
# 1. Download Unsloth's GGUF huggingface-cli download unsloth/gemma-4-26B-A4B-it-GGUF \ --include "gemma-4-26B-A4B-it-UD-Q3_K_M.gguf" \ --local-dir /path/to/models # 2. Start llama-server with --jinja ./build/bin/llama-server \ -m /path/to/models/gemma-4-26B-A4B-it-UD-Q3_K_M.gguf \ --host 0.0.0.0 --port 8000 \ -ngl 99 -c 32768 -np 1 \ --jinja -ctk q8_0 -ctv q8_0 \ --alias gemma4
See DEPLOYMENT.md for the full ~/.drydock/config.toml, env vars, and troubleshooting.
Textual-powered terminal UI. Slash commands, plan/edit modes, session history, undo/back/goal.
read_file, write_file, search_replace, bash, grep, glob, mechanical_rename, retrieve, and more.
OpenAI-compatible endpoint. Works with llama.cpp, Ollama, LM Studio, vLLM. No cloud required.
AST symbol indexer + TF-IDF retriever. Auto-prefetches relevant code on every turn.
Per-turn thinking budget: HIGH for planning, OFF for routine writes, LOW for recovery.
Optional mmproj-F16.gguf projector enables OpenAI-style image inputs with Gemma 4.
| GPU | 2× NVIDIA RTX 4060 Ti 16GB (32GB total VRAM) |
|---|---|
| RAM | 64GB recommended, 32GB minimum |
| Model | Gemma 4 26B-A4B (26B MoE, 4B active params per token) |
| Performance | ~15–17 tok/s decode (llama.cpp Q3_K_M) |
| OS | Ubuntu 22.04 / 24.04, kernel 6.8+ |
| Min single-GPU | 24GB VRAM card with reduced context |
GitHub repository · PyPI package · Docker Hub image · Issues · Changelog