Metadata-Version: 2.4
Name: subagent-fleet
Version: 0.0.1
Summary: Run Claude Code-style subagents across your local model fleet.
Author: Aditya Karnam
License-Expression: MIT
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: httpx>=0.27
Requires-Dist: jinja2>=3.1
Requires-Dist: pydantic>=2.7
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.7
Requires-Dist: typer>=0.12
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: respx>=0.21; extra == 'dev'
Description-Content-Type: text/markdown

<div align="center">

# subagent-fleet

**Run Claude Code-style subagents across your local model fleet.**

`subagent-fleet` is a config-first Python CLI for mapping coding subagents to the best Ollama model and machine you own, then generating LiteLLM and Claude Code-style agent configuration.

[![GitHub Repo stars](https://img.shields.io/github/stars/adityak74/subagent-fleet?style=social)](https://github.com/adityak74/subagent-fleet/stargazers)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
![Python](https://img.shields.io/badge/python-3.10%2B-blue)
![CLI](https://img.shields.io/badge/interface-CLI-4B5563)
![Ollama](https://img.shields.io/badge/ollama-compatible-111827)
![LiteLLM](https://img.shields.io/badge/litellm-ready-2563EB)
![GitHub last commit](https://img.shields.io/github/last-commit/adityak74/subagent-fleet)
![GitHub issues](https://img.shields.io/github/issues/adityak74/subagent-fleet)

[Quickstart](#quickstart) • [Configuration](#configuration) • [Generated Files](#generated-files) • [Security](#security) • [Roadmap](#roadmap)

</div>

## Overview

Local model users often have more than one useful machine: a laptop, a Mac mini, a workstation, a home server, or a spare GPU box. Most coding harnesses still point at one model endpoint.

`subagent-fleet` turns that setup into a private local subagent fleet:

```text
planner     -> small fast model on a lightweight node
implementer -> larger coding model on a bigger node
reviewer    -> larger coding model on a bigger node
summarizer  -> small local model on the controller
```

It does not replace Ollama, LiteLLM, or Claude Code. It generates the glue between them:

```text
Claude Code / coding harness
        |
        v
LiteLLM gateway generated by subagent-fleet
        |
        +-- Ollama node: laptop
        +-- Ollama node: Mac mini 64GB
        +-- Ollama node: workstation
```

## Features

- Validate a declarative `fleet.yaml`.
- Discover models from configured Ollama nodes via `/api/tags`.
- Generate `litellm_config.yaml` with `ollama_chat/` routes.
- Generate Claude Code-style `.claude/agents/*.md` files.
- Generate `.env.subagent-fleet` for Claude Code/LiteLLM environment variables.
- Warm configured Ollama models with `keep_alive`.
- Show node health and agent routing tables.
- Keep unreachable nodes isolated so one offline machine does not crash the whole workflow.

## Status

MVP CLI implemented.

Available commands:

```bash
subagent-fleet init
subagent-fleet validate
subagent-fleet discover
subagent-fleet generate
subagent-fleet warmup
subagent-fleet status
subagent-fleet doctor
subagent-fleet clean
subagent-fleet skills list
subagent-fleet skills install
subagent-fleet plugins install
```

## Install

Choose one of the install paths below.

### CLI from GitHub

Install the CLI directly from PyPI:

```bash
python -m pip install subagent-fleet
```

Or install it as an isolated command with `pipx`:

```bash
pipx install subagent-fleet
```

Verify:

```bash
subagent-fleet --help
```

### Development Checkout

Use this when contributing to the project:

```bash
git clone https://github.com/adityak74/subagent-fleet.git
cd subagent-fleet
python -m pip install -e ".[dev]"
```

Run tests:

```bash
python -m pytest
```

### Claude Code Plugin First

Install the plugin first from Claude Code, then let the bundled bootstrap skill install the CLI:

```text
/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet
```

After install, ask Claude Code:

```text
Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.
```

The bootstrap skill will run or recommend:

```bash
python -m pip install subagent-fleet
subagent-fleet skills install
```

### Codex Plugin First

Install this repository as a local Codex marketplace:

```bash
codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet
```

Then ask Codex:

```text
Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.
```

## Quickstart

Create a starter config:

```bash
subagent-fleet init
```

Edit `fleet.yaml` with your Ollama node endpoints and model names, then validate it:

```bash
subagent-fleet validate
```

Check which nodes are reachable:

```bash
subagent-fleet discover
```

Generate LiteLLM, Claude agent, and environment files:

```bash
subagent-fleet generate
```

Start LiteLLM:

```bash
export LITELLM_MASTER_KEY="sk-local-dev"

litellm \
  --config ./litellm_config.yaml \
  --host 127.0.0.1 \
  --port 4000
```

Point Claude Code at the local gateway:

```bash
source .env.subagent-fleet
claude
```

## Configuration

`subagent-fleet` is driven by `fleet.yaml`.

```yaml
project:
  name: local-dev
  gateway:
    provider: litellm
    host: 127.0.0.1
    port: 4000
    master_key_env: LITELLM_MASTER_KEY

nodes:
  m5-local:
    endpoint: http://localhost:11434
    tags: [controller, local, fast]

  m4-mini-64gb:
    endpoint: http://192.168.1.50:11434
    tags: [heavy, coder, reviewer]

  m4-mini-16gb:
    endpoint: http://192.168.1.51:11434
    tags: [small, planner, summarizer]

models:
  heavy-coder:
    node: m4-mini-64gb
    ollama_model: qwen2.5-coder:32b
    litellm_alias: claude-sonnet-local
    context: 32768
    timeout: 600
    max_parallel: 1

  small-coder:
    node: m4-mini-16gb
    ollama_model: qwen2.5-coder:7b
    litellm_alias: claude-haiku-local
    context: 8192
    timeout: 300
    max_parallel: 1

agents:
  planner:
    model: small-coder
    description: Use for planning, file discovery, task decomposition, and summarization.
    tools: [Read, Grep, Glob]
    prompt: |
      You are a fast local planning agent.
      Do not edit files.
      Return a concise response with:
      - plan
      - relevant files
      - risks
      - next recommended agent

  implementer:
    model: heavy-coder
    description: Use for implementation, bug fixes, refactors, and patch creation.
    tools: [Read, Grep, Glob, Edit, MultiEdit, Bash]

  reviewer:
    model: heavy-coder
    description: Use after implementation to review diffs, tests, regressions, and maintainability.
    tools: [Read, Grep, Glob, Bash]
```

## Generated Files

Running:

```bash
subagent-fleet generate
```

creates:

```text
litellm_config.yaml
.claude/agents/planner.md
.claude/agents/implementer.md
.claude/agents/reviewer.md
.env.subagent-fleet
```

Example LiteLLM route:

```yaml
model_list:
  - model_name: claude-sonnet-local
    litellm_params:
      model: ollama_chat/qwen2.5-coder:32b
      api_base: http://192.168.1.50:11434
      api_key: ollama
      timeout: 600
    model_info:
      max_input_tokens: 32768
```

Example Claude agent:

```markdown
---
name: planner
description: Use for planning, file discovery, task decomposition, and summarization.
model: claude-haiku-local
tools: Read, Grep, Glob
---

You are a fast local planning agent.
Do not edit files.
Return a concise response with:
- plan
- relevant files
- risks
- next recommended agent
```

## Commands

| Command | Purpose |
| --- | --- |
| `subagent-fleet init` | Create a starter `fleet.yaml`. |
| `subagent-fleet validate` | Validate schema, references, URLs, aliases, and agent names. |
| `subagent-fleet discover` | Query configured Ollama nodes for available models. |
| `subagent-fleet generate` | Generate LiteLLM config, Claude agents, and env file. |
| `subagent-fleet warmup` | Preload configured Ollama models with `keep_alive`. |
| `subagent-fleet status` | Show node health and agent routing. |
| `subagent-fleet doctor` | Show validation and local-network safety guidance. |
| `subagent-fleet clean` | List or remove generated files. |
| `subagent-fleet skills list` | List bundled assistant skills and supported targets. |
| `subagent-fleet skills install` | Install assistant-facing setup and operations skills. |
| `subagent-fleet plugins install` | Install Claude Code and Codex plugin marketplace bundles. |

JSON output is available for discovery and status:

```bash
subagent-fleet discover --json
subagent-fleet status --json
```

## Assistant Skills

`subagent-fleet` ships assistant-facing skills that teach Claude Code, Codex, OpenCode, and similar tools how to set up and operate the fleet from inside a repository.

List bundled skills and supported targets:

```bash
subagent-fleet skills list
```

Install all bundled skills for all supported targets:

```bash
subagent-fleet skills install
```

This writes:

```text
.claude/skills/subagent-fleet-setup/SKILL.md
.claude/skills/subagent-fleet-operations/SKILL.md
.codex/skills/subagent-fleet-setup/SKILL.md
.codex/skills/subagent-fleet-operations/SKILL.md
.opencode/skills/subagent-fleet-setup/SKILL.md
.opencode/skills/subagent-fleet-operations/SKILL.md
```

Install for a specific assistant:

```bash
subagent-fleet skills install --target codex
subagent-fleet skills install --target claude-code
subagent-fleet skills install --target opencode
```

Install one bundled skill:

```bash
subagent-fleet skills install --skill subagent-fleet-setup
```

Existing skill files are not overwritten unless you pass `--force`.

## Plugin Marketplaces

This repository also ships plugin marketplace metadata so users can install the assistant skill first, then let that skill install and verify the Python CLI.

Included plugin artifacts:

```text
.claude-plugin/marketplace.json
.agents/plugins/marketplace.json
plugins/subagent-fleet/.claude-plugin/plugin.json
plugins/subagent-fleet/.codex-plugin/plugin.json
plugins/subagent-fleet/skills/subagent-fleet-bootstrap/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-setup/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-operations/SKILL.md
```

The bootstrap skill teaches Claude Code or Codex how to install the CLI:

```bash
python -m pip install subagent-fleet
```

and then install repo-local assistant skills:

```bash
subagent-fleet skills install
```

Claude Code plugin install flow:

```text
/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet
```

Codex local marketplace flow:

```bash
codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet
```

To generate the same marketplace/plugin bundle into another directory:

```bash
subagent-fleet plugins install --out /path/to/marketplace-root
```

Install only one target:

```bash
subagent-fleet plugins install --target claude-code
subagent-fleet plugins install --target codex
```

Existing plugin marketplace files are not overwritten unless you pass `--force`.

## Ollama Worker Setup

On each worker machine, run Ollama on a private interface reachable from your controller:

```bash
launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
launchctl setenv OLLAMA_KEEP_ALIVE "-1"
launchctl setenv OLLAMA_NUM_PARALLEL "1"
launchctl setenv OLLAMA_MAX_LOADED_MODELS "1"

killall Ollama
open -a Ollama
```

From the controller:

```bash
curl http://NODE_IP:11434/api/tags
```

## Security

`subagent-fleet` assumes private local networking.

Do:

- Use LAN, firewall rules, Tailscale, WireGuard, or a private subnet.
- Keep `LITELLM_MASTER_KEY` set for LiteLLM access.
- Treat generated `.env.subagent-fleet` files as local developer configuration.

Do not:

- Expose Ollama directly to the public internet.
- Expose LiteLLM without authentication.
- Commit real API keys, LAN secrets, or machine-specific private `.env` files.

Run:

```bash
subagent-fleet doctor
```

for local setup and safety reminders.

## Development

Install dev dependencies:

```bash
python -m pip install -e ".[dev]"
```

Run tests:

```bash
python -m pytest
```

Run a focused test:

```bash
python -m pytest tests/test_config.py
```

Check CLI wiring:

```bash
python -m subagent_fleet.cli --help
```

## Project Layout

```text
src/subagent_fleet/
  cli.py
  config.py
  discovery.py
  plugins.py
  warmup.py
  status.py
  skills.py
  generators/
  skill_templates/
  templates/

examples/
plugins/
tests/
```

## Roadmap

MVP:

- [x] `fleet.yaml` schema
- [x] Ollama node health checks
- [x] Ollama model discovery via `/api/tags`
- [x] LiteLLM config generation
- [x] Claude Code agent generation
- [x] Environment file generation
- [x] Model warmup with `keep_alive`
- [x] Status and routing tables

Next:

- [ ] Latency benchmarking
- [ ] Recommended agent-to-node assignment
- [ ] Role-based routing templates
- [ ] Tailscale-aware node discovery
- [ ] OpenAI-compatible harness examples
- [ ] Release packaging

Later:

- [ ] Dynamic routing by task type
- [ ] Fallback model generation
- [ ] Queue-aware scheduling
- [ ] Agent execution trace viewer
- [ ] Support for vLLM, LM Studio, llama.cpp, OpenRouter, and cloud APIs

## Star History

<a href="https://star-history.com/#adityak74/subagent-fleet&Date">
  <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=adityak74/subagent-fleet&type=Date" />
</a>

## Contributing

Issues and pull requests are welcome.

Good first areas:

- More generator tests
- Additional example fleets
- Better status formatting
- More robust Ollama error reporting
- Documentation for real multi-machine setups

Before opening a PR:

```bash
python -m pytest
```

## What This Is Not

`subagent-fleet` is not:

- an inference engine
- a replacement for Ollama
- a replacement for LiteLLM
- a model sharding framework
- Kubernetes for local LLMs
- a public model hosting platform

It is a small workflow layer for private local subagent orchestration.

## License

MIT. See [LICENSE](LICENSE).
