Metadata-Version: 2.1
Name: cuddlytoddly
Version: 0.1.4
Summary: LLM-driven autonomous DAG planning and execution system
Author: 3IVIS GmbH
License: MIT
Project-URL: Homepage, https://github.com/3IVIS/cuddlytoddly
Project-URL: Documentation, https://github.com/3IVIS/cuddlytoddly/tree/main/docs
Project-URL: Issues, https://github.com/3IVIS/cuddlytoddly/issues
Keywords: llm,dag,planning,autonomous,agent
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: gitpython>=3.1
Requires-Dist: platformdirs>=4.0
Requires-Dist: windows-curses>=2.3; sys_platform == "win32"
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == "openai"
Provides-Extra: claude
Requires-Dist: anthropic>=0.25; extra == "claude"
Provides-Extra: local
Requires-Dist: llama-cpp-python>=0.2; extra == "local"
Requires-Dist: outlines>=0.0.46; extra == "local"
Provides-Extra: all
Requires-Dist: cuddlytoddly[claude,local,openai]; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-timeout; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: mypy; extra == "dev"

# cuddlytoddly

*Holding AI's hand through planning and into execution.*

Give it a goal and cuddlytoddly builds an explicit plan — a visible, editable graph of tasks and dependencies — before touching anything. Inspect it, change it, or redirect it at any point. When you're ready, it carries the plan out with real tools, quality-checks the results, and keeps going until the job is done.

**Why "cuddlytoddly"?**

AI models are capable, but left alone on long-horizon goals they miss dependencies, skip implicit steps, and wander off track. The problem isn't intelligence — it's the absence of structure and oversight.

cuddlytoddly's answer is to make the plan explicit and keep the human in control of it. Before a single tool is called, the system produces a full task graph you can read and edit. You can pause execution at any point, change a task's description, add or remove a dependency, promote a task into a subgoal for a finer breakdown, or switch goals entirely — and execution resumes from the updated state. Nothing runs without a declared intent, and no intent is locked in.

Think of it as holding AI's hand through planning and into execution — not blind autonomy, but guided, inspectable, interruptible progress. Hence the name.

---

## How it works

1. A plain-English **goal** is seeded into the graph. Nothing runs yet.
2. The **LLMPlanner** builds an explicit plan — a DAG of tasks with declared dependencies and expected outputs — before any execution begins. The draft plan passes through an optional self-review pass, structural validation, and deterministic constraint checks before any node is committed to the graph.
3. **You can inspect and edit the plan** at this point, or at any point during execution. Pause the LLM, change a task description, add or remove a dependency, promote a task to a subgoal for finer breakdown, or switch goals entirely. Only affected branches re-run — completed work is preserved.
4. The **Orchestrator** picks up ready nodes and dispatches them to the **LLMExecutor** concurrently.
5. The executor runs a multi-turn LLM loop, calling real tools (code execution, file I/O, custom skills) until the task produces a concrete result.
6. The **QualityGate** checks each result against declared outputs; if something is missing it injects a bridging task automatically.
7. Every mutation is written to an **event log** — crash and resume from exactly where you left off, with no lost work.

```
goal → [clarification fields] → LLMPlanner → [scrutinize?] → [validate] → [constraint check]
               ↑ user can edit                                                      │
               └── on confirm → resets children → partial replan              TaskGraph (DAG)
                                                                                    │
                                                                        Orchestrator
                                                                        ├── LLMExecutor + tools
                                                                        └── QualityGate (verify / bridge)
                                                                                    │
                                                                               EventLog (JSONL) → crash-proof replay
```

---

## Installation

```bash
pip install cuddlytoddly
```

**Requirements:** Python 3.11+, `git` on your PATH (for the DAG visualiser).

Then install one or more LLM backend extras depending on how you want to run the model:

| Backend | Extra to install |
|---|---|
| Anthropic Claude | `pip install cuddlytoddly[claude]` |
| OpenAI / compatible | `pip install cuddlytoddly[openai]` |
| Local llama.cpp | `pip install cuddlytoddly[local]` — see [Local model setup](#local-model-setup-llamacpp) |
| Everything | `pip install cuddlytoddly[all]` |

---

## Quick start

```bash
pip install cuddlytoddly[claude]
export ANTHROPIC_API_KEY=sk-ant-...
cuddlytoddly "Write a market analysis for electric scooters"
```

On first run, a `config.toml` is written to your user data directory with all defaults pre-filled. Open it to change backends, model settings, temperature, and more — **no code editing required**.

```bash
# Print the config file location
python -c "from cuddlytoddly.config import CONFIG_PATH; print(CONFIG_PATH)"
```

Pass no argument to open the startup screen (resume a previous run, load a manual plan, etc.). The web UI opens automatically showing the full task plan — inspect or edit it before execution starts, or just let it run. You can pause, redirect, or promote any task to a subgoal at any time. Run data is stored locally and can be resumed — the event log preserves all state.

### Switching backends

Edit `[llm] backend` in `config.toml`. That's the only change needed.

```toml
# config.toml

[llm]
backend = "claude"    # or "openai" or "llamacpp"

[claude]
model = "claude-opus-4-6"

[openai]
model    = "gpt-4o"
# base_url = "https://api.together.xyz/v1"   # any OpenAI-compatible provider
```

Then install the matching extra and set the API key:

| Backend | Extra | Env var |
|---|---|---|
| `claude` | `pip install cuddlytoddly[claude]` | `ANTHROPIC_API_KEY` |
| `openai` | `pip install cuddlytoddly[openai]` | `OPENAI_API_KEY` |
| `llamacpp` | see [Local model setup](#local-model-setup-llamacpp) | — |

---

## Local model setup (llama.cpp)

Running a model locally gives you full privacy, no API costs, and offline operation. The local backend uses [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python binding for [llama.cpp](https://github.com/ggerganov/llama.cpp).

### Step 1 — Install llama-cpp-python

The right install command depends on your hardware. The plain `pip install cuddlytoddly[local]` build is CPU-only and very slow for large models. Choose the command that matches your setup:

**macOS (Apple Silicon — Metal GPU)**
```bash
CMAKE_ARGS="-DGGML_METAL=on" pip install llama-cpp-python --force-reinstall --no-cache-dir
```

**Linux / Windows — NVIDIA GPU (CUDA)**
```bash
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --force-reinstall --no-cache-dir
```

**Linux — CPU only**
```bash
pip install llama-cpp-python
```

For other hardware (ROCm, Vulkan, SYCL) and detailed build options, see the official installation guide:
👉 **https://github.com/abetlen/llama-cpp-python#installation**

After installing llama-cpp-python, install the remaining local extras:

```bash
pip install "outlines>=0.0.46"
# or in one shot:
pip install cuddlytoddly[local]   # then re-run the GPU install above to override
```

### Step 2 — Download a model

Models must be in **GGUF format**. The default model is **Llama 3.3 70B Instruct Q4_K_M** — a good balance of quality and speed on 48 GB+ VRAM or unified memory.

**If you already have this model downloaded** (via `llama-cli -hf`, `llama-server -hf`, or `huggingface-cli download`), cuddlytoddly will find it automatically — no extra steps needed. It probes these locations in order:

1. `CUDDLYTODDLY_MODEL_PATH` env var — explicit override, any path
2. `~/.cache/llama.cpp/` — llama.cpp's native download cache
3. `~/.cache/huggingface/hub/` — Hugging Face hub cache
4. `<data dir>/models/` — cuddlytoddly's own models folder

If the model isn't found anywhere, you'll get a clear error message with the exact download command to run.

**To download the default model** into cuddlytoddly's own folder:

```bash
pip install huggingface-hub

# Linux / macOS
DATA_DIR=$(python -c "from platformdirs import user_data_dir; print(user_data_dir('cuddlytoddly', '3IVIS'))")
mkdir -p "$DATA_DIR/models"
huggingface-cli download bartowski/Llama-3.3-70B-Instruct-GGUF \
  Llama-3.3-70B-Instruct-Q4_K_M.gguf \
  --local-dir "$DATA_DIR/models"

# Windows PowerShell
$dataDir = python -c "from platformdirs import user_data_dir; print(user_data_dir('cuddlytoddly', '3IVIS'))"
New-Item -ItemType Directory -Force "$dataDir\models"
huggingface-cli download bartowski/Llama-3.3-70B-Instruct-GGUF Llama-3.3-70B-Instruct-Q4_K_M.gguf --local-dir "$dataDir\models"
```

**To use a different model or a custom path**, set the env var:

```bash
export CUDDLYTODDLY_MODEL_PATH=/path/to/your-model.gguf
```

### Step 3 — Configure the backend

Open your `config.toml` and set:

```toml
[llm]
backend = "llamacpp"

[llamacpp]
model_filename = "Llama-3.3-70B-Instruct-Q4_K_M.gguf"
n_gpu_layers   = -1    # -1 = all layers on GPU, 0 = CPU only
n_ctx          = 16384
max_tokens     = 8192
temperature    = 0.1
cache_enabled  = true
```

Change `model_filename` to match whatever you downloaded. Everything else can stay at defaults to start.

### Step 4 — Run

```bash
cuddlytoddly "Write a market analysis for electric scooters"
```

The first run will load the model into memory (10–30 seconds depending on hardware), then proceed normally. Subsequent runs reuse the response cache (`llamacpp_cache.json`) to skip identical prompts. The same caching applies to the `claude` and `openai` backends too (`api_cache.json`).

---

## LLM backends — full reference

See [docs/configuration.md](docs/configuration.md) for the complete config file reference and all available options per backend.

---

## Customising prompts and schemas

All LLM prompt templates and JSON output schemas are consolidated into two files — you never need to dig through the implementation to adjust them:

| File | What it contains |
|---|---|
| `cuddlytoddly/planning/prompts.py` | Every prompt template sent to the LLM: planner, scrutinizer, ghost node resolution, executor, verify-result, check-dependencies, plus the system prompt constants |
| `cuddlytoddly/planning/schemas.py` | Every JSON schema used for structured output: `PLAN_SCHEMA`, `EXECUTION_TURN_SCHEMA`, `RESULT_VERIFICATION_SCHEMA`, `GHOST_NODE_RESOLUTION_SCHEMA`, etc. |

Each function in `prompts.py` is documented with its parameters so it's clear what context is injected where. Edit the text freely — the functions use standard Python f-strings with named parameters.

---

## Adding skills

Drop a folder with a `SKILL.md` (and optional `tools.py`) into `cuddlytoddly/skills/`. The `SkillLoader` discovers it automatically. See [docs/skills.md](docs/skills.md) for the full format.

---

## Documentation

- [Architecture](docs/architecture.md) — how the components fit together
- [Configuration](docs/configuration.md) — LLM backends, run directory, tuning parameters, environment variables
- [Skills](docs/skills.md) — built-in skills and how to add custom ones
- [API Reference](docs/api.md) — public Python API

---

## Where is my data?

Models and run data are stored in the OS user data directory, completely separate from the package code. This works correctly whether you run from source or install via pip.

```bash
# Print the exact path on your machine
python -c "from platformdirs import user_data_dir; print(user_data_dir('cuddlytoddly', '3IVIS'))"
```

```
~/.local/share/cuddlytoddly/     ← Linux
~/Library/Application Support/cuddlytoddly/  ← macOS
%LOCALAPPDATA%\3IVIS\cuddlytoddly\  ← Windows

├── config.toml
├── models/
│   └── Llama-3.3-70B-Instruct-Q4_K_M.gguf
└── runs/
    └── write_a_market_analysis.../
        ├── events.jsonl         # full event log — enables crash recovery
        ├── llamacpp_cache.json  # response cache (llamacpp backend)
        ├── api_cache.json       # response cache (claude / openai backends)
        ├── file_llm_cache.json  # response cache (file backend)
        ├── logs/
        ├── outputs/             # working directory for file-writing tools
        └── dag_repo/            # Git repo mirroring the DAG
```

## Project structure

```
cuddlytoddly/
├── core/           # TaskGraph, events, reducer, ID generator
├── engine/         # Orchestrator, QualityGate, ExecutionStepReporter
├── infra/          # Logging, EventQueue, EventLog, replay
├── planning/
│   ├── prompts.py              ← all LLM prompt templates (edit here)
│   ├── schemas.py              ← all JSON output schemas (edit here)
│   ├── llm_interface.py
│   ├── llm_planner.py
│   ├── llm_executor.py
│   ├── llm_output_validator.py
│   └── plan_constraint_checker.py
├── skills/         # SkillLoader + built-in skill packs
│   ├── code_execution/
│   └── file_ops/
└── ui/             # Curses terminal UI, web UI, Git DAG projection
docs/
pyproject.toml
LICENSE
```

---

## Python API

```python
from cuddlytoddly.core.task_graph import TaskGraph
from cuddlytoddly.core.events import Event, ADD_NODE
from cuddlytoddly.core.reducer import apply_event
from cuddlytoddly.infra.event_queue import EventQueue
from cuddlytoddly.infra.event_log import EventLog
from cuddlytoddly.planning.llm_interface import create_llm_client
from cuddlytoddly.planning.llm_planner import LLMPlanner
from cuddlytoddly.planning.llm_executor import LLMExecutor
from cuddlytoddly.engine.quality_gate import QualityGate
from cuddlytoddly.engine.llm_orchestrator import Orchestrator
from cuddlytoddly.skills.skill_loader import SkillLoader

# Swap "claude" for "openai" or "llamacpp" — everything else is identical
llm = create_llm_client("claude", model="claude-opus-4-6")

graph    = TaskGraph()
skills   = SkillLoader()
planner  = LLMPlanner(
    llm_client=llm,
    graph=graph,
    skills_summary=skills.prompt_summary,
    scrutinize_plan=False,   # set True to enable post-planning LLM self-review
)
executor = LLMExecutor(llm_client=llm, tool_registry=skills.registry)
gate     = QualityGate(llm_client=llm, tool_registry=skills.registry)

orchestrator = Orchestrator(
    graph=graph, planner=planner, executor=executor,
    quality_gate=gate, event_queue=EventQueue(),
)

# Seed a goal
apply_event(graph, Event(ADD_NODE, {
    "node_id": "my_goal",
    "node_type": "goal",
    "metadata": {"description": "Summarise the key risks of AGI", "expanded": False},
}))

orchestrator.start()
# The graph is live and editable at any point during execution.

# Pause the LLM — in-flight tasks complete, no new ones start:
orchestrator.stop_llm_calls()

# Resume:
orchestrator.resume_llm_calls()

# Promote a task to a subgoal for a finer-grained breakdown:
# set its node_type → "goal", expanded → False; the planner picks it up next cycle

# After the first plan, each goal has a clarification node (clarification_{goal_id})
# showing the context used. Edit its fields in the UI and click Confirm
# to reset dependent tasks and trigger a partial replan with the new context.
```

All numeric limits (`max_turns`, `max_workers`, etc.) default to the values in `config.toml` when the system is started via the CLI. When constructing components programmatically you can pass them as keyword arguments — see [docs/api.md](docs/api.md) for the full signature of each class.

---

## License

MIT — see [LICENSE](LICENSE).
