Metadata-Version: 2.4
Name: formaltask
Version: 0.1.0
Summary: Structured task management for AI-assisted development workflows
Author: David Beyer
License-Expression: MIT
Project-URL: Homepage, https://github.com/davidabeyer/formaltask
Project-URL: Documentation, https://github.com/davidabeyer/formaltask
Keywords: task-management,cli,development-workflow,ai-assisted
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Build Tools
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: PyYAML>=6.0
Requires-Dist: GitPython>=3.1.37
Requires-Dist: tenacity>=8.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-frontmatter>=1.0.0
Requires-Dist: jsonschema>=4.0.0
Requires-Dist: python-statemachine>=2.0
Requires-Dist: Jinja2>=3.1.3
Requires-Dist: rich>=13.0.0
Provides-Extra: llm
Requires-Dist: openai>=1.0.0; extra == "llm"
Requires-Dist: instructor>=1.0.0; extra == "llm"
Provides-Extra: tui
Requires-Dist: textual>=0.47.0; extra == "tui"
Requires-Dist: cachetools>=5.3.0; extra == "tui"
Provides-Extra: test
Requires-Dist: pytest>=7.0.0; extra == "test"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "test"
Requires-Dist: pytest-mock>=3.10.0; extra == "test"
Requires-Dist: hypothesis>=6.0.0; extra == "test"
Requires-Dist: pytest-textual-snapshot>=1.0.0; extra == "test"
Provides-Extra: dev
Requires-Dist: basedpyright>=1.29.0; extra == "dev"
Requires-Dist: semgrep>=1.100.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: interrogate>=1.5.0; extra == "dev"
Provides-Extra: agents
Requires-Dist: numpy>=1.24.0; extra == "agents"
Requires-Dist: requests>=2.28.0; extra == "agents"
Provides-Extra: dayflow
Requires-Dist: httpx>=0.24.0; extra == "dayflow"
Provides-Extra: mcp
Requires-Dist: mcp>=0.1.0; extra == "mcp"
Provides-Extra: all
Requires-Dist: formaltask[llm]; extra == "all"
Requires-Dist: formaltask[tui]; extra == "all"
Requires-Dist: formaltask[test]; extra == "all"
Requires-Dist: formaltask[dev]; extra == "all"
Requires-Dist: formaltask[agents]; extra == "all"
Requires-Dist: formaltask[dayflow]; extra == "all"
Requires-Dist: formaltask[mcp]; extra == "all"
Dynamic: license-file

# FormalTask

Structured task management for AI-assisted development workflows. Integrates with Claude Code to provide epic-based planning, parallel task execution, and automated review workflows.

## How It Works

```
Plan → Critique → Specs → Tasks → Workers → Complete → Merge
```

You describe what you want. `/plan` explores the codebase and writes a plan. `/critique` pokes holes. `/decompose` splits it into YAML specs. `ft epic-decompose` commits tasks to SQLite with dependency tracking. `ft work spawn` launches parallel Claude workers in isolated git worktrees.

The database is the coordination backbone — not just storage.

### Plans Carry Their Revision History

Critiques don't live in separate files — they're embedded in the plan:

```yaml
goals:
  - id: "g-1"
    current: "Users can log in with email/password"
    history:
      - version: "r1"
        text: "Users can log in"
        critique:
          verdict: "FIX_AND_SHIP"
          findings:
            - priority: "P1"
              finding: "Missing rate limiting"
              action: "Add rate limiter"
              resolution: "fixed"  # Set by /revise
```

Each `/critique` round appends to `history`. When `/revise` addresses findings, it sets `resolution: fixed|rejected|deferred`. The plan carries its full revision history.

### Specs Are Contracts, Guards Enforce Them

Specs declare what the completion system will check:

```yaml
title: "Task 2: Implement API client"
depends_on: [1]
required_reviews: ["code-quality", "security"]
inputs:
  schema: "$task[1].outputs.schema"       # Auto-wired from Task 1
outputs:
  client: ".artifacts/api_client.py"      # Task 3 can reference this
acceptance_criteria:
  - id: "c-1"
    current: "GET /users returns parsed User objects"
    command: "pytest tests/test_api.py"   # Runnable verification
```

When `ft task complete` runs, the completion check evaluates: Did `required_reviews` all pass? Are acceptance criteria with `command:` fields passing? The spec is the contract. Guards enforce it.

### The Rules Kernel

Completion checks, orchestration, and prompt generation use the same type:

```python
@dataclass
class Rule:
    when: str      # condition DSL
    then: str      # output (phase name or Jinja2 template)
    target: str    # what it applies to ("task.phase", "notify", "tool.block")
    priority: int  # 0 = informational, 1 = blocks, 999 = catchall
    name: str      # reason (literal or state key for dynamic lookup)
```

The kernel is ~60 LOC: `evaluate(condition, context) → bool`, `render(template, context) → str`, `apply_rules(rules, context)` where first match wins. The same evaluator answers: "Is this task done?" "Should we spawn a CI fixer?" "What prompt should this worker get?"

The condition DSL supports `AND`, `OR`, `NOT`, comparisons (`==`, `!=`, `>`, `<`), dotted path resolution (`task.metadata.retries`), and bare truthy checks. No parentheses — flatten complex conditions into multiple rules.

22 builtin rules handle the standard completion lifecycle. Three rule sets ship by default:

| Rule set | Purpose |
|----------|---------|
| `BUILTIN_RULES` | 22 completion rules (review gates, PR checks, docs, acceptance criteria) |
| `ORCHESTRATION_RULES` | Watch daemon triggers (e.g., alert after 1 hour) |
| `TOOL_REDIRECT_RULES` | Block/redirect tool usage (e.g., WebSearch → exa) |

#### Custom Rules Per Task

Tasks can define their own rules in `metadata.completion_rules`. These are prepended before `BUILTIN_RULES`, so they get first-match-wins priority:

```python
# In a spec or task metadata:
"completion_rules": [
    {
        "when": "blocking_findings AND review_rounds.self-critique >= 2",
        "then": "needs_escalation",
        "target": "task.phase",
        "priority": 1,
        "name": "Round cap hit. Escalate to human."
    }
]
```

This lets individual tasks define their own completion policies without modifying global rules.

#### User Templates

Worker prompt templates use the same kernel. Drop a Jinja2 file in `~/.claude/templates/` to override any bundled template — user templates take priority, with automatic fallback to bundled on parse errors.

### Workers Create Their Own Tasks

A worker that finds a problem during review can create a new task on the spot:

```bash
ft task create-from-finding src/auth.py 42 --title "Fix session expiry edge case"
```

This creates a **critique-gated** task — a task with self-critique baked in:

1. The task starts in a **critique phase** (`c1`). The worker must self-review before moving to execution.
2. A custom completion rule caps critique rounds: if P0/P1 findings persist after 2 self-critique rounds, the task escalates to a human via `ft work blocked`.
3. Only after receiving a `verdict_go` does the task transition to the **exec phase** where normal completion rules apply.

The task inherits its epic from the spawning worker, carries provenance (`source_task_id`, `finding_ref`), and can be auto-spawned by the watch daemon.

### What Falls Out

Because everything routes through rules and the database:

- Auto-spawn fixer tasks when CI fails
- Nudge stuck workers after 30 minutes
- Inject thorough-approach prompts for complex tasks
- Wire outputs to inputs across task dependencies
- Block completion until required reviews pass
- Workers spawn new tasks mid-flight, with their own completion policies

## Quick Start

```bash
pip install formaltask
```

After installation:

1. Set the required environment variable:
   ```bash
   export OPENROUTER_API_KEY="<your-key-here>"
   ```

2. Run the setup wizard:
   ```bash
   ft setup        # Interactive mode
   ft setup --yes  # Non-interactive (CI/scripts)
   ```

   The setup wizard initializes the database, registers Claude Code hooks, and verifies your configuration.

## Prerequisites

- **Python 3.11+** (required)
- **Git** (for hooks and version control)
- **tmux 3.2+** (optional, enables parallel worker features)

### Optional Feature Groups

Install additional features using pip extras:

| Extra | Purpose |
|-------|---------|
| `llm` | LLM client libraries (openai, instructor) |
| `tui` | Terminal user interface dashboard |
| `test` | Testing dependencies (pytest, hypothesis) |
| `dev` | Development tools (ruff, basedpyright) |
| `agents` | Agent-related utilities |
| `dayflow` | HTTP client utilities |
| `mcp` | MCP server integration |
| `all` | All optional dependencies |

## Alternative Installation (Development)

For development or contributing to FormalTask:

```bash
git clone https://github.com/davidabeyer/formaltask.git
cd formaltask
python3 -m venv venv && source venv/bin/activate
./install.sh
```

### Manual pip Installation

Install in development mode:

```bash
pip install -e .
```

With optional dependencies:

```bash
pip install -e ".[all]"
```

Or install specific extras:

```bash
pip install -e ".[tui,test]"
```

### Git Hooks

The `./install.sh` script automatically configures git to use the project's tracked hooks. This enables:
- Pre-commit validation (linting, TDD guard)
- Pre-push task status enforcement
- Pre-merge-commit task validation

For manual installations, run: `git config core.hooksPath .githooks`

## Configuration

### Settings File

Claude Code settings are stored in `~/.claude/settings.json`. This file configures hooks, permissions, and other Claude Code behaviors.

### Environment Variables

| Variable | Required | Purpose |
|----------|----------|---------|
| `OPENROUTER_API_KEY` | Yes | LLM operations via OpenRouter |
| `PROJECT_ROOT` | For tests and CLI | Database path resolution |

### Database

Task data is stored in `.claude/formaltask.db` (SQLite).

## Usage

### Command Line

```bash
ft --help                      # Show available commands
ft work spawn <id>             # Spawn worker for a task
ft work list                   # List spawnable tasks
ft work watch                  # Monitor workers
ft work watch --spawn          # Monitor + auto-spawn ready tasks
ft work dashboard              # TUI dashboard
ft work inbox                  # Show blocked workers awaiting input
ft task list <epic>            # List tasks in an epic
ft task show <id>              # Show task details
ft task complete <id>          # Mark task as complete
ft task cancel <id>            # Cancel a task
ft epic list                   # List all epics
ft epic health <epic>          # Check epic health
ft setup                       # Run setup wizard
ft doctor                      # Verify configuration
```

Or run as a Python module:

```bash
python3 -m formaltask.cli --help
```

### Project Structure

```text
formaltask/
├── cli/                # CLI commands (ft <noun> <verb>)
├── core/               # Completion checking, config
├── data/               # Static data files
├── db/                 # Database connection, migrations
├── epics/              # Epic CRUD, YAML parsing
├── git/                # Worktree management, PR queries
├── hooks/              # Hook utilities (shared with hooks/)
├── llm/                # LLM integration (OpenRouter)
├── review/             # Review context, prompt building
├── skills/             # Skill metadata, span tracking
├── state/              # Findings, session tracking
├── tasks/              # Task lifecycle, dependencies, guards
├── validators/         # PreToolUse validators (TDD, doc-guard)
├── vault/              # Knowledge storage
├── workers/            # Worker spawning, monitoring
├── apps/               # TUI applications (dashboard)
└── utils/              # Shared utilities
agents/                 # Subagent definitions
hooks/                  # Hook entry points for Claude Code events
tests/                  # Test suite
.githooks/              # Tracked git hooks
.claude/
└── formaltask.db       # Task database (auto-created by ft setup)
```

See the [CLI Reference](docs/cli/index.md) for full command documentation, [Planning Workflow](skills/PLANNING-WORKFLOW.md) for the plan→critique→revise→decompose lifecycle, and [Architecture Overview](docs/architecture/overview.md) for how the pieces fit together.

### Dashboard

The interactive TUI dashboard (`ft work dashboard`) provides real-time monitoring and control of parallel workers.

![Dashboard](docs/assets/dashboard.png)

**Layout:** Status bar (top) showing task counts and auto-spawn state, task list (middle) with color-coded health indicators, terminal pane (bottom) showing the selected worker's output.

**Worker states:** Each task shows a health indicator — **LIVE** (running), **EXIT** (process ended), **HELP** (needs human input), **FIX** (has review findings), or **queued** (ready to spawn).

**Keybindings:**

| Key | Action |
|-----|--------|
| `j` / `k` | Navigate task list |
| `Enter` | Attach to selected worker (F12 to detach back) |
| `S` | Spawn next queued task |
| `A` | Toggle auto-spawn (automatically fills worker slots) |
| `+` / `-` | Adjust max worker limit (1-10) |
| `X` | Kill selected worker (double-tap to confirm) |
| `R` | Restart selected worker (double-tap to confirm) |
| `i` | Open inbox (blocked workers awaiting input) |
| `q` | Quit |

**Auto-spawn** fills available worker slots from the task queue. The status bar shows the current limit (e.g. `auto (5)`). Adjust with `+`/`-` to scale up or down without leaving the dashboard. This is the interactive equivalent of `ft work watch --spawn`.

## Development

### Running Tests

```bash
pytest tests/ --cov=formaltask
```

### Linting

```bash
ruff check formaltask/ --fix
ruff format formaltask/
```

### Type Checking

```bash
basedpyright formaltask/
```

## License

MIT License. See [LICENSE](LICENSE) for details.
