Metadata-Version: 2.4
Name: choola
Version: 0.6.0
Summary: A workflow engine for VS Code developers — build, run, and automate workflows with Python nodes.
License-Expression: Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: anthropic>=0.49.0
Requires-Dist: click>=8.1
Requires-Dist: flask>=3.0
Requires-Dist: google-genai>=1.0.0
Requires-Dist: requests>=2.31.0
Requires-Dist: aiofiles>=23.0
Requires-Dist: google-api-python-client>=2.0
Requires-Dist: google-auth>=2.0
Requires-Dist: cryptography>=42.0
Requires-Dist: chromadb>=0.5
Dynamic: license-file

# Choola

**An automation programming framework for AI agents.**

Choola is a Python-first framework for building automations *with* coding agents like Claude Code — not around them. You describe the automation in plain language, a coding agent scaffolds it into a graph of self-contained Python nodes, and the engine runs it with full traceability, cost discipline, and a deterministic execution model that agents can inspect and improve over time.

The framework is deliberately small. A workflow is a folder of Python files. A node is one file. Nodes talk to each other via JSON payloads. That's it — and that's precisely what makes Choola a comfortable surface for agents to generate and evolve code against.

---

> **⚠️ Early-Stage Project — Not for Production**
>
> Choola is under active development. Core node classes, the payload contract, and internal APIs **may change drastically between versions** without backward compatibility. We do not recommend using Choola in production systems at this time. It is intended as an exploration platform and learning tool.

---

## Why Choola

Coding agents are very good at writing small, self-contained functions with clear inputs and outputs. They are much worse at editing sprawling, implicit, cross-file orchestration. Choola turns automation into the first shape and avoids the second:

- **Agent-generated by design** — ship with [Claude Code](https://claude.ai/code) slash commands (`/choola`, `/node`) that turn English descriptions into working workflows. The framework's grep-friendly docstrings, single-file nodes, and explicit payload contracts are tuned for the way agents read and write code.
- **Simple node isolation** — every node is one `.py` file. No cross-node imports. No shared mutable state. The *only* way data moves between nodes is a JSON payload through `execute(payload, context)`. An agent can understand, edit, or replace any single node without reading the rest of the workflow.
- **Cost guardrails built into the contract** — nodes declare a `@cost` tag (`free`, `paid-one-shot`, `paid-per-item`, `paid-per-call`). Paid loop nodes are required to expose `max_items` caps and `max_consecutive_errors` circuit breakers. `choola replay` re-runs a single node against a saved input so you never pay for the whole pipeline twice while debugging.
- **Deterministic flow, AI inside** — the DAG is fixed, topologically sorted, and inspectable. The creativity goes *inside* nodes (LLM calls, extraction, classification) where it belongs, not into the orchestration.
- **Full execution traces, every run** — each run produces an evaluation JSON with per-node input, output, timing, and errors. Agents use these to diagnose and fix workflows the same way a developer would.
- **Visual editor + CLI, same source of truth** — the editor renders the same Python files the CLI runs. You can build in the browser, edit in your editor, and the two never drift.
- **Branching, merging, and conditional routing** — fan a payload out to parallel branches, merge them back with per-parent access, or let any node decide at runtime which branches to activate via `__active_branches__`.
- **Per-workflow SQLite, globals, and encrypted credentials** — state when you want it, none of it hidden. Each workflow gets its own isolated DB; credentials live in the engine's store and are fetched via `await self.get_credential(name)`.

---

## For End Users

### Install

```bash
pip install choola
```

### Initialize a project

In any empty directory:

```bash
choola init          # Creates workflows/ and choola.db
choola start         # Opens the visual editor at http://localhost:5000
```

### Build a workflow with Claude Code

If you use Claude Code, this is the shortest path from idea to running automation:

```
/choola build a workflow that takes an uploaded PDF, summarizes it with Claude,
and emails me the summary
```

Claude reads the framework's rules, scaffolds the folder, writes one node per step (form trigger → PDF extractor → LLM → Gmail), wires the DAG, and leaves you with a workflow you can run. `/node` does the same thing for a single node added to an existing workflow.

### Run it

From the UI: click a workflow, press **Run**, watch execution stream live.

From the CLI:

```bash
choola create my-workflow                                  # Scaffold a new workflow
choola list                                                # List all workflows
choola run my-workflow --payload '{"key": "value"}'        # Run headlessly
choola replay my-workflow <run_id> <node_id>               # Re-run one node against saved input
choola nodes                                               # List core node types
```

### Debug with evaluations

Every run writes `workflows/<name>/evaluations/<run_id>.json` containing:

- Top-level `status`, total duration, initial and final payload
- Per-node `input`, `output`, `status`, `duration_ms`, and full traceback on error

This is the primary debugging surface. When something misbehaves, open the evaluation, find the node with `"status": "ERROR"`, read the traceback, fix the node, and use `choola replay` to re-execute just that node against its original input — no re-running expensive upstream LLM calls.

### Cost discipline, out of the box

Choola assumes workflows will touch paid APIs and bakes guardrails into the node contract:

- Nodes declare `@cost:` in their docstring. Unmarked nodes that call `get_credential()` are treated as paid until proven otherwise.
- Paid loop nodes must expose `max_items` (small default, e.g. 20) and `max_consecutive_errors` (default 3). One bad API key cannot burn through a hundred calls.
- The framework's own rule for coding agents is **replay, don't re-run** when iterating on a downstream fix — and **no live paid calls during scaffolding**, only import checks, until the operator approves the spend.
- Classification and filter loops default to Haiku / Gemini Flash. Escalation to Sonnet/Opus is opt-in.

### Built-in triggers and core nodes

| Node | Purpose |
|---|---|
| `ManualTrigger` | Start from the UI "Run" button or `--payload '{...}'` |
| `WebhookTrigger` | Start from an HTTP request to a registered path |
| `FormTrigger` | Serve an HTML form; submission triggers the workflow. Form fields double as positional CLI args. |
| `LLM` | Call Claude or Gemini with an interpolated prompt template |
| `Gmail` | Send email via Gmail OAuth2 |
| `HTTP` | Call any HTTP endpoint with templated params |
| `DB` | Add a per-workflow SQLite database (schema declared in the node) |

Every core node is meant to be **extended, not instantiated directly** — your workflow's `nodes/` folder contains thin wrapper classes so the behavior stays yours to modify.

### Credentials

API keys and OAuth tokens live encrypted in `choola.db` and are never hardcoded. Manage them in **Settings → Credentials** in the UI, or via the API:

```
GET    /api/credentials          # List all (values masked)
POST   /api/credentials          # Create/update: { name, provider, value }
DELETE /api/credentials/<name>   # Delete
```

Access them inside a node:

```python
cred = await self.get_credential("my-anthropic-key")
api_key = cred["value"]
```

---

## Anatomy of a Workflow

```
workflows/my_workflow/
├── topology.json          # UI layout + per-instance config (auto-managed)
├── files/                 # Binary/generated files (gitignored)
├── evaluations/           # Auto-generated run traces, one JSON per run
└── nodes/
    ├── __init__.py
    ├── fetch_data.py      # node_id="fetch_data", next_nodes=["summarize"]
    ├── summarize.py       # node_id="summarize", next_nodes=["send_email"]
    └── send_email.py      # node_id="send_email", next_nodes=[]
```

The DAG is defined entirely in code: each node's `next_nodes` attribute declares where its output goes. The engine discovers nodes, topologically sorts them, and executes in order. `topology.json` stores only canvas positions and per-instance config — never execution order.

### Branching and merging

```
Trigger (next_nodes=["branch_a", "branch_b"])
    ├──> BranchA (next_nodes=["merge"])
    └──> BranchB (next_nodes=["merge"])
              └──> Merge (next_nodes=[])
```

- **Split**: each downstream branch receives an isolated deep copy of the parent's output. Mutations in one branch never leak into another.
- **Merge**: incoming branches are shallow-merged in topological order (last-writer-wins). The merge node can also read individual parents via `context["parent_outputs"]`.
- **Conditional routing**: any node can return `{"__active_branches__": [...]}` to activate only a subset of its `next_nodes`. The engine strips the key before downstream nodes see it, and marks inactive branches as `SKIPPED`. Diamond patterns work correctly — a merge node is only skipped if *all* its parents are skipped.

---

## For Developers — Extending the Framework

If your goal is to add new capabilities to Choola itself (new core nodes, new trigger types, new engine features), this section is for you.

### Prerequisites

- Python 3.10+
- Node.js 18+ and npm (for the React editor)

### Clone and install

```bash
git clone https://github.com/igrosny/choola.git
cd choola

python3 -m venv venv
source venv/bin/activate     # Windows: venv\Scripts\activate
pip install -e .

cd frontend
npm install
npm run build
cd ..
```

### Dev loop (two terminals)

```bash
# Terminal 1 — Flask backend
choola start --debug

# Terminal 2 — Vite frontend with HMR
cd frontend && npm run dev
```

Open `http://localhost:5173`. Vite proxies API calls to Flask at `:5000`.

### Package layout

```
choola/                       # The pip-installable package
├── __init__.py               # __version__ lives here
├── cli.py                    # CLI entry point
├── server.py                 # Flask API + execution engine + serves the React UI
├── database.py               # SQLite: globals, run logs, credentials
├── evaluations.py            # One JSON per run
├── CLAUDE.md                 # Workflow-authoring guide (copied on `choola init`)
├── core/
│   ├── base_node.py          # BaseNode — every node inherits from this
│   ├── CLAUDE.md             # Core node reference
│   └── nodes/                # Built-in core nodes
│       ├── trigger.py
│       ├── manual_trigger.py
│       ├── webhook_trigger.py
│       ├── form_trigger.py
│       ├── llm.py
│       ├── gmail.py
│       ├── http.py
│       └── db.py
└── static/dist/              # Pre-built React UI (rebuilt before release)

frontend/                     # React + XyFlow editor (Vite)
workflows/                    # Dev/test workflows (gitignored)
```

### Writing a node (the contract)

```python
"""
@choola-node: MyNodeName
@node-id: my_node_name
@category: processing
@description: Does one specific thing to the payload.
@next-nodes: next_node_id
@input-payload:
  - some_key (str): What this node expects
@output-payload:
  - some_key (str): Same or transformed
  - new_key (int): Something this node adds
@config-fields:
  - threshold (int, default=10): Controls the threshold
@example-input: {"some_key": "hello"}
@example-output: {"some_key": "hello", "new_key": 42}
@side-effects: none
@errors: Raises ValueError if some_key is missing
@cost: free
"""

from typing import Any
from choola.core.base_node import BaseNode


class MyNodeName(BaseNode):
    node_id = "my_node_name"
    name = "My Node Name"
    category = "processing"
    description = "Does one specific thing to the payload."
    next_nodes = ["next_node_id"]
    fields = [
        {"name": "threshold", "type": "number", "default": 10},
    ]

    async def execute(self, payload: dict[str, Any], context: dict[str, Any]) -> dict[str, Any]:
        return payload
```

The `@choola-node` docstring is not decoration — it is the agent-facing contract. `grep -r "@choola-node"` lists every node. `grep -r "@category: routing"` finds every router. `grep -r "@side-effects"` surfaces everything with external dependencies. **Keep the docstring in sync with the code** — after changing fields, payload shape, or errors, update the block.

### Helpers available on every node

| Call | Purpose |
|---|---|
| `await self.get_global(key)` | Read a cross-workflow persistent value |
| `await self.set_global(key, value)` | Write a cross-workflow persistent value |
| `await self.get_credential(name)` | Fetch a stored credential — returns `None` if missing (raise a clear error in that case) |
| `await self.db_query(sql, params)` | SELECT against the workflow's own SQLite at `files/db.sqlite`. Use `?` placeholders. |
| `await self.db_execute(sql, params)` | INSERT/UPDATE/DELETE against the workflow DB. Use `?` placeholders. |

### Adding or changing a core node

1. Edit or create the file in `choola/core/nodes/`.
2. It MUST inherit from `BaseNode` and include the `@choola-node` docstring.
3. Update `choola/core/CLAUDE.md` with the node's full API reference.
4. Update `choola/CLAUDE.md` if the node contract or workflow rules changed.
5. If `choola nodes` lists node types manually, update `choola/cli.py`.

### The three CLAUDE.md files

| File | Purpose |
|---|---|
| `/CLAUDE.md` | Dev environment guide for agents working on the engine |
| `/choola/CLAUDE.md` | Workflow-authoring guide — copied to user projects on `choola init` |
| `/choola/core/CLAUDE.md` | Core node reference — every core node's API |

These files are the agent-facing spec. When you change the engine, update them in the same commit.

### Committing frontend changes

```bash
cd frontend && npm run build && cp -r dist ../choola/static/dist
```

Commit both `frontend/src/` and `choola/static/dist/`.

### Cutting a release

```bash
# 1. Rebuild the UI
cd frontend && npm run build && cp -r dist ../choola/static/dist && cd ..

# 2. Bump the version in BOTH places
#    choola/__init__.py  ->  __version__ = "0.x.y"
#    pyproject.toml      ->  version = "0.x.y"

# 3. Build + publish
python -m build
python -m twine upload dist/*
```

### HTTP API reference

| Method | Path | Description |
|---|---|---|
| GET | `/api/nodes` | List all registered node types |
| GET | `/api/nodes/<node_type>/fields` | Get field definitions for a node |
| GET | `/api/nodes/<node_type>/source` | Read node source code |
| PUT | `/api/nodes/<node_type>/source` | Update node source code |
| GET | `/api/workflows` | List all workflows |
| POST | `/api/workflows` | Create a new workflow |
| GET | `/api/workflows/<name>/topology` | Get workflow topology |
| PUT | `/api/workflows/<name>/topology` | Update workflow topology |
| POST | `/api/workflows/<name>/run` | Execute a workflow |
| GET | `/api/workflows/<name>/stream/<run_id>` | SSE stream for live run status |
| POST | `/api/workflows/<name>/refresh` | Re-discover nodes from disk |
| POST | `/api/workflows/<name>/chat` | Chat with Claude about the workflow (SSE) |
| GET | `/api/workflows/<name>/trigger-info` | Get trigger type and config |
| GET | `/api/credentials` | List all credentials (values masked) |
| POST | `/api/credentials` | Create/update credential |
| DELETE | `/api/credentials/<name>` | Delete credential |
| POST | `/api/oauth2/gmail/start` | Initiate Gmail OAuth2 flow |
| GET | `/api/oauth2/gmail/callback` | Gmail OAuth2 callback |

---

## License

Apache 2.0 — see [LICENSE](LICENSE).
