Metadata-Version: 2.4
Name: automation-farm-runner
Version: 0.4.0
Summary: Browser automation farm — YAML cron queue and Multilogin mlx-pool runner. CLI: farm-runner.
Project-URL: Homepage, https://github.com/automation-farm-runner/automation-farm-runner
Project-URL: Documentation, https://github.com/automation-farm-runner/automation-farm-runner#readme
Project-URL: Repository, https://github.com/automation-farm-runner/automation-farm-runner
Project-URL: Issues, https://github.com/automation-farm-runner/automation-farm-runner/issues
Author: automation-farm-runner contributors
License-Expression: MIT
License-File: LICENSE
Keywords: automation-farm,batch-runner,browser-automation-farm,cdp-jobs,concurrency-queue,cron-scheduler,farm-scheduler,job-queue,mlx-pool,multilogin-automation,playwright-farm,profile-pool,scheduled-jobs,subprocess-runner,yaml-jobs
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: click>=8.1
Requires-Dist: croniter>=2.0
Requires-Dist: pydantic>=2.5
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: httpx>=0.27; extra == 'dev'
Requires-Dist: pytest-httpx>=0.34; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.8; extra == 'dev'
Provides-Extra: mlx
Requires-Dist: httpx>=0.27; extra == 'mlx'
Description-Content-Type: text/markdown

# automation-farm-runner

**Browser automation farm runner** — YAML cron job queue and Multilogin mlx-pool across folder profiles.

[![PyPI version](https://img.shields.io/pypi/v/automation-farm-runner.svg)](https://pypi.org/project/automation-farm-runner/)
[![Python versions](https://img.shields.io/pypi/pyversions/automation-farm-runner.svg)](https://pypi.org/project/automation-farm-runner/)
[![License: MIT](https://img.shields.io/pypi/l/automation-farm-runner.svg)](https://pypi.org/project/automation-farm-runner/)

```bash
pip install automation-farm-runner
farm-runner init && farm-runner validate .farm/jobs.yaml
```

CLI: **`farm-runner`** · Python **3.10+** · optional **`[mlx]`** for Launcher helpers


> **Coupon hub:** Verified MLX deals (`SAAS50` browser / `MIN50` cloud phone) — [Multilogin promo codes](https://anti-detect.github.io/). Core CLI works without a vendor account. [Affiliate disclosure](docs/AFFILIATE.md).

**Browser automation farm runner** — YAML cron job queue and Multilogin `mlx-pool` across folder profiles.

YAML job queue runner for **browser automation farms**. Execute Python scripts against **any CDP endpoint** — local Chrome, remote debugger, or MLX Launcher ports.

Scripts receive `CDP_URL`, `PROFILE_ID`, `FOLDER_ID`, and `FARM_JOB_ID` in the environment (or read `CDP_URL` you export yourself).

## Problem

Cron scripts across dozens of browser profiles need a queue: which script runs when, against which CDP endpoint, with retries and logs. Ad-hoc shell loops do not track job state, handle SIGINT gracefully, or scale concurrency. `farm-runner` provides YAML job definitions, cron scheduling, and optional MLX folder pools.

## Install

```bash
pip install automation-farm-runner

# MLX profile pool (list → start → script → stop)
pip install automation-farm-runner[mlx]
```

## Quick start

```bash
farm-runner init

farm-runner add-job \
  --script ./examples/task.py \
  --profile-id YOUR-PROFILE-UUID \
  --cron "0 9 * * *"

farm-runner validate .farm/jobs.yaml
farm-runner run --dry-run --force
farm-runner run --queue .farm/jobs.yaml --concurrency 3 --log-format json

farm-runner logs --tail
```

## CLI

| Command | Purpose |
|---------|---------|
| `farm-runner init` | Create `.farm/` workspace |
| `farm-runner add-job` | Append a job to the queue |
| `farm-runner validate` | Lint `jobs.yaml` schema and script paths |
| `farm-runner run` | Execute due jobs (`--dry-run`, `--force`, `--concurrency`) |
| `farm-runner mlx-pool` | Run a script across every profile in an MLX folder (`[mlx]`) |
| `farm-runner logs` | Tail `.farm/logs/` |
| `farm-runner schema` | Export JSON Schema for `jobs.yaml` |

## Architecture

Two execution modes share the same **script contract** (`CDP_URL`, `PROFILE_ID`, …) but differ in who provisions the browser:

```mermaid
flowchart LR
  subgraph cdp [Generic CDP mode]
    Q[jobs.yaml queue]
    R[farm-runner run]
    S[Your script]
    B[Any CDP endpoint]
    Q --> R --> S --> B
  end

  subgraph mlx [MLX pool mode]
    F[MLX folder]
    P[farm-runner mlx-pool]
    L[MLX Launcher]
    S2[Your script]
    F --> P --> L --> S2
  end
```

### Generic CDP mode

Use `farm-runner run` with a YAML queue (`.farm/jobs.yaml`).

1. **Queue** — Jobs declare `script`, optional `cron`, `profile_id`, and per-job `cdp_url`.
2. **Scheduler** — Cron-gated jobs run only when due (`--force` skips the gate).
3. **Executor** — Spawns your Python script with env vars injected; logs to `.farm/logs/`.
4. **State** — Job lifecycle (`pending` → `running` → `success` / `failed` / `retry`) in `.farm/state.json`.
5. **CDP** — You connect Playwright/Puppeteer to an endpoint you already have running (`--cdp-url` or `CDP_URL`).

Best for: fixed debugger ports, self-hosted browsers, custom orchestrators, or MLX profiles you start yourself.

### MLX pool mode

Use `farm-runner mlx-pool` (requires `[mlx]` extra and `MLX_TOKEN`).

1. **Discover** — Paginate `profile/search` for a folder.
2. **Start** — Launcher API starts each profile with `automation_type=playwright`.
3. **Run** — Same script contract; `CDP_URL` set to `http://127.0.0.1:{port}` per profile.
4. **Stop** — Launcher stops the profile after the script exits.

Best for: one-shot “run this script on every profile in folder X” without maintaining a YAML queue.

| | Generic CDP | MLX pool |
|---|-------------|----------|
| Input | `.farm/jobs.yaml` | `--folder-id` + `--script` |
| Concurrency | `--concurrency` | `--concurrency` (parallel workers) |
| Cron | Yes | No |
| State file | `.farm/state.json` | Log files only |
| CDP source | You provide | Launcher assigns port |

## Job queue (`jobs.yaml`)

```yaml
version: 1
jobs:
  - id: task-a1b2c3d4
    script: ./task.py
    profile_id: 95f9061e-5656-845a-2801-7fff8f0f12if
    folder_id: 81b5627a-1212-4016-9467-3dbe4d6f78eb
    cdp_url: http://127.0.0.1:9222
    cron: "0 9 * * *"
    env:
      TARGET_URL: https://example.com
    retry:
      max_attempts: 3
      delay_seconds: 5
    timeout_seconds: 300
    enabled: true
```

Validate before running:

```bash
farm-runner schema -o jobs.schema.json
farm-runner validate .farm/jobs.yaml
```

| Field | Description |
|-------|-------------|
| `script` | Python file executed with the current interpreter |
| `cdp_url` / `CDP_URL` | Chrome DevTools HTTP endpoint |
| `profile_id` | Passed as `PROFILE_ID` |
| `cron` | Standard 5-field cron; jobs without cron always run when due |
| `retry` | `max_attempts` and `delay_seconds` between failures |
| `timeout_seconds` | Subprocess kill timeout |

### Cron behavior

`farm-runner run` skips jobs whose `cron` is not due in the current minute. Use `--force` to run every enabled job.

### Job state (`.farm/state.json`)

| State | Meaning |
|-------|---------|
| `pending` | Queued, not started (or re-queued after SIGINT) |
| `running` | First attempt in progress |
| `retry` | Failed attempt, retrying |
| `success` | Script exited 0 |
| `failed` | All attempts exhausted |

**SIGINT** — First Ctrl+C stops scheduling new jobs, waits for in-flight jobs to finish, and marks unstarted jobs `pending` in `state.json`.

### Dry run

```bash
farm-runner run --dry-run --force
```

Prints a JSON execution plan (job ids, scripts, env targets) without spawning scripts.

## Generic CDP script example

```python
import os
from playwright.sync_api import sync_playwright

cdp = os.environ["CDP_URL"]
with sync_playwright() as p:
    browser = p.chromium.connect_over_cdp(cdp)
    page = browser.contexts[0].pages[0]
    page.goto("https://example.com")
```

## MLX profile pool

```bash
export MLX_TOKEN=your_bearer_token
export MLX_FOLDER_ID=YOUR-FOLDER-UUID   # optional; or pass --folder-id

farm-runner mlx-pool \
  --folder-id "$MLX_FOLDER_ID" \
  --script ./task.py \
  --concurrency 3 \
  --filter-tag warmup \
  --max-attempts 2 \
  --delay 5
```

Flow per profile: search → launcher start → script (`CDP_URL` set) → launcher stop (always, via `finally`). Inline httpx Launcher client — no `cdp-connect-kit` dependency. Logs in `.farm/logs/`.

| Flag | Description |
|------|-------------|
| `--concurrency` | Parallel profile workers (default 1) |
| `--filter-tag` | Run only profiles whose `tags` include this value |
| `--max-attempts` / `--delay` | Same retry semantics as `jobs.yaml` |
| `MLX_LAUNCHER_URL` | Override Launcher base URL (default `https://launcher.mlx.yt:45001`) |

## Logs

```bash
farm-runner logs --tail 80
farm-runner logs --tail 40 --follow
farm-runner run --log-format json   # structured stderr diagnostics
```

## Python API

```python
from automation_farm_runner import run_queue, add_job, validate_queue

add_job(script="./task.py", profile_id="...", cron="0 9 * * *")
validate_queue(".farm/jobs.yaml")
summary = run_queue(".farm/jobs.yaml", concurrency=3, force=True)
```

## Limitations

- **Python scripts only** — Jobs spawn `python script.py`; no built-in Node or shell wrappers.
- **Cron minute granularity** — Five-field cron; sub-minute scheduling is not supported.
- **MLX pool is one-shot** — `mlx-pool` has no cron; use `run` + per-job `cdp_url` for scheduled MLX jobs you start yourself.
- **No browser bundled** — You provide CDP endpoints or MLX Launcher; Playwright is not installed by this package.
- **Local state** — `.farm/state.json` is not distributed; one runner per workspace.

## Production

Run `farm-runner mlx-pool` when one script must hit **every profile in an MLX folder** with bounded concurrency. Scripts receive `CDP_URL`, `PROFILE_ID`, and `FOLDER_ID`. Launcher lifecycle details: [cdp-connect-kit](https://pypi.org/project/cdp-connect-kit/) and `docs/MLX_INTEGRATION.md` (peer package).

**Partner note (affiliate):** Folder pools and cron queues are where **Multilogin X** pays off vs one Chrome profile — isolated Launcher sessions at scale. Eligible new browser purchases may accept code **`SAAS50`** at [multilogin.com](https://multilogin.com?a_aid=saas) — verify terms before checkout. Generic `farm-runner run` works with any `CDP_URL` without MLX. Full disclosure: [docs/AFFILIATE.md](docs/AFFILIATE.md). Deals hub: [anti-detect.github.io](https://anti-detect.github.io/). Coupon details only via `farm-runner --show-deal` (never printed by default).

**FAQ:** [docs/FAQ.md](docs/FAQ.md) — browser automation farm, YAML cron queue, Multilogin mlx-pool.

## Docker

Run the job queue in a container while **MLX Launcher stays on the host**. Mount `docker/scripts/` and point `CDP_URL` at the host CDP port via `host.docker.internal`.

```bash
export CDP_URL=http://host.docker.internal:9222
docker compose up --build
```

See **[docs/DOCKER.md](docs/DOCKER.md)** for Linux VPS setup, firewall notes, and cron alternatives.

## When shell loops break at N profiles (playbook)

Ad-hoc `for` loops over profile IDs lose state, ignore SIGINT, and overlap Launcher starts.

| Pain | Shell loop | `farm-runner` |
|------|------------|---------------|
| Cron scheduling | crontab per script | One `jobs.yaml`, `farm-runner run` |
| Retries | Manual | `retry.max_attempts` in YAML or `mlx-pool --max-attempts` |
| Concurrency | Race on ports | `--concurrency` with bounded workers |
| Logs | Scattered stdout | `.farm/logs/` + `--log-format json` |
| MLX lifecycle | Forgotten `stop` | `mlx-pool` stops in `finally` |

**Folder-scale warmup pipeline:**

```bash
export MLX_TOKEN=... MLX_FOLDER_ID=...
farm-runner mlx-pool --folder-id "$MLX_FOLDER_ID" \
  --script ./warmup_with_human_input.py --concurrency 3 --filter-tag prod
# warmup script: read CDP_URL, import human_input_kit, connect via Playwright
cdp-probe mlx --profile-id "$PROFILE_ID" --url https://example.com  # spot-check
```

**Coupon hub:** [Multilogin promo codes](https://anti-detect.github.io/) — guides for `SAAS50` / `MIN50`, pricing comparisons, and workflow playbooks.

**Scheduled generic CDP:** `farm-runner add-job` + cron — pair with `cdp-connect mlx-start` in the script when you manage Launcher yourself.

## Development

```bash
cd automation-farm-runner
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
ruff check .
python -m build
```





## Related tools (on PyPI)

| Package | CLI | Role |
|---------|-----|------|
| [playwright-cdp-probe](https://pypi.org/project/playwright-cdp-probe/) | `cdp-probe` | CDP / WebDriver exposure score |
| [cookie-jar-bridge](https://pypi.org/project/cookie-jar-bridge/) | `cookie-bridge` | Netscape ↔ Playwright cookies |
| [proxy-lane-checker](https://pypi.org/project/proxy-lane-checker/) | `proxy-lane` | Proxy TCP / HTTP / geo / DNSBL |
| [fingerprint-coherence](https://pypi.org/project/fingerprint-coherence/) | `fp-coherence` | UA / screen / timezone lint |
| [mlx-warmup-kit](https://pypi.org/project/mlx-warmup-kit/) | `mlx-warmup` | Warmup shell for mlx-pool jobs |
| [antidetect-importer](https://pypi.org/project/antidetect-importer/) | `antidetect-import` | Competitor → MLX migration |
| [cloud-phone-kit](https://pypi.org/project/cloud-phone-kit/) | `cloud-phone` | Cloud Phone readiness checks |

**Farm pipeline:** `mlx-warmup plan --format shell` → `farm-runner mlx-pool` → `cdp-probe run`

## License

MIT

---

**Production antidetect:** [Multilogin X](https://multilogin.com?a_aid=saas) · Code `SAAS50` (-50% browser) · [MIN50](https://multilogin.com?a_aid=saas) (-50% cloud phone)  
More scripts: [@Multilogin_Scripts_Bot](https://t.me/Multilogin_Scripts_Bot) · [Multilogin promo codes](https://anti-detect.github.io/)
