Metadata-Version: 2.4
Name: airflow-providers-orchesjob
Version: 0.1.0
Summary: Apache Airflow provider for orchesjob – lightweight idempotent job runner
Author: Ryosuke Muraki
License-Expression: MIT
Keywords: airflow,provider,orchesjob
Classifier: Framework :: Apache Airflow
Classifier: Framework :: Apache Airflow :: Provider
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: apache-airflow>=2.6
Requires-Dist: apache-airflow-providers-ssh>=3.0
Provides-Extra: test
Requires-Dist: pytest>=8; extra == "test"
Requires-Dist: pytest-asyncio>=0.23; extra == "test"
Dynamic: license-file

# airflow-providers-orchesjob

Apache Airflow provider that runs jobs via [orchesjob](https://github.com/your-org/orchesjob) over SSH.

Uses Airflow's **Deferrable Operator** pattern — the worker slot is released
while the remote job runs, and the task resumes automatically when the job
reaches a terminal state.

## Requirements

- Apache Airflow ≥ 2.6
- `apache-airflow-providers-ssh` ≥ 3.0
- `orchesjob` installed on the remote host

## Installation

```bash
pip install airflow-providers-orchesjob
```

## Setup

Create an **SSH Connection** in Airflow (Admin → Connections):

| Field | Value |
|-------|-------|
| Conn Id | `my_ssh` (any name) |
| Conn Type | SSH |
| Host | remote host |
| Username | SSH user |

## Usage

```python
from airflow_providers_orchesjob.operators.orchesjob import OrchesJobOperator

with DAG("my_dag", ...):
    import_task = OrchesJobOperator(
        task_id="daily_import",
        command=["/jobs/import.sh", "--date", "{{ ds }}"],
        ssh_conn_id="my_ssh",
    )
```

### Idempotency

By default, `run_key` is auto-generated as `{dag_id}__{task_id}__{run_id}`,
which makes each DAG run idempotent: re-triggering the same run returns the
existing job if it is still active.

Supply an explicit `run_key` to control the key yourself:

```python
OrchesJobOperator(
    task_id="import",
    command=["/jobs/import.sh"],
    ssh_conn_id="my_ssh",
    run_key="daily-import-{{ ds }}",
)
```

Use `strict=True` to prevent any re-execution with the same key, even after
the previous job has finished:

```python
OrchesJobOperator(
    ...,
    strict=True,
)
```

## Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `command` | `list[str]` | **required** | Command to run on the remote host |
| `ssh_conn_id` | `str` | **required** | Airflow SSH Connection ID |
| `run_key` | `str \| None` | auto | orchesjob idempotency key |
| `strict` | `bool` | `False` | Never re-run the same `run_key` |
| `poll_interval` | `float` | `30.0` | Seconds between status polls |
| `startup_timeout` | `float` | `60.0` | Seconds to wait for job to leave `STARTING` |
| `orchesjob_home` | `str \| None` | `None` | Override `ORCHESJOB_HOME` on remote host |

## Error Handling

| Event | Airflow behaviour |
|-------|-----------------|
| Job `FAILED` or `LOST` | `AirflowException` → honours task `retries` |
| `startup_timeout` exceeded | `AirflowException` → honours task `retries` |
| SSH connection failure | `AirflowException` → honours task `retries` |
| Job `CANCELLED` | `AirflowException` |

## License

MIT
