Metadata-Version: 2.4
Name: airflow-provider-datris
Version: 0.1.1
Summary: Apache Airflow provider for Datris — trigger and monitor Datris taps from DAGs.
Project-URL: Homepage, https://datris.ai
Project-URL: Documentation, https://docs.datris.ai/integrations/airflow
Project-URL: Source, https://github.com/datris/airflow-provider-datris
Author-email: Datris <support@datris.ai>
License: Apache-2.0
License-File: LICENSE
Keywords: airflow,datris,etl,orchestration,provider
Classifier: Development Status :: 4 - Beta
Classifier: Framework :: Apache Airflow
Classifier: Framework :: Apache Airflow :: Provider
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.9
Requires-Dist: apache-airflow>=3.0.0
Requires-Dist: requests>=2.27.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: responses>=0.23; extra == 'dev'
Description-Content-Type: text/markdown

# airflow-provider-datris

An [Apache Airflow](https://airflow.apache.org/) provider for
[Datris](https://datris.ai). Airflow orchestrates; Datris executes. This package
ships an operator that triggers a Datris tap, waits for the resulting pipeline to
reach a terminal state, streams Datris logs into the Airflow task log, and pushes
run tokens and metrics back as XComs.

No execution moves into Airflow — it only triggers and observes.

## Requirements

- Apache Airflow 3.x
- A reachable Datris server (minimum version: the current Datris release at the
  time this provider was published)

## Install

```bash
pip install airflow-provider-datris
```

## Configure a connection

Add a connection of type **Datris**:

| Field    | Value                                                    |
| -------- | -------------------------------------------------------- |
| Host     | Datris base URL, e.g. `https://datris.example.com`       |
| Password | API key (`x-api-key`) — only if the install requires one |
| Extra    | `{"verify": true, "timeout": 30}` (optional)             |

The API key is **optional**: leave it blank for a local single-tenant install
with `useApiKeys=false`. For a hosted/multi-tenant install, use a key that holds
the `tap:run` capability for the taps you trigger. A key scoped
`tap:run:owner=self` can only run taps it created.

## Usage

```python
from datris_provider.operators import DatrisRunTapOperator

ingest = DatrisRunTapOperator(
    task_id="ingest_customers",
    tap_name="customers_pg_to_snowflake",
    datris_conn_id="datris_default",
    wait_for_completion=True,
    poll_interval=15,
    tap_params={"since": "{{ ds }}"},  # optional per-run params (env vars in the tap)
)
```

### Behavior

- Triggers the tap via `POST /tap/run` (`mode=run`).
- Polls `GET /pipeline/status?publishertoken=…&withrollup=true` to a terminal
  rollup, streaming Datris log events into the task log.
- Pushes `publisher_token`, `pipeline_tokens`, `row_count`, and `duration_ms`
  as XComs (the operator's return value).
- A debounced trigger (`status=skipped, persistedReason=debounced`) is treated
  as success-with-warning.
- On Airflow task kill/timeout, calls `POST /job/kill` for the in-flight jobs.

### Tip: avoid double-firing

If a tap is driven by Airflow, mark it **externally scheduled** in Datris so the
built-in scheduler doesn't also fire it.

## Development

```bash
pip install -e ".[dev]"
pytest
```

## License

Apache-2.0. The Datris engine itself is licensed separately (AGPL-3.0).
