Metadata-Version: 2.1
Name: openeo-processes-dask
Version: 2026.6.1
Summary: Python implementations of many OpenEO processes, dask-friendly by default.
Home-page: https://github.com/Open-EO/openeo-processes-dask
License: Apache 2.0
Author: Lukas Weidenholzer
Author-email: lukas.weidenholzer@eodc.eu
Maintainer: EODC Staff
Maintainer-email: support@eodc.eu
Requires-Python: >=3.10,<3.13
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.12
Provides-Extra: deforestation
Provides-Extra: implementations
Provides-Extra: ml
Requires-Dist: dask-geopandas (==0.4.3) ; extra == "implementations"
Requires-Dist: dask[array,dataframe,distributed] (>=2023.4.0,<2025.2.0) ; extra == "implementations"
Requires-Dist: geopandas (>=0.11.1,<1) ; extra == "implementations"
Requires-Dist: geoparquet (>=0.0.3,<0.0.4)
Requires-Dist: joblib (>=1.3.2) ; extra == "implementations"
Requires-Dist: numpy (<2.0.0)
Requires-Dist: odc-geo (>=0.4.1,<1) ; extra == "implementations"
Requires-Dist: odc-stac (>=0.3.9) ; extra == "implementations"
Requires-Dist: openeo (>=0.36.0)
Requires-Dist: openeo-pg-parser-networkx (>=2025.10) ; extra == "implementations"
Requires-Dist: pandas (>=2.0.0)
Requires-Dist: planetary_computer (>=0.5.1) ; extra == "implementations"
Requires-Dist: pyarrow (>=15.0.2,<16.0.0)
Requires-Dist: pystac (<1.12.0)
Requires-Dist: pystac_client (>=0.6.1) ; extra == "implementations"
Requires-Dist: rasterio (>=1.3.4,<2.0.0) ; extra == "implementations"
Requires-Dist: rioxarray (>=0.12.0,<1) ; extra == "implementations"
Requires-Dist: rqadeforestation (>=0.1) ; extra == "deforestation"
Requires-Dist: scipy (>=1.11.3,<2.0.0)
Requires-Dist: stac_validator (>=3.3.1) ; extra == "implementations"
Requires-Dist: xarray (>=2022.11.0,<2025.08.01) ; extra == "implementations"
Requires-Dist: xcube-eopf (>=0.2.0)
Requires-Dist: xgboost (>=1.5.1,<2.1.4) ; extra == "ml"
Requires-Dist: xvec (==0.2.0) ; extra == "implementations"
Requires-Dist: zarr (<=2.18.7)
Project-URL: Repository, https://github.com/Open-EO/openeo-processes-dask
Description-Content-Type: text/markdown

# OpenEO Processes Dask

[![Poetry](https://img.shields.io/endpoint?url=https://python-poetry.org/badge/v0.json)](https://python-poetry.org/)
![PyPI - Status](https://img.shields.io/pypi/status/openeo-processes-dask)
![PyPI](https://img.shields.io/pypi/v/openeo-processes-dask)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/openeo-processes-dask)
[![codecov](https://codecov.io/github/Open-EO/openeo-processes-dask/branch/main/graph/badge.svg?token=RA82MUN9RZ)](https://codecov.io/github/Open-EO/openeo-processes-dask)

`openeo-processes-dask` is a collection of Python implementations of [OpenEO processes](https://processes.openeo.org/) based on the [xarray](https://github.com/pydata/xarray)/[dask](https://github.com/dask/dask) ecosystem. It is intended to be used alongside with [openeo-pg-parser-networkx](https://github.com/Open-EO/openeo-pg-parser-networkx), which handles the parsing and execution of [OpenEO process graphs](https://openeo.org/documentation/1.0/developers/api/reference.html#section/Processes/Process-Graphs). There you'll also find a tutorial on how to register process implementations from an arbitrary source (e.g. this repo) to the registry of available processes.

## Installation

### Recommended installation (with system GDAL)

If you already have GDAL installed on your system (or in a conda/micromamba env), always install the matching Python bindings first, then this package:

#### Installation

Install this project via pip:

```bash
pip install "gdal==$(gdal-config --version)" openeo-processes-dask
```

Note that by default this only installs the JSON process specs.
In order to install the actual implementations, add the `implementations` extra:


```bash
pip install "gdal==$(gdal-config --version)" openeo-processes-dask[implementations]
````

This ensures that the Python bindings link against your system GDAL libraries and avoids pip pulling a mismatched GDAL wheel.

---

### Installing GDAL if you don’t have it yet

**System packages (Ubuntu/Debian):**

```bash
sudo apt-get install gdal-bin libgdal-dev python3-gdal
pip install "gdal==$(gdal-config --version)" openeo-processes-dask[implementations]
```

**Conda (recommended for most users):**

```bash
conda create -n openeo_processes_dask -c conda-forge python=3.12 gdal
conda activate openeo_processes_dask
pip install openeo-processes-dask[implementations]
```

**Micromamba (lightweight alternative to conda):**

```bash
micromamba create -n openeo_processes_dask -c conda-forge python=3.12 gdal
micromamba activate openeo_processes_dask
pip install openeo-processes-dask[implementations]
```

---

### Extra build variants

A subset of process implementations with heavy or unstable dependencies are hidden behind these extras:

* **ML processes:**

  ```bash
  pip install openeo-processes-dask[ml]
  ```
* **Experimental processes:**

  ```bash
  pip install openeo-processes-dask[experimental]
  ```

⚠️ **Note on GDAL:**
Some extras (e.g. `implementations`, `ml`, `experimental`) may trigger installation of packages that depend on GDAL.
To avoid version conflicts, make sure you have installed GDAL first (via conda/micromamba or system packages) and then install the extras as shown above.

---

## Development environment

openeo-processes-dask requires poetry >1.2, see their [docs](https://python-poetry.org/docs/#installation) for installation instructions.

Clone the repository with `--recurse-submodules` to also fetch the process specs:

```bash
git clone --recurse-submodules git@github.com:Open-EO/openeo-processes-dask.git
```

To setup the python venv and install this project into it run:

```bash
poetry install --all-extras
```

⚠️ **Note on GDAL for development:**
When using `poetry install --all-extras`, Poetry will attempt to install GDAL via pip, which may pull the latest GDAL wheels and cause conflicts with system libraries.
It is strongly recommended to create a conda/micromamba environment with GDAL preinstalled before running `poetry install`. For example:

```bash
conda create -n openeo_processes_dask_dev -c conda-forge python=3.12 gdal
conda activate openeo_processes_dask_dev
poetry install --all-extras
```

---

To add a new core dependency run:

```bash
poetry add some_new_dependency
```

To add a new development dependency run:

```bash
poetry add some_new_dependency --group dev
```

To run the test suite run:

```bash
poetry run python -m pytest
```

Note that you can also use the virtual environment that's generated by poetry as the kernel for the ipynb notebooks.

### Pre-commit hooks

This repo makes use of [pre-commit](https://pre-commit.com/) hooks to enforce linting & a few sanity checks. In a fresh development setup, install the hooks using `poetry run pre-commit install`. These will then automatically be checked against your changes before making the commit.

### Specs

The json specs for the individual processes are tracked as a git submodule in `openeo_processes_dask/specs/openeo-processes`.
The raw json for a specific process can be imported using `from openeo_processes_dask.specs import reduce_dimension`.

To bump these specs to a later version use:
`git -C openeo_processes_dask/specs/openeo-processes checkout <tag>`
`git add openeo_processes_dask/specs/openeo-processes`

