Metadata-Version: 2.4
Name: rasterizer
Version: 0.3
Summary: A Python package to rasterize GeoDataFrames
Author: Cyril Joly
Project-URL: Homepage, https://github.com/CyrilJl/rasterizer
Project-URL: Documentation, https://rasterizer.readthedocs.io
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: geopandas
Requires-Dist: xarray
Requires-Dist: numpy
Requires-Dist: shapely
Requires-Dist: rioxarray
Requires-Dist: numba
Provides-Extra: test
Requires-Dist: pytest; extra == "test"
Requires-Dist: scipy; extra == "test"

# Rasterizer

`rasterizer` is a lightweight Python package for rasterizing `geopandas` GeoDataFrames.

## Features

- Rasterize lines into a binary (presence/absence) or length-based grid.
- Rasterize polygons into a binary (presence/absence) or area-based grid.
- Hybrid polygon rasterization for large polygon bounding boxes: exact clipping on boundary cells, faster scanline filling for interior cells.
- Weighted rasterization: Rasterize geometries while weighting the output by a numerical column in the GeoDataFrame.
- Works with `geopandas` GeoDataFrames.
- Outputs an `xarray.DataArray` for easy integration with other scientific Python libraries.
- No GDAL dependency for the rasterization algorithm itself.

For detailed usage and API documentation, please see the [full documentation](https://rasterizer.readthedocs.io).

## Usage

Here are some examples of what you can do with `rasterizer`.

```python
import geopandas as gpd
from rasterizer import rasterize_polygons

polys = gpd.read_file("polygons.gpkg")
area_raster = rasterize_polygons(polys, your_x_grid, your_y_grid, polys.crs, mode="area")
```

### Rasterizing Lines

You can rasterize lines in either binary or length mode.

| Binary Mode                                      | Length Mode                                      |
| ------------------------------------------------ | ------------------------------------------------ |
| ![Lines - Binary](docs/_static/lines_binary.png) | ![Lines - Length](docs/_static/lines_length.png) |

### Rasterizing Polygons

You can rasterize polygons in either binary or area mode.

For polygon workloads, `rasterizer` now uses two internal strategies. Small polygon bounding boxes are handled with exact per-cell clipping. Larger ones switch to a hybrid path that still clips boundary cells exactly, but fills interior spans with a scanline pass to reduce the amount of geometric clipping required. The resulting area and binary outputs stay exact at cell boundaries while scaling better on large polygons.

| Binary Mode                                            | Area Mode                                          |
| ------------------------------------------------------ | -------------------------------------------------- |
| ![Polygons - Binary](docs/_static/polygons_binary.png) | ![Polygons - Area](docs/_static/polygons_area.png) |

## Installation

You can install the package directly from PyPI:

```bash
pip install rasterizer
```

## Why rasterizer

This package provides functionalities that are not present in `rasterio.features`, such as area and length-based rasterization. It is also lighter and faster than using GDAL-based solutions. GDAL's rasterization only burns values per pixel; it cannot return exact fractional area or length contributions without an expensive workaround. The common workaround is to rasterize at a much finer resolution and then downsample with averaging, which approximates the true area/length but is not exact and can be slow, e.g.:

```bash
gdal_rasterize -burn 1 -tr 1 1 -ot Float32 -of GTiff input.gpkg tmp_fine.tif
gdalwarp -tr 10 10 -r average tmp_fine.tif out_area_approx.tif
```

Doing this purely in `geopandas` by generating one polygon per grid cell and overlaying it with the input geometry is also slow because it creates a huge number of tiny geometries, triggers expensive overlay operations, and scales poorly with grid size.
