Metadata-Version: 2.4
Name: pysungeo
Version: 0.1.1
Summary: Sub-National Geospatial Data Archive: Geoprocessing Toolkit (Python)
Author: Yuri M. Zhukov, Jason Byers, Marty Davidson, Ye Chan Kim
License: GPL-2.0
Project-URL: Homepage, https://sungeo.org
Project-URL: Repository, https://github.com/zhukov/pysungeo
Project-URL: Bug Tracker, https://github.com/zhukov/pysungeo/issues
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v2 (GPLv2)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: GIS
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: geopandas>=0.14
Requires-Dist: shapely>=2.0
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: scipy>=1.10
Requires-Dist: rasterio>=1.3
Requires-Dist: rasterstats>=0.19
Requires-Dist: pyproj>=3.5
Requires-Dist: pykrige>=1.7
Requires-Dist: esda>=2.5
Requires-Dist: libpysal>=4.9
Requires-Dist: requests>=2.31
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: jupyter; extra == "dev"
Dynamic: license-file

# pysungeo

**Sub-National Geospatial Data Archive: Geoprocessing Toolkit for Python**

[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
[![License: GPL-2.0](https://img.shields.io/badge/license-GPL--2.0-green.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.html)
[![Tests](https://img.shields.io/badge/tests-533%20passed-brightgreen.svg)]()

`pysungeo` is a Python port of the [SUNGEO R package](https://github.com/zhukov/SUNGEO), providing tools for integrating geospatial datasets that use different geographic boundary systems. It solves the **change-of-support problem** — transferring data between misaligned spatial units (e.g., electoral precincts to administrative districts to hexagonal grids).

## Key Features

- **Polygon-to-polygon transfer** — Area-weighted interpolation with pycnophylactic (mass-preserving) correction
- **Point-to-polygon interpolation** — Simple aggregation, Voronoi tessellation, and Ordinary/Universal Kriging
- **Line-to-polygon metrics** — Road length, density, and distance calculations within polygons
- **Nesting diagnostics** — 12 metrics measuring how well one boundary set nests within another
- **Raster conversion** — Polygon/point to raster and back, with round-trip fidelity
- **Spatial statistics** — Getis-Ord Gi* hot spot analysis
- **Geocoding** — Address-to-coordinate conversion via OpenStreetMap Nominatim
- **SUNGEO API access** — Download sub-national data for 180+ countries directly

## Installation

```bash
pip install pysungeo
```

### From source

```bash
git clone https://github.com/zhukov/pysungeo.git
cd pysungeo
pip install -e ".[dev]"
```

## Quick Start

```python
import geopandas as gpd
from sungeo.utm_select import utm_select
from sungeo.poly2poly_ap import poly2poly_ap
from sungeo.hot_spot import hot_spot

# Load source and destination boundary sets
precincts = gpd.read_file("precincts.gpkg")
districts = gpd.read_file("districts.gpkg")

# Transfer turnout data from precincts to districts (area-weighted)
result = poly2poly_ap(
    poly_from=precincts,
    poly_to=districts,
    poly_to_id="DISTRICT_ID",
    varz="turnout",
)

# Identify spatial clusters
hotspots = hot_spot(insert=result, variable="turnout_aw")
```

## Available Functions

### Spatial Interpolation

| Function | Description |
|----------|-------------|
| `poly2poly_ap` | Area-weighted polygon-to-polygon transfer |
| `point2poly_simp` | Simple point-in-polygon aggregation |
| `point2poly_tess` | Voronoi tessellation interpolation |
| `point2poly_krige` | Ordinary and Universal Kriging |
| `line2poly` | Line length, density, and distance within polygons |

### Spatial Analysis

| Function | Description |
|----------|-------------|
| `nesting` | 12 metrics for boundary set compatibility |
| `hot_spot` | Getis-Ord Gi* local spatial clustering |
| `sf2raster` | Polygon/point ↔ raster conversion |

### Utilities

| Function | Description |
|----------|-------------|
| `utm_select` | Auto-select optimal projected CRS |
| `fix_geom` | Repair invalid geometries |
| `df2sf` | DataFrame with coordinates → GeoDataFrame |
| `update_bbox` | Refresh GeoDataFrame bounding box |
| `smart_round` | Round with significant digit preservation |
| `make_ticker` | Date-to-ID mapping table |
| `merge_list` | Recursive outer-join merge |

### Data Access

| Function | Description |
|----------|-------------|
| `get_data` | Download from the SUNGEO API |
| `get_info` | Browse the SUNGEO data catalog |
| `geocode_osm` | Geocode addresses via Nominatim |
| `geocode_osm_batch` | Batch geocoding with rate limiting |

## Examples

### Transfer data between boundary sets

```python
from sungeo.poly2poly_ap import poly2poly_ap

# Area-weighted transfer of election turnout from precincts to hex grid
result = poly2poly_ap(
    poly_from=precincts,
    poly_to=hex_grid,
    poly_to_id="HEX_ID",
    varz="turnout",
)
```

### Kriging interpolation from weather stations to districts

```python
from sungeo.point2poly_krige import point2poly_krige

result = point2poly_krige(
    pointz=weather_stations,
    polyz=districts,
    yvarz="temperature",
)
# Result includes temperature.pred, temperature.var, temperature.stdev
```

### Check boundary compatibility

```python
from sungeo.nesting import nesting

metrics = nesting(
    poly_from=precincts,
    poly_to=districts,
    metrix="all",
)
# metrics["rn"] close to 1.0 = good nesting
```

### Download SUNGEO data

```python
from sungeo.get_data import get_data

df = get_data(
    country_names="Germany",
    topics="Demographics:Population:GHS",
    year_min=2000,
    year_max=2020,
)
```

## Requirements

- Python ≥ 3.10
- geopandas ≥ 0.14
- shapely ≥ 2.0
- numpy ≥ 1.24
- pandas ≥ 2.0
- scipy ≥ 1.10
- rasterio ≥ 1.3
- rasterstats ≥ 0.19
- pyproj ≥ 3.5
- pykrige ≥ 1.7
- esda ≥ 2.5
- libpysal ≥ 4.9
- requests ≥ 2.31

## Testing

```bash
pip install -e ".[dev]"
pytest
```

533 tests passing, covering all 20 functions with R cross-validation.

## Citation

If you use this package in published research, please cite:

> Zhukov, Yuri M., Jason S. Byers, Marty Davidson, and Ye Chan Kim. 2025.
> "pysungeo: Sub-National Geospatial Data Archive — Geoprocessing Toolkit for Python."
> https://github.com/zhukov/pysungeo

And the original R package:

> Zhukov, Yuri M., Jason S. Byers, and Marty Davidson. 2024.
> "SUNGEO: Sub-National Geospatial Data Archive: Geoprocessing Toolkit."
> R package. https://github.com/zhukov/SUNGEO

## License

GPL-2.0. See [LICENSE](LICENSE) for details.

## Links

- [SUNGEO Project](https://sungeo.org)
- [R Package (original)](https://github.com/zhukov/SUNGEO)
- [API Documentation](https://api-sungeo-org-sungeo-api.apps.gnosis.lsa.umich.edu)
