Metadata-Version: 2.4
Name: utgc
Version: 0.1.1
Summary: GerryChain utilities for Utah redistricting ensembles
Author-email: Samuel Adams <sadams144@icloud.com>
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: GIS
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: gerrychain>=0.3.0
Requires-Dist: pandas>=1.5.0
Requires-Dist: geopandas>=0.12.0
Requires-Dist: numpy>=1.23.0
Requires-Dist: shapely>=2.0.0
Requires-Dist: networkx>=3.0
Requires-Dist: matplotlib>=3.5.0
Requires-Dist: seaborn>=0.12.0
Requires-Dist: PyYAML>=6.0
Requires-Dist: maup>=2.0.0
Requires-Dist: tqdm>=4.64.0
Requires-Dist: pillow>=9.0.0
Provides-Extra: notebook
Requires-Dist: ipython>=8.0.0; extra == "notebook"
Requires-Dist: ipywidgets>=8.0.0; extra == "notebook"

# UT_GerryChain

This project uses GerryChain to generate neutral redistricting ensembles for Utah.

## Data Files

The project works with any `data/*.geojson` file (UT_blocks.geojson, UT_vtds.geojson, UT_vtd_parts.geojson). All files must contain:
- `TOTPOP`: Total population column
- `MUNIID`: Municipality ID column  
- `COUNTYID`: County ID column
- Additional COI (Communities of Interest) columns as needed

## Configure neutral sampling in notebook

Use `01_configure_sampling.ipynb` to:
- Configure neutral parameters (no partisan metrics)
- Configure transitability analysis (road connectivity and water barriers)
- Run a short neutral sample for sanity checks
- Export a YAML configuration to `results/configurations/sampling_params.yaml`

## Transitability Analysis

UT_GerryChain now includes transitability analysis to ensure districts are connected by actual roads and not separated by impassable barriers. This implements Utah's requirement for "ease of travel throughout district" by:

1. **Road Connectivity**: Verifying that precincts are connected by actual road networks
2. **Hierarchical Fallback**: For rural areas with only local roads, fall back to municipality/county boundaries  
3. **Water Barriers**: Remove connections that cross major water bodies (Great Salt Lake, Lake Powell, etc.)

### Data Requirements

You can now build the transitability graph offline from the original geography sources and load it at runtime.

### Offline preprocessing (recommended)

Build once and reuse:

Transitability analysis uses edge penalties to discourage districts that aren't well-connected by roads. The transitability CSV file (e.g., `data/transitability/precinct_no_roads.csv`) can be generated by an external pipeline and should contain edge pairs (u, v) that should be penalized.

If the transitability CSV file is not found, the code will skip edge penalties with a warning.

### Configuration

Transitability can be configured in the notebook or YAML:

```yaml
transitability:
  enable: true
  remove_water_barriers: true
  verify_road_connectivity: true
  precomputed_path: data/transitability/transitability.graphml
  min_lake_size_sqkm: 1.0
  min_river_size_sqkm: 0.5
  road_buffer_meters: 500
  water_threshold: 0.5
```

## Running Ensembles

### Using Notebooks (Recommended)

1. **Configure sampling**: Use `01_configure_sampling.ipynb` to set up parameters and test with a small sample
2. **Run full ensemble**: Use `02_run_ensemble.ipynb` to load the configuration and run a full ensemble

### Column Names

The geojson files use the following column names for Communities of Interest:
- `HIGHEREDID`: Institutions of higher education
- `AIANNHID`: American Indian/Alaska Native Areas (formerly RESERVATION_ID)
- `MILITID`: Military installations (formerly MILITARY_ID)
- `CBSAID`: Core Based Statistical Areas / Metro areas (formerly METRO_ID)
- `SCHDISTID`: School districts

Note: WATER_ID and BASIN_ID are no longer used.
