Metadata-Version: 2.4
Name: deped-geos
Version: 0.1.0
Summary: School-derived area products for DepEd PSGC, governance, and legislative geographies.
Requires-Python: >=3.14
Requires-Dist: click>=8.4
Requires-Dist: duckdb>=1.5
Requires-Dist: environs>=15.0
Requires-Dist: pandas>=3.0
Requires-Dist: pyarrow>=22.0
Requires-Dist: rich>=15.0
Description-Content-Type: text/markdown

# deped-geos

`deped-geos` builds school-derived area products for four artifact hierarchies:

- Strict PSGC stats by region, province/HUC, city/municipality, and barangay.
- Normalized PSGC stats by region, province, city, municipality, locality, and
  barangay.
- DepEd governance stats starting at region, with derived boundaries for division and school district.
- Legislative stats starting at region, with derived boundaries for legislative district.

The producer consumes `psgc-maps` artifacts and `deped-dataset/db.sqlite3`. It writes Parquet tables, partitioned membership data, GeoParquet boundaries, and a manifest with source fingerprints, timings, and geometry policy.

## Quick start

```sh
uv sync --all-groups
just check
uv run --group geo deped-geos build --output-dir artifacts-current
uv run deped-geos status --output-dir artifacts-current
```

The CLI skips gracefully when the real external inputs are not present:

```sh
uv run --group geo deped-geos build --output-dir artifacts-current
uv run deped-geos status --output-dir artifacts-current
```

Derived boundary GeoParquet files require the `geo` dependency group. A plain
`uv run deped-geos build` can still write stats, memberships, audit, and the
manifest, but boundary entries in `manifest.json` are marked
`skipped: "geopandas is not installed"` and the boundary row counts stay zero.

## Outputs

`artifacts-current/` contains:

- `manifest.json`
- `school_spine.parquet`
- `school_spine_current.parquet`
- `school_area_source_universe.parquet`
- `school_area_stats.parquet`
- `school_area_assignment_audit.parquet`
- `school_area_memberships/`
- `school_area_region_group_components.parquet`
- `school_area_division_coverage.parquet`
- `school_area_division_coverage_audit.parquet`
- `school_area_denominator_coverage.parquet`
- `school_area_insight_context.parquet`
- `school_area_display_subdivisions.parquet`
- `school_id_crosswalk.parquet`
- `school_area_boundaries_governance_division.parquet`
- `school_area_boundaries_governance_school_district.parquet`
- `school_area_boundaries_legislative_legis_district.parquet`
- `school_area_boundaries_*_current.parquet`
- `school_coordinate_boundary_qa_current.parquet`
- `school_coordinate_boundary_qa_summary_current.parquet`

Stats, assignment, audit, and boundaries are keyed by `school_year_id`. The manifest marks the latest year as `current`.

CSV exports are optional: run `just build --export-csv` when a downstream handoff needs CSV copies.

`school_area_stats.parquet` owns the complete group universe for every
`(school_year_id, hierarchy, level, group_id)`: current-year group counts,
denominators, labels, and parent ids belong there. Boundary GeoParquet files
own renderable dissolved geometry only. A current stats row without a matching
`school_area_boundaries_*_current.parquet` row is valid and means the group is
known, but has no display geometry for consumers to draw.

`school_area_stats.parquet` also includes planning ratios such as
`students_per_school`, `students_per_capita`, `population_per_school`,
`school_density`, plus `urban_barangay_share` and `barangay_urbanicity_band`.
These are producer-owned artifact fields; notebooks and downstream consumers
should read them rather than recomputing them at runtime.

`school_area_source_universe.parquet` preserves PSGC labels, parent ancestry,
boundary mapping, and denominator seeds for no-school groups so boundary-first
maps can render known geography without downstream inference.

## Docs and notebooks

```sh
uv run zensical build
just notebooks
just check-full
```

The local Marimo notebooks inspect `artifacts-current/` read-only and render
maps in-cell. They do not import `deped-glue` or expose a server/API contract.
