Metadata-Version: 2.4
Name: us-county-airdata-trends
Version: 2026.5.26
Summary: U.S. county-level 5-year AirData emission trends from EPA AirData annual AQI summaries 2020-2024. Sensitivity-gated change class with facility-count floor, top-1-exclude robustness, and 4 petrochemical-corridor display-hold counties. 994 counties.
Author-email: Artem Akulov <artem@zipcheckup.com>
License: CC-BY-4.0
Project-URL: Homepage, https://zipcheckup.com
Project-URL: Repository, https://github.com/artakulov/waterbyzipcode
Project-URL: Bug Tracker, https://github.com/artakulov/waterbyzipcode/issues
Keywords: air-quality,airdata,aqi,epa,emissions,trend,county,fips,environmental,open-data,civic
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file

# us-county-airdata-trends

> U.S. county-level 5-year AirData emission trends, derived from EPA AirData annual AQI summaries 2020-2024.

[![PyPI version](https://img.shields.io/pypi/v/us-county-airdata-trends.svg)](https://pypi.org/project/us-county-airdata-trends/)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20382474.svg)](https://doi.org/10.5281/zenodo.20382474)
[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)

Sensitivity-gated 5-year change classification for **994 US counties**, with explicit display-hold flags for counties below the facility-count floor (368 counties) and the 4 petrochemical-corridor counties that surface a methodology-review notice instead of a trend value.

Produced by [ZipCheckup](https://zipcheckup.com) Trend Layers v1. **Display methodology underwent a 4-agent legal pre-build review** (FTC compliance + defamation/stigmatization + statistical methodology + feeds feasibility) — see [methodology page](https://zipcheckup.com/methodology/airdata-trend/).

---

## Install

```bash
pip install us-county-airdata-trends
```

## Quick start

```python
import us_county_airdata_trends as airdata

# Look up a single county
la = airdata.get_county("06037")
# {
#     "county_fips": "06037",
#     "county_name": "Los Angeles",
#     "state": "CA",
#     "airdata_change_class": "stable",
#     "airdata_pct_change": -5.9,
#     "cycles_used": 5,
#     "facility_count": 331,
#     "sensitivity_robust": None,
#     "skip_reason": None,
#     "source_attribution": "EPA AirData annual AQI summaries 2020-2024",
#     "petrochemical_corridor": False,
#     "aqi_latest_year": 2024,
# }

# All counties for a state
texas = airdata.filter_counties(state="TX")

# Counties classified as decreasing (and display-eligible)
decreasing = airdata.filter_counties(change_class="decrease", exclude_skipped=True)

# Display-eligible vs display-held — for methodology dashboards
eligible = airdata.get_display_eligible_counties()
held = airdata.get_display_held_counties()

# Coverage metadata
print(airdata.meta)
# Meta(generated='2026-05-25', total_counties=994, license='CC-BY-4.0')

# Field schema
print(airdata.schema())
```

## Field schema

| Field | Type | Description |
|-------|------|-------------|
| `county_fips` | str | 5-digit US county FIPS code |
| `county_name` | str \| None | County name without "County" suffix |
| `state` | str \| None | 2-letter state abbreviation |
| `airdata_change_class` | `"decrease" \| "increase" \| "stable" \| "insufficient_data"` \| None | Sensitivity-gated 5-year class |
| `airdata_pct_change` | float \| None | Signed percent change between earliest and latest cycle |
| `cycles_used` | int \| None | Number of reporting cycles (target 5, minimum 3) |
| `facility_count` | int \| None | Distinct facilities in latest cycle (display gate ≥5) |
| `sensitivity_robust` | bool \| None | True if direction stable when top-1 facility excluded |
| `skip_reason` | str \| None | `"facility_count_below_threshold"` \| `"petrochemical_corridor"` \| `"cycles_below_threshold"` \| None |
| `source_attribution` | str | Required attribution string for any public render |
| `petrochemical_corridor` | bool | True for 4 corridor counties (Harris TX, Calcasieu LA, Iberville LA, Kanawha WV) |
| `aqi_latest_year` | int \| None | Latest reporting year included in window |

Full machine-readable schema: ship via `airdata.schema()`.

## Coverage (snapshot)

| Class | Counties |
|-------|---------:|
| `decrease` | 67 |
| `increase` | 176 |
| `stable` | 376 |
| `insufficient_data` (display held) | 375 |
| **Total** | **994** |

Of which display-held:
- `facility_count_below_threshold`: 368
- `petrochemical_corridor`: 4

## Methodology

Source: EPA AirData annual AQI summaries (`annual_aqi_by_county_{YEAR}.zip`), years 2020-2024, retrieved via [aqs.epa.gov/aqsweb/airdata/download_files.html](https://aqs.epa.gov/aqsweb/airdata/download_files.html).

Pipeline:

1. Aggregate facility-reported emissions to county rollups per year.
2. Compute percent change between earliest and latest cycle.
3. Apply sensitivity gates:
   - `≥3` reporting cycles in window
   - `≥5` reporting facilities in latest cycle
   - **Top-1-exclude robustness:** direction unchanged when largest facility excluded
4. Bucket as `decrease` / `increase` / `stable` (`|pct_change| < 10%`) / `insufficient_data`.
5. Hold display for 4 petrochemical-corridor counties pending methodology review (Harris TX `48201`, Calcasieu LA `22019`, Iberville LA `22047`, Kanawha WV `54039`) — see `petrochemical_corridor` boolean.

Full methodology: [zipcheckup.com/methodology/airdata-trend/](https://zipcheckup.com/methodology/airdata-trend/).

## Caveats

- This is **facility-reported emissions data**, not ambient air quality. The trend reflects what facilities reported to EPA — not what residents breathe.
- AirData updates may reflect reporting methodology changes between cycles. Disclose this in any user-facing render.
- 4 petrochemical-corridor counties surface as `skip_reason: "petrochemical_corridor"` with a methodology-review notice; do **not** silently treat them as `insufficient_data`.
- This package is updated periodically. The `meta.generated` field surfaces the build date — check it before using for time-sensitive analysis.

## License

[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/). Free to use with attribution.

## Citation

```
Akulov, A. (2026). U.S. County AirData 5-Year Emission Trends [Data set].
ZipCheckup. https://doi.org/10.5281/zenodo.20382474
```

Or BibTeX:

```bibtex
@dataset{zipcheckup_us_county_airdata_trends,
  author    = {Akulov, Artem},
  title     = {U.S. County AirData 5-Year Emission Trends},
  year      = {2026},
  publisher = {ZipCheckup},
  doi       = {10.5281/zenodo.20382474},
  url       = {https://doi.org/10.5281/zenodo.20382474}
}
```

## Related

- [`us-water-quality-data`](https://pypi.org/project/us-water-quality-data/) — ZIP-level water quality dataset (sibling package).
- [`us-housing-risk-data`](https://pypi.org/project/us-housing-risk-data/) — Housing risk and home values (sibling package).
- [ZipCheckup Public API](https://api.zipcheckup.com/v1/) — REST API including `/v1/county/{fips}/airdata-trend/`.

## Source repository

[github.com/artakulov/waterbyzipcode](https://github.com/artakulov/waterbyzipcode) (this package lives under `packages/us-county-airdata-trends-python/`).
