Metadata-Version: 2.4
Name: nyc311
Version: 0.2.1
Summary: Python toolkit for reproducible NYC 311 complaint analysis via a typed SDK and CLI.
Project-URL: Homepage, https://github.com/random-walks/nyc311
Project-URL: Documentation, https://nyc311.readthedocs.io/
Project-URL: Bug Tracker, https://github.com/random-walks/nyc311/issues
Project-URL: Discussions, https://github.com/random-walks/nyc311/discussions
Project-URL: Changelog, https://github.com/random-walks/nyc311/releases
Author-email: random-walks <blaise@buenaola.io>
License-Expression: MIT
License-File: LICENSE
Keywords: 311,analytics,civic-tech,geospatial,nyc,nyc311,open-data
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Classifier: Typing :: Typed
Requires-Python: >=3.10
Provides-Extra: all
Requires-Dist: contextily>=1.6.2; extra == 'all'
Requires-Dist: geopandas>=1.1; extra == 'all'
Requires-Dist: matplotlib>=3.10.8; extra == 'all'
Requires-Dist: pandas>=2.3.3; extra == 'all'
Requires-Dist: scipy>=1.15.3; extra == 'all'
Requires-Dist: shapely>=2.1; extra == 'all'
Provides-Extra: dataframes
Requires-Dist: pandas>=2.3.3; extra == 'dataframes'
Provides-Extra: plotting
Requires-Dist: matplotlib>=3.10.8; extra == 'plotting'
Provides-Extra: science
Requires-Dist: matplotlib>=3.10.8; extra == 'science'
Requires-Dist: pandas>=2.3.3; extra == 'science'
Requires-Dist: scipy>=1.15.3; extra == 'science'
Provides-Extra: spatial
Requires-Dist: contextily>=1.6.2; extra == 'spatial'
Requires-Dist: geopandas>=1.1; extra == 'spatial'
Requires-Dist: shapely>=2.1; extra == 'spatial'
Description-Content-Type: text/markdown

# nyc311

[![Actions Status][actions-badge]][actions-link]
[![Documentation Status][rtd-badge]][rtd-link]
[![PyPI version][pypi-version]][pypi-link]
[![PyPI platforms][pypi-platforms]][pypi-link]

Python toolkit for reproducible NYC 311 complaint analysis via a typed SDK and
CLI.

## Status

`nyc311` is now on the stable `0.2` line with a tested toolkit for loading,
analyzing, and exporting NYC 311 complaint data.

The first public stable release shipped in `0.2.0`, and the `0.2.x` line focuses
on packaging polish, developer ergonomics, and incremental workflow improvements
on top of the current analysis surface.

### What ships in the stable `0.2` line

- load filtered NYC 311-style records from local CSV extracts or the live
  Socrata API
- stage reproducible local CSV snapshots from live fetches
- derive deterministic first-pass topic labels for supported complaint types
- aggregate complaint topics by borough or community district
- measure topic-rule coverage and summarize resolution gaps
- score anomalies over aggregated topic summaries
- export CSV tables, boundary-backed GeoJSON, and markdown report cards
- run the workflow through both a thin CLI and a composable functional SDK

## Install

Choose the dependency footprint that matches your workflow:

```bash
pip install nyc311
```

For the full turnkey experience:

```bash
pip install "nyc311[all]"
```

For pandas-backed conversion helpers:

```bash
pip install "nyc311[dataframes]"
```

For plotting and exploratory analysis without the geospatial stack:

```bash
pip install "nyc311[science]"
```

## Why this exists

NYC 311 data is one of the richest public records of neighborhood
quality-of-life complaints in the country, but much of the useful signal is
locked inside short text fields such as complaint descriptors.

This project aims to turn those records into reusable outputs for civic
analysis, journalism, and research while staying honest about what is truly
implemented today.

## Core workflow

The stable `0.2` line focuses on a deterministic, testable workflow:

1. load records from a local CSV extract or a filtered Socrata slice
2. filter by date, geography, and complaint type
3. assign a first-pass topic label using explicit keyword rules
4. aggregate counts by borough or community district
5. export a CSV summary table or boundary-backed GeoJSON artifact

### Supported topic extraction

The current rules-based topic extractor is implemented only for:

- `Blocked Driveway`
- `Illegal Parking`
- `Noise - Residential`
- `Rodent`

This is intentionally described as **first-pass topic extraction**, not
clustering or advanced NLP.

## Quick links

Docs: [Home](https://nyc311.readthedocs.io/en/latest/),
[Getting Started](https://nyc311.readthedocs.io/en/latest/getting-started/),
[CLI Reference](https://nyc311.readthedocs.io/en/latest/cli/),
[SDK Guide](https://nyc311.readthedocs.io/en/latest/sdk/),
[Examples](https://nyc311.readthedocs.io/en/latest/examples/),
[Architecture](https://nyc311.readthedocs.io/en/latest/architecture/),
[Contributing](https://nyc311.readthedocs.io/en/latest/contributing/)

## Example

```python
from datetime import date
from pathlib import Path

from nyc311 import analysis, export, models, pipeline

records = pipeline.fetch_service_requests(
    filters=models.ServiceRequestFilter(
        start_date=date(2025, 1, 1),
        end_date=date(2025, 1, 31),
        geography=models.GeographyFilter("borough", models.BOROUGH_BROOKLYN),
        complaint_types=("Noise - Residential",),
    ),
    socrata_config=models.SocrataConfig(page_size=250, max_pages=1),
)

export.export_service_requests_csv(
    records,
    models.ExportTarget("csv", Path("brooklyn-noise-snapshot.csv")),
)

assignments = analysis.extract_topics(records, models.TopicQuery("Noise - Residential"))
summary = analysis.aggregate_by_geography(assignments, geography="community_district")
export.export_topic_table(
    summary,
    models.ExportTarget("csv", Path("brooklyn-noise-topics.csv")),
)
```

CLI equivalent:

```bash
nyc311 fetch \
  --output brooklyn-noise-snapshot.csv \
  --complaint-type "Noise - Residential" \
  --geography borough \
  --geography-value BROOKLYN \
  --start-date 2025-01-01 \
  --end-date 2025-01-31 \
  --page-size 250 \
  --max-pages 1

nyc311 topics \
  --source brooklyn-noise-snapshot.csv \
  --complaint-type "Noise - Residential" \
  --geography community_district \
  --output brooklyn-noise-topics.csv
```

Live-data snapshot workflow:

```bash
nyc311 fetch \
  --output brooklyn-rodent-snapshot.csv \
  --complaint-type "Rodent" \
  --geography borough \
  --geography-value BROOKLYN \
  --start-date 2025-01-01 \
  --end-date 2025-01-31 \
  --page-size 500 \
  --max-pages 1
```

## Data assumptions

`load_service_requests()` currently supports:

- local CSV files
- live Socrata loading via `SocrataConfig`

CSV inputs use these columns:

- `unique_key`
- `created_date`
- `complaint_type`
- `descriptor`
- `borough`
- `community_district` or `community_board`

`resolution_description` is optional and loaded when present. It is currently
used by the resolution-gap and report-card helpers, while topic extraction
remains descriptor-driven.

## Public package surface

The current public package surface is organized around explicit namespaces:

- `nyc311.models` for dataclasses, constants, and configs
- `nyc311.io` for CSV and Socrata loading
- `nyc311.analysis` for topic extraction, coverage, gaps, and anomalies
- `nyc311.geographies` for packaged boundary layers and geometry helpers
- `nyc311.samples` for packaged sample records and sample-aligned boundaries
- `nyc311.export` for CSV, GeoJSON, and report exports
- `nyc311.pipeline` for one-call workflow helpers
- `nyc311.dataframes` for optional pandas conversions
- `nyc311.spatial` for optional geopandas helpers
- `nyc311.plotting` for optional plotting helpers
- `nyc311.presets` for reusable filter and Socrata config builders
- `nyc311.cli` with the `topics` and `fetch` subcommands

## Documentation

The hosted docs site is the canonical reference:
[nyc311.readthedocs.io](https://nyc311.readthedocs.io/).

If you are browsing in GitHub, the source docs live in `docs/`, including
`index.md`, `getting-started.md`, `cli.md`, `sdk.md`, `examples.md`, `api.md`,
`architecture.md`, and `contributing.md`.

Runnable examples live in `examples/` as self-contained consumer projects.

For local preview:

```bash
make docs
make docs-build
```

## Development

```bash
uv sync
uv sync --all-groups --all-extras
uv run --all-extras pytest -m "not integration"
uv run ruff check .
uv run ruff format --check .
uv run mypy
uv run mkdocs serve
uv run mkdocs build --strict
uv run python scripts/audit_public_api.py
uv run pytest -m "fetch and not integration"
```

## License

MIT.

<!-- prettier-ignore-start -->
[actions-badge]:            https://github.com/random-walks/nyc311/actions/workflows/ci.yml/badge.svg
[actions-link]:             https://github.com/random-walks/nyc311/actions
[github-discussions-badge]: https://img.shields.io/static/v1?label=Discussions&message=Ask&color=blue&logo=github
[github-discussions-link]:  https://github.com/random-walks/nyc311/discussions
[pypi-link]:                https://pypi.org/project/nyc311/
[pypi-platforms]:           https://img.shields.io/pypi/pyversions/nyc311
[pypi-version]:             https://img.shields.io/pypi/v/nyc311
[rtd-badge]:                https://readthedocs.org/projects/nyc311/badge/?version=latest
[rtd-link]:                 https://nyc311.readthedocs.io/en/latest/?badge=latest
<!-- prettier-ignore-end -->
