Metadata-Version: 2.4
Name: pyccrs
Version: 0.13.6
Summary: Python wrapper for downloading California Crash Reporting System (CCRS) data.
Author-email: Patrick Thomas Perrin <ptpdev@duck.com>
License-Expression: MIT
Project-URL: Homepage, https://codeberg.org/ptpdev/pyccrs
Classifier: Development Status :: 3 - Alpha
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.14
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: ckanpy>=0.2.8
Requires-Dist: pandas>=3.0.3
Requires-Dist: pydantic>=2.13.4
Dynamic: license-file

# pyccrs

### Purpose
California crash data is incredibly rich, and yet is a bit
of a hassle to access. `pyccrs` is my solution. With this
package, downloading CCRS data requires just a basic
understanding of Python, allowing data analysts to focus on
what they do best: last-mile data wrangling, and analysis.

### Dependencies

#### ckanpy
`ckanpy` is a package I created for `pyccrs`, which handles
interactions with the CKAN database that CCRS data is accessible
through.

#### pydantic
`pydantic` is used for improving data models, providing type
validation, parsing, and stability with respect to bugs.

#### pandas
`pandas` is used mainly as an export option, as well as a few
filters.

### Use Cases

#### Downloading CCRS Table Data
`pyccrs` makes it very simple to download CCRS data efficiently,
using SQL queries to parse data across all the years of data
one wishes to request. SQL queries are generated Pythonically,
with no SQL knowledge necessary from the user. For stability,
as sometimes a year may be available only through downloading
the CSV, such years will be downloaded and filtered accordingly,
albeit with a significantly longer load time and memory usage.

For the sake of usability, the easiest way to download CCRS
data through `pyccrs` is by a particular column:
- Crashes: city name
- Parties: collision ID
- People: collision ID

(Note: what CCRS calls "InjuredWitnessPassenger", `pyccrs`
calls "Person", or "People".)

In addition, `pyccrs` automatically remaps the column names to
a standardized, Pythonic format, like "column_name", as opposed
to CCRS's un-Pythonic "Column Name".

Each table has its own download function:
- `download_crashes()`
- `download_parties()`
- `download_people()`

#### Downloading Node-Based Data
`pyccrs` also allows downloading data as nodes in a graph,
which may be the superior data structure for certain analyses.
This may be achieved using `download_nodes()`.

#### Downloading Modeled Data
It is also possible to download data as lists of `pydantic`
models, with the export style option "model".

#### Documentation
`pyccrs` also serves as documentation. CCRS metadata is somewhat
inaccurate, with some columns inproperly typed or outright
missing. One of the objectives of `pyccrs`, then, is to
document the data and enforce that documentation through
`pydantic` models and docstrings.

### Contributing
Though functional, there are still many ways this package
can be improved. Feel free to look around for leftover TODOs,
send a pull request suggesting a change, or reach out to me 
by email to discuss specific improvements.
