Metadata-Version: 2.4
Name: storywrangler-schemas
Version: 0.0.1
Summary: Shared Pydantic schemas for the Storywrangler platform
Project-URL: Homepage, https://github.com/vermont-complex-systems/storywrangler
Project-URL: Documentation, https://complexstories.uvm.edu
Project-URL: Repository, https://github.com/vermont-complex-systems/storywrangler/tree/main/packages/schemas
Project-URL: Specification, https://github.com/vermont-complex-systems/Storywrangler-Specification
Author-email: Vermont Complex Systems Institute <compstorylab@uvm.edu>
License-Expression: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.11
Requires-Dist: mmh3>=4.0.0
Requires-Dist: pydantic>=2.0.0
Description-Content-Type: text/markdown

# storywrangler-schemas

Shared Pydantic schemas for the [Storywrangler](https://github.com/vermont-complex-systems/storywrangler) platform. Both the backend API and the `storywrangler` SDK import from this package — neither owns a copy of the models.

Implements the [Storywrangler Specification v0.0.3](https://github.com/vermont-complex-systems/Storywrangler-Specification/blob/main/versions/0.0.3.md).

## Installation

```bash
pip install storywrangler-schemas
```

Most users should install `storywrangler` instead, which includes this package as a dependency.

## What's inside

### Registry models (`storywrangler_schemas.registry`)

| Model | Purpose |
|---|---|
| `DatasetCreate` | Full registration payload (domain, format, transform, entity mapping, lineage, ...) |
| `EndpointSchemaConfig` | Output shape declaration (`types-counts`, `time-series`) |
| `TransformConfig` | Query slice axes (time dimension, filter dimensions, hash bucket) |
| `EntityMappingConfig` | Entity ID resolution config (local column + namespace) |
| `EntityRow` | One row in the entity mapping table |
| `ManifestConfig` | Coverage index (availability, partition index) |
| `OwnershipConfig` | Owner group, contact, lifecycle status |
| `LineageConfig` | Sources, upstream datasets, pipeline repo, archival DOI |

### Hash bucket assignment (`storywrangler_schemas.hashing`)

Canonical `murmur3_32` hash function shared between the backend query layer and data pipelines:

```python
from storywrangler_schemas.hashing import assign_bucket

bucket = assign_bucket("hello", num_buckets=16)  # deterministic int in [0, 16)
```

### Standards (`storywrangler_schemas.standards`)

Entity ID validation, namespace registry, and spec URL helpers.
