Metadata-Version: 2.4
Name: chronocratic-datasets
Version: 0.1.0a2
Summary: Ready-to-use time series datasets for PyTorch Lightning
Author-email: The Chronocratic Developers <github@users.noreply.github.com>
License-Expression: BSD-3-Clause
Project-URL: Homepage, https://github.com/chronocratic/chronocratic-datasets
Project-URL: Documentation, https://chronocratic-datasets.readthedocs.io/
Project-URL: Repository, https://github.com/chronocratic/chronocratic-datasets
Project-URL: Issues, https://github.com/chronocratic/chronocratic-datasets/issues
Keywords: time-series,pytorch,lightning,machine-learning,datasets,data-modules,forecasting,classification,regression,artificial-intelligence
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy<3.0.0,>=2.1
Requires-Dist: pandas>=2.2.0
Requires-Dist: scipy>=1.13.0
Requires-Dist: scikit-learn<2.0.0,>=1.6
Requires-Dist: lightning<3.0,>=2.5
Requires-Dist: torch<3.0,>=2.4
Provides-Extra: docs
Requires-Dist: sphinx>=7.0; extra == "docs"
Requires-Dist: pydata-sphinx-theme>=0.15; extra == "docs"
Requires-Dist: myst-parser>=3.0; extra == "docs"
Dynamic: license-file

# chronocratic-datasets

Ready-to-use time series datasets for PyTorch Lightning.

[![License: BSD 3-Clause](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![PyPI version](https://img.shields.io/pypi/v/chronocratic-datasets.svg)](https://pypi.org/project/chronocratic-datasets/)
[![Python 3.12+](https://img.shields.io/pypi/pyversions/chronocratic-datasets.svg)](https://pypi.org/project/chronocratic-datasets/)
[![PyPI Downloads](https://static.pepy.tech/personalized-badge/chronocratic-datasets?period=total&units=INTERNATIONAL_SYSTEM&left_color=BLACK&right_color=BLUE&left_text=downloads)](https://pepy.tech/projects/chronocratic-datasets)
[![Build and Test](https://github.com/chronocratic/chronocratic-datasets/actions/workflows/build-and-test.yml/badge.svg)](https://github.com/chronocratic/chronocratic-datasets/actions/workflows/build-and-test.yml)
[![Documentation Status](https://readthedocs.org/projects/chronocratic-datasets/badge/?version=latest)](https://chronocratic-datasets.readthedocs.io/)
![code style - ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)
[![GitHub Stars](https://img.shields.io/github/stars/chronocratic/chronocratic-datasets)](https://github.com/chronocratic/chronocratic-datasets)

## Installation

Install the package via pip:

```bash
pip install chronocratic-datasets
```

> **Note:** The PyPI package name uses a hyphen (`chronocratic-datasets`), but the import uses the `chronocratic.datasets` namespace:
> ```python
> from chronocratic.datasets import ...
> ```

## Quick Start

```python
from pathlib import Path

from chronocratic.datasets import ForecastingMode, WeatherDataModule

weather = WeatherDataModule(
    dataset_file_path=Path("data/weather.csv"),
    mode=ForecastingMode.UNIVARIATE,
)
weather.prepare_data()
weather.setup()
train_loader = weather.train_dataloader()
```

## Datasets

### Forecasting

- **ETT** (Electricity Transformer Temperature): ETTh1, ETTh2, ETTm1, ETTm2 — transformer temperature data at hourly and 15-minute intervals
- **Weather**: Weather and meteorological features from 2012 to 2017
- **Electricity**: Hourly electricity load data

### Classification

- **UCR** (Univariate): Archive of univariate time series classification datasets
- **UEA** (Multivariate): Archive of multivariate time series classification datasets

## Features

- **PyTorch Lightning DataModules** — Drop-in `LightningDataModule` implementations for seamless integration with Lightning training loops
- **Automatic caching with atomic writes** — Downloaded and processed data is cached locally with atomic file operations to prevent corruption
- **DDP-compliant data loading** — Workers share cached data correctly under Distributed Data Parallel training
- **Multiple forecasting modes** — Switch between `UNIVARIATE` and `MULTIVARIATE` forecasting configurations
- **Built-in scaling** — MinMax and Standard scalers applied automatically per dataset conventions
- **Type-safe API** — Full type hints and Google-style docstrings for IDE autocomplete and static analysis

## Documentation

Comprehensive documentation, including API reference, quickstart guides, and contributing instructions, is available at [chronocratic-datasets.readthedocs.io](https://chronocratic-datasets.readthedocs.io/).

## License

BSD 3-Clause — see [LICENSE](LICENSE) for the full text.
