Metadata-Version: 2.4
Name: finance-datagen
Version: 0.1.0
Summary: Standard financial data generation
Project-URL: Repository, https://github.com/prettygoodcapital/finance-datagen
Project-URL: Homepage, https://github.com/prettygoodcapital/finance-datagen
Author-email: PrettyGoodCapital <prettygoodcapital@gmail.com>
License: Apache-2.0
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Programming Language :: Rust
Requires-Python: >=3.10
Requires-Dist: numpy
Requires-Dist: polars>=1.0
Requires-Dist: pyarrow>=14
Provides-Extra: develop
Requires-Dist: build; extra == 'develop'
Requires-Dist: bump-my-version; extra == 'develop'
Requires-Dist: check-dist; extra == 'develop'
Requires-Dist: cibuildwheel; extra == 'develop'
Requires-Dist: codespell; extra == 'develop'
Requires-Dist: hatch-rs; extra == 'develop'
Requires-Dist: hatchling; extra == 'develop'
Requires-Dist: mdformat; extra == 'develop'
Requires-Dist: mdformat-tables>=1; extra == 'develop'
Requires-Dist: pytest; extra == 'develop'
Requires-Dist: pytest-cov; extra == 'develop'
Requires-Dist: ruff; extra == 'develop'
Requires-Dist: twine; extra == 'develop'
Requires-Dist: ty; extra == 'develop'
Requires-Dist: uv; extra == 'develop'
Requires-Dist: wheel; extra == 'develop'
Requires-Dist: yardang; extra == 'develop'
Description-Content-Type: text/markdown

# finance datagen

Standard financial data generation

[![Build Status](https://github.com/prettygoodcapital/finance-datagen/actions/workflows/build.yaml/badge.svg?branch=main&event=push)](https://github.com/prettygoodcapital/finance-datagen/actions/workflows/build.yaml)
[![codecov](https://codecov.io/gh/prettygoodcapital/finance-datagen/branch/main/graph/badge.svg)](https://codecov.io/gh/prettygoodcapital/finance-datagen)
[![License](https://img.shields.io/github/license/prettygoodcapital/finance-datagen)](https://github.com/prettygoodcapital/finance-datagen)
[![PyPI](https://img.shields.io/pypi/v/finance-datagen.svg)](https://pypi.python.org/pypi/finance-datagen)

## Overview

`finance-datagen` produces **synthetic** financial time series for
testing, demos, and benchmarking the rest of the `finance-*` stack
without relying on real market data. The numerical core is implemented
in Rust and emits Apache Arrow `RecordBatch` values; the Python layer
wraps each generator so the public API returns `polars.DataFrame`
objects.

### Generators

#### Price models (Rust core)

| Symbol            | Model                                                       | Output columns                                      |
| ----------------- | ----------------------------------------------------------- | --------------------------------------------------- |
| `GBMGenerator`    | Geometric Brownian Motion (log-Euler)                       | `timestamp, symbol, price`                          |
| `HestonGenerator` | Heston (1993) stochastic volatility (full-truncation Euler) | `timestamp, symbol, price, variance`                |
| `GARCHGenerator`  | GARCH(1,1) returns                                          | `timestamp, symbol, price, return, sigma`           |
| `ohlc_from_close` | OHLCV synthesis from any close series                       | `timestamp, symbol, open, high, low, close, volume` |

#### Cross-sectional panels (Python)

| Symbol                     | Output                                                                       |
| -------------------------- | ---------------------------------------------------------------------------- |
| `generate_signal`          | Long-form `[date, symbol, signal, fwd_returns]` with target Pearson IC       |
| `generate_factor_loadings` | Wide `[symbol, market, value, momentum, size, quality]` Barra-style loadings |
| `generate_benchmark`       | `[date, benchmark]` Gaussian benchmark return series                         |

All Rust generators accept an optional `seed: int` for bit-reproducible
output across platforms (ChaCha8 RNG); the cross-sectional generators
accept a `seed` for `numpy.random.default_rng`.

### Quick start

```python
from finance_datagen import GBMGenerator, ohlc_from_close

closes = GBMGenerator(s0=100.0, mu=0.07, sigma=0.25, seed=0).generate()
bars   = ohlc_from_close(closes["price"], seed=0)
```

See the [Data](docs/src/DATA.md) page for model math, parameter ranges,
and output schemas, and the [API](docs/src/API.md) page for a complete
function-level reference.

### Architecture

The Rust core (`rust/src/`) is **polars-free**: every generator builds
an `arrow_array::RecordBatch` and returns it through the
[Arrow C Data Interface](https://arrow.apache.org/docs/format/CDataInterface.html)
PyCapsule via `pyo3-arrow`. The Python wrappers call
`polars.from_arrow(batch)` on the receiving end. This keeps the
polars-rs and polars-py codebases on opposite sides of a stable ABI
boundary, avoiding the binary-incompatibility issues that come with
linking polars from both Rust and CPython.

> [!NOTE]
> This library was generated using [copier](https://copier.readthedocs.io/en/stable/) from the [Base Python Project Template repository](https://github.com/python-project-templates/base).
