Metadata-Version: 2.4
Name: finance-datagen
Version: 0.2.1
Summary: Standard financial data generation
Project-URL: Repository, https://github.com/prettygoodcapital/finance-datagen
Project-URL: Homepage, https://github.com/prettygoodcapital/finance-datagen
Author-email: PrettyGoodCapital <prettygoodcapital@gmail.com>
License: Apache-2.0
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Programming Language :: Rust
Requires-Python: >=3.10
Requires-Dist: finance-dates<0.3,>=0.2.0
Requires-Dist: finance-enums<0.6,>=0.5.1
Requires-Dist: numpy
Requires-Dist: polars>=1.0
Requires-Dist: pyarrow>=14
Requires-Dist: pydantic>=2
Provides-Extra: develop
Requires-Dist: build; extra == 'develop'
Requires-Dist: bump-my-version; extra == 'develop'
Requires-Dist: check-dist; extra == 'develop'
Requires-Dist: cibuildwheel; extra == 'develop'
Requires-Dist: codespell; extra == 'develop'
Requires-Dist: hatch-rs; extra == 'develop'
Requires-Dist: hatchling; extra == 'develop'
Requires-Dist: mdformat; extra == 'develop'
Requires-Dist: mdformat-tables>=1; extra == 'develop'
Requires-Dist: pytest; extra == 'develop'
Requires-Dist: pytest-cov; extra == 'develop'
Requires-Dist: ruff; extra == 'develop'
Requires-Dist: twine; extra == 'develop'
Requires-Dist: ty; extra == 'develop'
Requires-Dist: uv; extra == 'develop'
Requires-Dist: wheel; extra == 'develop'
Requires-Dist: yardang; extra == 'develop'
Description-Content-Type: text/markdown

# finance datagen

Standard financial data generation

[![Build Status](https://github.com/prettygoodcapital/finance-datagen/actions/workflows/build.yaml/badge.svg?branch=main&event=push)](https://github.com/prettygoodcapital/finance-datagen/actions/workflows/build.yaml)
[![codecov](https://codecov.io/gh/prettygoodcapital/finance-datagen/branch/main/graph/badge.svg)](https://codecov.io/gh/prettygoodcapital/finance-datagen)
[![License](https://img.shields.io/github/license/prettygoodcapital/finance-datagen)](https://github.com/prettygoodcapital/finance-datagen)
[![PyPI](https://img.shields.io/pypi/v/finance-datagen.svg)](https://pypi.python.org/pypi/finance-datagen)

## Overview

`finance-datagen` produces **synthetic** financial time series for
testing, demos, and benchmarking the rest of the `finance-*` stack
without relying on real market data. The numerical core is implemented
in Rust and emits Apache Arrow `RecordBatch` values; the Python layer
wraps each generator so the public API returns `polars.DataFrame`
objects.

All public generator classes inherit from `DataGenerator`, a pydantic
base model that validates typed parameters on construction. Use
`.generate()` for the table output, or `next(generator)` for one-shot
iterator-style use. Convenience functions such as `generate_prices(...)`,
`generate_gbm(...)`, and `generate_signal(...)` instantiate the matching model
and return `.generate()`.

### Generators

#### Price models (Rust core)

| Symbol            | Model                                                       | Output columns                                      |
| ----------------- | ----------------------------------------------------------- | --------------------------------------------------- |
| `GBMGenerator`    | Geometric Brownian Motion (log-Euler)                       | `timestamp, symbol, price`                          |
| `HestonGenerator` | Heston (1993) stochastic volatility (full-truncation Euler) | `timestamp, symbol, price, variance`                |
| `GARCHGenerator`  | GARCH(1,1) returns                                          | `timestamp, symbol, price, return, sigma`           |
| `ohlc_from_close` | OHLCV synthesis from any close series                       | `timestamp, symbol, open, high, low, close, volume` |

Price-path convenience wrappers are also exported as `generate_prices`,
`generate_gbm`, `generate_heston`, and `generate_garch`. `generate_prices` is a
plain alias for `generate_gbm` for examples and tests that want a model-neutral
name.

#### Python generators

| Symbol                          | Output                                                                              |
| ------------------------------- | ----------------------------------------------------------------------------------- |
| `SignalGenerator`               | Long-form `[date, symbol, signal, fwd_returns]` with target Pearson IC              |
| `FactorLoadingsGenerator`       | Wide `[symbol, market, value, momentum, size, quality]` Barra-style loadings        |
| `BenchmarkGenerator`            | `[date, benchmark]` Gaussian benchmark return series                                |
| `PositionsGenerator`            | Long-form position panel `[date, symbol, price, quantity, market_value, weight]`    |
| `TransactionsGenerator`         | Transaction log with enum-backed side/position-effect labels and explicit costs     |
| `OrdersGenerator`               | Enum-backed order fixtures with side, order type, status, and time-in-force         |
| `ExecutionsGenerator`           | Enum-backed execution fixtures for simulated fills                                  |
| `MultiAssetGBMGenerator`        | Correlated multi-asset GBM panel `[timestamp, symbol, price, return]`               |
| `RegimeSwitchingGenerator`      | Markov regime-switching price path `[timestamp, symbol, price, return, regime]`     |
| `MarketImpactCurveGenerator`    | Participation-rate impact curves with temporary, permanent, and total impact in bps |
| `StatisticalRiskModelGenerator` | PCA-style factor loadings, factor returns, and specific variance                    |
| `FundamentalRiskModelGenerator` | Barra-style enum-backed sector/style loadings plus specific variance                |
| `FactorCovarianceGenerator`     | Symmetric positive semidefinite factor covariance matrix                            |
| `SpecificVarianceGenerator`     | Positive idiosyncratic variance vector                                              |

Every Python generator has a matching `generate_*` convenience wrapper,
including the legacy `generate_signal`, `generate_factor_loadings`, and
`generate_benchmark` functions.

All Rust generators accept an optional `seed: int` for bit-reproducible
output across platforms (ChaCha8 RNG); the Python generators accept a
`seed` for `numpy.random.default_rng`.

Portfolio, transaction, order, execution, and market-model generators
also support enum-backed metadata columns where applicable, including
`currency`, `exchange`, `region`, `instrument_type`, `market_type`, and
`venue_type`. Portfolio and transaction generators can use
`finance-dates.Calendar` exchange calendars so generated dates and
timestamps align with actual business days and session hours.

### Quick start

```python
from finance_datagen import OrdersGenerator, generate_prices, generate_signal, ohlc_from_close

closes = generate_prices(symbol="ACME", seed=0)
bars   = ohlc_from_close(closes["price"], symbol="ACME", seed=0)
signal = generate_signal(n_dates=20, n_assets=50, seed=0)
orders = OrdersGenerator(n_dates=3, n_assets=5, orders_per_day=10, exchange="XNYS", currency="USD", seed=0).generate()
```

See the [Data](docs/src/DATA.md) page for model math, parameter ranges,
and output schemas, and the [API](docs/src/API.md) page for a complete
function-level reference.

### Architecture

The Rust core (`rust/src/`) is **polars-free**: every generator builds
an `arrow_array::RecordBatch` and returns it through the
[Arrow C Data Interface](https://arrow.apache.org/docs/format/CDataInterface.html)
PyCapsule via `pyo3-arrow`. The Python wrappers call
`polars.from_arrow(batch)` on the receiving end. This keeps the
polars-rs and polars-py codebases on opposite sides of a stable ABI
boundary, avoiding the binary-incompatibility issues that come with
linking polars from both Rust and CPython.

> [!NOTE]
> This library was generated using [copier](https://copier.readthedocs.io/en/stable/) from the [Base Python Project Template repository](https://github.com/python-project-templates/base).
