Metadata-Version: 2.4
Name: finmlsim
Version: 0.4.0
Summary: Simulation toolkit for financial machine learning: scholarly market, microstructure, and execution simulators for teaching and research.
Author-email: Shane Conway <shane.conway@gmail.com>
Maintainer-email: Shane Conway <shane.conway@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/smc77/aiinfinance
Project-URL: Documentation, https://github.com/smc77/aiinfinance/blob/master/finmlsim/README.md
Project-URL: Repository, https://github.com/smc77/aiinfinance
Project-URL: Issues, https://github.com/smc77/aiinfinance/issues
Keywords: finance,quantitative-finance,machine-learning,simulation,monte-carlo,stochastic-volatility,market-microstructure,limit-order-book,optimal-execution,heston,sabr,hawkes-process,rough-volatility,almgren-chriss,kyle-model,block-bootstrap
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Education
Classifier: Topic :: Office/Business :: Financial :: Investment
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Provides-Extra: analysis
Requires-Dist: pandas>=2.0; extra == "analysis"
Requires-Dist: scipy>=1.10; extra == "analysis"
Requires-Dist: matplotlib>=3.7; extra == "analysis"
Requires-Dist: statsmodels>=0.14; extra == "analysis"
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Dynamic: license-file

# finmlsim

Simulation and analysis toolkit for the **FIN 70xx — Financial Machine Learning & AI** course.

The course is **simulation-driven**: before confronting messy empirical data, we generate return
series under controlled assumptions, turn each *stylized fact* of markets on and off, and watch the
effect. `finmlsim` provides those generators and the tools to measure what they produce.

```python
import finmlsim as fms
r = fms.simulate.garch(n=2000, dist="t", seed=0)   # volatility clustering + fat tails
fms.stylized.summary(r)                            # measure the stylized facts
prices = fms.stylized.to_prices(r)                 # log returns -> price path for plotting
```

## Install
```bash
pip install -e .          # from the repo root (editable)
# core needs only numpy; figures/analysis also use pandas, scipy, matplotlib, statsmodels
```

## Modules
- **`simulate`** — generators grouped by family:
  - *Return series* (1-D log returns): `gaussian`, `student_t`, `arma`, `garch`,
    `regime_switching`, `jump_diffusion`, `cont_bouchaud`.
  - *Cross-sectional panel*: `panel` (2-D, common-factor structure).
  - *Continuous-time price processes*: `ornstein_uhlenbeck` / `vasicek`,
    `heston` (1993), `bates` (1996), `sabr` (Hagan et al. 2002), `rbergomi`
    (Bayer-Friz-Gatheral 2016), `fbm` / `fgn` (Hurst-parametrized).
  - *Microstructure and execution*: `hawkes` (1971 self-exciting), `roll_bounce` (1984),
    `glosten_milgrom` (1985), `kyle` (1985), `limit_order_book` (Cont-Stoikov style),
    `almgren_chriss` (2000 optimal execution).
  Every generator takes a `seed` for reproducibility.
- **`stylized`** — diagnostics: `acf`, `excess_kurtosis`, `vol_clustering`, `summary`, `to_prices`.
- **`metrics`** — strategy evaluation: `sharpe`, `max_drawdown`, `rank_ic`, `ic_ir`, `turnover`,
  `deflated_sharpe`, `min_track_record_length` (used in Chapters 15 and 16).
- **`resample`** — block bootstraps for dependent series: `stationary_bootstrap`
  (Politis-Romano 1994), `circular_block_bootstrap` (Politis-Romano 1992). Pairs with the
  multiple-testing diagnostics in `metrics`.
- **`data`** — loaders for the bundled empirical datasets (Ken French daily factors, momentum, and 12
  industry portfolios) used for the book's real-data "empirical companion" figures: `ff_factors`,
  `ff_momentum`, `industries`, `market_return`. Offline; see `datasets/famafrench/`.

## The stylized facts each return-series generator produces
| Generator | Fat tails | Vol clustering | Mean autocorrelation |
|-----------|:---------:|:--------------:|:--------------------:|
| `gaussian` | – | – | – |
| `student_t` | ✓ | – | – |
| `arma` | – | – | ✓ (by design) |
| `garch` | mild | ✓ | – |
| `garch(dist="t")` | ✓ | ✓ | – |
| `regime_switching` | ✓ (mixed) | ✓ | – |
| `jump_diffusion` | ✓ | – | – |
| `cont_bouchaud` | ✓ (endogenous) | (weak) | – |

## What the continuous-time and microstructure models give you
| Generator | Returns | Key parameter | Use case |
|-----------|---------|---------------|----------|
| `ornstein_uhlenbeck` / `vasicek` | path | `kappa` (mean-reversion rate) | pairs, short rates, vol state |
| `heston` | `(S, v)` | `xi`, `rho`, `kappa` | stochastic vol; smile/skew |
| `bates` | `(S, v)` | + `jump_intensity` | stoch-vol with crash risk |
| `sabr` | `(F, alpha)` | `beta`, `nu`, `rho` | FX / rates vol surfaces |
| `rbergomi` | `(S, xi)` | `H` < 0.5 | rough volatility |
| `fbm` / `fgn` | path / increments | `H` | Hurst-parameterized paths |
| `hawkes` | event times | `alpha`, `beta` | order-flow clustering, contagion |
| `roll_bounce` | tx-price log returns | `half_spread` | bid-ask bounce, negative AC(1) |
| `glosten_milgrom` | dict (ask/bid/spread/belief) | `alpha` (informed prob) | adverse-selection spread |
| `kyle` | dict (v, u, x, y, price) | `sigma_v / sigma_u` | strategic informed trading |
| `limit_order_book` | dict (mid, spread, depth) | `lam0`, `theta`, `mu_mkt` | LOB dynamics, depth, impact |
| `almgren_chriss` | dict (path, costs) | `lam` (risk aversion) | execution trajectories |

These are **teaching models**, not calibrations. Planned additions: a cost-aware `backtest` and
leak-free `features` helpers. The package will evolve with the course.
