Metadata-Version: 2.4
Name: pyscarcopula
Version: 0.3.1
Summary: Stochastic copula models with Ornstein-Uhlenbeck latent process
Author-email: Alexey Novokhatskiy <aanovokhatskiy@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/AANovokhatskiy/pyscarcopula
Project-URL: Repository, https://github.com/AANovokhatskiy/pyscarcopula
Project-URL: Issues, https://github.com/AANovokhatskiy/pyscarcopula/issues
Keywords: copula,stochastic,SCAR,vine copula,transfer matrix,risk management,VaR,CVaR,time-varying dependence
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Office/Business :: Financial
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: numpy>=1.22
Requires-Dist: numba>=0.56
Requires-Dist: scipy>=1.9
Requires-Dist: joblib>=1.0
Requires-Dist: tqdm>=4.0
Provides-Extra: test
Requires-Dist: pytest>=7.0; extra == "test"
Dynamic: license-file

# pyscarcopula

Stochastic copula models with Ornstein-Uhlenbeck latent process in Python.

- [About](#about)
- [Install](#install)
- [Features](#features)
- [Mathematical background](#mathematical-background)
- [Examples](#examples)
- [Performance tuning](#performance-tuning)
- [Architecture](#architecture)
- [License](#license)

## About

**pyscarcopula** fits multivariate distributions using the copula approach with time-varying dependence. The classical constant-parameter model is extended to a stochastic model where the copula parameter follows an Ornstein-Uhlenbeck process.

For parameter estimation we provide five methods:

| Method | Key | Description |
|---|---|---|
| Maximum likelihood | `mle` | Classical fit with constant copula parameter |
| MC p-sampler | `scar-p-ou` | Monte Carlo without importance sampling |
| MC m-sampler | `scar-m-ou` | Monte Carlo with efficient importance sampling (EIS) |
| **Transfer matrix** | **`scar-tm-ou`** | **Deterministic quadrature on a grid — no MC variance or bias** |
| GAS | `gas` | Generalized autoregressive score (observation-driven, deterministic) |

The transfer matrix method exploits the Markov structure and known Gaussian transition density of the OU process to evaluate the likelihood function as a sequence of matrix-vector products. The implementation automatically selects between dense and sparse transfer matrices depending on the kernel bandwidth, and adaptively refines the grid to resolve the transition kernel.

## Install

```bash
pip install pyscarcopula
```

For development (includes data files and tests):

```bash
git clone https://github.com/AANovokhatskiy/pyscarcopula
cd pyscarcopula
pip install -e ".[test]"
pytest tests/
```

**Dependencies:** numpy, numba, scipy, joblib, tqdm.

## Features

**Copula families**
- Archimedean: Gumbel, Frank, Clayton, Joe (with rotations 0°/90°/180°/270°)
- Elliptical: Gaussian, Student-t (MLE only)
- Independence copula (null model for automatic vine pruning)
- Equicorrelation Gaussian (single dynamic correlation for d assets)
- C-vine pair copula construction

**Estimation methods**
- MLE — constant copula parameter
- SCAR-TM-OU — transfer matrix with analytical gradient (recommended)
- GAS — observation-driven score model
- SCAR-P-OU / SCAR-M-OU — Monte Carlo alternatives

**Vine copulas**
- Automatic copula family and rotation selection per edge (AIC/BIC)
- Automatic pruning of weak edges via independence copula baseline
- Tree-level and edge-level truncation for scalability

**Diagnostics and risk**
- Goodness-of-fit via Rosenblatt transform + Cramér–von Mises test
- Mixture Rosenblatt transform for stochastic models
- Smoothed time-varying copula parameter
- VaR / CVaR via `pyscarcopula.contrib` (rolling window, marginals, portfolio optimization)

## Mathematical background

### Copula models

By Sklar's theorem, any joint distribution can be decomposed as

$$F(x_1, \ldots, x_d) = C(F_1(x_1), \ldots, F_d(x_d))$$

We focus on single-parameter Archimedean copulas defined via a generator φ(t; θ):

$$C(u_1, \ldots, u_d) = \phi^{-1}(\phi(u_1; \theta) + \cdots + \phi(u_d; \theta))$$

| Copula  | Generator | Inverse generator | Domain |
|---------|-----------|-------------------|--------|
| Gumbel  | (-log t)^θ | exp(-t^(1/θ)) | θ ∈ [1, ∞) |
| Frank   | -log((e^(-θt) - 1)/(e^(-θ) - 1)) | -(1/θ)log(1 + e^(-t)(e^(-θ) - 1)) | θ ∈ (0, ∞) |
| Joe     | -log(1 - (1-t)^θ) | 1 - (1 - e^(-t))^(1/θ) | θ ∈ [1, ∞) |
| Clayton | (1/θ)(t^(-θ) - 1) | (1 + tθ)^(-1/θ) | θ ∈ (0, ∞) |

### Stochastic copula (SCAR)

In the stochastic model the copula parameter is driven by a latent Ornstein-Uhlenbeck process:

$$\theta_t = \Psi(x_t), \qquad dx_t = \theta_{\text{OU}}(\mu - x_t)\,dt + \nu\,dW_t$$

where Ψ maps the OU state to the copula parameter domain. The likelihood function is an integral over all latent paths:

$$L = \int \prod_{t} c(u_{1t}, u_{2t}; \Psi(x_t))\;p(x_t | x_{t-1})\;dx_0 \cdots dx_T$$

### Transfer matrix method

The Markov property allows this high-dimensional integral to be factored into a chain of one-dimensional integrals, each computed as a matrix-vector product on a discretized grid. Total complexity: O(TK²) (dense) or O(TKb) (sparse, where b is the kernel bandwidth).

### Goodness of fit

Model quality is assessed via the Rosenblatt transform. For the stochastic model we use the **mixture Rosenblatt transform**, which integrates the h-function over the predictive distribution of the latent state, avoiding the Jensen bias from plugging in a point estimate. Uniformity of the transformed sample is tested with the Cramér–von Mises statistic.


## Examples

### 1. Read dataset

```python
import pandas as pd
import numpy as np
from pyscarcopula._utils import pobs
from pyscarcopula import GumbelCopula, CVineCopula, GaussianCopula, StudentCopula
from pyscarcopula.api import fit, smoothed_params
from pyscarcopula.stattests import gof_test

crypto_prices = pd.read_csv("data/crypto_prices.csv", index_col=0, sep=';')
tickers = ['BTC-USD', 'ETH-USD']

returns = np.log(crypto_prices[tickers] / crypto_prices[tickers].shift(1))[1:].values
u = pobs(returns)
```

### 2. Fit a bivariate copula

```python
copula = GumbelCopula(rotate=180)

result_mle = fit(copula, u, method='mle')
result_tm  = fit(copula, u, method='scar-tm-ou')
result_gas = fit(copula, u, method='gas')

# Typed results — access parameters directly
print(f"MLE:     logL={result_mle.log_likelihood:.2f}, r={result_mle.copula_param:.4f}")
print(f"SCAR-TM: logL={result_tm.log_likelihood:.2f}, theta={result_tm.params.theta:.2f}")
print(f"GAS:     logL={result_gas.log_likelihood:.2f}, beta={result_gas.beta:.4f}")

# GoF test
gof = gof_test(copula, u, fit_result=result_tm, to_pobs=False)
print(f"GoF p-value: {gof.pvalue:.4f}")
```

Results on daily BTC-ETH data (T = 1460):

| Model | logL | GoF p-value |
|-------|------|-------------|
| MLE | 955.63 | 0.0087 |
| GAS | 1031.42 | 0.5282 |
| **SCAR-TM** | **1042.47** | **0.6201** |

### 3. Smoothed copula parameter

```python
r_t = smoothed_params(copula, u, result_tm)
# r_t[k] = E[Psi(x_k) | u_{1:k-1}]
```

### 4. Sample from copula

```python
samples = copula.sample(n=1000, r=result_mle.copula_param)
```

### 5. Fit a multivariate C-vine copula

```python
tickers = ['BTC-USD', 'ETH-USD', 'BNB-USD', 'ADA-USD', 'XRP-USD', 'DOGE-USD']
returns_6d = np.log(crypto_prices[tickers] / crypto_prices[tickers].shift(1))[1:251].values
u6 = pobs(returns_6d)

vine = CVineCopula()
vine.fit(u6, method='scar-tm-ou',
         truncation_level=2, min_edge_logL=10)
vine.summary()

gof_vine = gof_test(vine, u6, to_pobs=False)
```

Results on 6-dimensional crypto data (T = 250):

| Model | logL | GoF p-value |
|-------|------|-------------|
| **C-vine SCAR-TM** | **921.9** | **0.90** |
| C-vine MLE | 869.2 | 0.21 |
| Student-t | 764.4 | 0.00 |

### 6. Risk metrics (VaR / CVaR)

```python
from pyscarcopula.contrib.risk_metrics import risk_metrics

result = risk_metrics(
    GumbelCopula(rotate=180),
    returns_6d, window_len=100,
    gamma=[0.95], N_mc=[100_000],
    marginals_method='johnsonsu',
    method='mle',
    optimize_portfolio=False,
    portfolio_weight=np.ones(6) / 6,
    n_jobs=-1,
)

var = result[0.95][100_000]['var']
cvar = result[0.95][100_000]['cvar']
```

See `example_new_api.ipynb` for a complete walkthrough with plots.

## Performance tuning

### Bivariate copula

```python
copula = GumbelCopula(rotate=180)
result = fit(copula, u, method='scar-tm-ou')

# Relaxed tolerance (faster, slight logL loss)
result = fit(copula, u, method='scar-tm-ou', tol=5e-2)
```

| Parameter | Default | Effect |
|---|---|---|
| `analytical_grad` | `True` | Analytical gradient. ~3-4x fewer function evaluations. |
| `smart_init` | `True` | Heuristic initial point. Up to 5x speedup on long series. |
| `tol` | `1e-2` | Gradient tolerance. `5e-2` is ~2x faster with negligible logL loss. |
| `K` | `300` | Minimum grid size. The adaptive rule may increase it. |
| `pts_per_sigma` | `2` | Grid density: points per conditional standard deviation. Higher values improve quadrature accuracy at the cost of larger K. |

### Vine copula

```python
vine = CVineCopula()
vine.fit(u, method='scar-tm-ou', truncation_level=2, min_edge_logL=10)
```

| Parameter | Default | Effect |
|---|---|---|
| `truncation_level` | `None` | Trees >= this level stay MLE. Recommended: 2-3 for d > 10. |
| `min_edge_logL` | `None` | Edges with MLE logL below threshold stay MLE. Recommended: 5-10. |


## Architecture

The codebase is organized in layers with top-down dependencies:

| Layer | Directory | Responsibility |
|-------|-----------|----------------|
| API | `api.py` | Entry points: `fit()`, `smoothed_params()`, `mixture_h()` |
| Strategy | `strategy/` | Estimation methods: MLE, SCAR-TM, GAS |
| Copula | `copula/` | Pure math: PDF, h-functions, transforms |
| Numerical | `numerical/` | TM grid, gradient, MC samplers, OU kernels |
| Types | `_types.py`, `_utils.py` | Typed results, config, shared utilities |

See [ARCHITECTURE.md](ARCHITECTURE.md) for the full module map.

## License

MIT License. See [LICENSE.txt](LICENSE.txt).
