Metadata-Version: 2.4
Name: insurance-copula
Version: 0.1.0
Summary: Vine copulas for multi-peril home insurance pricing — exposure-weighted dependence modelling
Project-URL: Homepage, https://github.com/burning-cost/insurance-copula
Project-URL: Repository, https://github.com/burning-cost/insurance-copula
Author-email: Burning Cost <pricing.frontier@gmail.com>
License: MIT
Keywords: actuarial,copula,dependence,insurance,pricing,vine
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.10
Requires-Dist: matplotlib>=3.5
Requires-Dist: numpy>=1.21
Requires-Dist: pandas>=2.0
Requires-Dist: scipy>=1.10
Provides-Extra: dev
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: pyvinecopulib>=0.7; extra == 'dev'
Provides-Extra: vine
Requires-Dist: pyvinecopulib>=0.7; extra == 'vine'
Description-Content-Type: text/markdown

# insurance-copula

Vine copulas for multi-peril home insurance pricing.

## The problem

UK home insurance portfolios have a pricing methodology problem. Perils — flood, subsidence, storm, escape of water, fire, theft — are typically modelled independently. Each peril gets its own frequency/severity model, and the components are added up. This is tractable and auditable, but it ignores the fact that these perils are correlated.

Flood and escape of water spike together in wet winters. Storm and flood co-move. Subsidence clusters with drought. When multiple perils hit at once, the joint loss is larger than the sum of independent expectations. If you price assuming independence, you systematically underprice the scenarios that matter most for reserving and capital.

The standard actuarial fix is copulas. The challenge is that vine copulas — the right tool for multi-dimensional loss dependence — have a steep API and require insurance-specific wrapping to be useful: exposure weighting, pseudo-observation transforms, marginal back-transformation, and aggregate loss simulation for capital modelling.

This library does that wrapping.

## What it does

- Fits a vine copula to multi-peril loss data using `pyvinecopulib`
- Supports exposure-weighted fitting (weight by days at risk or earned premium)
- Simulates from the fitted copula in both uniform and loss space
- Estimates conditional expectations: E[flood | storm > threshold]
- Simulates aggregate portfolio loss distributions for PML/SCR calculation
- Serialises fitted models to JSON for production deployment

## Installation

```bash
pip install insurance-copula[vine]
```

The `[vine]` extra installs `pyvinecopulib`. Pre-built wheels are available for x86_64 Linux, macOS, and Windows. ARM64 Linux requires building from source.

## Quick start

```python
from insurance_copula import PerilVine
from scipy.stats import gamma

# Fit the vine
vine = PerilVine(perils=['flood', 'subsidence', 'storm', 'eow'])
vine.fit(losses_df, exposure=earned_premium_series)

# Simulate losses
marginals = {
    'flood':      gamma(a=0.8, scale=5_000),
    'subsidence': gamma(a=0.6, scale=3_000),
    'storm':      gamma(a=1.2, scale=800),
    'eow':        gamma(a=1.5, scale=600),
}
samples = vine.simulate(10_000, marginals=marginals, seed=42)

# Portfolio aggregate loss (Solvency II SCR)
result = vine.aggregate_loss(n_sim=10_000, marginals=marginals, n_policies=50_000, seed=42)
print(f"PML 1-in-200: £{result.pml_1_in_200:,.0f}")

# Conditional expectation
flood_given_storm = vine.conditional_expectation(
    target_peril='flood',
    condition_peril='storm',
    threshold=marginals['storm'].ppf(0.9),
    marginals=marginals,
)

# Model diagnostics
diag = vine.diagnostics()
print(diag.kendall_tau_matrix)

# Serialise for production
json_str = vine.to_json()
restored = PerilVine.from_json(json_str)
```

## Design choices

**pyvinecopulib over statsmodels or copulae**: C++ backend, proper vine structure selection, exposure weights natively supported. The alternatives are either slower or don't support R-vine structure selection.

**Nonparametric PIT by default**: We use empirical CDF ranks rather than assuming a parametric marginal family. This means the copula fit is robust to heavy tails in the loss data. The marginals are specified separately when you want loss-space simulation.

**trunc_lvl=3 default**: For 4-6 perils, a full vine overfits on typical insurance datasets (a few hundred observations). Truncation at tree 3 keeps the most important pair dependencies without fitting noise in the higher-order conditional structures.

**Accept-reject for conditional expectation**: Simpler than analytical conditioning via D-vine inverse Rosenblatt, works for any vine structure, and gives accurate results when n_samples is large enough. We warn when fewer than 500 samples pass the filter.

## Notebook demo

See `notebooks/demo_multi_peril_vine.ipynb` for a full worked example on synthetic UK home insurance data.

## Requirements

- Python >= 3.10
- pyvinecopulib >= 0.7 (x86_64 only; ARM64 requires source build)
- scipy, numpy, pandas, matplotlib
