Edit on GitHub

insurance_credibility.classical

Classical Bühlmann-Straub credibility models for non-life insurance pricing.

Implements the Bühlmann-Straub (1970) credibility model and its hierarchical extension (Jewell, 1975). The standard tool for blending group-level loss experience with the portfolio mean, weighted by exposure.

Quick start::

import polars as pl
from insurance_credibility.classical import BuhlmannStraub

bs = BuhlmannStraub()
bs.fit(df, group_col="scheme", period_col="year",
       loss_col="loss_rate", weight_col="exposure")
bs.summary()
bs.z_        # credibility factors by scheme
bs.premiums_ # full results DataFrame

For hierarchical multi-level structures::

from insurance_credibility.classical import HierarchicalBuhlmannStraub

model = HierarchicalBuhlmannStraub(level_cols=["region", "district", "sector"])
model.fit(df, period_col="year", loss_col="loss_rate", weight_col="exposure")
model.premiums_at("sector")
 1"""
 2Classical Bühlmann-Straub credibility models for non-life insurance pricing.
 3
 4Implements the Bühlmann-Straub (1970) credibility model and its hierarchical
 5extension (Jewell, 1975). The standard tool for blending group-level loss
 6experience with the portfolio mean, weighted by exposure.
 7
 8Quick start::
 9
10    import polars as pl
11    from insurance_credibility.classical import BuhlmannStraub
12
13    bs = BuhlmannStraub()
14    bs.fit(df, group_col="scheme", period_col="year",
15           loss_col="loss_rate", weight_col="exposure")
16    bs.summary()
17    bs.z_        # credibility factors by scheme
18    bs.premiums_ # full results DataFrame
19
20For hierarchical multi-level structures::
21
22    from insurance_credibility.classical import HierarchicalBuhlmannStraub
23
24    model = HierarchicalBuhlmannStraub(level_cols=["region", "district", "sector"])
25    model.fit(df, period_col="year", loss_col="loss_rate", weight_col="exposure")
26    model.premiums_at("sector")
27"""
28
29from .buhlmann_straub import BuhlmannStraub
30from .hierarchical import HierarchicalBuhlmannStraub, LevelResult
31
32__all__ = [
33    "BuhlmannStraub",
34    "HierarchicalBuhlmannStraub",
35    "LevelResult",
36]
class BuhlmannStraub:
 59class BuhlmannStraub:
 60    """
 61    Bühlmann-Straub credibility model with non-parametric structural parameter
 62    estimation.
 63
 64    Fit the model to a panel dataset - one row per (group, period) - where
 65    each row has a loss rate and an exposure weight. The model estimates the
 66    structural parameters v (within-group variance) and a (between-group
 67    variance) from the data itself, then computes a credibility factor Z_i
 68    for each group.
 69
 70    The result is a credibility premium for each group that blends its own
 71    weighted-average loss rate with the portfolio mean, with the blend governed
 72    entirely by the ratio K = v/a and the group's total exposure.
 73
 74    Parameters
 75    ----------
 76    truncate_a : bool, default True
 77        If True (the default), truncate a_hat at zero when the between-group
 78        variance estimate is negative. This sets K → ∞ and Z_i → 0 for all
 79        groups, so every group gets the collective mean. If False, a negative
 80        a_hat raises an error instead.
 81
 82    Examples
 83    --------
 84    >>> import polars as pl
 85    >>> from insurance_credibility import BuhlmannStraub
 86    >>>
 87    >>> df = pl.DataFrame({
 88    ...     "scheme": ["A", "A", "A", "B", "B", "B", "C", "C", "C"],
 89    ...     "year":   [2021, 2022, 2023] * 3,
 90    ...     "loss_rate": [0.55, 0.60, 0.58, 0.80, 0.75, 0.82, 0.40, 0.42, 0.38],
 91    ...     "exposure": [1000, 1200, 1100, 300, 350, 320, 5000, 4800, 5200],
 92    ... })
 93    >>> bs = BuhlmannStraub()
 94    >>> bs.fit(df, group_col="scheme", period_col="year",
 95    ...        loss_col="loss_rate", weight_col="exposure")
 96    >>> bs.summary()
 97    """
 98
 99    def __init__(self, truncate_a: bool = True) -> None:
100        self.truncate_a = truncate_a
101
102        # Fitted attributes - set by .fit()
103        self._mu_hat: Optional[float] = None
104        self._v_hat: Optional[float] = None
105        self._a_hat: Optional[float] = None
106        self._k: Optional[float] = None
107        self._z: Optional[pl.DataFrame] = None
108        self._premiums: Optional[pl.DataFrame] = None
109        self._fitted = False
110
111    # ------------------------------------------------------------------
112    # Scikit-learn style fit
113    # ------------------------------------------------------------------
114
115    def fit(
116        self,
117        data: Union[pl.DataFrame, "pd.DataFrame"],  # type: ignore[name-defined]
118        group_col: str = "group",
119        period_col: str = "period",
120        loss_col: str = "loss",
121        weight_col: str = "weight",
122    ) -> "BuhlmannStraub":
123        """
124        Estimate structural parameters and credibility factors from panel data.
125
126        Parameters
127        ----------
128        data:
129            A Polars DataFrame (preferred) or pandas DataFrame (converted
130            internally) with one row per (group, period). Columns need not be
131            in any particular order.
132        group_col:
133            Column identifying the group (e.g. scheme ID, territory, NCD class).
134        period_col:
135            Column identifying the time period (e.g. accident year). Used only
136            to count periods per group; the ordering does not matter.
137        loss_col:
138            Column containing the loss rate or loss ratio. This should be losses
139            per unit of exposure - not total losses. If you have total losses,
140            divide by the exposure column before fitting.
141        weight_col:
142            Column containing the exposure weight (e.g. earned car years,
143            earned premium, policy count). Must be strictly positive.
144
145        Returns
146        -------
147        self
148            Returns the fitted estimator so calls can be chained.
149        """
150        data = _to_polars(data)
151
152        validate_panel_data(data, group_col, period_col, loss_col, weight_col)
153        check_duplicate_periods(data, group_col, period_col)
154
155        self._group_col = group_col
156        self._period_col = period_col
157        self._loss_col = loss_col
158        self._weight_col = weight_col
159
160        # Build group-level summary: w_i, X_bar_i, T_i
161        groups = self._build_group_summary(data, group_col, period_col, loss_col, weight_col)
162        self._groups = groups
163
164        # Estimate structural parameters
165        mu_hat, v_hat, a_hat_raw = self._estimate_structural_params(
166            data, groups, group_col, period_col, loss_col, weight_col
167        )
168
169        # Handle negative a_hat
170        if a_hat_raw <= 0:
171            if self.truncate_a:
172                warnings.warn(
173                    f"Between-group variance estimate a_hat = {a_hat_raw:.6g} ≤ 0. "
174                    "Truncating to zero. This means the model finds no detectable "
175                    "heterogeneity between groups - all groups will receive the "
176                    "collective mean as their credibility premium (Z_i = 0). "
177                    "Consider whether your data genuinely lacks between-group "
178                    "variation, or whether you have too few groups to estimate a reliably.",
179                    stacklevel=2,
180                )
181                a_hat = 0.0
182                k = np.inf
183            else:
184                raise ValueError(
185                    f"Between-group variance estimate a_hat = {a_hat_raw:.6g} ≤ 0. "
186                    "Set truncate_a=True to handle this automatically."
187                )
188        else:
189            a_hat = a_hat_raw
190            k = v_hat / a_hat
191
192        # Credibility factors
193        w = groups["w_i"].to_numpy()
194        if np.isinf(k):
195            z_values = np.zeros(len(groups))
196        else:
197            z_values = w / (w + k)
198
199        group_ids = groups["group"].to_list()
200        x_bar = groups["x_bar_i"].to_numpy()
201
202        # z_ is a Polars DataFrame with columns ["group", "Z"]
203        z_df = pl.DataFrame({
204            "group": group_ids,
205            "Z": z_values,
206        })
207
208        # premiums_ is a Polars DataFrame with a "group" column (no index)
209        premiums = pl.DataFrame({
210            "group": group_ids,
211            "exposure": w,
212            "observed_mean": x_bar,
213            "Z": z_values,
214            "credibility_premium": z_values * x_bar + (1 - z_values) * mu_hat,
215            "complement": np.full(len(groups), mu_hat),
216        })
217
218        self._mu_hat = float(mu_hat)
219        self._v_hat = float(v_hat)
220        self._a_hat = float(a_hat)
221        self._a_hat_raw = float(a_hat_raw)
222        self._k = float(k) if not np.isinf(k) else np.inf
223        self._z = z_df
224        self._premiums = premiums
225        self._fitted = True
226
227        return self
228
229    # ------------------------------------------------------------------
230    # Structural parameter properties
231    # ------------------------------------------------------------------
232
233    @property
234    def mu_hat_(self) -> float:
235        """
236        Collective mean - the grand weighted average loss rate across all groups.
237
238        This is the complement: what every group's credibility premium regresses
239        toward as exposure shrinks.
240        """
241        self._check_fitted()
242        return self._mu_hat  # type: ignore[return-value]
243
244    @property
245    def v_hat_(self) -> float:
246        """
247        EPV - Expected value of Process Variance.
248
249        Measures within-group year-to-year volatility, averaged across groups.
250        High v_hat means individual group experience is noisy - lean on the
251        collective mean.
252        """
253        self._check_fitted()
254        return self._v_hat  # type: ignore[return-value]
255
256    @property
257    def a_hat_(self) -> float:
258        """
259        VHM - Variance of Hypothetical Means.
260
261        Measures how much the true underlying loss rates differ between groups.
262        High a_hat means groups are genuinely heterogeneous - trust their own
263        experience. Truncated at zero if truncate_a=True.
264        """
265        self._check_fitted()
266        return self._a_hat  # type: ignore[return-value]
267
268    @property
269    def k_(self) -> float:
270        """
271        Bühlmann's k - the credibility parameter.
272
273        k = v / a. This is the exposure a group needs to achieve Z = 0.5 (equal
274        weight on own experience and collective mean). A group with exposure w_i
275        has Z_i = w_i / (w_i + k).
276
277        Returns np.inf when a_hat is zero (all groups get the collective mean).
278        """
279        self._check_fitted()
280        return self._k  # type: ignore[return-value]
281
282    @property
283    def z_(self) -> pl.DataFrame:
284        """
285        Credibility factors - a Polars DataFrame with columns ["group", "Z"].
286
287        Z_i = w_i / (w_i + k), ranging from 0 (no credibility, use collective
288        mean) to 1 (full credibility, use group's own experience).
289
290        To look up a specific group::
291
292            bs.z_.filter(pl.col("group") == "Motor Guild")["Z"][0]
293
294        To find the group with the highest Z::
295
296            bs.z_.sort("Z", descending=True)["group"][0]
297        """
298        self._check_fitted()
299        return self._z  # type: ignore[return-value]
300
301    @property
302    def premiums_(self) -> pl.DataFrame:
303        """
304        Credibility premiums - one row per group, as a Polars DataFrame.
305
306        Columns:
307
308        - ``group``: group identifier
309        - ``exposure``: total exposure weight for the group
310        - ``observed_mean``: exposure-weighted average loss rate
311        - ``Z``: credibility factor
312        - ``credibility_premium``: Z * observed_mean + (1 - Z) * mu_hat
313        - ``complement``: mu_hat (the collective mean complement)
314        """
315        self._check_fitted()
316        return self._premiums  # type: ignore[return-value]
317
318    # ------------------------------------------------------------------
319    # Output
320    # ------------------------------------------------------------------
321
322    def summary(self) -> pl.DataFrame:
323        """
324        Return a formatted summary of the fit.
325
326        The table shows structural parameters at the top and per-group credibility
327        results below, formatted for an actuarial audience.
328
329        Returns
330        -------
331        pl.DataFrame
332            Per-group results. Structural parameters are printed to stdout.
333        """
334        self._check_fitted()
335
336        print("Bühlmann-Straub Credibility Model")
337        print("=" * 42)
338        print(f"  Collective mean    mu  = {self._mu_hat:.6g}")
339        print(f"  Process variance   v   = {self._v_hat:.6g}   (EPV, within-group)")
340        print(f"  Between-group var  a   = {self._a_hat:.6g}   (VHM, between-group)")
341        if np.isinf(self._k):
342            print(f"  Credibility param  k   = inf   (a ≤ 0, all Z = 0)")
343        else:
344            print(f"  Credibility param  k   = {self._k:.6g}   (v / a)")
345        print()
346        print("  Interpretation: a group needs exposure = k to achieve Z = 0.50")
347        if self._a_hat_raw <= 0:
348            print(f"  (raw a_hat before truncation: {self._a_hat_raw:.6g})")
349        print()
350
351        tbl = self._premiums.rename({
352            "exposure": "Exposure",
353            "observed_mean": "Obs. Mean",
354            "Z": "Z",
355            "credibility_premium": "Cred. Premium",
356            "complement": "Complement",
357        })
358        return tbl
359
360    # ------------------------------------------------------------------
361    # Internal helpers
362    # ------------------------------------------------------------------
363
364    @staticmethod
365    def _build_group_summary(
366        data: pl.DataFrame,
367        group_col: str,
368        period_col: str,
369        loss_col: str,
370        weight_col: str,
371    ) -> pl.DataFrame:
372        """
373        Compute per-group summary statistics needed for parameter estimation.
374
375        Returns a Polars DataFrame with columns:
376            group   : group identifier (same type as group_col)
377            w_i     : total exposure (sum of weights)
378            x_bar_i : exposure-weighted mean loss rate
379            T_i     : number of periods
380        """
381        groups = (
382            data.group_by(group_col)
383            .agg([
384                pl.col(weight_col).sum().alias("w_i"),
385                (
386                    (pl.col(weight_col) * pl.col(loss_col)).sum()
387                    / pl.col(weight_col).sum()
388                ).alias("x_bar_i"),
389                pl.len().alias("T_i"),
390            ])
391            .rename({group_col: "group"})
392            .sort("group")
393        )
394        return groups
395
396    @staticmethod
397    def _estimate_structural_params(
398        data: pl.DataFrame,
399        groups: pl.DataFrame,
400        group_col: str,
401        period_col: str,
402        loss_col: str,
403        weight_col: str,
404    ) -> tuple[float, float, float]:
405        """
406        Compute unbiased estimates of mu, v, and a.
407
408        These are the standard Bühlmann-Straub (1970) non-parametric estimators,
409        as presented in Bühlmann & Gisler (2005), Chapter 4.
410
411        Parameters
412        ----------
413        data:
414            Full panel data (Polars DataFrame).
415        groups:
416            Per-group summary from _build_group_summary.
417
418        Returns
419        -------
420        mu_hat, v_hat, a_hat_raw
421            a_hat_raw may be negative - the caller decides how to handle it.
422        """
423        w = groups["w_i"].to_numpy()
424        x_bar = groups["x_bar_i"].to_numpy()
425        T = groups["T_i"].to_numpy()
426        group_ids = groups["group"].to_list()
427        r = len(groups)
428
429        # Collective mean: grand weighted average
430        w_total = w.sum()
431        mu_hat = (w * x_bar).sum() / w_total
432
433        # v_hat: within-group weighted mean squared deviation
434        # v_hat = (1 / sum_i(T_i - 1)) * sum_i sum_j [ w_{ij} * (X_{ij} - X_bar_i)^2 ]
435        denom_v = (T - 1).sum()  # degrees of freedom for v
436
437        # Build a lookup from group id → x_bar_i for the within-group deviation
438        x_bar_map = dict(zip(group_ids, x_bar))
439
440        # Join x_bar_i into the full data, then compute the weighted squared deviation
441        # We use a join rather than a Python loop for efficiency
442        x_bar_lf = pl.DataFrame({
443            group_col: group_ids,
444            "_x_bar_i": x_bar,
445        }).lazy()
446
447        numerator_v = (
448            data.lazy()
449            .join(x_bar_lf, on=group_col, how="left")
450            .with_columns(
451                (
452                    pl.col(weight_col) * (pl.col(loss_col) - pl.col("_x_bar_i")) ** 2
453                ).alias("_sq_dev")
454            )
455            .select(pl.col("_sq_dev").sum())
456            .collect()
457            ["_sq_dev"][0]
458        )
459
460        v_hat = float(numerator_v) / denom_v if denom_v > 0 else 0.0
461
462        # a_hat: between-group variance
463        # c = w_total - sum_i(w_i^2) / w_total
464        # s2 = sum_i [ w_i * (x_bar_i - mu_hat)^2 ]
465        # a_hat = (s2 - (r - 1) * v_hat) / c
466        c = w_total - (w ** 2).sum() / w_total
467        s2 = (w * (x_bar - mu_hat) ** 2).sum()
468        a_hat_raw = (s2 - (r - 1) * v_hat) / c
469
470        return float(mu_hat), float(v_hat), float(a_hat_raw)
471
472    def _check_fitted(self) -> None:
473        if not self._fitted:
474            raise RuntimeError(
475                "Model has not been fitted. Call .fit() first."
476            )
477
478    def __repr__(self) -> str:
479        if not self._fitted:
480            return "BuhlmannStraub(not fitted)"
481        return (
482            f"BuhlmannStraub("
483            f"mu={self._mu_hat:.4g}, "
484            f"v={self._v_hat:.4g}, "
485            f"a={self._a_hat:.4g}, "
486            f"k={self._k:.4g})"
487        )

Bühlmann-Straub credibility model with non-parametric structural parameter estimation.

Fit the model to a panel dataset - one row per (group, period) - where each row has a loss rate and an exposure weight. The model estimates the structural parameters v (within-group variance) and a (between-group variance) from the data itself, then computes a credibility factor Z_i for each group.

The result is a credibility premium for each group that blends its own weighted-average loss rate with the portfolio mean, with the blend governed entirely by the ratio K = v/a and the group's total exposure.

Parameters

truncate_a : bool, default True If True (the default), truncate a_hat at zero when the between-group variance estimate is negative. This sets K → ∞ and Z_i → 0 for all groups, so every group gets the collective mean. If False, a negative a_hat raises an error instead.

Examples

>>> import polars as pl
>>> from insurance_credibility import BuhlmannStraub
>>>
>>> df = pl.DataFrame({
...     "scheme": ["A", "A", "A", "B", "B", "B", "C", "C", "C"],
...     "year":   [2021, 2022, 2023] * 3,
...     "loss_rate": [0.55, 0.60, 0.58, 0.80, 0.75, 0.82, 0.40, 0.42, 0.38],
...     "exposure": [1000, 1200, 1100, 300, 350, 320, 5000, 4800, 5200],
... })
>>> bs = BuhlmannStraub()
>>> bs.fit(df, group_col="scheme", period_col="year",
...        loss_col="loss_rate", weight_col="exposure")
>>> bs.summary()
BuhlmannStraub(truncate_a: bool = True)
 99    def __init__(self, truncate_a: bool = True) -> None:
100        self.truncate_a = truncate_a
101
102        # Fitted attributes - set by .fit()
103        self._mu_hat: Optional[float] = None
104        self._v_hat: Optional[float] = None
105        self._a_hat: Optional[float] = None
106        self._k: Optional[float] = None
107        self._z: Optional[pl.DataFrame] = None
108        self._premiums: Optional[pl.DataFrame] = None
109        self._fitted = False
truncate_a
def fit( self, data: "Union[pl.DataFrame, 'pd.DataFrame']", group_col: str = 'group', period_col: str = 'period', loss_col: str = 'loss', weight_col: str = 'weight') -> BuhlmannStraub:
115    def fit(
116        self,
117        data: Union[pl.DataFrame, "pd.DataFrame"],  # type: ignore[name-defined]
118        group_col: str = "group",
119        period_col: str = "period",
120        loss_col: str = "loss",
121        weight_col: str = "weight",
122    ) -> "BuhlmannStraub":
123        """
124        Estimate structural parameters and credibility factors from panel data.
125
126        Parameters
127        ----------
128        data:
129            A Polars DataFrame (preferred) or pandas DataFrame (converted
130            internally) with one row per (group, period). Columns need not be
131            in any particular order.
132        group_col:
133            Column identifying the group (e.g. scheme ID, territory, NCD class).
134        period_col:
135            Column identifying the time period (e.g. accident year). Used only
136            to count periods per group; the ordering does not matter.
137        loss_col:
138            Column containing the loss rate or loss ratio. This should be losses
139            per unit of exposure - not total losses. If you have total losses,
140            divide by the exposure column before fitting.
141        weight_col:
142            Column containing the exposure weight (e.g. earned car years,
143            earned premium, policy count). Must be strictly positive.
144
145        Returns
146        -------
147        self
148            Returns the fitted estimator so calls can be chained.
149        """
150        data = _to_polars(data)
151
152        validate_panel_data(data, group_col, period_col, loss_col, weight_col)
153        check_duplicate_periods(data, group_col, period_col)
154
155        self._group_col = group_col
156        self._period_col = period_col
157        self._loss_col = loss_col
158        self._weight_col = weight_col
159
160        # Build group-level summary: w_i, X_bar_i, T_i
161        groups = self._build_group_summary(data, group_col, period_col, loss_col, weight_col)
162        self._groups = groups
163
164        # Estimate structural parameters
165        mu_hat, v_hat, a_hat_raw = self._estimate_structural_params(
166            data, groups, group_col, period_col, loss_col, weight_col
167        )
168
169        # Handle negative a_hat
170        if a_hat_raw <= 0:
171            if self.truncate_a:
172                warnings.warn(
173                    f"Between-group variance estimate a_hat = {a_hat_raw:.6g} ≤ 0. "
174                    "Truncating to zero. This means the model finds no detectable "
175                    "heterogeneity between groups - all groups will receive the "
176                    "collective mean as their credibility premium (Z_i = 0). "
177                    "Consider whether your data genuinely lacks between-group "
178                    "variation, or whether you have too few groups to estimate a reliably.",
179                    stacklevel=2,
180                )
181                a_hat = 0.0
182                k = np.inf
183            else:
184                raise ValueError(
185                    f"Between-group variance estimate a_hat = {a_hat_raw:.6g} ≤ 0. "
186                    "Set truncate_a=True to handle this automatically."
187                )
188        else:
189            a_hat = a_hat_raw
190            k = v_hat / a_hat
191
192        # Credibility factors
193        w = groups["w_i"].to_numpy()
194        if np.isinf(k):
195            z_values = np.zeros(len(groups))
196        else:
197            z_values = w / (w + k)
198
199        group_ids = groups["group"].to_list()
200        x_bar = groups["x_bar_i"].to_numpy()
201
202        # z_ is a Polars DataFrame with columns ["group", "Z"]
203        z_df = pl.DataFrame({
204            "group": group_ids,
205            "Z": z_values,
206        })
207
208        # premiums_ is a Polars DataFrame with a "group" column (no index)
209        premiums = pl.DataFrame({
210            "group": group_ids,
211            "exposure": w,
212            "observed_mean": x_bar,
213            "Z": z_values,
214            "credibility_premium": z_values * x_bar + (1 - z_values) * mu_hat,
215            "complement": np.full(len(groups), mu_hat),
216        })
217
218        self._mu_hat = float(mu_hat)
219        self._v_hat = float(v_hat)
220        self._a_hat = float(a_hat)
221        self._a_hat_raw = float(a_hat_raw)
222        self._k = float(k) if not np.isinf(k) else np.inf
223        self._z = z_df
224        self._premiums = premiums
225        self._fitted = True
226
227        return self

Estimate structural parameters and credibility factors from panel data.

Parameters

data: A Polars DataFrame (preferred) or pandas DataFrame (converted internally) with one row per (group, period). Columns need not be in any particular order. group_col: Column identifying the group (e.g. scheme ID, territory, NCD class). period_col: Column identifying the time period (e.g. accident year). Used only to count periods per group; the ordering does not matter. loss_col: Column containing the loss rate or loss ratio. This should be losses per unit of exposure - not total losses. If you have total losses, divide by the exposure column before fitting. weight_col: Column containing the exposure weight (e.g. earned car years, earned premium, policy count). Must be strictly positive.

Returns

self Returns the fitted estimator so calls can be chained.

mu_hat_: float
233    @property
234    def mu_hat_(self) -> float:
235        """
236        Collective mean - the grand weighted average loss rate across all groups.
237
238        This is the complement: what every group's credibility premium regresses
239        toward as exposure shrinks.
240        """
241        self._check_fitted()
242        return self._mu_hat  # type: ignore[return-value]

Collective mean - the grand weighted average loss rate across all groups.

This is the complement: what every group's credibility premium regresses toward as exposure shrinks.

v_hat_: float
244    @property
245    def v_hat_(self) -> float:
246        """
247        EPV - Expected value of Process Variance.
248
249        Measures within-group year-to-year volatility, averaged across groups.
250        High v_hat means individual group experience is noisy - lean on the
251        collective mean.
252        """
253        self._check_fitted()
254        return self._v_hat  # type: ignore[return-value]

EPV - Expected value of Process Variance.

Measures within-group year-to-year volatility, averaged across groups. High v_hat means individual group experience is noisy - lean on the collective mean.

a_hat_: float
256    @property
257    def a_hat_(self) -> float:
258        """
259        VHM - Variance of Hypothetical Means.
260
261        Measures how much the true underlying loss rates differ between groups.
262        High a_hat means groups are genuinely heterogeneous - trust their own
263        experience. Truncated at zero if truncate_a=True.
264        """
265        self._check_fitted()
266        return self._a_hat  # type: ignore[return-value]

VHM - Variance of Hypothetical Means.

Measures how much the true underlying loss rates differ between groups. High a_hat means groups are genuinely heterogeneous - trust their own experience. Truncated at zero if truncate_a=True.

k_: float
268    @property
269    def k_(self) -> float:
270        """
271        Bühlmann's k - the credibility parameter.
272
273        k = v / a. This is the exposure a group needs to achieve Z = 0.5 (equal
274        weight on own experience and collective mean). A group with exposure w_i
275        has Z_i = w_i / (w_i + k).
276
277        Returns np.inf when a_hat is zero (all groups get the collective mean).
278        """
279        self._check_fitted()
280        return self._k  # type: ignore[return-value]

Bühlmann's k - the credibility parameter.

k = v / a. This is the exposure a group needs to achieve Z = 0.5 (equal weight on own experience and collective mean). A group with exposure w_i has Z_i = w_i / (w_i + k).

Returns np.inf when a_hat is zero (all groups get the collective mean).

z_: polars.dataframe.frame.DataFrame
282    @property
283    def z_(self) -> pl.DataFrame:
284        """
285        Credibility factors - a Polars DataFrame with columns ["group", "Z"].
286
287        Z_i = w_i / (w_i + k), ranging from 0 (no credibility, use collective
288        mean) to 1 (full credibility, use group's own experience).
289
290        To look up a specific group::
291
292            bs.z_.filter(pl.col("group") == "Motor Guild")["Z"][0]
293
294        To find the group with the highest Z::
295
296            bs.z_.sort("Z", descending=True)["group"][0]
297        """
298        self._check_fitted()
299        return self._z  # type: ignore[return-value]

Credibility factors - a Polars DataFrame with columns ["group", "Z"].

Z_i = w_i / (w_i + k), ranging from 0 (no credibility, use collective mean) to 1 (full credibility, use group's own experience).

To look up a specific group::

bs.z_.filter(pl.col("group") == "Motor Guild")["Z"][0]

To find the group with the highest Z::

bs.z_.sort("Z", descending=True)["group"][0]
premiums_: polars.dataframe.frame.DataFrame
301    @property
302    def premiums_(self) -> pl.DataFrame:
303        """
304        Credibility premiums - one row per group, as a Polars DataFrame.
305
306        Columns:
307
308        - ``group``: group identifier
309        - ``exposure``: total exposure weight for the group
310        - ``observed_mean``: exposure-weighted average loss rate
311        - ``Z``: credibility factor
312        - ``credibility_premium``: Z * observed_mean + (1 - Z) * mu_hat
313        - ``complement``: mu_hat (the collective mean complement)
314        """
315        self._check_fitted()
316        return self._premiums  # type: ignore[return-value]

Credibility premiums - one row per group, as a Polars DataFrame.

Columns:

  • group: group identifier
  • exposure: total exposure weight for the group
  • observed_mean: exposure-weighted average loss rate
  • Z: credibility factor
  • credibility_premium: Z * observed_mean + (1 - Z) * mu_hat
  • complement: mu_hat (the collective mean complement)
def summary(self) -> polars.dataframe.frame.DataFrame:
322    def summary(self) -> pl.DataFrame:
323        """
324        Return a formatted summary of the fit.
325
326        The table shows structural parameters at the top and per-group credibility
327        results below, formatted for an actuarial audience.
328
329        Returns
330        -------
331        pl.DataFrame
332            Per-group results. Structural parameters are printed to stdout.
333        """
334        self._check_fitted()
335
336        print("Bühlmann-Straub Credibility Model")
337        print("=" * 42)
338        print(f"  Collective mean    mu  = {self._mu_hat:.6g}")
339        print(f"  Process variance   v   = {self._v_hat:.6g}   (EPV, within-group)")
340        print(f"  Between-group var  a   = {self._a_hat:.6g}   (VHM, between-group)")
341        if np.isinf(self._k):
342            print(f"  Credibility param  k   = inf   (a ≤ 0, all Z = 0)")
343        else:
344            print(f"  Credibility param  k   = {self._k:.6g}   (v / a)")
345        print()
346        print("  Interpretation: a group needs exposure = k to achieve Z = 0.50")
347        if self._a_hat_raw <= 0:
348            print(f"  (raw a_hat before truncation: {self._a_hat_raw:.6g})")
349        print()
350
351        tbl = self._premiums.rename({
352            "exposure": "Exposure",
353            "observed_mean": "Obs. Mean",
354            "Z": "Z",
355            "credibility_premium": "Cred. Premium",
356            "complement": "Complement",
357        })
358        return tbl

Return a formatted summary of the fit.

The table shows structural parameters at the top and per-group credibility results below, formatted for an actuarial audience.

Returns

pl.DataFrame Per-group results. Structural parameters are printed to stdout.

class HierarchicalBuhlmannStraub:
102class HierarchicalBuhlmannStraub:
103    """
104    Multi-level Bühlmann-Straub credibility model for nested group structures.
105
106    This model is appropriate when groups are arranged in a strict hierarchy -
107    every observation belongs to exactly one parent at each level. Typical uses:
108
109    - Geographic: postcode sector → district → area
110    - Organisational: risk → scheme → portfolio
111    - Time: accident quarter → accident year → underwriting year
112
113    Parameters
114    ----------
115    level_cols:
116        List of column names defining the hierarchy from outermost (top) to
117        innermost (bottom). For a three-level model, provide three names.
118        Example: ``["area", "district", "sector"]``.
119    truncate_a:
120        Whether to truncate negative between-group variance estimates at zero.
121        Passed through to the BuhlmannStraub model at each level.
122
123    Examples
124    --------
125    Geographic three-level model::
126
127        >>> model = HierarchicalBuhlmannStraub(
128        ...     level_cols=["region", "district", "sector"]
129        ... )
130        >>> model.fit(
131        ...     data=df,
132        ...     period_col="year",
133        ...     loss_col="loss_rate",
134        ...     weight_col="exposure",
135        ... )
136        >>> model.premiums_at("sector")   # blended sector-level premiums
137        >>> model.level_results_["district"].k   # k at district level
138
139    Two-level scheme model (equivalent to BuhlmannStraub with a book-level prior)::
140
141        >>> model = HierarchicalBuhlmannStraub(level_cols=["book", "scheme"])
142        >>> model.fit(df, period_col="year", loss_col="loss_rate", weight_col="exposure")
143    """
144
145    def __init__(
146        self,
147        level_cols: List[str],
148        truncate_a: bool = True,
149    ) -> None:
150        if len(level_cols) < 2:
151            raise ValueError(
152                "At least two levels are required for a hierarchical model. "
153                "For a single-level model, use BuhlmannStraub."
154            )
155        self.level_cols = level_cols
156        self.truncate_a = truncate_a
157        self._fitted = False
158        self._level_results: Optional[Dict[str, LevelResult]] = None
159        self._bottom_premiums: Optional[pl.DataFrame] = None
160
161    def fit(
162        self,
163        data: Union[pl.DataFrame, "pd.DataFrame"],  # type: ignore[name-defined]
164        period_col: str = "period",
165        loss_col: str = "loss",
166        weight_col: str = "weight",
167    ) -> "HierarchicalBuhlmannStraub":
168        """
169        Fit the hierarchical model to panel data.
170
171        Parameters
172        ----------
173        data:
174            Panel DataFrame (Polars preferred, pandas accepted) with one row
175            per (leaf node, period). Must contain all columns in ``level_cols``
176            plus ``period_col``, ``loss_col``, and ``weight_col``.
177        period_col:
178            Column identifying the time period.
179        loss_col:
180            Column containing the per-unit loss rate.
181        weight_col:
182            Column containing exposure weights (must be strictly positive).
183
184        Returns
185        -------
186        self
187        """
188        data = _to_polars(data)
189
190        required_cols = set(self.level_cols) | {period_col, loss_col, weight_col}
191        missing = required_cols - set(data.columns)
192        if missing:
193            raise ValueError(f"Columns not found in data: {missing}")
194
195        self._period_col = period_col
196        self._loss_col = loss_col
197        self._weight_col = weight_col
198
199        # Validate that hierarchy is strict (no cross-level ambiguity)
200        self._validate_hierarchy(data)
201
202        # Bottom-up: estimate variance components at each level
203        level_results: Dict[str, LevelResult] = {}
204        bottom_level = self.level_cols[-1]
205
206        # At the innermost level, fit BuhlmannStraub with the leaf node as group
207        # and each period as an observation
208        innermost_fit = self._fit_innermost_level(data, bottom_level, period_col, loss_col, weight_col)
209        level_results[bottom_level] = innermost_fit
210
211        # For each higher level, aggregate upward and fit B-S treating each node
212        # at that level as a "group" with its children's credibility premiums as
213        # "observations" (but we use the raw aggregated data for cleaner estimation)
214        for depth in range(len(self.level_cols) - 2, -1, -1):
215            parent_col = self.level_cols[depth]
216            child_col = self.level_cols[depth + 1]
217            level_result = self._fit_upper_level(
218                data,
219                parent_col,
220                child_col,
221                loss_col,
222                weight_col,
223                lower_v=level_results[child_col].v_hat,
224            )
225            level_results[parent_col] = level_result
226
227        self._level_results = level_results
228
229        # Top-down: compute final blended premiums at each level
230        self._bottom_premiums = self._compute_top_down_premiums(data, level_results)
231        self._fitted = True
232        return self
233
234    # ------------------------------------------------------------------
235    # Output
236    # ------------------------------------------------------------------
237
238    @property
239    def level_results_(self) -> Dict[str, LevelResult]:
240        """
241        Dictionary mapping level name to LevelResult with fitted parameters.
242        """
243        self._check_fitted()
244        return self._level_results  # type: ignore[return-value]
245
246    @property
247    def premiums_(self) -> pl.DataFrame:
248        """
249        Credibility premiums at the bottom (leaf) level of the hierarchy.
250
251        Equivalent to ``premiums_at(self.level_cols[-1])``.
252        """
253        self._check_fitted()
254        return self._bottom_premiums  # type: ignore[return-value]
255
256    def premiums_at(self, level: str) -> pl.DataFrame:
257        """
258        Return the credibility premiums at a specified level of the hierarchy.
259
260        Parameters
261        ----------
262        level:
263            One of the column names from ``level_cols``.
264
265        Returns
266        -------
267        pl.DataFrame
268            With a ``group`` column plus ``exposure``, ``observed_mean``,
269            ``Z``, ``credibility_premium``, and ``complement`` columns.
270        """
271        self._check_fitted()
272        if level not in self.level_cols:
273            raise ValueError(
274                f"'{level}' is not one of the fitted levels: {self.level_cols}"
275            )
276        return self._level_results[level].premiums
277
278    def summary(self) -> None:
279        """Print structural parameters at each level."""
280        self._check_fitted()
281        print("Hierarchical Bühlmann-Straub Credibility Model")
282        print("=" * 50)
283        print(f"Levels (outer → inner): {' → '.join(self.level_cols)}")
284        print()
285        for level in self.level_cols:
286            lr = self._level_results[level]
287            print(f"  Level: {level}")
288            print(f"    mu  = {lr.mu_hat:.6g}")
289            print(f"    v   = {lr.v_hat:.6g}   (EPV within-node)")
290            print(f"    a   = {lr.a_hat:.6g}   (VHM between-node)")
291            k_str = "inf" if np.isinf(lr.k) else f"{lr.k:.6g}"
292            print(f"    k   = {k_str}")
293            print()
294
295    # ------------------------------------------------------------------
296    # Internal helpers
297    # ------------------------------------------------------------------
298
299    def _validate_hierarchy(self, data: pl.DataFrame) -> None:
300        """Check that the hierarchy is strict - each child belongs to exactly one parent."""
301        for depth in range(len(self.level_cols) - 1):
302            parent = self.level_cols[depth]
303            child = self.level_cols[depth + 1]
304            parents_per_child = (
305                data.select([parent, child])
306                .unique()
307                .group_by(child)
308                .agg(pl.col(parent).n_unique().alias("n_parents"))
309            )
310            ambiguous = parents_per_child.filter(pl.col("n_parents") > 1)
311            if ambiguous.height > 0:
312                examples = ambiguous[child].to_list()[:3]
313                raise ValueError(
314                    f"Hierarchy is not strict at level '{child}' → '{parent}': "
315                    f"{ambiguous.height} child node(s) appear under multiple parents. "
316                    f"Examples: {examples}"
317                )
318
319    def _fit_innermost_level(
320        self,
321        data: pl.DataFrame,
322        leaf_col: str,
323        period_col: str,
324        loss_col: str,
325        weight_col: str,
326    ) -> LevelResult:
327        """Fit B-S at the innermost level using raw period data."""
328        bs = BuhlmannStraub(truncate_a=self.truncate_a)
329        bs.fit(
330            data,
331            group_col=leaf_col,
332            period_col=period_col,
333            loss_col=loss_col,
334            weight_col=weight_col,
335        )
336        return LevelResult(
337            level_name=leaf_col,
338            mu_hat=bs.mu_hat_,
339            v_hat=bs.v_hat_,
340            a_hat=bs.a_hat_,
341            k=bs.k_,
342            z=bs.z_,
343            premiums=bs.premiums_,
344        )
345
346    def _fit_upper_level(
347        self,
348        data: pl.DataFrame,
349        parent_col: str,
350        child_col: str,
351        loss_col: str,
352        weight_col: str,
353        lower_v: float,
354    ) -> LevelResult:
355        """
356        Fit B-S at a higher level by aggregating child-level data.
357
358        We treat each child node as a "period" within the parent node,
359        using the child's total exposure as weight and its weighted mean loss
360        rate as the observation. The within-parent variance v at this level
361        is the between-child variance a from the level below.
362
363        This is the Jewell (1975) / actuar-compatible approach: v at level L
364        equals a at level L+1, propagating variance components up the hierarchy.
365        """
366        # Aggregate data to child level (summing over periods)
367        child_summary = (
368            data.select([parent_col, child_col, weight_col, loss_col])
369            .group_by([parent_col, child_col])
370            .agg([
371                pl.col(weight_col).sum().alias("exposure"),
372                (
373                    (pl.col(weight_col) * pl.col(loss_col)).sum()
374                    / pl.col(weight_col).sum()
375                ).alias("loss_rate"),
376            ])
377        )
378
379        # Fit B-S at parent level, treating each child as one "observation"
380        # The period col here is child_col, loss col is loss_rate, weight is exposure
381        bs = BuhlmannStraub(truncate_a=self.truncate_a)
382        bs.fit(
383            child_summary,
384            group_col=parent_col,
385            period_col=child_col,
386            loss_col="loss_rate",
387            weight_col="exposure",
388        )
389
390        return LevelResult(
391            level_name=parent_col,
392            mu_hat=bs.mu_hat_,
393            v_hat=bs.v_hat_,
394            a_hat=bs.a_hat_,
395            k=bs.k_,
396            z=bs.z_,
397            premiums=bs.premiums_,
398        )
399
400    def _compute_top_down_premiums(
401        self,
402        data: pl.DataFrame,
403        level_results: Dict[str, LevelResult],
404    ) -> pl.DataFrame:
405        """
406        Compute the final blended premiums by propagating the hierarchy top-down.
407
408        At the topmost level, the complement is the grand mean. At each level
409        below, the complement for a node is the credibility premium of its parent.
410        The final output is a premium for each leaf node.
411
412        Blending formula at each level:
413            P_node = Z_node * X̄_node + (1 - Z_node) * P_parent
414        """
415        top_level = self.level_cols[0]
416        top_lr = level_results[top_level]
417
418        # Start: parent premiums map from group id → credibility_premium at top level
419        # Represented as a pl.DataFrame with [group, parent_premium] for joining
420        parent_premiums_df = top_lr.premiums.select([
421            pl.col("group"),
422            pl.col("credibility_premium").alias("parent_premium"),
423        ])
424
425        # Work downward, level by level
426        for depth in range(1, len(self.level_cols)):
427            current_level = self.level_cols[depth]
428            parent_level = self.level_cols[depth - 1]
429            current_lr = level_results[current_level]
430
431            # Map: child node id → parent node id (unique pairs)
432            child_to_parent = (
433                data.select([current_level, parent_level])
434                .unique()
435                .rename({current_level: "group", parent_level: "parent_id"})
436            )
437
438            # Join child→parent map with parent premiums to get P_parent for each child
439            child_with_parent_premium = (
440                child_to_parent
441                .join(
442                    parent_premiums_df.rename({"group": "parent_id"}),
443                    on="parent_id",
444                    how="left",
445                )
446            )
447
448            # Join current level premiums (Z, observed_mean) with parent premium
449            blended = (
450                current_lr.premiums.select(["group", "Z", "observed_mean"])
451                .join(child_with_parent_premium.select(["group", "parent_premium"]), on="group", how="left")
452                .with_columns(
453                    (
454                        pl.col("Z") * pl.col("observed_mean")
455                        + (1 - pl.col("Z")) * pl.col("parent_premium")
456                    ).alias("blended_premium")
457                )
458                .select(["group", "blended_premium"])
459                .rename({"blended_premium": "parent_premium"})
460            )
461
462            parent_premiums_df = blended
463
464        # Build final DataFrame: start from bottom-level premiums, replace credibility_premium
465        bottom_level = self.level_cols[-1]
466        bottom_lr = level_results[bottom_level]
467        result = (
468            bottom_lr.premiums
469            .drop("credibility_premium")
470            .join(
471                parent_premiums_df.rename({"parent_premium": "credibility_premium"}),
472                on="group",
473                how="left",
474            )
475        )
476        return result
477
478    def _check_fitted(self) -> None:
479        if not self._fitted:
480            raise RuntimeError(
481                "Model has not been fitted. Call .fit() first."
482            )
483
484    def __repr__(self) -> str:
485        levels_str = " → ".join(self.level_cols)
486        if not self._fitted:
487            return f"HierarchicalBuhlmannStraub(levels=[{levels_str}], not fitted)"
488        return f"HierarchicalBuhlmannStraub(levels=[{levels_str}])"

Multi-level Bühlmann-Straub credibility model for nested group structures.

This model is appropriate when groups are arranged in a strict hierarchy - every observation belongs to exactly one parent at each level. Typical uses:

  • Geographic: postcode sector → district → area
  • Organisational: risk → scheme → portfolio
  • Time: accident quarter → accident year → underwriting year

Parameters

level_cols: List of column names defining the hierarchy from outermost (top) to innermost (bottom). For a three-level model, provide three names. Example: ["area", "district", "sector"]. truncate_a: Whether to truncate negative between-group variance estimates at zero. Passed through to the BuhlmannStraub model at each level.

Examples

Geographic three-level model::

>>> model = HierarchicalBuhlmannStraub(
...     level_cols=["region", "district", "sector"]
... )
>>> model.fit(
...     data=df,
...     period_col="year",
...     loss_col="loss_rate",
...     weight_col="exposure",
... )
>>> model.premiums_at("sector")   # blended sector-level premiums
>>> model.level_results_["district"].k   # k at district level

Two-level scheme model (equivalent to BuhlmannStraub with a book-level prior)::

>>> model = HierarchicalBuhlmannStraub(level_cols=["book", "scheme"])
>>> model.fit(df, period_col="year", loss_col="loss_rate", weight_col="exposure")
HierarchicalBuhlmannStraub(level_cols: List[str], truncate_a: bool = True)
145    def __init__(
146        self,
147        level_cols: List[str],
148        truncate_a: bool = True,
149    ) -> None:
150        if len(level_cols) < 2:
151            raise ValueError(
152                "At least two levels are required for a hierarchical model. "
153                "For a single-level model, use BuhlmannStraub."
154            )
155        self.level_cols = level_cols
156        self.truncate_a = truncate_a
157        self._fitted = False
158        self._level_results: Optional[Dict[str, LevelResult]] = None
159        self._bottom_premiums: Optional[pl.DataFrame] = None
level_cols
truncate_a
def fit( self, data: "Union[pl.DataFrame, 'pd.DataFrame']", period_col: str = 'period', loss_col: str = 'loss', weight_col: str = 'weight') -> HierarchicalBuhlmannStraub:
161    def fit(
162        self,
163        data: Union[pl.DataFrame, "pd.DataFrame"],  # type: ignore[name-defined]
164        period_col: str = "period",
165        loss_col: str = "loss",
166        weight_col: str = "weight",
167    ) -> "HierarchicalBuhlmannStraub":
168        """
169        Fit the hierarchical model to panel data.
170
171        Parameters
172        ----------
173        data:
174            Panel DataFrame (Polars preferred, pandas accepted) with one row
175            per (leaf node, period). Must contain all columns in ``level_cols``
176            plus ``period_col``, ``loss_col``, and ``weight_col``.
177        period_col:
178            Column identifying the time period.
179        loss_col:
180            Column containing the per-unit loss rate.
181        weight_col:
182            Column containing exposure weights (must be strictly positive).
183
184        Returns
185        -------
186        self
187        """
188        data = _to_polars(data)
189
190        required_cols = set(self.level_cols) | {period_col, loss_col, weight_col}
191        missing = required_cols - set(data.columns)
192        if missing:
193            raise ValueError(f"Columns not found in data: {missing}")
194
195        self._period_col = period_col
196        self._loss_col = loss_col
197        self._weight_col = weight_col
198
199        # Validate that hierarchy is strict (no cross-level ambiguity)
200        self._validate_hierarchy(data)
201
202        # Bottom-up: estimate variance components at each level
203        level_results: Dict[str, LevelResult] = {}
204        bottom_level = self.level_cols[-1]
205
206        # At the innermost level, fit BuhlmannStraub with the leaf node as group
207        # and each period as an observation
208        innermost_fit = self._fit_innermost_level(data, bottom_level, period_col, loss_col, weight_col)
209        level_results[bottom_level] = innermost_fit
210
211        # For each higher level, aggregate upward and fit B-S treating each node
212        # at that level as a "group" with its children's credibility premiums as
213        # "observations" (but we use the raw aggregated data for cleaner estimation)
214        for depth in range(len(self.level_cols) - 2, -1, -1):
215            parent_col = self.level_cols[depth]
216            child_col = self.level_cols[depth + 1]
217            level_result = self._fit_upper_level(
218                data,
219                parent_col,
220                child_col,
221                loss_col,
222                weight_col,
223                lower_v=level_results[child_col].v_hat,
224            )
225            level_results[parent_col] = level_result
226
227        self._level_results = level_results
228
229        # Top-down: compute final blended premiums at each level
230        self._bottom_premiums = self._compute_top_down_premiums(data, level_results)
231        self._fitted = True
232        return self

Fit the hierarchical model to panel data.

Parameters

data: Panel DataFrame (Polars preferred, pandas accepted) with one row per (leaf node, period). Must contain all columns in level_cols plus period_col, loss_col, and weight_col. period_col: Column identifying the time period. loss_col: Column containing the per-unit loss rate. weight_col: Column containing exposure weights (must be strictly positive).

Returns

self

level_results_: Dict[str, LevelResult]
238    @property
239    def level_results_(self) -> Dict[str, LevelResult]:
240        """
241        Dictionary mapping level name to LevelResult with fitted parameters.
242        """
243        self._check_fitted()
244        return self._level_results  # type: ignore[return-value]

Dictionary mapping level name to LevelResult with fitted parameters.

premiums_: polars.dataframe.frame.DataFrame
246    @property
247    def premiums_(self) -> pl.DataFrame:
248        """
249        Credibility premiums at the bottom (leaf) level of the hierarchy.
250
251        Equivalent to ``premiums_at(self.level_cols[-1])``.
252        """
253        self._check_fitted()
254        return self._bottom_premiums  # type: ignore[return-value]

Credibility premiums at the bottom (leaf) level of the hierarchy.

Equivalent to premiums_at(self.level_cols[-1]).

def premiums_at(self, level: str) -> polars.dataframe.frame.DataFrame:
256    def premiums_at(self, level: str) -> pl.DataFrame:
257        """
258        Return the credibility premiums at a specified level of the hierarchy.
259
260        Parameters
261        ----------
262        level:
263            One of the column names from ``level_cols``.
264
265        Returns
266        -------
267        pl.DataFrame
268            With a ``group`` column plus ``exposure``, ``observed_mean``,
269            ``Z``, ``credibility_premium``, and ``complement`` columns.
270        """
271        self._check_fitted()
272        if level not in self.level_cols:
273            raise ValueError(
274                f"'{level}' is not one of the fitted levels: {self.level_cols}"
275            )
276        return self._level_results[level].premiums

Return the credibility premiums at a specified level of the hierarchy.

Parameters

level: One of the column names from level_cols.

Returns

pl.DataFrame With a group column plus exposure, observed_mean, Z, credibility_premium, and complement columns.

def summary(self) -> None:
278    def summary(self) -> None:
279        """Print structural parameters at each level."""
280        self._check_fitted()
281        print("Hierarchical Bühlmann-Straub Credibility Model")
282        print("=" * 50)
283        print(f"Levels (outer → inner): {' → '.join(self.level_cols)}")
284        print()
285        for level in self.level_cols:
286            lr = self._level_results[level]
287            print(f"  Level: {level}")
288            print(f"    mu  = {lr.mu_hat:.6g}")
289            print(f"    v   = {lr.v_hat:.6g}   (EPV within-node)")
290            print(f"    a   = {lr.a_hat:.6g}   (VHM between-node)")
291            k_str = "inf" if np.isinf(lr.k) else f"{lr.k:.6g}"
292            print(f"    k   = {k_str}")
293            print()

Print structural parameters at each level.

class LevelResult:
54class LevelResult:
55    """
56    Fitted parameters and premiums for one level of the hierarchy.
57
58    Attributes
59    ----------
60    level_name:
61        The column name that identifies this level (e.g. "district").
62    mu_hat:
63        Collective mean at this level.
64    v_hat:
65        Within-node variance at this level (EPV).
66    a_hat:
67        Between-node variance at this level (VHM). Zero if truncated.
68    k:
69        Bühlmann's k = v / a at this level.
70    z:
71        Credibility factors as a pl.DataFrame with columns ["group", "Z"].
72    premiums:
73        Credibility premiums as a pl.DataFrame with a "group" column.
74    """
75
76    def __init__(
77        self,
78        level_name: str,
79        mu_hat: float,
80        v_hat: float,
81        a_hat: float,
82        k: float,
83        z: pl.DataFrame,
84        premiums: pl.DataFrame,
85    ) -> None:
86        self.level_name = level_name
87        self.mu_hat = mu_hat
88        self.v_hat = v_hat
89        self.a_hat = a_hat
90        self.k = k
91        self.z = z
92        self.premiums = premiums
93
94    def __repr__(self) -> str:
95        return (
96            f"LevelResult(level='{self.level_name}', "
97            f"mu={self.mu_hat:.4g}, v={self.v_hat:.4g}, "
98            f"a={self.a_hat:.4g}, k={self.k:.4g})"
99        )

Fitted parameters and premiums for one level of the hierarchy.

Attributes

level_name: The column name that identifies this level (e.g. "district"). mu_hat: Collective mean at this level. v_hat: Within-node variance at this level (EPV). a_hat: Between-node variance at this level (VHM). Zero if truncated. k: Bühlmann's k = v / a at this level. z: Credibility factors as a pl.DataFrame with columns ["group", "Z"]. premiums: Credibility premiums as a pl.DataFrame with a "group" column.

LevelResult( level_name: str, mu_hat: float, v_hat: float, a_hat: float, k: float, z: polars.dataframe.frame.DataFrame, premiums: polars.dataframe.frame.DataFrame)
76    def __init__(
77        self,
78        level_name: str,
79        mu_hat: float,
80        v_hat: float,
81        a_hat: float,
82        k: float,
83        z: pl.DataFrame,
84        premiums: pl.DataFrame,
85    ) -> None:
86        self.level_name = level_name
87        self.mu_hat = mu_hat
88        self.v_hat = v_hat
89        self.a_hat = a_hat
90        self.k = k
91        self.z = z
92        self.premiums = premiums
level_name
mu_hat
v_hat
a_hat
k
z
premiums