Metadata-Version: 2.4
Name: interlace-lme
Version: 0.2.6
Summary: Joint REML estimation for linear mixed models with crossed random intercepts using sparse design matrices
Project-URL: Homepage, https://github.com/heliopais/interlace
Project-URL: Documentation, https://heliopais.github.io/interlace/
Project-URL: Issues, https://github.com/heliopais/interlace/issues
License: BSD 3-Clause License
        
        Copyright (c) 2026, Helio Pais
        
        Redistribution and use in source and binary forms, with or without
        modification, are permitted provided that the following conditions are met:
        
        1. Redistributions of source code must retain the above copyright notice, this
           list of conditions and the following disclaimer.
        
        2. Redistributions in binary form must reproduce the above copyright notice,
           this list of conditions and the following disclaimer in the documentation
           and/or other materials provided with the distribution.
        
        3. Neither the name of the copyright holder nor the names of its
           contributors may be used to endorse or promote products derived from
           this software without specific prior written permission.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
        AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
        IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
        DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
        FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
        DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
        SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
        CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
        OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
        OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
License-File: LICENSE
Requires-Python: >=3.13
Requires-Dist: formulaic>=0.6
Requires-Dist: narwhals>=1.0
Requires-Dist: numpy>=1.26
Requires-Dist: plotnine>=0.15.3
Requires-Dist: scipy>=1.12
Requires-Dist: statsmodels>=0.14
Requires-Dist: tqdm>=4.67.3
Provides-Extra: bobyqa
Requires-Dist: py-bobyqa>=1.4; extra == 'bobyqa'
Provides-Extra: cholmod
Requires-Dist: scikit-sparse>=0.4; extra == 'cholmod'
Provides-Extra: pandas
Requires-Dist: pandas>=2.0; extra == 'pandas'
Description-Content-Type: text/markdown

# interlace

<p align="center">
  <img src="docs/source/_static/interlace_logo.png" alt="interlace" width="220">
</p>

**[Documentation](https://heliopais.github.io/interlace/)**

Pure-Python profiled REML estimation for linear mixed models with **crossed random intercepts**, validated to match R's `lme4::lmer()`.

Designed as a drop-in replacement for `statsmodels.MixedLM` in diagnostics pipelines that require crossed grouping factors (e.g. `(1|worker) + (1|company)`), which statsmodels does not support.

**Scope:** interlace fits models with random intercepts only — it does not support random slopes, generalised outcomes (GLMM), or nested random effects with `/` syntax. For those cases, use R's `lme4` directly or a Python GLMM library.

## Installation

```bash
pip install interlace-lme
```

Requires Python ≥ 3.13.

## Quick start

```python
import pandas as pd
from interlace import fit

result = fit(
    formula="score ~ hours_studied + prior_gpa",
    data=df,
    groups=["student_id", "school_id"],   # crossed random intercepts
)

print(result.fe_params)          # fixed-effect coefficients
print(result.variance_components) # per-factor variance components
print(result.scale)              # residual variance σ²
```

`groups` accepts a single string (one random intercept) or a list (crossed intercepts). The first entry is the primary grouping factor.

## Usage

### Fitting

```python
from interlace import fit

result = fit(formula, data, groups, method="REML")
```

Returns a `CrossedLMEResult` with the following attributes:

| Attribute | Description |
|---|---|
| `fe_params` | Fixed-effect coefficients (Series) |
| `fe_bse` | Standard errors |
| `fe_pvalues` | Wald p-values |
| `fe_conf_int` | 95% confidence intervals |
| `random_effects` | Dict of BLUPs per grouping factor |
| `variance_components` | Dict of variance estimates per grouping factor |
| `scale` | Residual variance σ² |
| `fittedvalues` | Conditional fitted values (Xβ + Zû) |
| `resid` | Conditional residuals |
| `llf`, `aic`, `bic` | Log-likelihood and information criteria |

### Prediction

```python
# In-sample (uses BLUPs)
result.predict()

# New data (unseen group levels shrink to zero)
result.predict(newdata=df_new)

# Fixed effects only
result.predict(newdata=df_new, include_re=False)
```

### Residuals

```python
from interlace import hlm_resid

resid_df = hlm_resid(result, type="conditional")  # or "marginal"
# Returns DataFrame with .resid, .fitted, and original data columns
```

### Leverage

```python
from interlace import leverage

lev = leverage(result)  # array of hat-matrix diagonal values
```

### Influence diagnostics

```python
from interlace import hlm_influence, cooks_distance, mdffits, n_influential, tau_gap

infl = hlm_influence(result, level=1)   # Cook's D, MDFFITS, COVTRACE, COVRATIO, RVC per obs

# Scalar summaries
n = n_influential(result)   # count of high-influence observations
gap = tau_gap(result)        # gap statistic between influential and non-influential groups
```

### Combined augment

```python
from interlace import hlm_augment

aug = hlm_augment(result)
# DataFrame: original data + conditional residuals + influence statistics
```

### Plotting

```python
from interlace import plot_resid, plot_influence, dotplot_diag

plot_resid(resid_df, type="resid_vs_fitted")  # or "qq"
plot_influence(infl_df, measure="cooks_d")
dotplot_diag(infl_df, variable="cooks_d", cutoff="internal")
```

All plots return `plotnine.ggplot` objects.

## statsmodels compatibility

`CrossedLMEResult` exposes the same interface as `statsmodels.MixedLMResults` so it can be used as a drop-in in downstream code that accesses `fe_params`, `resid`, `scale`, `fittedvalues`, `random_effects`, `predict()`, and `model.exog / model.groups / model.data.frame`.

`hlm_resid`, `hlm_influence`, and `hlm_augment` all accept either a `CrossedLMEResult` or a statsmodels `MixedLMResults` object.

## Parity with lme4

Results are validated against R's `lme4::lmer()` to the following tolerances:

| Metric | Tolerance |
|---|---|
| Fixed effects | abs diff < 1e-4 |
| Variance components | rel diff < 5% |
| BLUP correlation | > 0.99 |
| Conditional residual correlation | > 0.999 |

## Contributing

Bug reports, documentation fixes, and new features are welcome — see [CONTRIBUTING.md](CONTRIBUTING.md) for how to get started. To open an issue or ask a question, use the [GitHub issue tracker](https://github.com/heliopais/interlace/issues).

## Attribution

- **[lme4](https://github.com/lme4/lme4)** — the reference implementation for mixed-effects models in R; interlace targets parity with `lme4::lmer()` and uses its output as the validation benchmark.
- **[HLMdiag](https://github.com/aloy/HLMdiag)** — the R package whose diagnostics API (`hlm_resid`, `hlm_influence`, `hlm_augment`, `dotplot_diag`) interlace replicates in Python.
