Metadata-Version: 2.4
Name: blockwise
Version: 0.1.0
Summary: Blockwise Reduced Modeling (BRM) for tabular data with blockwise missing patterns
Author: Faiz Currim, Sudha Ram
Author-email: Karthik Srinivasan <karthiks@ku.edu>
License: GPL-3.0-or-later
Project-URL: Homepage, https://github.com/KarAnalytics/blockwise
Project-URL: Paper, https://pubsonline.informs.org/journal/ijds
Project-URL: Issues, https://github.com/KarAnalytics/blockwise/issues
Keywords: blockwise missing,reduced modeling,incomplete data,predictive modeling,tabular
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.22
Requires-Dist: pandas>=1.5
Requires-Dist: scikit-learn>=1.2
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: jupyter; extra == "dev"
Requires-Dist: matplotlib; extra == "dev"
Requires-Dist: ipykernel; extra == "dev"
Dynamic: license-file

# blockwise

Blockwise Reduced Modeling (BRM) for tabular data with **blockwise missing
patterns** — a scikit-learn-compatible estimator.

## Install

```bash
pip install blockwise
```

## Quickstart

```python
import pandas as pd
from sklearn.ensemble import GradientBoostingRegressor
from blockwise import BRM, simulate_blockwise_missing, datasets

bike = datasets.load_bike()
bike_miss = simulate_blockwise_missing(
    bike,
    blocks=[["windspeed", "hum", "weekday"],
            ["hr", "temp", "weathersit"]],
    prop_missing=0.30,
    noise=0.05,
)

X = bike_miss.drop(columns=["cnt"])
y = bike_miss["cnt"]

brm = BRM(estimator=GradientBoostingRegressor()).fit(X, y)
y_hat = brm.predict(X)
```

`BRM` is learner-agnostic: pass any estimator with `fit(X, y)` / `predict(X)`
(and optionally `predict_proba(X)` for classification). Each block's model is
a fresh clone of `estimator`.

## What BRM does

BRM partitions the training data into overlapping subsets based on per-row
feature-missing patterns, pre-trains one model per subset on only the
observed columns of that subset, and at prediction time routes each test row
to the subset model whose missingness pattern most closely matches.

See [`notebooks/`](notebooks/) for worked examples on the **bike** (regression),
**adult** (binary classification), and **house** (regression) datasets.

## Citation

If you use this package, please cite the paper that introduced the method:

> Srinivasan, K., Currim, F., and Ram, S. (2025). *A Reduced Modeling Approach
> for Making Predictions With Incomplete Data Having Blockwise Missing
> Patterns.* INFORMS Journal on Data Science.

A machine-readable `CITATION.cff` is included at the repo root.

## License

GPL-3.0-or-later. See `LICENSE`.
