Metadata-Version: 2.4
Name: modern-fm
Version: 1.1.1
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Dist: numpy>=1.24
Requires-Dist: scipy>=1.10
Requires-Dist: scikit-learn>=1.6
Requires-Dist: pytest>=7 ; extra == 'dev'
Requires-Dist: ruff>=0.4 ; extra == 'dev'
Requires-Dist: pandas ; extra == 'dev'
Requires-Dist: polars ; extra == 'dev'
Provides-Extra: dev
License-File: LICENSE
Summary: Fast, sklearn-compatible Factorization Machines and Field-aware Factorization Machines
Keywords: factorization-machines,ffm,ctr,recommender,tabular
Author: Masaya Kawamata
License: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/Matapanino/modern_fm
Project-URL: Issues, https://github.com/Matapanino/modern_fm/issues
Project-URL: Repository, https://github.com/Matapanino/modern_fm

# modern_fm

[![PyPI](https://img.shields.io/pypi/v/modern-fm.svg)](https://pypi.org/project/modern-fm/)
[![Python versions](https://img.shields.io/pypi/pyversions/modern-fm.svg)](https://pypi.org/project/modern-fm/)
[![CI](https://github.com/Matapanino/modern_fm/actions/workflows/ci.yml/badge.svg)](https://github.com/Matapanino/modern_fm/actions/workflows/ci.yml)
[![License: MIT](https://img.shields.io/badge/license-MIT-yellow.svg)](LICENSE)
[![docs](https://img.shields.io/badge/docs-github.io-blue.svg)](https://matapanino.github.io/modern_fm/)

Fast, sklearn-compatible Factorization Machines (FM) and Field-aware
Factorization Machines (FFM) for Python.

**Documentation: <https://matapanino.github.io/modern_fm/>** — install,
quickstart, API reference, math specs.

**Status: v1.0 (stable).** The public API is frozen under the SemVer contract
in `docs/compat_policy.md`. A Rust CPU backend (parity-tested against
pure-NumPy reference implementations) drives sklearn-style estimators —
`FMClassifier`, `FMRegressor`, `FFMClassifier`, `FFMRegressor` (binary +
multiclass softmax + regression) and `FwFMClassifier` (Field-weighted FM) —
with the SGD / AdaGrad / Adam / **FTRL-Proximal** optimizers, **mini-batch**
gradient averaging (`batch_size`), **multi-core training** via `rayon`
(`n_jobs`), early stopping for every cell, `partial_fit`/`warm_start`
streaming, `sample_weight`/`class_weight`, `label_smoothing`, a
`CategoricalEncoder`, `top_interactions` model inspection, and
`save_model`/`load_model`. FTRL's L1 (`l1_linear`/`l1_factors`) yields
exact-zero weights. The estimators are scikit-learn
`check_estimator`-compatible (drop into `Pipeline` / `GridSearchCV` /
`CalibratedClassifierCV`), accept pandas / polars DataFrames, and
`load_libffm` / `dump_libffm` read and write the libffm text format. An
optional CUDA backend (`backend="cuda"`) accelerates every prediction and
training cell — FM/FFM/FwFM, binary/regression/multiclass — on NVIDIA GPUs
(compute ≥ 6.0).

## Installation

```bash
pip install modern-fm        # prebuilt wheels for Linux/macOS/Windows, no Rust toolchain needed
```

The Linux wheels are CUDA-ready out of the box: wherever an NVIDIA driver
(CUDA 12+) is present — e.g. Colab/Kaggle GPU runtimes — `backend="cuda"`
just works; on CPU-only machines the same wheel behaves exactly like a CPU
build. macOS/Windows wheels are CPU-only.

To build from source instead (e.g. on a platform without a prebuilt wheel), see
Development below; it requires a Rust toolchain.

## Usage

```python
from modern_fm import FMClassifier, FFMClassifier

model = FMClassifier(
    n_factors=16,
    optimizer="adagrad",
    learning_rate=0.05,
    max_iter=100,
    batch_size=256,        # mini-batch gradient averaging (1 = per-row SGD)
    n_jobs=-1,             # train batches across all CPU cores
    l2_linear=1e-5,
    l2_factors=1e-5,
    random_state=42,
)
model.fit(X_train, y_train)
proba = model.predict_proba(X_test)

# FTRL-Proximal with L1 for sparse linear weights (classic CTR setup)
sparse = FMClassifier(optimizer="ftrl", l1_linear=1.0, batch_size=256, random_state=42)
sparse.fit(X_train, y_train)

ffm = FFMClassifier(n_factors=8, n_jobs=-1, random_state=42)
ffm.fit(X_train, y_train, field_ids=field_ids)
```

`FMRegressor`, multiclass `FMClassifier` (just pass a target with >2 classes),
early stopping (`early_stopping=True` or `eval_set=(X_val, y_val)`), and the
`CategoricalEncoder` are demonstrated in `examples/basic_usage.py`.
`benchmarks/bench_synthetic.py` reports fit time and predict throughput against
the NumPy reference floor.

## Benchmarks

On synthetic CTR data (40k train / 20k test; 16 one-hot categorical fields →
256 features) with *planted pairwise interactions* between field pairs — signal
a linear model cannot represent — FM/FFM recover most of it. `n_jobs=-1` uses all
cores (8 here); absolute numbers vary by machine.

| Model | Test AUC | Fit (s) | Predict (rows/s) |
|---|---:|---:|---:|
| `LogisticRegression` (sklearn) | 0.694 | 0.01 | 60M |
| `FMClassifier` (batch=1) | 0.817 | 1.34 | 4.3M |
| `FMClassifier` (batch=512) | 0.816 | 0.45 | 4.8M |
| `FMClassifier` (batch=512, `n_jobs=-1`) | 0.816 | 0.33 | 5.0M |
| `FFMClassifier` (batch=512) | 0.846 | 1.68 | 2.3M |
| `FFMClassifier` (batch=512, `n_jobs=-1`) | 0.846 | 1.46 | 2.1M |

- **Interactions matter**: AUC climbs 0.69 → 0.82 (FM) → 0.85 (FFM) as the model
  captures the pairwise / field-aware structure the linear baseline misses.
- **Mini-batch**: `batch_size=512` trains ~3× faster than per-row SGD at equal AUC.
- **Multi-core**: `n_jobs=-1` adds a further ~1.2–1.4× here (more on larger/denser data).

Reproduce with `python benchmarks/bench_vs_baseline.py`. `xlearn` is auto-included
if importable, but it does not build on every platform (it failed to build here on
macOS/arm64 + CPython 3.11).

### Real click data (KDD Cup 2012 sample)

On real CTR data — the KDD Cup 2012 track-2 sample from OpenML
(`Click_prediction_small`; 200k impressions subsampled with seed 0, 9
id-categorical fields → 373k one-hot features, 4.4% CTR, stratified 80/20
split) — with libFM-style fixed hyperparameters (AdaGrad, L2 1e-4, built-in
early stopping; not tuned to this benchmark):

| Model | Test AUC | Fit (s) | Predict (krows/s) |
|---|---:|---:|---:|
| `LogisticRegression` (sklearn) | 0.6908 | 3.5 | 14 594 |
| `FMClassifier` (k=8) | 0.6810 | 1.8 | 2 402 |
| `FFMClassifier` (k=4) | 0.6721 | 5.1 | 1 211 |
| `FwFMClassifier` (k=8) | 0.6891 | 2.8 | 2 481 |

Honest read: this 9-field sample is dominated by rare ids (373k features for
160k train rows), so second-order factor models only match — not beat — a
well-regularized linear baseline; `FwFMClassifier` comes closest at a
fraction of LR's predict throughput. The planted-interaction synthetic table
above shows the regime where factor models pull ahead. Machine: macOS arm64
(Apple Silicon), Python 3.11; reproduce with
`python benchmarks/bench_criteo_like.py` (the original Criteo/Avazu samples
are no longer publicly downloadable without credentials, so the bench uses
this real CTR dataset via `fetch_openml` — details in the script docstring).

## Development

Requires Python >= 3.10 and a recent Rust toolchain (1.74+; `rustup update`).

```bash
python3 -m venv .venv
.venv/bin/pip install -e ".[dev]"   # builds the Rust extension via maturin
.venv/bin/pytest -q
.venv/bin/ruff check .
```

`pip install -e .` compiles `rust/` and installs the extension as
`modern_fm._rust` (maturin mixed layout, config in `pyproject.toml`).
After editing Rust code, re-run `pip install -e .` to rebuild. Rust-only
checks:

```bash
cd rust
PYO3_PYTHON=$PWD/../.venv/bin/python3 cargo test
PYO3_PYTHON=$PWD/../.venv/bin/python3 cargo clippy
```

Without the extension built, the package still works: `modern_fm._backend`
falls back to the pure-NumPy reference implementations, and the parity tests
in `tests/test_rust_parity.py` are skipped.

Design documents live in `docs/` — start with `docs/requirements.md` and
`docs/math_spec.md`. The roadmap is in `docs/roadmap.md`.

