Metadata-Version: 2.4
Name: rustypca
Version: 0.2.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Dist: numpy>=1.21.0,<3.0
Requires-Dist: scikit-learn>=1.0.0,<2.0
License-File: LICENSE
Summary: Probabilistic PCA with missing value support using Rust and PyO3
Keywords: pca,probabilistic,missing-values,machine-learning
Author-email: Victor Gruselius <victor.tingstrom@gmail.com>
Requires-Python: >=3.10, <3.14
Description-Content-Type: text/markdown
Project-URL: Documentation, https://github.com/tingiskhan/rustypca#readme
Project-URL: Repository, https://github.com/tingiskhan/rustypca

# rustypca

A Python library for Probabilistic PCA that handles missing data without asking you to impute first. It uses the EM algorithm under the hood, with the number-crunching done in Rust.

## Why?

Regular PCA doesn't cope well when your data has holes in it. Probabilistic PCA treats missing values as latent variables in a generative model — a more principled approach than patching NaNs and hoping for the best.

For the full story, see [Tipping & Bishop (1999)](https://www.robots.ox.ac.uk/~cbishop/papers/PPCA.pdf), *"Probabilistic Principal Component Analysis"*, Journal of the Royal Statistical Society, 61(3), 611–622.

## Features

- **Rust backend** — keeps things snappy on larger datasets
- **Missing value support** — the main reason this exists
- **Scikit-learn compatible** — fits in wherever you'd use `sklearn.decomposition.PCA`

## Installation

```bash
pip install rustypca
```

You'll need a Rust toolchain to build from source. Python >= 3.10.

## Quick start

```python
import numpy as np
from rustypca import PPCA

X = np.random.randn(100, 10)
X[np.random.rand(100, 10) < 0.1] = np.nan  # Introduce some missing values

model = PPCA(n_components=2)
X_transformed = model.fit_transform(X)
X_reconstructed = model.inverse_transform(X_transformed)
```

No preprocessing or imputation needed.

## Testing

```bash
make test
```

## Disclaimer

This project was built with the help of [Claude](https://claude.ai).

## License

MIT

