Metadata-Version: 2.4
Name: patentml
Version: 0.1.0
Summary: 37 machine learning algorithms reconstructed from expired US patents. Zero dependencies, pure Python stdlib.
Author-email: Martin Carr <martincarrsy23@gmail.com>
License: MIT
Project-URL: Homepage, https://getoptimal8.com
Keywords: machine-learning,zero-dependency,stdlib,patents,genetic-algorithm,genetic-programming,neural-network,reinforcement-learning,clustering,kalman-filter,gaussian-process,embedded,micropython
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Education
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file

# patentml

**Machine learning from expired patents. Zero dependencies. Pure Python stdlib.**

37 modules, 131 classes and functions — every algorithm reconstructed from a United States patent that has expired into the public domain. The patents that built modern ML were filed by IBM, Bell Labs, Microsoft Research, Lockheed, AT&T and Lucent between 1995 and 2006. They have all expired. This library is what they describe, as clean modern Python, with no imports beyond the standard library.

```
pip install patentml
```

No numpy. No scipy. No compiled extensions. If it runs Python 3.8+, it runs `patentml` — locked-down corporate machines, serverless functions, air-gapped environments, Pyodide in the browser, and (with light trimming) MicroPython boards.

## Quick start

```python
from patentml import RandomForest, ScalableKMeans, ThompsonSampling, KalmanFilter

# Classification — US6816847 (Microsoft, 1999)
forest = RandomForest(n_trees=25)
forest.fit(X_train, y_train)
labels = [forest.predict(x) for x in X_test]

# Clustering — US6012058 (Microsoft, 1998)
km = ScalableKMeans(k=3)
km.fit(points)

# Bandits — US6981040 (Utopy, 2000) [919 forward citations]
bandit = ThompsonSampling(n_arms=4)
arm = bandit.select()
bandit.update(arm, reward=1.0)

# State estimation — US6795794 (Univ. Illinois, 2002)
kf = KalmanFilter(dim_state=2, dim_obs=1)
```

## What's inside

| Family | Modules |
|---|---|
| Evolutionary & global optimisation | genetic algorithm, genetic programming, grammar GP / grammatical evolution, linear GP, particle swarm, differential evolution, CMA-ES, simulated annealing, ant colony, Bayesian optimiser / EDA, neuroevolution |
| Neural networks | mini neural net (mini-batch backprop), Conv1D, SimpleRNN, GRU cell, SGD/RMSProp/Adam/AdamW optimisers |
| Classifiers | decision tree, random forest, AdaBoost, SVM (SMO), online Bayes, naive Bayes, KNN (+ BallTree), gradient boosting |
| Ensembles | voting, stacking, bagging, weighted |
| Clustering | scalable & hierarchical k-means, DBSCAN, OPTICS, EM / Gaussian mixture, spectral, mean shift |
| Reinforcement learning | Q-learning, SARSA, function-approximation Q, actor-critic A2C, PPO-lite, ε-greedy / UCB1 / Thompson / EXP3 / LinUCB bandits |
| Probabilistic | Bayesian network, hidden Markov model, Gaussian process regression & classification, kernel density estimation |
| Anomaly detection | isolation forest, one-class SVM |
| NLP | TF-IDF + naive Bayes text pipeline, word2vec SGNS, PMI embeddings |
| Recommenders | memory-based & Bayesian collaborative filtering |
| Dimensionality & features | PCA, randomised SVD, vector quantisation (LBG, product quantiser), scalers, mutual-information ranking, forward selection |
| State estimation | Kalman filter, extended Kalman filter |

## Provenance

Every module documents its source patent: number, assignee, filing year, and forward-citation count. Highlights:

| Patent | Assignee | Algorithm | Citations |
|---|---|---|---|
| US5613012 | SmartTouch (1995) | Voting ensemble | 1,182 |
| US6981040 | Utopy (2000) | Bandit selection | 919 |
| US6161130 | Microsoft (1998) | Online classifier | 896 |
| US6556983 | Microsoft (2000) | Word embeddings (PMI + SGNS) | 645 |
| US6192360 | Microsoft (1998) | TF-IDF + naive Bayes | 364 |
| US6317707 | AT&T (1998) | Mean shift + KDE | 269 |
| US6931384 | Microsoft (2001) | Gaussian process regression | 258 |

The full list of ~40 source patents is in the package docstring: `python -c "import patentml; print(patentml.__doc__)"`.

All source patents are expired. The implementations are original code, MIT licensed.

## Why

Modern ML stacks are heavy, opaque, and supply-chain risky. Sometimes you need *one* algorithm — a Kalman filter on a microcontroller, a bandit in a serverless function, k-means in a browser — without 200 MB of compiled wheels. And sometimes you want code you can actually read: every module here is a single self-contained file you can audit in one sitting.

These algorithms earned their citations the hard way. They still work.
