Metadata-Version: 2.4
Name: mc-dropout-pytorch
Version: 0.1.1
Summary: MC Dropout (Gal & Ghahramani, 2016) - Pytorch
Home-page: https://github.com/lucidrains/mc-dropout-pytorch
Author: lucidrains
Author-email: lucidrains@gmail.com
License: MIT
Keywords: artificial intelligence,deep learning,bayesian deep learning,uncertainty estimation,monte carlo dropout
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: accelerate>=1.0
Requires-Dist: einops>=0.8
Requires-Dist: ema-pytorch>=0.7
Requires-Dist: torch>=2.0
Requires-Dist: tqdm
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: summary



## MC Dropout, in Pytorch

[![PyPI version](https://badge.fury.io/py/mc-dropout-pytorch.svg)](https://badge.fury.io/py/mc-dropout-pytorch)

Implementation of [Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning](https://arxiv.org/abs/1506.02142) (Gal & Ghahramani, ICML 2016) in Pytorch.

Standard dropout NNs cast as approximate Bayesian inference over deep Gaussian processes — giving free, calibrated uncertainty estimates with no architectural changes and zero inference overhead beyond T forward passes.

## Install

```bash
$ pip install mc-dropout-pytorch
```

## Usage

### Regression with uncertainty

```python
import torch
from torch.utils.data import TensorDataset
from mc_dropout_pytorch import BayesianMLP, MCDropoutInference, Trainer

# build model
model = BayesianMLP(
    input_dim    = 1,
    output_dim   = 1,
    hidden_dims  = (256, 256),
    dropout_rate = 0.1,
    activation   = 'relu',
)

# wrap for MC inference (T=50 stochastic passes)
mc = MCDropoutInference(model, num_samples = 50, task = 'regression', tau = 1.0)

x = torch.linspace(-3, 3, 100).unsqueeze(-1)
out = mc(x)

out.mean      # predictive mean     — (100, 1)
out.variance  # predictive variance — (100, 1)  includes τ⁻¹ noise term
out.samples   # raw samples         — (50, 100, 1)
```

### Classification with predictive entropy

```python
import torch
from mc_dropout_pytorch import BayesianCNN, MCDropoutInference

model = BayesianCNN(
    in_channels      = 1,
    num_classes      = 10,
    base_channels    = 32,
    dropout_rate     = 0.25,
    fc_dropout_rate  = 0.5,
    img_size         = 28,
)

mc = MCDropoutInference(model, num_samples = 50, task = 'classification')

x = torch.randn(8, 1, 28, 28)
out = mc(x)

out.mean      # class probabilities  — (8, 10)
out.variance  # per-class variance   — (8, 10)

# active learning signals (§6)
H  = mc.predictive_entropy(x)   # (8,)  — total uncertainty
MI = mc.mutual_information(x)   # (8,)  — epistemic uncertainty only
```

### Full training loop with the `Trainer`

```python
import torch
from torch.utils.data import TensorDataset
from mc_dropout_pytorch import BayesianMLP, Trainer

# synthetic regression dataset
X = torch.randn(1000, 4)
y = X[:, 0] * 2 + X[:, 1] - X[:, 2] + torch.randn(1000) * 0.1
dataset = TensorDataset(X, y)

model = BayesianMLP(
    input_dim    = 4,
    output_dim   = 1,
    hidden_dims  = (128, 128),
    dropout_rate = 0.1,
)

trainer = Trainer(
    model,
    dataset,
    task             = 'regression',
    train_lr         = 1e-3,
    train_num_steps  = 5_000,
    train_batch_size = 64,
    ema_decay        = 0.995,
    amp              = False,
    weight_decay     = 1e-4,   # ≡ prior precision in §3
    tau              = 1.0,    # noise precision
    num_mc_samples   = 50,
)

trainer.train()

# inference via EMA model
mc = trainer.inference
out = mc(X[:10])
print(out.mean, out.variance)
```

### Multi-GPU

```bash
$ accelerate config
$ accelerate launch train.py
```

## Key ideas from the paper

**The insight (§3)**: Training a NN with dropout and L2 regularisation minimises a KL divergence to the posterior of a deep Gaussian process — no variational EM, no weight sampling required.

**Test-time dropout (MC Dropout)**:

```
for t = 1 … T:
    ŷ_t = f^ω_t(x)    # ω_t ~ q(ω)  via Bernoulli dropout

E[y*]   ≈ (1/T) Σ ŷ_t                              # predictive mean
Var[y*] ≈ τ⁻¹ I + (1/T) Σ ŷ_t ŷ_tᵀ − E[y*]²     # predictive variance  (Eq. 9)
```

**Active learning** (§6): Use `mc.mutual_information(x)` to identify the most informative unlabelled points — pure epistemic uncertainty, disentangled from aleatoric noise.

**Weight correspondence** (§3.2):

| Dropout training          | Bayesian GP posterior    |
|---------------------------|--------------------------|
| dropout probability `p`   | variational parameter    |
| L2 weight decay `λ`       | prior precision          |
| noise precision `τ`       | `τ = (2N λ) / (1 − p)`  |

## Citations

```bibtex
@article{Gal2016Dropout,
    title   = {Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning},
    author  = {Yarin Gal and Zoubin Ghahramani},
    journal = {Proceedings of the 33rd International Conference on Machine Learning (ICML)},
    year    = {2016},
    url     = {https://arxiv.org/abs/1506.02142}
}
```
