Metadata-Version: 2.4
Name: mlx-taef
Version: 0.1.1
Summary: Tiny AutoEncoders for diffusion (TAESD family) on Apple MLX.
Project-URL: Homepage, https://github.com/ionden/mlx-taef
Project-URL: Documentation, https://ionden.github.io/mlx-taef
Project-URL: Issues, https://github.com/ionden/mlx-taef/issues
Project-URL: Changelog, https://github.com/ionden/mlx-taef/blob/main/CHANGELOG.md
Author-email: Denis Ineshin <denis.ineshin@gmail.com>
License-Expression: MIT
Keywords: apple-silicon,autoencoder,flux,mlx,stable-diffusion,taesd
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: huggingface-hub>=0.24
Requires-Dist: mlx>=0.20
Requires-Dist: numpy>=1.26
Requires-Dist: safetensors>=0.4
Provides-Extra: image
Requires-Dist: pillow>=10.0; extra == 'image'
Provides-Extra: mflux
Requires-Dist: mflux>=0.17; extra == 'mflux'
Description-Content-Type: text/markdown

# mlx-taef

[![PyPI version](https://img.shields.io/pypi/v/mlx-taef.svg)](https://pypi.org/project/mlx-taef/)
[![Python versions](https://img.shields.io/pypi/pyversions/mlx-taef.svg)](https://pypi.org/project/mlx-taef/)
[![License: MIT](https://img.shields.io/pypi/l/mlx-taef.svg)](https://github.com/IonDen/mlx-taef/blob/main/LICENSE)

Tiny AutoEncoders for diffusion latents on Apple Silicon, in pure MLX.

`mlx-taef` is the first MLX port of the TAESD family — TAESD (SD1.x), TAESDXL (SDXL), TAEF1 (FLUX.1), TAEF2 (FLUX.2 Klein) — distilled mini-autoencoders that decode diffusion latents to RGB in milliseconds using a few-MB model instead of multi-GB full VAEs.

Use it for:
- **Live previews** during long generations on Mac — see each step refresh in <100 ms instead of waiting 30 s for the full VAE.
- **Low-memory fallbacks** when the full VAE OOMs on 16 GB Macs (TAEF2 peaks at ~1 GB for 1024×1024 vs ~9.6 GB for the full Flux VAE).
- **Quick latent inspection** in notebooks and ML research.

```python
import mlx.core as mx
from mlx_taef import TAEF2

taef = TAEF2.from_pretrained()              # downloads + converts on first call
img = taef.decode(latents)                  # NHWC float in [0, 1]
img_uint8 = taef.decode_image(latents)      # uint8 NHWC ready for PIL
```

## Install

From PyPI:

```bash
pip install mlx-taef
# With the mflux preview callback:
pip install "mlx-taef[mflux]"
```

Or with `uv`:

```bash
uv add mlx-taef
# With mflux:
uv add "mlx-taef[mflux]"
```

Pin an exact version in a project that needs reproducibility:

```bash
pip install "mlx-taef==0.1.0"
```

Verify the install:

```bash
mlx-taef --help
```

Requires Python ≥ 3.11 and Apple Silicon (`mlx` itself is Apple-Silicon-only). Runtime install has **zero PyTorch dependency** — `torch` is dev-only and used solely for fixture generation in the test suite.

## Variants

| Variant | latent_channels | For | HF source |
|---|---|---|---|
| `TAESD` | 4 | Stable Diffusion 1.x | [madebyollin/taesd](https://huggingface.co/madebyollin/taesd) |
| `TAESDXL` | 4 | Stable Diffusion XL | [madebyollin/taesdxl](https://huggingface.co/madebyollin/taesdxl) |
| `TAEF1` | 16 | FLUX.1 | [madebyollin/taef1](https://huggingface.co/madebyollin/taef1) |
| `TAEF2` | 32 | FLUX.2 Klein | [madebyollin/taef2](https://huggingface.co/madebyollin/taef2) |

All four share one API.

## Benchmarks (M1 Max, fp16)

| Metric | TAEF2 (this library) | Full Flux VAE (reference) | Win |
|---|---|---|---|
| Decode latency 1024×1024 | **~100 ms** | seconds | 50–100× |
| Peak unified memory 1024×1024 | **~1 GB** | ~9.6 GB | **9.4×** |
| Output cosine sim vs PyTorch reference | > 0.999 | — | (parity verified) |

Numbers from `tests/test_perf.py` on M1 Max 32 GB. See `notes/phase1-benchmarks.md` for details.

## mflux live previews

```python
from mflux.models.flux2 import Flux2Klein
from mlx_taef.integrations.mflux import LivePreviewCallback

model = Flux2Klein.from_pretrained("4bit")
preview = LivePreviewCallback(
    variant="taef2",
    every=5,
    save_to="preview.png",
    latent_height=32,  # 512 / 16
    latent_width=32,
)
model.callbacks.register(preview)
model.generate_image(
    prompt="a red apple on a wooden table",
    num_inference_steps=25,
    width=512,
    height=512,
    seed=42,
)
```

For exact value-space recovery, also pass `bn_mean=flux2_vae.bn.running_mean, bn_var=flux2_vae.bn.running_var` to the callback. Without them, previews show correct structure but colors may shift.

See `docs/manual-verification.md` for the full verification recipe.

## Status

- **v0.1.0 — initial public release on PyPI** (2026-05-13). All four variants, encoder + decoder, mflux integration, CI, 99 % honest coverage.

Track future releases via the [PyPI history](https://pypi.org/project/mlx-taef/#history) or `gh release list -R IonDen/mlx-taef`.

## License

MIT. Mirrors upstream [madebyollin/taesd](https://github.com/madebyollin/taesd) license. Pretrained weights belong to their respective authors (madebyollin).

## Acknowledgements

- [madebyollin](https://github.com/madebyollin) for the upstream TAESD-family models and weights.
- [Apple ML Explore](https://github.com/ml-explore/mlx) for MLX.
- [filipstrand/mflux](https://github.com/filipstrand/mflux) for the MLX-native FLUX runner this library integrates with.
