Metadata-Version: 2.4
Name: fp4-fp8-for-torch-mps
Version: 1.0.0
Summary: FP8 and FP4 sub-byte dtype support for PyTorch MPS on Apple Silicon via Metal shaders
Project-URL: Repository, https://github.com/AppMana/mps-fp8-for-torch-and-comfyui-python-package
Author-email: doctorpangloss <2229300+doctorpangloss@users.noreply.github.com>
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: torch>=2.9
Description-Content-Type: text/markdown

Registers FP8 (float8_e4m3fn, float8_e5m2) and FP4 (float4_e2m1fn_x2) support for PyTorch's MPS backend on Apple Silicon. Once installed, `import torch` auto-loads the extension via the `torch.backends` entry point, enabling `tensor.to(torch.float8_e4m3fn)`, `torch._scaled_mm`, and `tensor.copy_` to work transparently on MPS through Metal shader kernels dispatched via `torch.mps.compile_shader`. The FP8 encode is tested byte-for-byte against all 254 representable values and their midpoints to match CPU PyTorch exactly; FP4 decode is verified exhaustively against all 256 packed byte patterns. 80 tests run on macOS MPS hardware in CI.

```
uv pip install git+https://github.com/AppMana/mps-fp8-for-torch-and-comfyui-python-package.git
```
