Metadata-Version: 2.4
Name: mono-quant
Version: 1.0.1
Summary: Simple, reliable model quantization with minimal dependencies
Author: thatAverageGuy
License: MIT
Project-URL: Homepage, https://github.com/thatAverageGuy/mono-quant
Project-URL: Repository, https://github.com/thatAverageGuy/mono-quant
Project-URL: Documentation, https://thataverageguy.github.io/mono-quant
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: safetensors>=0.3
Requires-Dist: tqdm>=4.66
Requires-Dist: click>=8.1
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: tqdm>=4.66; extra == "dev"
Provides-Extra: docs
Requires-Dist: mkdocs>=1.6.0; extra == "docs"
Requires-Dist: mkdocs-material>=9.7.0; extra == "docs"
Requires-Dist: mkdocstrings[python]>=1.0.0; extra == "docs"

# Mono Quant

**Ultra-lightweight, model-agnostic quantization for PyTorch**

[![PyPI Version](https://img.shields.io/pypi/v/mono-quant)](https://pypi.org/project/mono-quant/)
[![Python Version](https://img.shields.io/pypi/pyversions/mono-quant)](https://pypi.org/project/mono-quant/)
[![License](https://img.shields.io/github/license/thatAverageGuy/mono-quant)](https://github.com/thatAverageGuy/mono-quant/blob/main/LICENSE)
[![Documentation](https://img.shields.io/badge/docs-mkdocs-blue)](https://thataverageguy.github.io/mono-quant)

## What is Mono Quant?

Mono Quant is a simple, reliable model quantization package for PyTorch with minimal dependencies. Just `torch` and `numpy`, no bloat.

### Key Features

- **Model-Agnostic** - Works with any PyTorch model: HuggingFace, local, or custom
- **Multiple Modes** - INT8, INT4, and FP16 quantization
- **Flexible Calibration** - Dynamic (no data) or static (with calibration data)
- **Robust Validation** - SQNR metrics, size comparison, and accuracy warnings
- **Dual Interface** - Python API for automation, CLI for CI/CD
- **Build-Phase Only** - Quantize during build, deploy lightweight models

## Installation

```bash
pip install mono-quant
```

### Requirements

- Python 3.11 or higher
- PyTorch 2.0 or higher
- NumPy 1.24 or higher

## Quick Start

### Python API

```python
from mono_quant import quantize

# Dynamic INT8 quantization (no calibration data needed)
result = quantize(model, bits=8, dynamic=True)

# Save the quantized model
result.save("model_quantized.pt")

# Check metrics
print(f"Compression: {result.info.compression_ratio:.2f}x")
print(f"SQNR: {result.info.sqnr_db:.2f} dB")
```

### CLI

```bash
# Dynamic quantization
monoquant quantize --model model.pt --bits 8 --dynamic

# With custom output path
monoquant quantize --model model.pt --bits 8 --output model_quantized.pt
```

## Quantization Modes

### Dynamic Quantization (Fastest, No Data)

```python
result = quantize(model, bits=8, dynamic=True)
```

### Static Quantization (Best Accuracy, Requires Data)

```python
calibration_data = [torch.randn(1, 3, 224, 224) for _ in range(150)]

result = quantize(
    model,
    bits=8,
    dynamic=False,
    calibration_data=calibration_data
)
```

### INT4 Quantization (Maximum Compression)

```python
result = quantize(
    model,
    bits=4,
    dynamic=False,
    calibration_data=calibration_data,
    group_size=128  # Default
)
```

## Documentation

Full documentation available at **[https://thataverageguy.github.io/mono-quant](https://thataverageguy.github.io/mono-quant)**

- [Installation Guide](https://thataverageguy.github.io/mono-quant/getting-started/installation/)
- [Quick Start](https://thataverageguy.github.io/mono-quant/getting-started/quickstart/)
- [User Guide](https://thataverageguy.github.io/mono-quant/user-guide/modes/)
- [CLI Reference](https://thataverageguy.github.io/mono-quant/cli/)
- [API Reference](https://thataverageguy.github.io/mono-quant/api/)

## Why Mono Quant?

Most quantization tools are tied to specific frameworks (HuggingFace, TFLite) or require heavy dependencies. Mono Quant fills the niche of **"just quantize the weights, nothing else."**

### Design Philosophy

| Aspect | Approach |
|--------|----------|
| **Model Loading** | You load the model, we quantize it |
| **Dependencies** | Only torch and numpy required |
| **Use Case** | Build-phase (CI/CD, local development) |
| **Scope** | Quantization only, no runtime or serving |

## License

MIT License - see [LICENSE](https://github.com/thatAverageGuy/mono-quant/blob/main/LICENSE) for details.

## Contributing

Contributions welcome! Please see [CONTRIBUTING.md](https://thataverageguy.github.io/mono-quant/about/contributing/) for guidelines.

## Links

- **GitHub:** https://github.com/thatAverageGuy/mono-quant
- **PyPI:** https://pypi.org/project/mono-quant/
- **Documentation:** https://thataverageguy.github.io/mono-quant
- **Issues:** https://github.com/thatAverageGuy/mono-quant/issues
