Metadata-Version: 2.4
Name: audiotq
Version: 0.1.1
Summary: A high-performance lossy audio compression engine inspired by LLM weight quantization
Author: Lost Martian
License: GPL-3.0-only
License-File: LICENSE
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Multimedia :: Sound/Audio :: Conversion
Requires-Python: >=3.11
Requires-Dist: numpy>=2.4.6
Requires-Dist: scipy>=1.17.1
Description-Content-Type: text/markdown

<p align="center">
  <img src="https://raw.githubusercontent.com/lostmartian/audioTQ/main/docs/image.png" alt="TurboQuant Audio Banner" width="800">
</p>

<p align="center">
  <a href="https://github.com/lostmartian/audioTQ/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-GPL%20v3-blue.svg" alt="License"></a>
  <img src="https://img.shields.io/badge/Version-0.1.0-orange.svg" alt="Version">
  <img src="https://img.shields.io/badge/Python-3.11%2B-blue.svg" alt="Python">
</p>

A minimal, high-performance lossy audio compression engine built in Python and NumPy. 

TQA adapts the mathematical principles of [TurboQuant](https://arxiv.org/pdf/2504.19874) (originally designed for data-oblivious weight quantization of LLMs on GPU clusters) into a localized, cache-aligned CPU audio codec. By mapping high-dimensional audio amplitudes into zero-centered symmetrical Gaussian distributions via Hadamard rotations, it achieves a ~3.9x memory reduction while preserving transient fidelity.

---

## Features

- **Data-Oblivious Energy Flattening**: Spreads transient spike energy uniformly across audio blocks using Fast Walsh-Hadamard Transform (FWHT) rotations.
- **Optimal Centroid Clustering**: Employs an iterative Lloyd-Max solver to converge on the Mean-Squared-Error (MSE) optimal 6-bit codebook.
- **1-Bit QJL Residual Layer**: Uses a Quantized Joint Least-Squares error sign layer to track quantization rounding errors and suppress distortion.
- **Zero-Dependency**: Built purely on standard Python, NumPy, and SciPy.


## Installation

Install the package directly from PyPI:

```bash
pip install audiotq
```

## Quick Start (Python Library Usage)

You can easily integrate `audiotq` into your own Python audio processing pipelines:

```python
import numpy as np
from audiotq import TurboAudioEngine

# 1. Initialize the codec engine
engine = TurboAudioEngine(block_size=512)

# 2. Prepare your floating-point audio signal (normalized between -1.0 and 1.0)
raw_signal = np.random.normal(0, 0.2, 8000).astype(np.float32)

# 3. Compress the signal
compressed_blocks, meta_scales = engine.compress_signal(raw_signal)

# 4. Decompress back to audio amplitudes
reconstructed_signal = engine.decompress_signal(compressed_blocks, meta_scales)
```

## Command Line Interface (CLI)

The package installs global command-line entry points:

### 1. Compress and Decompress Audio
Process any standard `.wav` audio track end-to-end:
```bash
tqa-cli run -i input.wav -o output_reconstructed.wav
```

### 2. Compare Reconstructed Audio
Extract mathematical fidelity metrics (MSE, SQNR, correlation, and envelope preservation) between raw and processed signals:
```bash
tqa-cli compare -f1 input.wav -f2 output_reconstructed.wav
```

### 3. Run Synthetic Simulations
Generate custom synthetic signals (e.g., sine waves, square waves, noise, transients) with custom parameters:
```bash
tqa-sim --type square --frequency 440 --duration 2.0 --spikes 5
```

## Performance Benchmarks

Below is the telemetry report captured using a standard high-sample dataset (44.1 kHz):

| Metric | Performance Profile |
| --- | --- |
| **Original Dataset Size** | 2.52 MB (15.0 seconds) |
| **Compressed On-Disk Footprint** | 0.65 MB |
| **Compression Ratio** | **~3.91x smaller footprint (74.4% reduction)** |
| **Fidelity (SQNR)** | **30.24 dB** |
| **Compression Throughput** | ~1.32 MB/s |
| **Decompression Throughput** | ~1.35 MB/s |

## Known Limitations & Failure Modes

### 1. WHT Basis Alignment (Extreme Sparsity Failure)
* **Failure Scenario**: If an input block perfectly aligns with one of the Walsh-Hadamard basis vectors (e.g. `signal = rotator.hadamardSigns`), the rotated vector becomes a single extreme Kronecker delta spike.
* **Result**: Because the Lloyd-Max codebook is optimized for normal distributions, it clips this extreme spike to the outermost centroid boundary ($\pm 2.41$ standard deviations). This clipping noise destroys reconstruction quality, dropping the SQNR to **~1.31 dB**. (Proven in `tests/test_failures.py::test_failure_hadamard_basis_alignment`).

### 2. Silent Block Edge Case
* **Edge Case**: Silent blocks have a standard deviation of `0.0`. Dividing by this value during block standardization would lead to `NaN` or `Inf` errors.
* **Resolution**: The engine implements a safety threshold guard (`std_dev > 1e-6`). Silent blocks bypass normalization and are reconstructed as perfect silence. (Proven in `tests/test_failures.py::test_boundary_silent_signal`).

## License

This project is licensed under the GNU General Public License v3.0. See the [LICENSE](https://github.com/lostmartian/audioTQ/blob/main/LICENSE) file for details.
