Metadata-Version: 2.2
Name: omni-wst-core
Version: 1.0.3
Summary: OmniPulse WST/JTFS C++/CUDA perceptual hashing engine
Requires-Python: >=3.10
Requires-Dist: numpy>=1.26
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-benchmark; extra == "dev"
Description-Content-Type: text/markdown

# OmniPulse WST Core (`omni-wst-core`)
**GPU-Accelerated Wavelet Scattering & Joint Time-Frequency Scattering (JTFS) Primitives**

[![Build Status](https://img.shields.io/badge/build-passing-brightgreen)](#) [![CUDA](https://img.shields.io/badge/CUDA-11.0%2B-76B900)](#) [![Python](https://img.shields.io/badge/Python-3.10%2B-blue)](#) [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](#)

`omni-wst-core` is a production-grade, highly optimized C++/CUDA mathematical extension module for Python. It provides extreme-throughput primitives for calculating the **Wavelet Scattering Transform (WST)** and the **Joint Time-Frequency Scattering (JTFS)** transform. 

Designed for formally grounded perceptual fingerprinting, robust signal representation, and transient anomaly detection, this module orchestrates dual-stream, double-buffered execution over pinned `cudaMallocHost` memory. The result is zero-copy NumPy ingestion that virtually eliminates PCIe bottleneck latency, achieving processing speeds unachievable by standard deep learning frameworks.

---

## 📑 Table of Contents
1. [Mathematical Formalism](#-mathematical-formalism)
2. [Architectural Implementation](#-architectural-implementation)
3. [Cross-Domain Use Cases](#-cross-domain-use-cases)
4. [Installation](#-installation)
5. [Usage Guide](#-usage-guide)
6. [Testing & Validation](#-testing--validation)
7. [License](#️-license)

---

## 🔬 Mathematical Formalism

The Wavelet Scattering Transform provides an incredibly rich, deformation-stable representation of high-frequency time-series data. It is constructed through a deep convolutional network architecture where learned, opaque filters are replaced by explicit, analytically defined wavelet filter banks.

### 1. The Scattering Cascade
Let $x(u)$ be an input signal. We construct a complex-valued analytic wavelet filter bank $\psi_\lambda(u)$ defined by dilations of a mother wavelet, alongside a low-pass scaling function $\phi_J(u)$ at scale $2^J$.

* **Zero-order** coefficients (the local average):
  $$S[0]x(u) = x * \phi_J(u)$$
* **First-order** coefficients are obtained by computing the wavelet transform and applying a complex modulus nonlinearity ($|\cdot|$) to discard rapidly varying phase:
  $$S[1]x(u, \lambda_1) = |x * \psi_{\lambda_1}| * \phi_J(u)$$
* **Depth-$m$** coefficients are obtained by cascading this convolution and modulus operator $m$ times:
  $$S[p]x(u) = |\,| \dots |x * \psi_{\lambda_1}| * \dots | * \psi_{\lambda_m}| * \phi_J(u)$$

### 2. Parseval Energy Conservation
To prevent the loss of informational energy across deep cascades (information collapse), the filter bank must satisfy a Parseval frame condition. By enforcing this during the initialization of the `WSTEngine`, we guarantee that the energy of the input signal is perfectly partitioned and conserved:
$$\sum_{p} ||S[p]x||^2 = ||x||^2$$

### 3. Adversarial Robustness & Lipschitz Continuity
The fundamental advantage of WST over ad-hoc spectrograms or standard neural networks is **formal deformation stability**. The WST is strictly *Lipschitz continuous*, meaning minor signal deformations or background noise result in linearly bounded changes to the fingerprint. 

The depth-$m$ propagator is mathematically bounded by:
$$||S[p]x - S[p]y||_{L^2} \le (||\psi||_1)^m \cdot ||x - y||_{L^2}$$
Because our explicitly constructed Morlet wavelets satisfy $||\psi||_1 < 1$, the Lipschitz constant $L_m$ decays exponentially with depth, strictly bounding the impact of adversarial noise.

### 4. Joint Time-Frequency Scattering (JTFS)
The standard WST modulus operator ($|\cdot|$) destroys critical phase-coupling information between adjacent frequency bands. This limits discriminative power for signals with complex frequency-modulated structures (e.g., chirps, overlapping formants).

**JTFS** recovers this structure by executing a fully separable 2D convolution across both the temporal axis $t$ and the log-frequency axis $\lambda$:
$$\Psi_{\mu,l,s}(t,\lambda) = \psi_\mu(t) \cdot \psi_{l,s}(\lambda)$$
`omni-wst-core` computes this massive multidimensional operation concurrently by dispatching parallel CUDA streams (`stream0` for time, `stream1` for frequency) to maximize ALU saturation.

---

## ⚡ Architectural Implementation

This repository utilizes an aggressive host-to-device memory architecture to bypass standard PCIe bottlenecks:

* **Template Metaprogramming:** `TilePolicy<ArchTag>` eliminates dynamic runtime branching, specializing convolution tile sizes dynamically for Ampere (`TILE_M=64`) or Hopper (`TILE_M=128`) architectures at compile time.
* **Zero-Copy FFI:** Utilizing `pybind11` buffer protocols, NumPy memory buffers are seamlessly mapped without host-side replication.
* **UVA & Pinned Memory:** `MemoryStaging` utilizes `cudaMallocHost` to lock RAM. By asynchronously swapping double buffers via `cudaMemcpyAsync`, memory is shuttled to VRAM in the background while the `cuFFT` logic saturates the compute cores.

---

## 🌍 Cross-Domain Use Cases

While designed for perceptual hashing, the deterministic, energy-preserving nature of `omni-wst-core` makes it highly effective across multiple R&D fields:

1. **Astrophysics (Gravitational Waves):** The deformation stability of WST and the phase-coupling detection of JTFS are ideal for isolating the non-stationary "chirp" of compact binary coalescences (neutron star mergers) from massive broadband noise.
2. **Neurology (EEG/ECG Analysis):** Human physiological signals are notoriously noisy and non-stationary. WST filters out warping deformations, allowing researchers to maintain 90%+ accuracy in EEG emotion recognition without heavy pre-processing.
3. **Bioinformatics (Genomics):** Applied to ChIP-seq peak calling, WST provides a translation-invariant encoding of genomic read-depth signals, while JTFS captures co-modulated binding patterns between different histone modifications.
4. **Audio Forensics & Copyright:** Neutralizes adversarial phase-shifting attacks in media piracy by utilizing JTFS to recover the inter-band phase correlations that standard audio fingerprinting algorithms discard.

---

## 🚀 Installation

**Prerequisites:**
* Python 3.10+
* CMake 3.18+
* NVIDIA CUDA Toolkit (v11.x or v12.x)
* C++17 compliant compiler (GCC/Clang/MSVC)

**Build from Source (PEP 517):**
```bash
git clone [https://github.com/samvardhan03/Module-I-omni-wst-core.git](https://github.com/samvardhan03/Module-I-omni-wst-core.git)
cd Module-I-omni-wst-core

# Install dependencies and build the C++/CUDA extension
pip install -e .
```

---

## 💻 Usage Guide

### 1. Basic WST Fingerprinting
```python
import numpy as np
import omni_wst_core as wst

# Initialize configuration
# J: Maximum wavelet scale (2^J samples)
# Q: Quality factor (wavelets per octave)
cfg = wst.WSTConfig(J=8, Q=16, depth=2, jtfs=False)

# Simulate 1 second of audio at 44.1kHz
signal = np.random.randn(44100).astype(np.float32)

# Generate formally grounded WST fingerprint
fingerprint = wst.fingerprint(signal, cfg)
print(f"WST Fingerprint Dimensions: {fingerprint.shape}")
```

### 2. Advanced: Joint Time-Frequency Scattering (JTFS)
To protect against phase-shifting attacks or to analyze highly modulated signals (like chirps), enable JTFS.

```python
# Enable JTFS phase-recovery
jtfs_cfg = wst.WSTConfig(J=10, Q=16, depth=2, jtfs=True)

complex_signal = np.random.randn(44100).astype(np.float32)

# Executes parallel 2D separable wavelets on stream0 and stream1
jtfs_fingerprint = wst.fingerprint(complex_signal, jtfs_cfg)
print(f"JTFS Coefficient Shape: {jtfs_fingerprint.shape}")
```

---

## 🧪 Testing & Validation

The module includes a rigorous suite of mathematical proofs implemented in `pytest` to validate the CUDA kernels against continuous mathematics constraints.

```bash
# Install test dependencies
pip install pytest numpy

# Run the test suite
pytest tests/
```

**Key Validations Performed:**
* `test_lipschitz.py`: Confirms that adversarial noise injections remain mathematically bounded by $L_{m}\le(||\psi||_{1})^{m}$.
* `test_parseval.py`: Validates that the generated frequency-domain Morlet filter banks strictly conserve signal energy.
* `test_memory.py`: Ensures `CUdeviceptr` handles do not leak memory during high-throughput batched ingestion.

*(Note: For explicit memory safety checks on the GPU, run the test suite wrapped in `compute-sanitizer --tool memcheck` via `make memcheck`)*.

---

## ⚖️ License

The mathematical primitives within `omni-wst-core` are licensed under the **Apache License 2.0** for free open-source academic and research usage.

**Commercial Deployment:** Production deployment of OmniPulse modules for differential licensing or enterprise MLOps requires an Enterprise SaaS agreement. Academic and government institutions qualify for immediate enterprise waivers. Please refer to `COMMERCIAL_LICENSE.md` or contact the repository owner for details.

---
*Developed by Samvardhan Singh*

```
