Metadata-Version: 2.2
Name: voxis
Version: 0.0.7
Summary: Voxis is a hybrid Python + C++ audio DSP library for fast, modular effect pipelines.
Author: SpiralRbx
License: MIT
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: C++
Classifier: Operating System :: OS Independent
Project-URL: Homepage, https://voxis.org
Project-URL: Repository, https://github.com/spiralRbx/voxis
Project-URL: Documentation, https://voxis.org
Requires-Python: >=3.10
Requires-Dist: numpy>=2.0
Requires-Dist: imageio-ffmpeg>=0.6.0
Provides-Extra: dev
Requires-Dist: build>=1.2.2; extra == "dev"
Requires-Dist: pytest>=8.4; extra == "dev"
Requires-Dist: twine>=5.1; extra == "dev"
Provides-Extra: web
Requires-Dist: flask>=3.1; extra == "web"
Provides-Extra: waveform
Requires-Dist: pillow>=10.0; extra == "waveform"
Provides-Extra: all
Requires-Dist: flask>=3.1; extra == "all"
Requires-Dist: pillow>=10.0; extra == "all"
Description-Content-Type: text/markdown

# Voxis

**[Documentation](https://voxis.org)** | **[Getting Started](https://voxis.org/getting-started)** | **[Python Guide](https://voxis.org/python)** | **[Effects Reference](https://voxis.org/effects-reference)** | **[Realtime Guide](https://voxis.org/realtime)**

Voxis is a hybrid Python + C++ audio library built for modular DSP pipelines, band-based processing, and a developer experience that feels closer to `pydub` than to low-level DSP toolkits.

## Goals

- Modular DSP pipeline with block-based float32 processing
- Simple Python API for loading, chaining effects, and exporting audio
- C++ core for high-performance effect execution
- FFmpeg-powered decode/encode flow that works cross-platform
- Band-splitting foundation for multiband effects and processing graphs

## Architecture

```text
decoder (FFmpeg) -> PCM float32 buffer -> DSP pipeline -> encoder (FFmpeg)
```

### Design principles

- All transformations run on `float32` buffers
- Internal layout is frame-major: `(frames, channels)`
- Effects are stateful and process in blocks
- Multithreading is optional and exposed through the pipeline API
- Python owns orchestration; C++ owns the hot DSP loop

## Included in this baseline

- `95` chainable effects/processors available via `effect_names()`
- `108` total checklist resources when you count clip ops, utilities, and analysis helpers
- C++ pipeline runner with Python bindings
- Basic clip operations: normalize, fade in/out, trim, cut, crossfade, silence removal, reverse, mono/stereo conversion, DC offset removal
- Effects: gain, clip, lowpass, highpass, bandpass, notch, peak EQ, low shelf, high shelf, delay, tremolo, distortion, compressor, limiter, expander, noise gate, de-esser, pan, stereo width
- Dynamics extensions: downward compression, upward compression, transient shaper, multiband compressor
- Advanced EQ/modulation: graphic EQ, resonant filter, dynamic EQ, formant filter, chorus, flanger, phaser, vibrato, auto-pan, rotary speaker, ring modulation, frequency shifter
- Advanced time/space effects: delay, feedback delay, echo, ping-pong delay, multi-tap delay, slapback, room/hall/plate reverb, early reflections, convolution reverb
- Saturation and lo-fi effects: overdrive, fuzz, bitcrusher, waveshaper, tube saturation, tape saturation, soft clipping, hard clipping
- Pitch/tempo effects: pitch shift, time stretch, time compression, auto-tune, harmonizer, octaver, formant shifting
- Restoration effects: noise reduction, voice isolation, source separation, de-reverb, de-echo, spectral repair, AI enhancer, speech enhancement
- Creative effects: glitch, stutter, tape stop, reverse reverb, granular synthesis, time slicing, random pitch modulation, vinyl, radio, telephone, retro 8-bit, slow motion, robot voice, alien voice
- Spectral effects: FFT filter, spectral gating, spectral blur, spectral freeze, spectral morphing, phase vocoder, harmonic/percussive separation, spectral delay
- Spatial effects: stereo widening, mid/side processing, stereo imager, binaural effect, spatial positioning, HRTF-style simulation
- Utility/analysis tools: resample, dither, bit-depth conversion, loudness normalization, peak detection, RMS analysis, envelope follower
- `AudioClip` class inspired by the ergonomics of `pydub`
- Multiband processor scaffold using Linkwitz-Riley-style band splits
- Lazy render support for deferred processing chains
- Presets such as `radio`, `vocal_enhance`, `cinematic`, `podcast_clean`, and `lofi`
- Pipeline debug via `pipeline_info()` and lazy render cache reuse
- FFmpeg backend using `imageio-ffmpeg` to resolve the binary portably
- Tests and a simple benchmark

See [docs/EFFECTS.md](docs/EFFECTS.md) for the overview, [docs/EFFECT_REFERENCE.md](docs/EFFECT_REFERENCE.md) for the complete per-effect usage catalog, and [docs/REALTIME.md](docs/REALTIME.md) for the first realtime browser starter. The offline `web-test` app exposes items 1 through 12 with ready-to-demo defaults, independent control columns, metrics, and per-step timing.

## Documentation

Full documentation is available at **[voxis.org](https://voxis.org)**:

- [Getting Started](https://voxis.org/getting-started) — install and first script
- [Core Concepts](https://voxis.org/core-concepts) — AudioClip, Pipeline, Effects
- [Python Guide](https://voxis.org/python) — complete offline processing guide
- [Realtime Guide](https://voxis.org/realtime) — browser audio with WASM + Web Audio API
- [Effects Reference](https://voxis.org/effects-reference) — all 95+ effects with parameters
- [Tutorials](https://voxis.org/tutorials) — step-by-step learning projects
- [Cookbook](https://voxis.org/cookbook) — quick recipes for common tasks
- [API Reference](https://voxis.org/api-reference) — browse source documentation

## Quick start

```python
from voxis import AudioClip, Pipeline, compressor, delay, distortion, lowpass

clip = AudioClip.from_file("input.wav")

pipeline = (
    Pipeline(sample_rate=clip.sample_rate, channels=clip.channels, block_size=2048)
    >> [
        distortion(drive=2.5),
        lowpass(frequency_hz=8_000, stages=2),
        delay(delay_ms=135, feedback=0.32, mix=0.22),
        compressor(threshold_db=-18.0, ratio=3.5),
    ]
)

processed = clip.apply_pipeline(pipeline)
processed.export("output.wav")
```

## Realtime Starter

The realtime browser layer covers:

- Basic effects, dynamics, EQ / filters, modulation, space / ambience, distortion / saturation, pitch / time, modern / AI-like effects, creative / special effects, spectral processing, and advanced stereo / spatial processing — all in realtime.
- The basic `AudioWorklet` layer carries realtime paths for gain, normalize, fade in, fade out, crossfade, trim, cut, silence removal, reverse, pan, mono ↔ stereo, and DC offset removal.
- The WASM EQ/filter/dynamics chain includes high-pass, low-pass, band-pass, notch, low shelf, high shelf, resonant filter, parametric equalizer, graphic EQ, dynamic EQ, formant filter, compressor, downward compression, upward compression, limiter, expander, noise gate, multiband compressor, de-esser, and transient shaper.
- The modulation `AudioWorklet` layer includes chorus, flanger, phaser, tremolo, vibrato, auto-pan, rotary speaker, ring modulation, and frequency shifter.
- The realtime space / ambience rack includes delay, echo, ping-pong delay, multi-tap delay, slapback delay, early reflections, room / hall / plate reverb, and convolution reverb (IR).
- The realtime color/time `AudioWorklet` layer includes distortion, overdrive, fuzz, bitcrusher, waveshaper, tube saturation, tape saturation, soft clipping, hard clipping, pitch shift, auto-tune, harmonizer, octaver, and formant shifting.
- The realtime modern/creative `AudioWorklet` layer includes noise reduction, voice isolation, source separation, de-reverb, de-echo, spectral repair, AI enhancer, speech enhancement, glitch effect, stutter, tape stop, reverse reverb, granular synthesis, time slicing, random pitch mod, vinyl effect, radio effect, telephone effect, 8-bit / retro sound, slow motion extreme, robot voice, and alien voice.
- The realtime spectral/spatial `AudioWorklet` layer includes FFT filter, spectral gating, spectral blur, spectral freeze, spectral morphing, phase vocoder, harmonic/percussive separation, spectral delay, stereo widening, mid/side processing, stereo imager, binaural effect, 3D audio positioning, and HRTF simulation.
- `Time stretch` and `Time compression` run in realtime preview through the browser file transport with preserved pitch. On microphone input, those two controls are bypassed.
- It uses Web Audio API plus `AudioWorklet` for low-latency preview with either a file source or microphone input.
- The reusable browser module lives in the `api/` folder, with `voxis-realtime.js` as the public entry point, `voxis-realtime-player.js` handling the graph/player side, and `voxis-realtime-effects.js` exposing builder helpers for the checklist sections.
- The processor files are part of the browser-side Voxis realtime layer. The `.wasm` file alone is not enough by itself; it needs the JavaScript worklet bridge and Web Audio graph wiring around it.
- The realtime layer does not replace the offline `AudioClip` workflow. The browser uses streaming-safe counterparts for section 1 clip edits, while the offline API keeps the destructive full-buffer versions.
- For section 8, the browser path is a low-latency realtime approximation layer for monitoring and UI preview; the heavier offline restoration flow still lives in the Python/C++ path.

For a browser-side reusable starter, you can import:

```js
import { createVoxisRealtimePlayer, effects } from "./api/voxis-realtime.js";

const player = await createVoxisRealtimePlayer({ container: "#player" });
await player.loadUrl("/audio/example.mp3");
player.setEffects([
  effects.gain({ value: 1.1 }),
  effects.compressor({ thresholdDb: -18, ratio: 3.0, makeupDb: 2.0 }),
  effects.chorus({ mix: 0.35, rateHz: 0.9 }),
  effects.hall_reverb({ decaySeconds: 1.8, mix: 0.2 }),
  effects.noise_reduction({ strength: 0.8 }),
  effects.fft_filter({ lowHz: 90, highHz: 9000, mix: 1.0 }),
  effects.hrtf_simulation({ azimuthDeg: 30, elevationDeg: 0, distance: 1.1 }),
]);
```

That wrapper ships effect builders for all 11 sections, creates a standard `<audio>` element automatically when you pass a container, resolves its processor and WASM files relative to the module location, and can load a file source, microphone source, crossfade partner, and convolution IR.

For app-side browser controls, the public API also exports `createLockedControl(...)`, which lets you define the real min/max/step in JavaScript so the effect value is normalized again before it reaches the realtime engine.

That helps prevent casual DevTools edits from pushing the UI outside the limits you defined, but it is still app-side protection. Final product rules are always the responsibility of the project that embeds Voxis, because browser-side JavaScript still runs in an environment the end user controls.

The shipped `voxis-realtime-dynamics.wasm` module is now also the native source of truth for realtime parameter metadata across the browser API. EQ/filter/dynamics values are still validated in the native DSP bridge itself, and the remaining realtime stages read their public limits from the WASM metadata before configuration reaches the JavaScript processors. In the browser API you can listen for warnings with `player.onWarning((warning) => { ... })`, inspect the full native metadata with `player.getNativeRealtimeLimits()`, and keep `player.getNativeDynamicsLimits()` as a compatibility alias.

## Path resolution

For audio input and output, Voxis resolves every relative path from the directory of the Python file that called the API.

That means this works even when you launch Python from the project root:

```text
project/
  tests/
    test_readme.md.py
    example.mp3
```
---
```python
from voxis import AudioClip

audio = AudioClip.from_file("example.mp3")
```

`example.mp3` is resolved as `tests/example.mp3` because the script is inside `tests/`.

If the file is outside the script folder, use normal relative navigation:

```python
AudioClip.from_file("../example.mp3")
AudioClip.from_file("../folder/example.mp3")
AudioClip.from_file("folder/example.mp3")
```

Those examples mean:

```text
../example.mp3
  go one folder up from the script directory

../folder/example.mp3
  go one folder up, then enter folder/

folder/example.mp3
  enter folder/ inside the script directory
```

Exports follow the same rule, so relative outputs are also created next to the script by default:

```text
project/
  tests/
    test_readme.md.py
```
---
```python
processed.export("output-1.mp3")
processed.export("outputs/clean.wav")
```

The examples above will create:

```text
project/
  tests/
    output-1.mp3
    outputs/
      clean.wav
```

If you want to save or load using another base, pass an explicit relative path like `../example.wav` or an absolute path.

## Presets

```python
from voxis import AudioClip

clip = AudioClip.from_file("voice.wav")
processed = clip.apply("vocal_enhance")
processed.export("voice_master.mp3", bitrate="192k", format="mp3", sample_rate=44_100)
```

## Clip Editing

```python
edited = (
    clip.fade_in(180.0)
    .trim(start_ms=50.0, end_ms=12_000.0)
    .remove_silence(threshold_db=-50.0, min_silence_ms=90.0, padding_ms=15.0)
    .remove_dc_offset()
)
```

## Lazy render

```python
lazy_clip = clip.apply("cinematic", lazy=True).normalize(headroom_db=1.0)
lazy_clip.export("final.wav")
```

## Pipeline Debug

```python
rendered = clip.apply("radio", lazy=True).render()
print(rendered.pipeline_info())
```

## Installation

```bash
python -m pip install voxis
```

PyPI releases ship prebuilt wheels for Windows, Linux, and macOS. If `pip` ever falls back to the source distribution, Voxis can still install without a local C++ toolchain and will use the Python DSP backend automatically, with the native extension remaining an optional speed-up.

## Roadmap

- Streaming decoder -> pipeline -> encoder path without loading entire files into memory
- More filters and modulation effects
- Convolution and FIR kernels
- SIMD-specialized kernels
- Native codec integration through libav* for tighter FFmpeg coupling
