Metadata-Version: 2.4
Name: AudiooPy
Version: 1.0
Summary: Audio manager in Python Object-Oriented Programming
Author-email: Brigitte Bigi <contact@sppas.org>
Project-URL: Homepage, https://audioopy.sourceforge.io/
Project-URL: Repository, https://sourceforge.net/projects/audioopy/
Keywords: python,class,module,audio,mixer,wav,channel,frames,extract,ipus
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Operating System :: OS Independent
Classifier: Topic :: Multimedia :: Sound/Audio
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Development Status :: 5 - Production/Stable
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE.md
License-File: AUTHORS.md
Provides-Extra: docs
Requires-Dist: pygments; extra == "docs"
Requires-Dist: markdown2; extra == "docs"
Requires-Dist: Clamming==2.0; extra == "docs"
Provides-Extra: tests
Requires-Dist: coverage~=7.3.0; extra == "tests"
Requires-Dist: python-dotenv~=1.0.0; extra == "tests"
Dynamic: license-file

```
-------------------------------------------------------------------------

   █████╗  ██╗   ██╗ ██████╗  ██╗  ██████╗   ██████╗  ██████╗  ██╗   ██╗
  ██╔══██╗ ██║   ██║ ██╔══██╗ ██║ ██╔═══██╗ ██╔═══██╗ ██╔══██╗ ╚██╗ ██╔╝ 
  ███████║ ██║   ██║ ██║  ██║ ██║ ██║   ██║ ██║   ██║ ██████╔╝  ╚████╔╝ 
  ██╔══██║ ██║   ██║ ██║  ██║ ██║ ██║   ██║ ██║   ██║ ██╔═══╝    ╚██╔╝  
  ██║  ██║ ╚██████╔╝ ██████╔╝ ██║ ╚██████╔╝ ╚██████╔╝ ██║         ██║   
  ╚═╝  ╚═╝  ╚═════╝  ╚═════╝  ╚═╝  ╚═════╝   ╚═════╝  ╚═╝         ╚═╝  

        an Audio manager in Python Object-Oriented Programming

                Copyright (C) 2024-2026 Brigitte Bigi, CNRS
        Laboratoire Parole et Langage, Aix-en-Provence, France
-------------------------------------------------------------------------

```

# AudiooPy description

## Overview

### Uses cases

Either, you have an audio file, and you want to apply some operations the Python `audioop` library is doing before its removal of the distribution.
Or, you have a speech audio file, and you want to detect automatically speech segments, also called IPUs - Inter-Pausal Units. 

And... You don't want to install a bunch of other libraries! Then `AudiooPy` is the library you need.

### Features

AudiooPy contains self-implemented useful operations on sound files and sound fragments. It operates on sound frames, meaning they consist of signed integer samples 8, 16, or 32 bits wide, stored in bytes-like objects.

Among others, it allows the followings:

- Audio reader/writer for wav -- based on Python standard library
- Manipulate raw audio data
- Audio mixer
- Channels extractor
- Channels mixer
- Automated calculation of statistical descriptors for audio data
- High-fidelity **automatic detection of sounding segments in speech** - also called "Search for IPUs". See: <https://hal.archives-ouvertes.fr/hal-03697808>

AudiooPy is a solution to replace 'audioop' which was part of Python standard library and was unexpectedly removed in 3.13. See PEP 594 (dead batteries) for details. Actually, 'audioop' is one of the 19 removed libraries with no proposed alternative and no explanation.

### Scientific and technical relevance

`AudiooPy` implements from scratch all key audio operations, including a scientifically validated algorithm for automatic detection of sounding segments (also known as speech activity detection, voiced segment detection, voice activity detection, pause detection, inter-pausal unit segmentation, or silence detection).

Unlike typical silence-based methods, `AudiooPy` focuses on detecting sounding intervals directly, relying on an adaptive RMS-based thresholding strategy described and evaluated in:

> Bigi, B., & Priego-Valverde, B. (2022). The Automatic Search for Sounding Segments of SPPAS: Application to Cheese! Corpus.
> Human Language Technology. Challenges for Computer Science and Linguistics, LNCS 13212, Springer.
> <https://hal.science/hal-03697808>

This study demonstrated that the algorithm achieves highly reliable segmentation of Inter-Pausal Units (IPUs) in real conversational speech, with fewer than 1 % missed segments — confirming its scientific validity and practical efficiency.

In short, `AudiooPy` offers a rare combination of:

1. full self-implementation (no external dependencies),
2. advanced, empirically validated sounding-segment detection,
3. and transparent, reproducible algorithms for speech and audio processing.

### Replacing `audioop`

> **Performance notice.**
> `audioop` was a C extension; AudiooPy is pure Python.
> Expect identical results but slower execution — typically 10–50× on
> sample-iterating operations (bulk `struct` calls reduce the gap considerably,
> but cannot close it entirely).
> For offline processing of speech files this is rarely an issue.
> For hard real-time or very high-throughput pipelines, profile before migrating.

#### Option 1 — Zero-code-change compatibility

AudiooPy ships a `audioopy.audioop` module that exposes **all 26 functions** of the
original `audioop` with identical signatures and return types.
Replace only the import line — the rest of your code stays untouched:

```python
# Before (Python ≤ 3.12)
import audioop
result = audioop.rms(fragment, width)

# After
import audioopy.audioop as audioop          # ← only this line changes
result = audioop.rms(fragment, width)       # identical
```

Named imports work the same way:

```python
# Before
from audioop import add, rms, ratecv

# After
from audioopy.audioop import add, rms, ratecv
```

> **Note on `ratecv`:** state chaining across successive chunk calls is not
> supported by this wrapper; the returned state is always `None`.  For
> fine-grained control use `AudioFrames._ratecv()` directly.

#### Option 2 — Object-oriented API (migration table)

For new code or a full migration, use the `AudioFrames` API directly.
Abbreviations used: `f` = frames (bytes), `w` = sample width (1/2/4), `nc` = number of channels.

| `audioop` (Python ≤ 3.12) | Description | AudiooPy (`AudioFrames`) |
|:---|:---|:---|
| `add(f1, f2, w)` | sum two fragments, sample by sample | `AudioFrames(f1, w).add(AudioFrames(f2, w))` |
| `adpcm2lin(f, w, state)` | decode IMA/DVI ADPCM → linear PCM | `AudioFrames(f, 1).adpcm2lin(w, state)` |
| `alaw2lin(f, w)` | decode G.711 a-law → linear PCM | `AudioFrames(f, 1).alaw2lin(w)` |
| `avg(f, w)` | average over all samples | `AudioFrames(f, w).avg()` |
| `avgpp(f, w)` | average peak-to-peak amplitude | `AudioFrames(f, w).avgpp()` |
| `bias(f, w, bias)` | add a constant bias to each sample | `AudioFrames(f, w).bias(bias)` |
| `byteswap(f, w)` | swap byte order of each sample | `AudioFrames(f, w).byteswap()` |
| `cross(f, w)` | count zero crossings | `AudioFrames(f, w).cross()` |
| `findfactor(f, ref, w)` | find scale factor minimising rms(f − ref×F) | `AudioFrames(f, w).findfactor(AudioFrames(ref, w))` |
| `findfit(f, ref)` | find best position of ref inside f | `AudioFrames(f, 2).findfit(AudioFrames(ref, 2))` |
| `findmax(f, length)` | find window of length samples with max energy | `AudioFrames(f, 2).findmax(length)` |
| `getsample(f, w, i)` | get the value of sample i | `AudioFrames(f, w).get_sample(i)` |
| `lin2adpcm(f, w, state)` | encode linear PCM → IMA/DVI ADPCM | `AudioFrames(f, w).lin2adpcm(state)` |
| `lin2alaw(f, w)` | encode linear PCM → G.711 a-law | `AudioFrames(f, w).lin2alaw()` |
| `lin2lin(f, w, neww)` | convert sample width | `AudioFrames(f, w).change_sampwidth(neww)` |
| `lin2ulaw(f, w)` | encode linear PCM → G.711 µ-law | `AudioFrames(f, w).lin2ulaw()` |
| `max(f, w)` | maximum absolute sample value | `AudioFrames(f, w).absmax()` |
| `maxpp(f, w)` | maximum peak-to-peak amplitude | `AudioFrames(f, w).maxpp()` |
| `minmax(f, w)` | minimum and maximum sample values | `AudioFrames(f, w).minmax()` |
| `mul(f, w, factor)` | multiply all samples by a factor | `AudioFrames(f, w).mul(factor)` |
| `ratecv(f, w, nc, inr, outr, state)` | convert frame rate | `AudioFrames(f, w, nc).resample(inr, outr)` |
| `reverse(f, w)` | reverse the order of samples | `AudioFrames(f, w).reverse()` |
| `rms(f, w)` | root-mean-square | `AudioFrames(f, w).rms()` |
| `tomono(f, w, lf, rf)` | stereo → mono with per-channel factors | `AudioFrames(f, w, 2).tomono(lf, rf)` |
| `tostereo(f, w, lf, rf)` | mono → stereo with per-channel factors | `AudioFrames(f, w, 1).tostereo(lf, rf)` |
| `ulaw2lin(f, w)` | decode G.711 µ-law → linear PCM | `AudioFrames(f, 1).ulaw2lin(w)` |

> **Note on `ratecv`:** `resample()` is the high-level API and handles common rates automatically.
> For fine-grained control (custom weights, state chaining) use `_ratecv(in_rate, out_rate, in_n, out_n, state, weightA, weightB)` directly.

### Main advantages

> **AudiooPy** is fully self-implemented — it requires no external dependencies.

- ⚙️ **Lightweight and transparent** — pure-Python, object-oriented design, no black boxes or external libraries.  
- 💻 **Portable** — runs on any platform where Python ≥ 3.9 is available.  
- 🧩 **Customizable** — easy to extend or adapt for research and specialized audio workflows.  
- ✅ **Reliable** — 94 % code coverage, tested and documented algorithms.  
- 🔁 **Sustainable** — long-term replacement for the deprecated `audioop` module.
- 🧠 **Scientifically validated** — robust detection of sounding segments (IPUs, speech activity, silence boundaries) proven effective on real conversational speech.  


## Install AudiooPy

### From pypi.org:

![PyPI - Version](https://img.shields.io/pypi/v/AudiooPy)

```bash
> python -m pip install AudiooPy
```

### From its wheel package:

Download the wheel file (AudiooPy-xxx.whl) from SourceForge and install it in your python environment with:

```bash
> python -m pip install <AudiooPy-xxx.whl>
```

### From its repo:

Download the repository from SourceForge and unpack it, or clone with `git`. Optionally, it can be installed with:

```bash
> python -m pip install .
```

Install all the optional dependencies with:

```bash
> python -m pip install ".[docs, tests]"
```

### AudiooPy content

AudiooPy tool includes the following folders and files:

1. "audioopy": the source code of the API
2. "docs": the documentation of audioopy library in both HTML and Markdown
3. "tests": the tests of the source code, including audio sample files


## Quick Start

Scripts are available in the `audioopy/scripts` folder. Try for example:
```bash
> python audioopy/scripts/audioinfo.py -w tests/samples/oriana1.wav
> python audioopy/scripts/audioipus.py -w tests/samples/oriana1.wav -s 0.25
```

### Operates on audio

Implement it by yourself! Open an audio file and get some information:

```python
>>> import audioopy.aio
>>> audio = audioopy.aio.open("tests/samples/oriana1.wav")
>>> audio.get_sampwidth()
>>> audio.get_framerate()
>>> audio.get_duration()
>>> audio.get_nchannels()
>>> audio.get_nframes()
>>> audio.rms()
>>> audio.clipping_rate(0.4)
>>> # Extract the channel
>>> audio.extract_channel(0)
```

### Search for IPUs: sound segments in speech

You can launch 'python sample.py' and see the results!

To get IPUs from an audio into your python program, see the following example:

```python
>>> from audioopy.ipus import SearchForIPUs
>>>  # Create the instance and fix options
>>> searcher = SearchForIPUs(channel=audio[0])
>>>  # Fix options
>>> searcher.set_vol_threshold(0)  # auto
>>> searcher.set_win_length(0.02)
>>> searcher.set_min_sil(0.25)
>>> searcher.set_min_ipu(0.2)
>>> searcher.set_shift_start(0.02)
>>> searcher.set_shift_end(0.02)
>>>  # Process the data and get the list of IPUs
>>> tracks = searcher.get_tracks(time_domain=True)
```

The following reference presents the 'Search for IPUs' initially implemented in SPPAS and migrated in AudiooPy, 
describing the method and focusing on its evaluation on the Cheese! corpus, a corpus of both reading and 
conversational speech between two participants. 
It reports the number of manual actions performed by the annotators to achieve the expected segmentation: 
adding new IPUs, ignoring irrelevant ones, splitting an IPU, merging two consecutive ones, and moving boundaries.

> Brigitte Bigi, Béatrice Priego-Valverde (2022).
> The automatic search for sounding segments of SPPAS: application to Cheese! corpus.
> Human Language Technology. Challenges for Computer Science and Linguistics, LNAI, LNCS 13212, pp. 16-27. 
> <https://hal.science/hal-03697808>


## Test/Analyze source code

Install the optional dependencies with:

```bash
> python -m pip install ".[tests]"
```

Code coverage can be analyzed with unittest and coverage. Install them with the command: `python -m pip install ".[tests]"`.
Then, perform the following steps:

1. `coverage run -m unittest`
2. `coverage report` to see a summary report into the terminal, or use this command to get the detailed result in XML format: `coverage xml`


## Projects using AudiooPy

AudiooPy was initially developed within SPPAS <https://sppas.org>.
It was extracted from its original software in 2024 by the author to lead its own life as standalone package.


## Help / How to contribute

If you want to report a bug, please send an e-mail to the author.
Any and all constructive comments are welcome.

If you plan to contribute to the code, please read carefully and agree both the 
code of conduct and the code style guide.
If you are contributing code or documentation to the AudiooPy project, you are 
agreeing to the DCO certificate <http://developercertificate.org>. 
Copy/paste the DCO, then you just add a line saying:
```
Signed-off-by: Random J Developer <random@developer.example.org>
```
Send this file by e-mail to the author.


## AudiooPy Documentation

The documentation of the API is available at <https://audioopy.sourceforge.io>.

To generate the doc locally, install the required external programs, then launch the doc generator:
```bash
>python -m pip install ".[docs]"
>python makedoc.py
```

Starting from Whakerexa 1.0, browsing the HTML documentation of `AudiooPy`
requires running an HTTP server. This can easily be done with **uWSGI**
(for instance, using WSL under Windows):

```bash
# Install the external libraries:
python3 -m pip install pycryptodome --break-system-packages
python3 -m pip install uwsgi
# Launch the HTTP service:
uwsgi --http :9090 --wsgi-file docs/uwsgi.py
```

## License/Copyright

See the accompanying LICENSE and AUTHORS.md files for the full list of contributors.

Copyright (C) 2024-2026 Brigitte Bigi, CNRS - Laboratoire Parole et Langage, Aix-en-Provence, France

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.


## Changes

- Version 0.1: 

    * Initial version, extracted from SPPAS 4.17.
    * A few self-implemented functions into AudioFrames() instead of using 'audioop' standard library.
  

- Version 0.2:

    * Self-implemented functions into AudioFrames(), except for resample() which is only partially self-implemented.
    * Added function clip() in AudioFrames().
    * Added the error number 2090: NumberFramesError.
    * Updated scripts.


- Version 0.3:

    * Self-implemented resample() is continued: it works with output=16kHz, input=(32kHz, 48kHz).


- Version 0.4:

    * Include the full implementation of the Search for IPUs algorithm - migrated from SPPAS.
    * License changed from GPL, v3 to AGPL, v3.


- Version 0.5:

    * A `resample()` function is implemented
    * The `po` folder migrated into 'audioopy' in order to be included to the whl
    * The `scripts` folder migrated into 'audioopy'
    * new script `audiomixer.py` allowing to create an all-in-one audio channel from several audio files
    * new script `audiofragment.py` allowing to extract a fragment of audio
    * new script `audioresample.py` allowing to resample an audio
    * new script `audioipus.py` allowing to search for IPUs in an audio
    * Test coverage is 79%

- Version 0.6:

    * Corrected bugs in 'mul' and 'bias' of ChannelFormatter()
    * Increased tests coverage to 82%
    * The documentation makes use of Clamming-2.0 and Whakerexa-1.0.

- Version 1.0:
    * `AudioFrames` is now a **100 % pure-Python drop-in replacement for `audioop`**:
      all 26 public functions of the removed standard library are available.
      - New methods: `add`, `avgpp`, `maxpp`, `findfit`, `findmax`, `tomono`, `tostereo`,
        `lin2ulaw` / `ulaw2lin` (G.711 µ-law), `lin2alaw` / `alaw2lin` (G.711 a-law),
        `lin2adpcm` / `adpcm2lin` (IMA/DVI ADPCM).
      - Fixed stubs: `byteswap`, `findfactor`, `reverse` (previously raised `NotImplementedError`).
    * New `audioopy.audioop` compatibility module: zero-code-change drop-in replacement.
      Only the import line needs to change — all 26 function signatures and return types are identical.
    * Added migration table `audioop` → AudiooPy in the README.
    * **Performance**: all sample-iterating methods (`rms`, `avg`, `minmax`, `cross`, `clip`,
      `mul`, `bias`, `add`, `reverse`, `findfactor`, `avgpp`, `maxpp`, `findfit`, `findmax`,
      `tomono`, `tostereo`, `change_sampwidth`, and all codec encode/decode methods) now use
      bulk `struct.unpack` / `struct.pack` instead of per-sample calls — ~5–10× faster on the
      IPU detection path, ~3–5× on transform methods. No external dependency added.
    * `SearchForIPUs`: new `max_ipu_dur` parameter (via `set_max_ipu()`). When a track exceeds
      this limit, `get_tracks()` automatically re-segments by progressively lowering the minimum
      silence duration. Script `audioipus.py` exposes it as `-x`.
    * Fixed `resample()`: removed the dead fallback to the removed `audioop` standard library.
      `_re_sample()` now falls back to `_ratecv()` for unsupported rate pairs instead of
      raising `NotImplementedError`.
    * Test coverage increased to 94%.
    * `aio` package: pure-Python AIFF / AIFF-C and AU / SND readers and writers,
      replacing the `aifc` and `sunau` standard library modules removed in Python 3.13.
      Supported formats: AIFF (big-endian PCM), AIFF-C (`NONE` and `sowt` variants),
      AU 8/16/32-bit PCM. The `AudioFactory` and `aio.open()` / `aio.save()` API are unchanged.
    * WAV: float32 and float64 WAV files (IEEE 754, format tag 3) are now read transparently
      instead of crashing. Samples are converted to int16 on the fly; no external dependency.
