Metadata-Version: 2.4
Name: whisperx-nemo-pipeline
Version: 1.0.1
Summary: Production-ready transcription and diarization pipeline with parallel processing
Home-page: https://github.com/PaulBorie/whisperx-nemo-parallel
Author: Paul Borie
Author-email: paul.borie1@gmail.com
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Multimedia :: Sound/Audio :: Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: absl-py<3,>=2.3.1
Requires-Dist: aiohappyeyeballs<3,>=2.6.1
Requires-Dist: aiohttp<4,>=3.12.15
Requires-Dist: aiosignal<2,>=1.4.0
Requires-Dist: alembic<2,>=1.16.4
Requires-Dist: annotated-types<1,>=0.7.0
Requires-Dist: antlr4-python3-runtime<5,>=4.9.3
Requires-Dist: asteroid-filterbanks<1,>=0.4.0
Requires-Dist: asttokens<4,>=3.0.0
Requires-Dist: attrs<26,>=25.3.0
Requires-Dist: audioread<4,>=3.0.1
Requires-Dist: av<16,>=15.0.0
Requires-Dist: braceexpand<1,>=0.1.7
Requires-Dist: certifi<2026,>=2025.8.3
Requires-Dist: cffi<2,>=1.17.1
Requires-Dist: charset-normalizer<4,>=3.4.3
Requires-Dist: click<9,>=8.2.1
Requires-Dist: cloudpickle<4,>=3.1.1
Requires-Dist: coloredlogs<16,>=15.0.1
Requires-Dist: colorlog<7,>=6.9.0
Requires-Dist: comm<1,>=0.2.3
Requires-Dist: contourpy<2,>=1.3.3
Requires-Dist: ctranslate2<5,>=4.4.0
Requires-Dist: cycler<1,>=0.12.1
Requires-Dist: cytoolz<2,>=1.0.1
Requires-Dist: datasets<4,>=3.2.0
Requires-Dist: decorator<6,>=5.2.1
Requires-Dist: dill<1,>=0.3.8
Requires-Dist: Distance<1,>=0.1.3
Requires-Dist: docopt<1,>=0.6.2
Requires-Dist: dora_search<1,>=0.1.12
Requires-Dist: editdistance<1,>=0.8.1
Requires-Dist: einops<1,>=0.8.1
Requires-Dist: executing<3,>=2.2.0
Requires-Dist: faster-whisper<2,>=1.1.0
Requires-Dist: fiddle<1,>=0.3.0
Requires-Dist: filelock<4,>=3.18.0
Requires-Dist: flatbuffers<26,>=25.2.10
Requires-Dist: fonttools<5,>=4.59.0
Requires-Dist: frozenlist<2,>=1.7.0
Requires-Dist: fsspec<2025,>=2024.9.0
Requires-Dist: future<2,>=1.0.0
Requires-Dist: g2p-en<3,>=2.1.0
Requires-Dist: gitdb<5,>=4.0.12
Requires-Dist: GitPython<4,>=3.1.45
Requires-Dist: graphviz<1,>=0.21
Requires-Dist: greenlet<4,>=3.2.4
Requires-Dist: grpcio<2,>=1.74.0
Requires-Dist: huggingface-hub<1,>=0.23.5
Requires-Dist: humanfriendly<11,>=10.0
Requires-Dist: hydra-core<2,>=1.3.2
Requires-Dist: HyperPyYAML<2,>=1.2.2
Requires-Dist: idna<4,>=3.10
Requires-Dist: inflect<8,>=7.5.0
Requires-Dist: iniconfig<3,>=2.1.0
Requires-Dist: intervaltree<4,>=3.1.0
Requires-Dist: ipython<10,>=9.4.0
Requires-Dist: ipython_pygments_lexers<2,>=1.1.1
Requires-Dist: ipywidgets<9,>=8.1.7
Requires-Dist: jedi<1,>=0.19.2
Requires-Dist: Jinja2<4,>=3.1.6
Requires-Dist: jiwer<5,>=4.0.0
Requires-Dist: joblib<2,>=1.5.1
Requires-Dist: julius<1,>=0.2.7
Requires-Dist: jupyterlab_widgets<4,>=3.0.15
Requires-Dist: kaldi-python-io<2,>=1.2.2
Requires-Dist: kaldiio<3,>=2.18.1
Requires-Dist: kiwisolver<2,>=1.4.9
Requires-Dist: lameenc<2,>=1.8.1
Requires-Dist: lazy_loader<1,>=0.4
Requires-Dist: Levenshtein<1,>=0.27.1
Requires-Dist: lhotse<2,>=1.30.3
Requires-Dist: libcst<2,>=1.8.2
Requires-Dist: librosa<1,>=0.11.0
Requires-Dist: lightning<3,>=2.5.3
Requires-Dist: lightning-utilities<1,>=0.15.2
Requires-Dist: lilcom<2,>=1.8.1
Requires-Dist: llvmlite<1,>=0.44.0
Requires-Dist: loguru<1,>=0.7.3
Requires-Dist: Mako<2,>=1.3.10
Requires-Dist: Markdown<4,>=3.8.2
Requires-Dist: markdown-it-py<5,>=4.0.0
Requires-Dist: MarkupSafe<4,>=3.0.2
Requires-Dist: marshmallow<5,>=4.0.0
Requires-Dist: matplotlib<4,>=3.10.5
Requires-Dist: matplotlib-inline<1,>=0.1.7
Requires-Dist: mdurl<1,>=0.1.2
Requires-Dist: more-itertools<11,>=10.7.0
Requires-Dist: mpmath<2,>=1.3.0
Requires-Dist: msgpack<2,>=1.1.1
Requires-Dist: multidict<7,>=6.6.4
Requires-Dist: multiprocess<1,>=0.70.16
Requires-Dist: nemo_toolkit<3,>=2.0.0rc0
Requires-Dist: networkx<4,>=3.5
Requires-Dist: nltk<4,>=3.9.1
Requires-Dist: numba<1,>=0.61.2
Requires-Dist: numpy<2,>=1.26.4
Requires-Dist: nvidia-cublas-cu12<13,>=12.8.4.1
Requires-Dist: nvidia-cuda-cupti-cu12<13,>=12.8.90
Requires-Dist: nvidia-cuda-nvrtc-cu12<13,>=12.8.93
Requires-Dist: nvidia-cuda-runtime-cu12<13,>=12.8.90
Requires-Dist: nvidia-cudnn-cu12<10,>=9.10.2.21
Requires-Dist: nvidia-cufft-cu12<12,>=11.3.3.83
Requires-Dist: nvidia-cufile-cu12<2,>=1.13.1.3
Requires-Dist: nvidia-curand-cu12<11,>=10.3.9.90
Requires-Dist: nvidia-cusolver-cu12<12,>=11.7.3.90
Requires-Dist: nvidia-cusparse-cu12<13,>=12.5.8.93
Requires-Dist: nvidia-cusparselt-cu12<1,>=0.7.1
Requires-Dist: nvidia-nccl-cu12<3,>=2.27.3
Requires-Dist: nvidia-nvjitlink-cu12<13,>=12.8.93
Requires-Dist: nvidia-nvtx-cu12<13,>=12.8.90
Requires-Dist: omegaconf<3,>=2.3.0
Requires-Dist: onnx<2,>=1.18.0
Requires-Dist: onnxruntime<2,>=1.22.1
Requires-Dist: openunmix<2,>=1.3.0
Requires-Dist: optuna<5,>=4.4.0
Requires-Dist: packaging<26,>=25.0
Requires-Dist: pandas<3,>=2.3.1
Requires-Dist: parso<1,>=0.8.4
Requires-Dist: pexpect<5,>=4.9.0
Requires-Dist: pillow<12,>=11.3.0
Requires-Dist: plac<2,>=1.4.5
Requires-Dist: platformdirs<5,>=4.3.8
Requires-Dist: pluggy<2,>=1.6.0
Requires-Dist: pooch<2,>=1.8.2
Requires-Dist: primePy<2,>=1.3
Requires-Dist: prompt_toolkit<4,>=3.0.51
Requires-Dist: propcache<1,>=0.3.2
Requires-Dist: protobuf<7,>=6.31.1
Requires-Dist: ptyprocess<1,>=0.7.0
Requires-Dist: pure_eval<1,>=0.2.3
Requires-Dist: pyannote.audio<4,>=3.3.2
Requires-Dist: pyannote.core<6,>=5.0.0
Requires-Dist: pyannote.database<6,>=5.1.3
Requires-Dist: pyannote.metrics<4,>=3.2.1
Requires-Dist: pyannote.pipeline<4,>=3.0.1
Requires-Dist: pyarrow<22,>=21.0.0
Requires-Dist: pybind11<4,>=3.0.0
Requires-Dist: pycparser<3,>=2.22
Requires-Dist: pydantic<3,>=2.11.7
Requires-Dist: pydantic_core<3,>=2.33.2
Requires-Dist: pydub<1,>=0.25.1
Requires-Dist: Pygments<3,>=2.19.2
Requires-Dist: pyloudnorm<1,>=0.1.1
Requires-Dist: pyparsing<4,>=3.2.3
Requires-Dist: pytest<9,>=8.4.1
Requires-Dist: python-dateutil<3,>=2.9.0.post0
Requires-Dist: pytorch-lightning<3,>=2.5.3
Requires-Dist: pytorch-metric-learning<3,>=2.8.1
Requires-Dist: pytz<2026,>=2025.2
Requires-Dist: PyYAML<7,>=6.0.2
Requires-Dist: RapidFuzz<4,>=3.13.0
Requires-Dist: regex<2026,>=2025.7.34
Requires-Dist: requests<3,>=2.32.4
Requires-Dist: resampy<1,>=0.4.3
Requires-Dist: retrying<2,>=1.4.2
Requires-Dist: rich<15,>=14.1.0
Requires-Dist: ruamel.yaml<1,>=0.18.14
Requires-Dist: ruamel.yaml.clib<1,>=0.2.12
Requires-Dist: sacremoses<1,>=0.1.1
Requires-Dist: safetensors<1,>=0.6.2
Requires-Dist: scikit-learn<2,>=1.7.1
Requires-Dist: scipy<2,>=1.16.1
Requires-Dist: semver<4,>=3.0.4
Requires-Dist: sentencepiece<1,>=0.2.1
Requires-Dist: sentry-sdk<3,>=2.34.1
Requires-Dist: setuptools<81,>=80.9.0
Requires-Dist: shellingham<2,>=1.5.4
Requires-Dist: six<2,>=1.17.0
Requires-Dist: smmap<6,>=5.0.2
Requires-Dist: sortedcontainers<3,>=2.4.0
Requires-Dist: soundfile<1,>=0.13.1
Requires-Dist: sox<2,>=1.5.0
Requires-Dist: soxr<1,>=0.5.0.post1
Requires-Dist: speechbrain<2,>=1.0.3
Requires-Dist: SQLAlchemy<3,>=2.0.43
Requires-Dist: stack-data<1,>=0.6.3
Requires-Dist: submitit<2,>=1.5.3
Requires-Dist: sympy<2,>=1.14.0
Requires-Dist: tabulate<1,>=0.9.0
Requires-Dist: tensorboard<3,>=2.20.0
Requires-Dist: tensorboard-data-server<1,>=0.7.2
Requires-Dist: tensorboardX<3,>=2.6.4
Requires-Dist: termcolor<4,>=3.1.0
Requires-Dist: text-unidecode<2,>=1.3
Requires-Dist: texterrors<2,>=1.0.9
Requires-Dist: threadpoolctl<4,>=3.6.0
Requires-Dist: tokenizers<1,>=0.19.1
Requires-Dist: toolz<2,>=1.0.0
Requires-Dist: torch<3,>=2.8.0
Requires-Dist: torch-audiomentations<1,>=0.12.0
Requires-Dist: torch_pitch_shift<2,>=1.2.5
Requires-Dist: torchaudio<3,>=2.8.0
Requires-Dist: torchmetrics<2,>=1.8.1
Requires-Dist: tqdm<5,>=4.67.1
Requires-Dist: traitlets<6,>=5.14.3
Requires-Dist: transformers<5,>=4.40.2
Requires-Dist: treetable<1,>=0.2.5
Requires-Dist: triton<4,>=3.4.0
Requires-Dist: typeguard<5,>=4.4.4
Requires-Dist: typer<1,>=0.16.0
Requires-Dist: typing-inspection<1,>=0.4.1
Requires-Dist: typing_extensions<5,>=4.14.1
Requires-Dist: tzdata<2026,>=2025.2
Requires-Dist: Unidecode<2,>=1.4.0
Requires-Dist: urllib3<3,>=2.5.0
Requires-Dist: uroman<2,>=1.3.1.1
Requires-Dist: wandb<1,>=0.21.1
Requires-Dist: wcwidth<1,>=0.2.13
Requires-Dist: webdataset<2,>=1.0.2
Requires-Dist: Werkzeug<4,>=3.1.3
Requires-Dist: wget<4,>=3.2
Requires-Dist: whisperx<4,>=3.3.1
Requires-Dist: widgetsnbextension<5,>=4.0.14
Requires-Dist: wrapt<2,>=1.17.3
Requires-Dist: xxhash<4,>=3.5.0
Requires-Dist: yarl<2,>=1.20.1
Provides-Extra: constraints
Requires-Dist: huggingface_hub<0.24; extra == "constraints"
Requires-Dist: numpy<2; extra == "constraints"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# WhisperX-NeMo Pipeline

A production-ready transcription and diarization pipeline with parallel processing.

## Features

- **Parallel Processing**: Runs Whisper transcription and NeMo diarization simultaneously
- **Multiple Backends**: Supports both faster-whisper and WhisperX
- **Speaker Diarization**: Uses NeMo MSDD models for accurate speaker identification
- **Audio Source Separation**: Optional vocal extraction using Demucs
- **Punctuation Restoration**: Automatic punctuation using deep learning models
- **Memory Efficient**: Proper GPU memory management and cleanup

## Installation

```bash
pip install whisperx-nemo-pipeline
```

**With constraints (recommended for production):**
```bash
pip install whisperx-nemo-pipeline -c constraints.txt
```

## Quick Start

```python
from whisperx_nemo_pipeline import create_transcription_pipeline

# Create pipeline
pipeline = create_transcription_pipeline(
    audio_path="path/to/your/audio.wav",
    model_name="large-v2",
    device="cuda",  # or "cpu"
    stemming=True,  # Enable source separation
    backend="faster_whisper"  # or "whisperx"
)

# Process audio
transcript_path, srt_path, timing_info = pipeline.process()

print(f"Transcript saved to: {transcript_path}")
print(f"Subtitles saved to: {srt_path}")
print(f"Processing took: {timing_info['total_time']:.2f}s")
```

## Advanced Usage

```python
from whisperx_nemo_pipeline import TranscriptionPipeline, TranscriptionConfig

# Custom configuration
config = TranscriptionConfig(
    audio_path="path/to/audio.wav",
    model_name="large-v2",
    device="cuda",
    batch_size=8,
    language="en",  # or None for auto-detection
    stemming=True,
    suppress_numerals=False,
    backend="faster_whisper"
)

# Create pipeline with custom config
pipeline = TranscriptionPipeline(config)

# Process
transcript_path, srt_path, timing_info = pipeline.process()
```

## Configuration Options

- `audio_path`: Path to input audio file
- `model_name`: Whisper model size ("tiny", "base", "small", "medium", "large-v2")
- `device`: Computing device ("cuda" or "cpu")
- `batch_size`: Batch size for inference (default: 4)
- `language`: Language code or None for auto-detection
- `stemming`: Enable audio source separation (default: True)
- `suppress_numerals`: Suppress numerical tokens (default: False)
- `backend`: "faster_whisper" or "whisperx"

## Requirements

- Python 3.8+
- CUDA-capable GPU (recommended)
- See `requirements.txt` for full dependency list

## License

MIT License
