Metadata-Version: 2.4
Name: plantain2asr
Version: 1.0.4
Summary: A benchmarking and analysis framework for Russian ASR models
License-Expression: MIT
Project-URL: Homepage, https://github.com/akatsnelson/plantain2asr
Project-URL: Documentation, https://akatsnelson.github.io/plantain2asr
Project-URL: Issues, https://github.com/akatsnelson/plantain2asr/issues
Keywords: asr,speech-recognition,benchmarking,nlp,russian,wer
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: jiwer>=3.0
Requires-Dist: tqdm>=4.64
Requires-Dist: requests>=2.28
Provides-Extra: tone
Requires-Dist: miniaudio>=1.2; extra == "tone"
Requires-Dist: onnxruntime>=1.16; extra == "tone"
Provides-Extra: tone-cpu
Requires-Dist: miniaudio>=1.2; extra == "tone-cpu"
Requires-Dist: onnxruntime>=1.16; extra == "tone-cpu"
Provides-Extra: tone-gpu
Requires-Dist: miniaudio>=1.2; extra == "tone-gpu"
Requires-Dist: onnxruntime-gpu>=1.16; extra == "tone-gpu"
Provides-Extra: gigaam-v2
Requires-Dist: gigaam>=0.1.0; extra == "gigaam-v2"
Provides-Extra: gigaam-v3
Requires-Dist: torch<2.6,>=2.5; extra == "gigaam-v3"
Requires-Dist: torchaudio<2.6,>=2.5; extra == "gigaam-v3"
Requires-Dist: transformers<5,>=4.40; extra == "gigaam-v3"
Requires-Dist: accelerate>=0.27; extra == "gigaam-v3"
Requires-Dist: hydra-core>=1.3; extra == "gigaam-v3"
Requires-Dist: omegaconf>=2.3; extra == "gigaam-v3"
Requires-Dist: sentencepiece>=0.1.99; extra == "gigaam-v3"
Provides-Extra: gigaam
Requires-Dist: gigaam>=0.1.0; extra == "gigaam"
Requires-Dist: torch<2.6,>=2.5; extra == "gigaam"
Requires-Dist: torchaudio<2.6,>=2.5; extra == "gigaam"
Requires-Dist: transformers<5,>=4.40; extra == "gigaam"
Requires-Dist: accelerate>=0.27; extra == "gigaam"
Requires-Dist: hydra-core>=1.3; extra == "gigaam"
Requires-Dist: omegaconf>=2.3; extra == "gigaam"
Requires-Dist: sentencepiece>=0.1.99; extra == "gigaam"
Provides-Extra: whisper
Requires-Dist: torch<2.6,>=2.5; extra == "whisper"
Requires-Dist: transformers<5,>=4.40; extra == "whisper"
Requires-Dist: accelerate>=0.27; extra == "whisper"
Requires-Dist: librosa>=0.10; extra == "whisper"
Provides-Extra: vosk
Requires-Dist: vosk>=0.3.45; extra == "vosk"
Provides-Extra: canary
Requires-Dist: nemo_toolkit[asr]>=1.23; extra == "canary"
Provides-Extra: analysis
Requires-Dist: pandas>=1.5; extra == "analysis"
Requires-Dist: numpy>=1.23; extra == "analysis"
Requires-Dist: matplotlib>=3.6; extra == "analysis"
Requires-Dist: seaborn>=0.12; extra == "analysis"
Requires-Dist: scikit-learn>=1.2; extra == "analysis"
Requires-Dist: pymorphy3>=1.0; extra == "analysis"
Requires-Dist: gensim>=4.3; extra == "analysis"
Requires-Dist: bert-score>=0.3.13; extra == "analysis"
Requires-Dist: torchmetrics>=1.0; extra == "analysis"
Requires-Dist: num2words>=0.5.13; extra == "analysis"
Provides-Extra: train
Requires-Dist: torch<2.6,>=2.5; extra == "train"
Requires-Dist: transformers<5,>=4.40; extra == "train"
Requires-Dist: accelerate>=0.27; extra == "train"
Requires-Dist: datasets>=2.14; extra == "train"
Requires-Dist: wandb>=0.16; extra == "train"
Provides-Extra: asr-cpu
Requires-Dist: miniaudio>=1.2; extra == "asr-cpu"
Requires-Dist: onnxruntime>=1.16; extra == "asr-cpu"
Requires-Dist: gigaam>=0.1.0; extra == "asr-cpu"
Requires-Dist: torch<2.6,>=2.5; extra == "asr-cpu"
Requires-Dist: torchaudio<2.6,>=2.5; extra == "asr-cpu"
Requires-Dist: transformers<5,>=4.40; extra == "asr-cpu"
Requires-Dist: accelerate>=0.27; extra == "asr-cpu"
Requires-Dist: hydra-core>=1.3; extra == "asr-cpu"
Requires-Dist: omegaconf>=2.3; extra == "asr-cpu"
Requires-Dist: sentencepiece>=0.1.99; extra == "asr-cpu"
Requires-Dist: librosa>=0.10; extra == "asr-cpu"
Requires-Dist: vosk>=0.3.45; extra == "asr-cpu"
Provides-Extra: asr-gpu
Requires-Dist: miniaudio>=1.2; extra == "asr-gpu"
Requires-Dist: onnxruntime-gpu>=1.16; extra == "asr-gpu"
Requires-Dist: gigaam>=0.1.0; extra == "asr-gpu"
Requires-Dist: torch<2.6,>=2.5; extra == "asr-gpu"
Requires-Dist: torchaudio<2.6,>=2.5; extra == "asr-gpu"
Requires-Dist: transformers<5,>=4.40; extra == "asr-gpu"
Requires-Dist: accelerate>=0.27; extra == "asr-gpu"
Requires-Dist: hydra-core>=1.3; extra == "asr-gpu"
Requires-Dist: omegaconf>=2.3; extra == "asr-gpu"
Requires-Dist: sentencepiece>=0.1.99; extra == "asr-gpu"
Requires-Dist: librosa>=0.10; extra == "asr-gpu"
Requires-Dist: vosk>=0.3.45; extra == "asr-gpu"
Provides-Extra: all
Requires-Dist: miniaudio>=1.2; extra == "all"
Requires-Dist: onnxruntime>=1.16; extra == "all"
Requires-Dist: gigaam>=0.1.0; extra == "all"
Requires-Dist: torch<2.6,>=2.5; extra == "all"
Requires-Dist: torchaudio<2.6,>=2.5; extra == "all"
Requires-Dist: transformers<5,>=4.40; extra == "all"
Requires-Dist: accelerate>=0.27; extra == "all"
Requires-Dist: hydra-core>=1.3; extra == "all"
Requires-Dist: omegaconf>=2.3; extra == "all"
Requires-Dist: sentencepiece>=0.1.99; extra == "all"
Requires-Dist: librosa>=0.10; extra == "all"
Requires-Dist: vosk>=0.3.45; extra == "all"
Requires-Dist: pandas>=1.5; extra == "all"
Requires-Dist: numpy>=1.23; extra == "all"
Requires-Dist: matplotlib>=3.6; extra == "all"
Requires-Dist: seaborn>=0.12; extra == "all"
Requires-Dist: scikit-learn>=1.2; extra == "all"
Requires-Dist: pymorphy3>=1.0; extra == "all"
Requires-Dist: gensim>=4.3; extra == "all"
Requires-Dist: bert-score>=0.3.13; extra == "all"
Requires-Dist: torchmetrics>=1.0; extra == "all"
Requires-Dist: num2words>=0.5.13; extra == "all"
Requires-Dist: datasets>=2.14; extra == "all"
Requires-Dist: wandb>=0.16; extra == "all"
Provides-Extra: all-gpu
Requires-Dist: miniaudio>=1.2; extra == "all-gpu"
Requires-Dist: onnxruntime-gpu>=1.16; extra == "all-gpu"
Requires-Dist: gigaam>=0.1.0; extra == "all-gpu"
Requires-Dist: torch<2.6,>=2.5; extra == "all-gpu"
Requires-Dist: torchaudio<2.6,>=2.5; extra == "all-gpu"
Requires-Dist: transformers<5,>=4.40; extra == "all-gpu"
Requires-Dist: accelerate>=0.27; extra == "all-gpu"
Requires-Dist: hydra-core>=1.3; extra == "all-gpu"
Requires-Dist: omegaconf>=2.3; extra == "all-gpu"
Requires-Dist: sentencepiece>=0.1.99; extra == "all-gpu"
Requires-Dist: librosa>=0.10; extra == "all-gpu"
Requires-Dist: vosk>=0.3.45; extra == "all-gpu"
Requires-Dist: pandas>=1.5; extra == "all-gpu"
Requires-Dist: numpy>=1.23; extra == "all-gpu"
Requires-Dist: matplotlib>=3.6; extra == "all-gpu"
Requires-Dist: seaborn>=0.12; extra == "all-gpu"
Requires-Dist: scikit-learn>=1.2; extra == "all-gpu"
Requires-Dist: pymorphy3>=1.0; extra == "all-gpu"
Requires-Dist: gensim>=4.3; extra == "all-gpu"
Requires-Dist: bert-score>=0.3.13; extra == "all-gpu"
Requires-Dist: torchmetrics>=1.0; extra == "all-gpu"
Requires-Dist: num2words>=0.5.13; extra == "all-gpu"
Requires-Dist: datasets>=2.14; extra == "all-gpu"
Requires-Dist: wandb>=0.16; extra == "all-gpu"
Dynamic: license-file

# 🌱 plantain2asr

[![PyPI version](https://img.shields.io/pypi/v/plantain2asr.svg)](https://pypi.org/project/plantain2asr/)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Docs](https://img.shields.io/badge/docs-github%20pages-blue.svg)](https://akatsnelson.github.io/plantain2asr)

**Benchmarking and analysis framework for Russian ASR models.**

`plantain2asr` lets you compare ASR backends, normalize transcripts, compute metrics, inspect errors, benchmark latency, and export research artifacts without losing the underlying composable pipeline model.

## Start Here

There are three entry points, ordered from simplest to most flexible:

1. **Interactive Constructor**: open the docs constructor and assemble a ready-made chain visually.
2. **`Experiment` facade**: run common research scenarios with a few high-level calls.
3. **`>>` pipeline API**: build custom chains from datasets, models, normalizers, metrics, reports, and analyzers.

Docs: [akatsnelson.github.io/plantain2asr](https://akatsnelson.github.io/plantain2asr)

## Install

```bash
# Core only: datasets, normalization, metrics, reports
pip install plantain2asr

# Common CPU-only local stack
pip install plantain2asr[asr-cpu]

# Common GPU-ready local stack
pip install plantain2asr[asr-gpu]

# Individual model families
pip install plantain2asr[gigaam]
pip install plantain2asr[whisper]
pip install plantain2asr[vosk]
pip install plantain2asr[canary]
pip install plantain2asr[tone]
pip install "tone @ https://github.com/voicekit-team/T-one/archive/3c5b6c015038173840e62cea99e10cdb1c759116.tar.gz"

# Research analysis tools
pip install plantain2asr[analysis]

# Everything
pip install plantain2asr[all]
```

Device selection is automatic where supported: NVIDIA GPU first, then MPS, then CPU.

## Recommended Quick Start

For most research workflows, start with the `>>` pipeline:

```python
from plantain2asr import GolosDataset, Models, SimpleNormalizer, Metrics

ds = GolosDataset("data/golos")

ds >> Models.GigaAM_v3()
ds >> Models.Whisper()

norm = ds >> SimpleNormalizer()
norm >> Metrics.composite()

df = norm.to_pandas()
print(df.groupby("model")[["WER", "CER", "Accuracy"]].mean().sort_values("WER"))
```

If you want ready-made research scenarios on top of the same building blocks, use `Experiment`:

- `Experiment.compare_on_corpus()` for straightforward model comparison
- `Experiment.prepare_thesis_tables()` for publication-ready aggregate tables
- `Experiment.export_appendix_bundle()` for a full appendix bundle with exports and optional static report
- `Experiment.benchmark_models()` for latency, throughput, and RTF measurement

## Advanced Pipeline API

If you want full composability, the canonical chain is still:

```python
from plantain2asr import GolosDataset, Models, DagrusNormalizer, Metrics, ReportServer

ds = GolosDataset("data/golos")
ds >> Models.GigaAM_v3()
ds >> Models.Whisper()

norm = ds >> DagrusNormalizer()
norm >> Metrics.composite()

ReportServer(norm, audio_dir="data/golos").serve()
```

Pipeline rules:

- Every step returns a dataset or processor-compatible object.
- Normalization creates a new dataset view and does not mutate the original.
- Model results are cached and safe to resume.
- You can branch at any point with `filter()`, `take()`, or cloned views.

## Supported Models

| Call | Stored name | Extra | Device |
|---|---|---|---|
| `Models.GigaAM_v3()` | `GigaAM-v3-e2e_rnnt` | `gigaam` | CUDA / MPS / CPU |
| `Models.GigaAM_v3(model_name="e2e_ctc")` | `GigaAM-v3-e2e_ctc` | `gigaam` | CUDA / MPS / CPU |
| `Models.GigaAM_v3(model_name="rnnt")` | `GigaAM-v3-rnnt` | `gigaam` | CUDA / MPS / CPU |
| `Models.GigaAM_v3(model_name="ctc")` | `GigaAM-v3-ctc` | `gigaam` | CUDA / MPS / CPU |
| `Models.GigaAM_v2(model_name="v2_rnnt")` | `GigaAM-v2_rnnt` | `gigaam` | CUDA / MPS / CPU |
| `Models.GigaAM_v2(model_name="v2_ctc")` | `GigaAM-v2_ctc` | `gigaam` | CUDA / MPS / CPU |
| `Models.Whisper()` | `Whisper-whisper-large-v3-ru-podlodka` | `whisper` | CUDA / MPS / CPU |
| `Models.Tone()` | `T-One` | `tone` + T-One source archive | CUDA / CPU |
| `Models.Vosk(model_path=...)` | `Vosk` | `vosk` | CPU |
| `Models.Canary()` | `Canary-1B` | `canary` | CUDA |
| `Models.SaluteSpeech()` | `SaluteSpeech` | none | cloud |

You can also resolve models by user-facing names with `Models.create(...)`, including case and separator variants such as `"gigaam_v3"`, `"GigaAM-v3"`, or `"tone"`.

## Typical Research Outputs

- Metrics tables as Python dicts or pandas DataFrames
- Leaderboards sorted by a chosen metric
- Error-case tables and CSV exports
- Static HTML reports for sharing without a running server
- Appendix bundles with thesis-ready artifacts
- Benchmark summaries for CPU, CUDA, or MPS

## Extending

plantain2asr keeps the "plantain" idea of modular composition. If the built-in stack is not enough, extend one of the four base types:

- `BaseASRModel`
- `BaseNormalizer`
- `BaseMetric`
- `BaseSection`

See the docs extending guides for custom components and implementation patterns.

## License

MIT — Artem Katsnelson
