Metadata-Version: 2.4
Name: jaxcld
Version: 0.1.0
Summary: CLD: language detection heads for ASR models
Author: CLD contributors
License: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.24
Requires-Dist: torch>=2.0.0
Requires-Dist: torchaudio>=2.0.0
Requires-Dist: transformers==4.56.2
Requires-Dist: scikit-learn>=1.3.0
Provides-Extra: train
Requires-Dist: datasets[audio]==3.6.0; extra == "train"
Requires-Dist: soundfile>=0.12.1; extra == "train"
Requires-Dist: scipy>=1.10; extra == "train"
Requires-Dist: tqdm>=4.66; extra == "train"
Requires-Dist: pandas>=1.5.0; extra == "train"
Requires-Dist: librosa>=0.10.1; extra == "train"
Requires-Dist: noisereduce>=3.0.0; extra == "train"
Requires-Dist: pydub>=0.25.1; extra == "train"
Requires-Dist: accelerate>=0.20.0; extra == "train"
Requires-Dist: evaluate>=0.4.0; extra == "train"
Requires-Dist: jiwer>=3.0.0; extra == "train"
Requires-Dist: torchcodec==0.10.0; extra == "train"
Requires-Dist: wandb>=0.15.0; extra == "train"
Requires-Dist: tensorboard>=2.13.0; extra == "train"
Requires-Dist: huggingface_hub>=0.17.0; extra == "train"
Requires-Dist: gradio>=3.0.0; extra == "train"
Requires-Dist: audiomentations==0.43.1; extra == "train"
Requires-Dist: jax==0.7.2; extra == "train"
Requires-Dist: optax==0.2.6; extra == "train"
Requires-Dist: flax==0.11.2; extra == "train"
Requires-Dist: python-dotenv==1.1.1; extra == "train"

## jaxcld

`jaxcld` is a lightweight language-detection module for multilingual ASR models (Whisper / MMS). It provides an `ASRModel` wrapper plus pluggable language detection heads you can attach at inference time.

## Install

```bash
pip install jaxcld
```

If you are developing from source:

```bash
pip install -e .
```

## Using the package (minimal inference example)

```python
import numpy as np

from jaxcld import ASRModel, CVXNNLangDetectHead, NNLangDetectHead, SVMLangDetectHead

# 1) Load the base ASR model
languages = ["en", "hi", "id", "ms", "zh"]
asr = ASRModel.from_pretrained("openai/whisper-small", config={"languages": languages})

# 2) Load a language detection head artifact (choose ONE)
# head = CVXNNLangDetectHead.load("path/to/whisper-small_trained_cvx_mlp.pkl", asr)
# head = NNLangDetectHead.load("path/to/openai_whisper-small_nn_head.pkl", asr)
# head = SVMLangDetectHead.load("path/to/openai_whisper-small_linear_svm.pkl", asr)

# 3) Attach head and run inference
asr.set_lang_detect_head(head)

audio_16k_mono: np.ndarray = ...  # shape (T,), sampling rate 16kHz
pred_langs, pred_texts = asr.predict(audio_16k_mono)
print(pred_langs[0], pred_texts[0])
```

## Notes

- Head artifacts (`*.pkl`) are produced by training scripts in the source repository; this pip README intentionally focuses only on **package usage**.

