Metadata-Version: 2.3
Name: fasr-lid-firered
Version: 0.5.2
Summary: FireRedLID language identification model for fasr
Author: osc
Author-email: osc <790990241@qq.com>
Requires-Dist: fasr
Requires-Dist: torch>=2.0.0
Requires-Dist: numpy>=1.24
Requires-Dist: kaldiio>=2.18.0
Requires-Dist: kaldi-native-fbank>=1.19.0
Requires-Python: >=3.10, <3.13
Description-Content-Type: text/markdown

# fasr-lid-firered

[Chinese documentation](README_ZH.md)

FireRedLID language identification for fasr. It accepts a `Waveform` and returns
a language tag such as `"zh"` or `"en"`.

## Install

```bash
pip install fasr-lid-firered
```

## Registered Model

| Registry name | Class | Best for |
|---|---|---|
| `firered` | `FireRedForLID` | Audio language identification with FireRedLID |

The default checkpoint is `FireRedTeam/FireRedLID`.

## Direct Model Usage

```python
from fasr.config import registry
from fasr.data import Waveform

model = registry.lid_models.get("firered")(
    use_gpu=True,
    use_half=True,
    max_chunk_seconds=60.0,
)

waveform = Waveform.from_file("example.wav")
language = model.identify(waveform)
print(language)
```

Use local weights:

```python
model.load_checkpoint("/path/to/FireRedLID")
```

## Confection Config

```toml
[lid_model]
@lid_models = "firered"
use_gpu = true
use_half = false
max_chunk_seconds = 60.0
```

## Parameters

| Parameter | Type / range | Default | Higher / true | Lower / false | Change when |
|---|---|---|---|---|---|
| `use_gpu` | `bool` | `True` | Uses CUDA when available | Uses CPU | You want predictable CPU deployment or faster GPU inference |
| `use_half` | `bool` | `False` | Uses FP16 on GPU, lower VRAM | Uses FP32, more stable | GPU memory is tight, or FP16 causes instability |
| `max_chunk_seconds` | `float > 0` | `60.0` | Fewer chunks, more memory per call | More chunks, less memory per call | Long audio causes OOM, or throughput needs tuning |

Generic checkpoint fields such as `checkpoint`, `cache_dir`, `endpoint`,
`revision`, and `force_download` are inherited from the base model.

## Tuning Guide

| Symptom | Try first |
|---|---|
| GPU out of memory on long audio | Lower `max_chunk_seconds` to `20.0` or `30.0` |
| GPU memory is tight | Set `use_half=True` |
| CPU-only deployment | Set `use_gpu=False` |
| Language result is unstable on very long audio | Keep chunking enabled and aggregate over multiple chunks, the model already votes internally |

## Dependencies

- `fasr`
- `torch >= 2.0.0`
- `numpy >= 1.24`
- `kaldiio >= 2.18.0`
- `kaldi-native-fbank >= 1.19.0`
- Python 3.10-3.12
