Metadata-Version: 2.4
Name: lisaai
Version: 0.2.0
Summary: High-level APIs and training utilities for the LISA Vision-to-Speech model
Author-email: LISA Team <lisateam@gmail.com>
License-Expression: Apache-2.0
Project-URL: homepage, https://lisaai.com/lisa
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: torch
Requires-Dist: safetensors

# lisaai

High-level, repo-independent tools for the LISA Vision-to-Speech model.

## Features (v0.2.0)
- **True Vision-to-Speech**: Direct vision-to-mel-spectrogram synthesis avoiding any text/CTC intermediate representation.
- **Improved Fusion**: Proper cross-attention between vision and audio streams.
- **Training Utilities**: Full support for masked modelling, contrastive alignment, and mel-generation losses.

## Install

From PyPI:
```bash
pip install lisaai --upgrade
```

From source:
```bash
pip install -e .
```

## CLI

- Inspect a checkpoint (folder or .safetensors):
```
lisa.inspect "PATH/TO/MODEL_DIR"
```
- Compare param counts by saved prefixes:
```
lisa.compare-params "PATH/TO/MODEL_DIR"
```

## Python API

```python
from lisaai import inspect_checkpoint, compare_param_counts

rep = inspect_checkpoint("PATH/TO/MODEL_DIR")
print(rep["inferred_dimensions"])  # {vision_embed_dim, fusion_hidden_dim}
```

To use runtime loading (optional, if the original repo is available):
```python
from lisaai import load_model
m = load_model()  # will use LISA_MODEL_PATH if set
```
