Metadata-Version: 2.4
Name: small-hybrid-reranker
Version: 0.1.0
Summary: Lightweight hybrid reranker with baked-in model artifact.
Author: cnmoro
Keywords: nlp,ranking,reranker,search
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: joblib>=1.3
Requires-Dist: lightgbm>=4.5.0
Requires-Dist: model2vec>=0.7.0
Requires-Dist: numpy>=1.24
Requires-Dist: rank-bm25>=0.2.2
Description-Content-Type: text/markdown

# small-hybrid-reranker

`small-hybrid-reranker` is a lightweight reranker package with a baked-in trained model.

It reranks a list of passages for a query using a hybrid feature stack:
- static embeddings (`cnmoro/static-nomic-384-pten`)
- lexical overlap and token interaction sketches
- BM25 and dense retrieval priors
- listwise LightGBM ranker

The model artifact is included in the package, so there is no separate checkpoint download.

## Install

```bash
pip install small-hybrid-reranker
```

## Quickstart

```python
from small_hybrid_reranker import HybridReranker

reranker = HybridReranker()

query = "What is the speed of light?"
passages = [
    "The speed of light in a vacuum is about 299,792 km/s.",
    "Earth orbits the Sun in about 365 days.",
    "Newton described laws of motion.",
]

ranked = reranker.rerank(query, passages)
print(ranked[0])
# {'passage': 'The speed of light in a vacuum is about 299,792 km/s.', 'score': 100.0}
```

## API

### `HybridReranker(model_path: str | None = None)`

- `model_path=None`: uses baked-in model inside package.
- `model_path="...joblib"`: load your own compatible artifact.

### `rerank(query: str, passages: list[str], top_k: int | None = None) -> list[dict]`

Returns:

```python
[
  {"passage": "...", "score": 82.31},
  {"passage": "...", "score": 40.87},
]
```

Scores are floats in `[0, 100]` and sorted descending.

## Notes

- This package is optimized for reranking a provided candidate list.
- It is not a full retrieval system by itself.
