Metadata-Version: 2.4
Name: suiteeval
Version: 0.1.2
Summary: Tools for running IR Evaluation Suites
Author-email: Sean MacAvaney <sean.macavaney@glasgow.ac.uk>, Andrew Parry <0andrewparry@gmail.com>
Project-URL: Repository, https://github.com/mrparryparry/suiteeval
Project-URL: Bug Tracker, https://github.com/mrparryparry/suiteeval/issues
Classifier: Programming Language :: Python
Classifier: Operating System :: OS Independent
Classifier: Topic :: Text Processing
Classifier: Topic :: Text Processing :: Indexing
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: python-terrier
Dynamic: license-file

# 🍬 SuiteEval

[![Python](https://img.shields.io/badge/python-3.10%2B-blue)](https://www.python.org/)
[![License](https://img.shields.io/badge/license-MIT-green)](./LICENSE)
[![PyTerrier](https://img.shields.io/badge/PyTerrier-Compatible-orange)](https://github.com/terrier-org/pyterrier)

Tools for running **IR evaluation suites** with [PyTerrier](https://github.com/terrier-org/pyterrier).  
SuiteEval helps you define, run, and aggregate evaluations across datasets while managing temporary indices and memory footprint.

## 📘 Overview

**SuiteEval** provides:
- Declaration of **pipelines** (BM25, dense, re-ranking chains).
- Execution of **evaluation suites** (e.g., BEIR-style benchmarks).
- **DatasetContext** utilities for temporary paths and text loading.
- **DataFrame** outputs for downstream analysis.

Workflow:
1) Implement `pipelines(context)` that yields one or more PyTerrier pipelines (optionally named).  
2) Pass it to a suite (e.g., `BEIR`).  
3) Analyse the returned DataFrame.

## 🚀 Getting Started

### Install from PyPI
```bash
pip install suiteeval
```
### Install from source
```bash
git clone https://github.com/<your-username>/suiteeval.git
cd suiteeval
pip install -e .
```

## ⚙️ Defining Pipelines

Write a callable that accepts a `DatasetContext` and **returns or yields** pipelines.

- **Return** a list/tuple of pipelines or `(pipeline, name)` pairs; **or**  
- **Yield** pipelines to keep only one large model resident in memory.

`DatasetContext` provides:
- `context.path` — temporary working directory for indices/artifacts.  
- `context.get_corpus_iter()` — iterator suitable for indexing.  
- `context.text_loader()` — attaches document text for re-ranking.

### Example
```python
from suiteeval import BEIR
from pyterrier_pisa import PisaIndex
from pyterrier_dr import ElectraScorer
from pyterrier_t5 import MonoT5ReRanker

def pipelines(context):
    index = PisaIndex(context.path + "/index.pisa")
    index.index(context.get_corpus_iter())

    bm25 = index.bm25()
    yield bm25 >> context.text_loader() >> MonoT5ReRanker(), "BM25 >> monoT5"
    yield bm25 >> context.text_loader() >> ElectraScorer(), "BM25 >> monoELECTRA"

results = BEIR(pipelines)
```

## 🧪 Running Suites

Entry points (e.g., `BEIR`) accept your pipeline factory and return a DataFrame:

```python
results = BEIR(pipelines)  # per-dataset metrics and system names (if provided)
```

## 📦 Reproducibility & Resource Management

- **Temporary indices** live under `context.path` and are cleaned up.  
- Prefer **yielding** pipelines when using large models.  
- Name systems via `(pipeline, "<name>")` for clear result tables and logs.

## 🛠️ Compatibility

Works with modern PyTerrier and common extensions  
(e.g., `pyterrier_pisa`, `pyterrier_dr`, `pyterrier_t5`).  
For older environments, ensure standard PyTerrier transformer interfaces.

## 👥 Authors

- [Andrew Parry](mailto:a.parry.1@research.gla.ac.uk)
- [Sean MacAvaney](mailto:Sean.MacAvaney@glasgow.ac.uk)

## 🧾 Version History

| Version | Date       | Changes        |
|-------:|------------|----------------|
|    0.1 | 2025-11-03 | Initial README |

## License

This project is licensed under the **MIT License** — see the [LICENSE](./LICENSE) file for details.
