Metadata-Version: 2.1
Name: flexrag
Version: 0.1.1
Summary: A RAG Framework for Information Retrieval and Generation.
Home-page: https://github.com/ZhuochengZhang98/flexrag
Author: Zhuocheng Zhang
Author-email: zhuocheng_zhang@outlook.com
License: MIT License
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas
Requires-Dist: numpy<2.0.0
Requires-Dist: transformers>=4.44.0
Requires-Dist: tqdm
Requires-Dist: elasticsearch>=8.14.0
Requires-Dist: torch>=2.3.0
Requires-Dist: sacrebleu>=2.4.2
Requires-Dist: rouge
Requires-Dist: unidecode
Requires-Dist: openai>=1.30.1
Requires-Dist: tenacity
Requires-Dist: lancedb
Requires-Dist: lmdb
Requires-Dist: bm25s
Requires-Dist: hydra-core>=1.3
Requires-Dist: omegaconf>=2.3.0
Requires-Dist: pillow
Requires-Dist: accelerate>=0.26.0
Requires-Dist: beautifulsoup4
Requires-Dist: faiss-cpu
Provides-Extra: gui
Requires-Dist: gradio>=5.8.0; extra == "gui"
Provides-Extra: scann
Requires-Dist: scann>=1.3.2; extra == "scann"
Provides-Extra: llamacpp
Requires-Dist: llama_cpp_python>=0.2.84; extra == "llamacpp"
Provides-Extra: vllm
Requires-Dist: vllm>=0.5.0; extra == "vllm"
Provides-Extra: ollama
Requires-Dist: ollama>=0.2.1; extra == "ollama"
Provides-Extra: duckduckgo
Requires-Dist: duckduckgo_search>=6.1.6; extra == "duckduckgo"
Provides-Extra: anthropic
Requires-Dist: anthropic; extra == "anthropic"
Provides-Extra: minference
Requires-Dist: minference>=0.1.5; extra == "minference"
Provides-Extra: all
Requires-Dist: gradio>=5.8.0; extra == "all"
Requires-Dist: llama_cpp_python>=0.2.84; extra == "all"
Requires-Dist: vllm>=0.5.0; extra == "all"
Requires-Dist: ollama>=0.2.1; extra == "all"
Requires-Dist: duckduckgo_search>=6.1.6; extra == "all"
Requires-Dist: anthropic; extra == "all"
Requires-Dist: minference>=0.1.5; extra == "all"
Requires-Dist: PySocks>=1.7.1; extra == "all"
Provides-Extra: dev
Requires-Dist: black; extra == "dev"
Requires-Dist: pytest; extra == "dev"

<p align="center">
<img src="assets/flexrag-wide.png" width=55%>
</p>

![Language](https://img.shields.io/badge/language-python-brightgreen)
![github license](https://img.shields.io/github/license/ZhuochengZhang98/flexrag)
[![DOI](https://zenodo.org/badge/900151663.svg)](https://doi.org/10.5281/zenodo.14306983)



FlexRAG is an open-source framework designed for Retrieval-Augmented Generation (RAG), combining state-of-the-art information retrieval techniques with large language models (LLMs) for enhanced, context-aware responses.

<!-- <p align="center">
<img src="assets/librarian-gui.gif" width=55%>
</p> -->

# :book: Table of Contents
- [:book: Table of Contents](#book-table-of-contents)
- [:sparkles: Key Features](#sparkles-key-features)
- [:rocket: Getting Started](#rocket-getting-started)
  - [Step 0. Installation](#step-0-installation)
    - [Install from pip](#install-from-pip)
    - [Install from source](#install-from-source)
  - [Step 1. Prepare the Retriever](#step-1-prepare-the-retriever)
    - [Download the Corpus](#download-the-corpus)
    - [Prepare the Index](#prepare-the-index)
  - [Step 2. Run your RAG Application](#step-2-run-your-rag-application)
    - [Run the FlexRAG Example RAG Application with GUI](#run-the-flexrag-example-rag-application-with-gui)
    - [Run the FlexRAG Example Assistants for Knowledge Intensive Tasks](#run-the-flexrag-example-assistants-for-knowledge-intensive-tasks)
    - [Build your own RAG Application](#build-your-own-rag-application)
- [:bar\_chart: Benchmarks](#bar_chart-benchmarks)
- [:label: License](#label-license)
- [:pen: Citation](#pen-citation)
- [:heart: Acknowledgements](#heart-acknowledgements)


# :sparkles: Key Features
- **Multiple Retriever Types**: Supports a wide range of retrievers, including sparse, dense, network-based, and multimodal retrievers.
- **Diverse Information Sources**: Allows integration of multiple data types, such as pure text, images, documents, web snapshots, and more.
- **Unified Configuration Management**: Leveraging OmegaConf and dataclasses, FlexRAG simplifies configuration and management across your entire project.
- **Out-of-the-Box Performance**: Comes with carefully optimized default configurations for retrievers, delivering solid performance without the need for extensive parameter tuning.
- **High Performance**: Built with persistent cache system and asynchronous methods to significantly improve speed and reduce latency in RAG workflows.
- **Research & Development Friendly**: Supports multiple development modes and includes a companion repository, [flexrag_examples](https://github.com/ZhuochengZhang98/flexrag_examples), to help you reproduce various RAG algorithms with ease.
- **Lightweight**: Designed with minimal overhead, FlexRAG is efficient and easy to integrate into your project.


# :rocket: Getting Started

## Step 0. Installation

### Install from pip
To install FlexRAG via pip:
```bash
pip install flexrag
```

### Install from source
Alternatively, to install from the source:
```bash
pip install pybind11

git clone https://github.com/ZhuochengZhang98/flexrag.git
cd flexrag
pip install ./
```
You can also install the FlexRAG in editable mode with the `-e` flag.


## Step 1. Prepare the Retriever

### Download the Corpus
Before starting you RAG application, you need to download the corpus. In this example, we will use the wikipedia corpus provided by [DPR](https://github.com/facebookresearch/DPR) as the corpus. You can download the corpus by running the following command:
```bash
# Download the corpus
wget https://dl.fbaipublicfiles.com/dpr/wikipedia_split/psgs_w100.tsv.gz
# Unzip the corpus
gzip -d psgs_w100.tsv.gz
```

### Prepare the Index
After downloading the corpus, you need to build the index for the retriever. If you want to employ the dense retriever, you can simply run the following command to build the index:
```bash
CORPUS_PATH=psgs_w100.tsv.gz
CORPUS_FIELDS='[title,text]'
DB_PATH=<path_to_database>

python -m flexrag.entrypoints.prepare_index \
    corpus_path=$CORPUS_PATH \
    saving_fields=$CORPUS_FIELDS \
    retriever_type=dense \
    dense_config.database_path=$DB_PATH \
    dense_config.encode_fields='[text]' \
    dense_config.passage_encoder_config.encoder_type=hf \
    dense_config.passage_encoder_config.hf_config.model_path='facebook/contriever' \
    dense_config.passage_encoder_config.hf_config.device_id=[0,1,2,3] \
    dense_config.index_type=faiss \
    dense_config.faiss_config.batch_size=4096 \
    dense_config.faiss_config.log_interval=100000 \
    dense_config.batch_size=4096 \
    dense_config.log_interval=100000 \
    reinit=True
```

If you want to employ the sparse retriever, you can run the following command to build the index:
```bash
CORPUS_PATH=psgs_w100.tsv.gz
CORPUS_FIELDS='[title,text]'
DB_PATH=<path_to_database>

python -m flexrag.entrypoints.prepare_index \
    corpus_path=$CORPUS_PATH \
    saving_fields=$CORPUS_FIELDS \
    retriever_type=bm25s \
    bm25s_config.database_path=$DB_PATH \
    bm25s_config.indexed_fields='[title,text]' \
    bm25s_config.method=lucene \
    bm25s_config.batch_size=512 \
    bm25s_config.log_interval=100000 \
    reinit=True
```

## Step 2. Run your RAG Application
When the index is ready, you can run your RAG application. Here is an example of how to run a RAG application.

### Run the FlexRAG Example RAG Application with GUI
```bash
python -m flexrag.entrypoints.run_interactive \
    assistant_type=modular \
    modular_config.used_fields=[title,text] \
    modular_config.retriever_type=dense \
    modular_config.dense_config.top_k=5 \
    modular_config.dense_config.database_path=${DB_PATH} \
    modular_config.dense_config.query_encoder_config.encoder_type=hf \
    modular_config.dense_config.query_encoder_config.hf_config.model_path='facebook/contriever' \
    modular_config.dense_config.query_encoder_config.hf_config.device_id=[0] \
    modular_config.response_type=short \
    modular_config.generator_type=openai \
    modular_config.openai_config.model_name='gpt-4o-mini' \
    modular_config.openai_config.api_key=$OPENAI_KEY \
    modular_config.do_sample=False
```

### Run the FlexRAG Example Assistants for Knowledge Intensive Tasks
You can evaluate your RAG application on several knowledge intensive datasets with great ease. The following command let you evaluate the modular RAG assistant with dense retriever on the Natural Questions (NQ) dataset:
```bash
OUTPUT_PATH=<path_to_output>
DB_PATH=<path_to_database>
OPENAI_KEY=<your_openai_key>

python -m flexrag.entrypoints.run_assistant \
    data_path=flash_rag/nq/test.jsonl \
    output_path=${OUTPUT_PATH} \
    assistant_type=modular \
    modular_config.used_fields=[title,text] \
    modular_config.retriever_type=dense \
    modular_config.dense_config.top_k=10 \
    modular_config.dense_config.database_path=${DB_PATH} \
    modular_config.dense_config.query_encoder_config.encoder_type=hf \
    modular_config.dense_config.query_encoder_config.hf_config.model_path='facebook/contriever' \
    modular_config.dense_config.query_encoder_config.hf_config.device_id=[0] \
    modular_config.response_type=short \
    modular_config.generator_type=openai \
    modular_config.openai_config.model_name='gpt-4o-mini' \
    modular_config.openai_config.api_key=$OPENAI_KEY \
    modular_config.do_sample=False \
    eval_config.metrics_type=[retrieval_success_rate,generation_f1,generation_em] \
    eval_config.retrieval_success_rate_config.context_preprocess.processor_type=[simplify_answer] \
    eval_config.retrieval_success_rate_config.eval_field=text \
    eval_config.response_preprocess.processor_type=[simplify_answer] \
    log_interval=10
```

Similarly, you can evaluate the modular RAG assistant with sparse retriever on the Natural Questions dataset:
```bash
OUTPUT_PATH=<path_to_output>
DB_PATH=<path_to_database>
OPENAI_KEY=<your_openai_key>

python -m flexrag.entrypoints.run_assistant \
    data_path=flash_rag/nq/test.jsonl \
    output_path=${OUTPUT_PATH} \
    assistant_type=modular \
    modular_config.used_fields=[title,text] \
    modular_config.retriever_type=bm25s \
    modular_config.bm25s_config.top_k=10 \
    modular_config.bm25s_config.database_path=${DB_PATH} \
    modular_config.response_type=short \
    modular_config.generator_type=openai \
    modular_config.openai_config.model_name='gpt-4o-mini' \
    modular_config.openai_config.api_key=$OPENAI_KEY \
    modular_config.do_sample=False \
    eval_config.metrics_type=[retrieval_success_rate,generation_f1,generation_em] \
    eval_config.retrieval_success_rate_config.context_preprocess.processor_type=[simplify_answer] \
    eval_config.retrieval_success_rate_config.eval_field=text \
    eval_config.response_preprocess.processor_type=[simplify_answer] \
    log_interval=10
```

You can also evaluate your own assistant by adding the `user_module=<your_module_path>` argument to the command.

### Build your own RAG Application
To build your own RAG application, you can create a new Python file and import the necessary FlexRAG modules. Here is an example of how to build a RAG application:
```python
from flexrag.retrievers import DenseRetriever, DenseRetrieverConfig
from flexrag.models import OpenAIGenerator, OpenAIGeneratorConfig
from flexrag.prompt import ChatPrompt, ChatTurn


def main():
    # Initialize the retriever
    retriever = DenseRetriever(
        DenseRetrieverConfig(
            database_path='path_to_database',
            encode_fields=['text'],
            top_k=1,
            passage_encoder_config=HfEncoderConfig(
                encoder_type='hf',
                hf_config=HfConfig(
                    model_path='facebook/contriever',
                    device_id=[0],
                )
            ),
        )
    )

    # Initialize the generator
    generator = OpenAIGenerator(
        OpenAIGeneratorConfig(
            model_name='gpt-4o-mini',
            api_key='your_openai_key',
            do_sample=False
        )
    )

    # Run your RAG application
    Prompt = ChatPrompt()
    while True:
        query = input('Please input your query: ')
        context = retriever.retrieve(query)[0]
        prompt_str = f"Question: {query}\nContext: {context.data["text"]}"
        prompt.update(ChatTurn(role="user", content=prompt_str))
        response = generator.chat(prompt)
        prompt.update(ChatTurn(role="assistant", content=response))
        print(response)
    return

if __name__ == "__main__":
    main()
```
We also provide several detailed examples of how to build a RAG application in the [flexrag examples](https://github.com/ZhuochengZhang98/flexrag_examples) repository.


# :bar_chart: Benchmarks
We have conducted extensive benchmarks using the FlexRAG framework. For more details, please refer to the [benchmarks](benchmarks.md) page.

# :label: License
This repository is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.


# :pen: Citation
If you find this project helpful, please consider citing it:

```bibtex
@software{FlexRAG,
  author = {Zhang Zhuocheng},
  doi = {10.5281/zenodo.14306984},
  month = {12},
  title = {{FlexRAG}},
  url = {https://github.com/ZhuochengZhang98/flexrag},
  version = {0.1.0},
  year = {2024}
}
```

# :heart: Acknowledgements
This project benefits from the following open-source projects:
- [Faiss](https://github.com/facebookresearch/faiss)
- [FlashRAG](https://github.com/RUC-NLPIR/FlashRAG)
- [LanceDB](https://github.com/lancedb/lancedb)
- [ANN Benchmarks](https://github.com/erikbern/ann-benchmarks)
- [Chonkie](https://github.com/chonkie-ai/chonkie)
- [rerankers](https://github.com/AnswerDotAI/rerankers)
