Metadata-Version: 2.4
Name: promptplanner
Version: 0.0.1
Summary: Prompt Execution Path Selection in Retrieval-Based LLM Inference Systems
Author: Yinsicheng Jiang, Chivier Humber
License: MIT
Project-URL: Homepage, https://github.com/SecretSettler/PromptPlanner
Keywords: feed,reader,tutorial
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: datasets
Requires-Dist: transformers
Requires-Dist: torch
Requires-Dist: elasticsearch==8.18.1
Requires-Dist: aiohttp
Requires-Dist: ujson
Requires-Dist: scipy
Requires-Dist: faiss-gpu-cu12
Provides-Extra: dev
Requires-Dist: black; extra == "dev"
Requires-Dist: bumpver; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: pip-tools; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: ipython; extra == "dev"

# PromptPlanner

## Install
Python >=3.10
```bash
git clone https://github.com/SecretSettler/PromptPlanner.git
cd PromptPlanner
pip install -e .
```

## Data Format:
Currently we only support data with `jsonl` format. Each json should at least contain these attributes:
```json
{
    "qid": 0,
    "text": "Is the sky blue?",
    "answer": ["Yes", "Yes the sky is blue"],
    "top_k_doc_id": [2, 8, 1, 10]
}
```

## Quick usage
```python
from promptplanner.offline.clustering import cluster_prompts
from promptplanner.offline.inter_reordering import PromptProcessor
import json
import argparse

def parse_args():
    parser = argparse.ArgumentParser(description="Reorder prompt data based on clustering results.")
    parser.add_argument('--prompt_path', type=str, required=True, help='Path to the JSONL file containing prompt data.')
    return parser.parse_args()

def main():
    args = parse_args()
    
    # Load prompts from the specified path
    prompt_path = args.prompt_path
    with open(prompt_path, 'r') as f:
        prompts = [json.loads(line) for line in f]

    qids = [prompt['qid'] for prompt in prompts]
    questions = [prompt['text'] for prompt in prompts]
    answers = [prompt['answer'] for prompt in prompts]
    topk_doc_ids = [prompt['top_k_doc_id'] for prompt in prompts]


    processer = PromptProcessor(use_gpu=True)
    # Perform clustering (Intra-reordering)
    result = cluster_prompts(topk_doc_ids, similarity_method="sharp", use_triton=True)
    # Perform inter-reordering
    organized_reordered_ids, _, final_index_mapping = processer.process_and_reorder(result.reordered_prompts, topk_doc_ids, show_progress=True)

    # Results
    reorded_qids = [qids[i] for i in final_index_mapping]
    reorded_questions = [questions[i] for i in final_index_mapping]
    reorded_answers = [answers[i] for i in final_index_mapping]
    reorded_topk_doc_ids = organized_reordered_ids
    return reorded_qids, reorded_questions, reorded_answers, reorded_topk_doc_ids

if __name__ == "__main__":
    reorded_qids, reorded_questions, reorded_answers, reorded_topk_doc_ids = main()
```

The example code is in `examples/planner/example.py`



