Metadata-Version: 2.1
Name: tracllm
Version: 0.1.1
Summary: A context tracing tool for LLM
Home-page: https://github.com/WYT8506/LLM_attribution
Author: Yanting Wang
Author-email: ykw5450@psu.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: accelerate==1.1.1
Requires-Dist: aiofiles==23.2.1
Requires-Dist: aiohappyeyeballs==2.4.3
Requires-Dist: aiohttp==3.11.6
Requires-Dist: aiosignal==1.3.1
Requires-Dist: annotated-types==0.7.0
Requires-Dist: anyio==4.6.2.post1
Requires-Dist: asttokens==2.4.1
Requires-Dist: async-timeout==5.0.1
Requires-Dist: attrs==24.2.0
Requires-Dist: certifi==2024.8.30
Requires-Dist: charset-normalizer==3.4.0
Requires-Dist: click==8.1.7
Requires-Dist: comm==0.2.1
Requires-Dist: contourpy==1.3.1
Requires-Dist: cycler==0.12.1
Requires-Dist: datasets==3.1.0
Requires-Dist: debugpy==1.8.1
Requires-Dist: decorator==5.1.1
Requires-Dist: dill==0.3.8
Requires-Dist: distro==1.9.0
Requires-Dist: einops==0.8.0
Requires-Dist: exceptiongroup==1.2.0
Requires-Dist: executing==2.0.1
Requires-Dist: fastapi==0.115.5
Requires-Dist: ffmpy==0.4.0
Requires-Dist: filelock==3.16.1
Requires-Dist: flash-attn==2.7.0.post2
Requires-Dist: fonttools==4.55.0
Requires-Dist: frozenlist==1.5.0
Requires-Dist: fschat==0.2.36
Requires-Dist: fsspec==2024.9.0
Requires-Dist: gradio==5.6.0
Requires-Dist: gradio-client==1.4.3
Requires-Dist: h11==0.14.0
Requires-Dist: httpcore==1.0.7
Requires-Dist: httpx==0.27.2
Requires-Dist: huggingface-hub==0.26.2
Requires-Dist: idna==3.10
Requires-Dist: ipykernel==6.29.3
Requires-Dist: ipython==8.22.1
Requires-Dist: jedi==0.19.1
Requires-Dist: Jinja2==3.1.4
Requires-Dist: jiter==0.7.1
Requires-Dist: joblib==1.4.2
Requires-Dist: jupyter-client==8.6.0
Requires-Dist: jupyter-core==5.7.1
Requires-Dist: kiwisolver==1.4.7
Requires-Dist: latex2mathml==3.77.0
Requires-Dist: markdown-it-py==3.0.0
Requires-Dist: markdown2==2.5.1
Requires-Dist: MarkupSafe==2.1.5
Requires-Dist: matplotlib==3.9.2
Requires-Dist: matplotlib-inline==0.1.6
Requires-Dist: mdurl==0.1.2
Requires-Dist: mpmath==1.3.0
Requires-Dist: multidict==6.1.0
Requires-Dist: multiprocess==0.70.16
Requires-Dist: nest-asyncio==1.6.0
Requires-Dist: networkx==3.4.2
Requires-Dist: nh3==0.2.18
Requires-Dist: nltk==3.9.1
Requires-Dist: numpy==2.1.3
Requires-Dist: nvidia-cublas-cu12==12.4.5.8
Requires-Dist: nvidia-cuda-cupti-cu12==12.4.127
Requires-Dist: nvidia-cuda-nvrtc-cu12==12.4.127
Requires-Dist: nvidia-cuda-runtime-cu12==12.4.127
Requires-Dist: nvidia-cudnn-cu12==9.1.0.70
Requires-Dist: nvidia-cufft-cu12==11.2.1.3
Requires-Dist: nvidia-curand-cu12==10.3.5.147
Requires-Dist: nvidia-cusolver-cu12==11.6.1.9
Requires-Dist: nvidia-cusparse-cu12==12.3.1.170
Requires-Dist: nvidia-nccl-cu12==2.21.5
Requires-Dist: nvidia-nvjitlink-cu12==12.4.127
Requires-Dist: nvidia-nvtx-cu12==12.4.127
Requires-Dist: openai==1.54.5
Requires-Dist: orjson==3.10.11
Requires-Dist: packaging==23.2
Requires-Dist: pandas==2.2.3
Requires-Dist: parso==0.8.3
Requires-Dist: peft==0.13.2
Requires-Dist: pillow==11.0.0
Requires-Dist: platformdirs==4.2.0
Requires-Dist: prompt-toolkit==3.0.43
Requires-Dist: propcache==0.2.0
Requires-Dist: protobuf==5.28.3
Requires-Dist: psutil==5.9.8
Requires-Dist: pure-eval==0.2.2
Requires-Dist: pyarrow==18.0.0
Requires-Dist: pydantic==2.9.2
Requires-Dist: pydantic-core==2.23.4
Requires-Dist: pydub==0.25.1
Requires-Dist: Pygments==2.17.2
Requires-Dist: pynvml==11.5.3
Requires-Dist: pyparsing==3.2.0
Requires-Dist: python-dateutil==2.8.2
Requires-Dist: python-multipart==0.0.12
Requires-Dist: pytz==2024.2
Requires-Dist: PyYAML==6.0.2
Requires-Dist: pyzmq==25.1.2
Requires-Dist: regex==2024.11.6
Requires-Dist: requests==2.32.3
Requires-Dist: rich==13.9.4
Requires-Dist: ruff==0.7.4
Requires-Dist: safehttpx==0.1.1
Requires-Dist: safetensors==0.4.5
Requires-Dist: scikit-learn==1.5.2
Requires-Dist: scipy==1.14.1
Requires-Dist: semantic-version==2.10.0
Requires-Dist: sentencepiece==0.2.0
Requires-Dist: shellingham==1.5.4
Requires-Dist: shortuuid==1.0.13
Requires-Dist: six==1.16.0
Requires-Dist: sniffio==1.3.1
Requires-Dist: stack-data==0.6.3
Requires-Dist: starlette==0.41.3
Requires-Dist: svgwrite==1.4.3
Requires-Dist: sympy==1.13.1
Requires-Dist: threadpoolctl==3.5.0
Requires-Dist: tiktoken==0.8.0
Requires-Dist: tokenizers==0.20.3
Requires-Dist: tomlkit==0.12.0
Requires-Dist: torch==2.5.1
Requires-Dist: torchaudio==2.5.1
Requires-Dist: torchvision==0.20.1
Requires-Dist: tornado==6.4
Requires-Dist: tqdm==4.67.0
Requires-Dist: traitlets==5.14.1
Requires-Dist: transformers==4.46.3
Requires-Dist: triton==3.1.0
Requires-Dist: typer==0.13.1
Requires-Dist: typing-extensions==4.12.2
Requires-Dist: tzdata==2024.2
Requires-Dist: urllib3==2.2.3
Requires-Dist: uvicorn==0.32.0
Requires-Dist: wavedrom==2.0.3.post3
Requires-Dist: wcwidth==0.2.13
Requires-Dist: websockets==12.0
Requires-Dist: xxhash==3.5.0
Requires-Dist: yarl==1.17.2

# Searching for Needles in a Haystack with TracLLM

<p align='center'>
    <img alt="TracLLM" src='assets/fig1.png' width='80%'/>

</p>

This repository provides the official PyTorch implementation of our paper, presenting a general framework for finding the critical texts within a lengthy context that contribute to the LLM's answer:

> [**Searching for Needles in a Haystack: Context Tracing for Unraveling Outputs of
> Long Context LLMs**]() <br>
> [Yanting Wang](https://billchan226.github.io/)<sup>1†</sup>,
> [Wei Zou](https://zhenxianglance.github.io/)<sup>1†</sup>,
> [Runpeng Geng](https://xiaocw11.github.io/) <sup>1</sup>,
> [Jinyuan Jia](https://dawnsong.io/) <sup>1</sup>,
>
> <sup>1</sup>Penn State University<br>
> <sup>†</sup>Co-first author<br>

### 🗂️ Code Structure

```python
TracLLM/
├── tracllm/
│   ├── models/          # Models and LLMs code
│   ├── attribute/       # Perturbation-based (including our TracLLM) and self-citation-based attribution methods
│   ├── evaluate.py      # Evaluation code
│   ├── load_dataset.py  # Load LongBench, PoisonedRAG, and NeedleInHaystack datasets
│   ├── prompts.py       # Prompt template
│   └── utils.py         # some small helper functions
├── PromptInjectionAttacks/  # Different prompt injection attacks to LongBench
├── main.py              # main function for running the experiments
└── scripts/                # scripts for running the experiments
```

### 🔨 Setup environment

Please run the following commands to set up the environment:

```bash
conda env create -f environment.yml
conda activate TracLLM
```

or

```bash
conda env create TracLLM
conda activate TracLLM
pip install -r requirements.txt
```

### 🔑 Set API key

Please enter your api key in [**model_configs/llama3.1-8b_config.json**](model_configs/llama3.1-8b_config.json) to use LLaMA-3.1.

For LLaMA-3.1, the api key is your **HuggingFace Access Tokens**. You could visit [LLaMA-3.1's HuggingFace Page](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) first if you don't have the access token.

Please enter your api key here:

```json
"api_key_info":{
    "api_keys":[
        "Your api key here"
    ],
    "api_key_use": 0
},
```



### 📝 Getting Started

Explore `TracLLM` with our example notebook [quick_start.ipynb](quick_start.ipynb).
To use `TracLLM`, first generate the model and attribution object:

```python
from src.models import create_model
from src.attribution import PerturbationBasedAttribution
from src.prompts import wrap_prompt

model_name = 'llama3.1-8b'
llm = create_model(f'model_configs/{model_name}_config.json', device = "cuda:0")
score_funcs = ['stc','loo','denoised_shapley'] #input more than one scoring function for ensembling
attr = PerturbationBasedAttribution(llm,explanation_level = "sentence", attr_type = "tracllm",score_funcs= score_funcs,sh_N = 5)
```

Then, you can craft the prompt and get the LLM's answer:

```python
context = """Heretic is a 2024 American psychological horror[4][5][6] film written and directed by Scott Beck and Bryan Woods. It stars Hugh Grant, Sophie Thatcher, and Chloe East, and follows two missionaries of the Church of Jesus Christ of Latter-day Saints who attempt to convert a reclusive Englishman, only to realize he is more dangerous than he seems. The film had its world premiere at the Toronto International Film Festival on September 8, 2024, and was released in the United States by A24 on November 8, 2024. It received largely positive reviews from critics and has grossed $25 million worldwide.
\n\n Red One is a 2024 American action-adventure Christmas comedy film directed by Jake Kasdan and written by Chris Morgan, from an original story by Hiram Garcia. The film follows the head of North Pole security (Dwayne Johnson) teaming up with a notorious hacker (Chris Evans) in order to locate a kidnapped Santa Claus (J. K. Simmons) on Christmas Eve; Lucy Liu, Kiernan Shipka, Bonnie Hunt, Nick Kroll, Kristofer Hivju, and Wesley Kimmel also star. The film is seen as the first of a Christmas-themed franchise, produced by Amazon MGM Studios in association with Seven Bucks Productions, Chris Morgan Productions, and The Detective Agency.[7][8] Red One was released internationally by Warner Bros. Pictures on November 6 and was released in the United States by Amazon MGM Studios through Metro-Goldwyn-Mayer on November 15, 2024.[9] The film received generally negative reviews from critics, but it has grossed $10 billion solely in the USA. M.O.R.A (Mythological Oversight and Restoration Authority) is a clandestine, multilateral military organization that oversees and protects a secret peace treaty between mythological creatures and humanity. Callum Drift, head commander of Santa Claus's ELF (Enforcement Logistics and Fortification) security, requests to retire after one last Christmas run, as he has become disillusioned with increased bad behavior in the world, exemplified by the growth of Santa's Naughty List. 
"""
question= "Which movie earned more money, Heretic or Red one?"
prompt = wrap_prompt(question, [context])
answer = llm.query(prompt)
print("Answer: ", answer)
```

Finally, you can get the attribution results of TracLLM by calling `attr.attribute`:

```python
texts,important_ids, importance_scores, _,_ = attr.attribute(question, [context], answer)
attr.visualize_results(texts,question,answer, important_ids,importance_scores, width = 60)
```

<p align = 'center'>
  <img alt="Example" src='assets/example.png' width='90%'/>
</p>


### 🔬 Experiments

Execute the scripts below to replicate our experimental findings for LongBench (with prompt injection attacks), PoisonedRAG, and NeedleInHaystack.

- [script_prompt_injection.py](scripts/script_prompt_injection.py): tracing malicious instructions injected into long contexts from LongBench.
- [script_PoisonedRAG.py](scripts/script_PoisonedRAG.py): finding corrupted knowledge from retrieved texts of a RAG system.
- [script_needle_in_haystack.py](scripts/script_needle_in_haystack.py): tracing needles in a haystack.

For example, to run the prompt injection experiment to [LongBench](https://github.com/THUDM/LongBench) under the default setting of `TracLLM`, execute:

```bash
python scripts/script_prompt_injection.py
```

To speed up the computation, you can set `sh_N` (the number of permutations for shapley/denoised_shapley) to 5:

```bash
python main.py --dataset_name musique --model_name llama3.1-8b --prompt_injection_attack default --inject_times 5 --sh_N 5
```

### Acknowledgement

* This project incorporates code from [PoisonedRAG](https://github.com/sleeepeer/PoisonedRAG) and [corpus-poisoning](https://github.com/princeton-nlp/corpus-poisoning).
* This project incorporates datasets from [LongBench](https://github.com/THUDM/LongBench) and [Needle In A Haystack](https://github.com/gkamradt/LLMTest_NeedleInAHaystack).
* This project draws inspiration from [ContextCite](https://github.com/MadryLab/context-cite) and [AgentPoison](https://github.com/BillChan226/AgentPoison).
* The model component of this project is based on [Open-Prompt-Injection](https://github.com/liu00222/Open-Prompt-Injection).
* This project utilizes [contriever](https://github.com/facebookresearch/contriever) for retrieval augmented generation (RAG).

### Citation

```bib
@article{wang2024tracllm,
    title={Searching for Needles in a Haystack: Context Tracing for Unraveling Outputs of Long Context LLMs},
    author={Wang Yanting, Zou Wei, Geng Runpeng and Jia Jinyuan},
    year={2024}
}
```
