Metadata-Version: 2.4
Name: bibfixer
Version: 0.3.0
Summary: Fixes and standardizes BibTeX using LLM + web search
Author: Takashi Ishida
License: MIT License
        
        Copyright (c) 2025 Takashi Ishida
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: openai<2,>=1.107.0
Requires-Dist: bibtexparser<2,>=1.4.1
Dynamic: license-file

<div align="center">
<img src="logo.png" alt="" width="450">

[![PyPI version](https://badge.fury.io/py/bibfixer.svg?update=20251204)](https://pypi.org/project/bibfixer/)
[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
![PRs Welcome](https://img.shields.io/badge/PRs-welcome-purple.svg)
[![Changelog](https://img.shields.io/github/v/release/takashiishida/bibfixer?label=changelog)](https://github.com/takashiishida/bibfixer/releases)
[![Downloads](https://static.pepy.tech/badge/bibfixer)](https://pepy.tech/project/bibfixer)
</div>

A Python tool that fixes and standardizes your BibTeX. It not only completes entries with accurate metadata via LLM + web search capabilities, but also enforces a consistent style based on your preferences (e.g., venue naming, title casing, author format, page ranges). This removes the tedious manual work of hunting down sources and cleaning messy entries (like those copied from Google Scholar), producing a clean, uniform bib file. A consistent style improves readability and leaves a stronger impression on readers and reviewers.

> [!WARNING]
> bibfixer is experimental and uses LLM + web search, so it may occasionally produce incomplete or incorrect metadata/formatting. Always review the final `.bib` before submission. For known limitations and ongoing issues (and to report new ones), please see the GitHub Issues.



## Examples

Example (1) Original bib entry from Google Scholar. Additional authors are omitted and indicated by "and others", and "ai" is not capitalized.
```bib
@article{bai2022constitutional,
 author = {Bai, Yuntao and Kadavath, Saurav and Kundu, Sandipan and Askell, Amanda and Kernion, Jackson and Jones, Andy and Chen, Anna and Goldie, Anna and Mirhoseini, Azalia and McKinnon, Cameron and others},
 journal = {arXiv preprint arXiv:2212.08073},
 title = {Constitutional ai: Harmlessness from ai feedback},
 year = {2022}
}
```

With bibfixer, missing authors are added and title is capitalized properly:
```bib
@article{bai2022constitutional,
  author = {Bai, Yuntao and Kadavath, Saurav and Kundu, Sandipan and Askell, Amanda and Kernion, Jackson and Jones, Andy and Chen, Anna and Goldie, Anna and Mirhoseini, Azalia and McKinnon, Cameron and Chen, Carol and Olsson, Catherine and Olah, Christopher and Hernandez, Danny and Drain, Dawn and Ganguli, Deep and Li, Dustin and Tran-Johnson, Eli and Perez, Ethan and Kerr, Jamie and Mueller, Jared and Ladish, Jeffrey and Landau, Joshua and Ndousse, Kamal and Lukosuite, Kamile and Lovitt, Liane and Sellitto, Michael and Elhage, Nelson and Schiefer, Nicholas and Mercado, Noemi and DasSarma, Nova and Lasenby, Robert and Larson, Robin and Ringer, Sam and Johnston, Scott and Kravec, Shauna and El Showk, Sheer and Fort, Stanislav and Lanham, Tamera and Telleen-Lawton, Timothy and Conerly, Tom and Henighan, Tom and Hume, Tristan and Bowman, Samuel R. and Hatfield-Dodds, Zac and Mann, Ben and Amodei, Dario and Joseph, Nicholas and McCandlish, Sam and Brown, Tom and Kaplan, Jared},
  title = {Constitutional {AI}: {H}armlessness from {AI} Feedback},
  journal = {arXiv preprint arXiv:2212.08073},
  year = {2022}
}
```

Example (2) Original bib entry from Google Scholar. This shows the arXiv version but the paper was published in ICML. "llm" needs to be capitalized.
```bib
@article{khan2024debating,
 author = {Khan, Akbir and Hughes, John and Valentine, Dan and Ruis, Laura and Sachan, Kshitij and Radhakrishnan, Ansh and Grefenstette, Edward and Bowman, Samuel R and Rockt{\"a}schel, Tim and Perez, Ethan},
 journal = {arXiv preprint arXiv:2402.06782},
 title = {Debating with more persuasive llms leads to more truthful answers},
 year = {2024}
}
```

With bibfixer, arXiv is replaced with the conference information and appropriate title:
```bib
@inproceedings{khan2024debating,
  author = {Khan, Akbir and Hughes, John and Valentine, Dan and Ruis, Laura and Sachan, Kshitij and Radhakrishnan, Ansh and Grefenstette, Edward and Bowman, Samuel R. and Rockt{\"a}schel, Tim and Perez, Ethan},
  title = {Debating with More Persuasive {LLMs} Leads to More Truthful Answers},
  booktitle = {Proceedings of the 41st International Conference on Machine Learning},
  year = {2024},
  volume = {235},
  pages = {23662--23733}
}
```

Example (3) Original bib entry from Google Scholar. Last author is missing due to a system issue of the distributor Penguin Random House. Subtitle and publisher needs to be capitalized appropriately.
```bib
@book{sugiyama2022machine,
 title = {Machine learning from weak supervision: An empirical risk minimization approach},
 author = {Sugiyama, Masashi and Bao, Han and Ishida, Takashi and Lu, Nan and Sakai, Tomoya},
 year = {2022},
 publisher = {MIT Press}
}
```

With bibfixer, we have all authors and appropriate capitalization:
```bib
@book{sugiyama2022machine,
  author = {Sugiyama, Masashi and Bao, Han and Ishida, Takashi and Lu, Nan and Sakai, Tomoya and Niu, Gang},
  title = {Machine Learning from Weak Supervision: {A}n Empirical Risk Minimization Approach},
  publisher = {{MIT} Press},
  year = {2022},
  pages = {320}
}
```

## Installation

1. Install (from PyPI):
```bash
pip install bibfixer
```

2. Set up your API key:
```bash
# For OpenAI (default provider):
export OPENAI_API_KEY='your-api-key-here'

# For OpenRouter:
export OPENROUTER_API_KEY='your-api-key-here'
```

## Usage

Basic usage (input is required via `-i/--input`):
```bash
bibfixer -i sample_input.bib
```

With output file:
```bash
bibfixer -i sample_input.bib -o corrected.bib
```

With additional formatting preferences (`-p`):
```bash
bibfixer -i sample_input.bib -p "Use NeurIPS instead of NIPS"
```

Use a custom prompt file (defaults to bundled `prompts/default.md`):
```bash
bibfixer -i sample_input.bib --prompt-file prompts/default.md
```

The complete revision instructions are in `prompts/default.md`. You can edit this file to match your style or point to another file using `--prompt-file`.

Use OpenRouter for other models. We use Exa.ai via the `:online` model suffix for web search capabilities.

```bash
# Use OpenRouter with the default model:
bibfixer -i sample_input.bib --provider openrouter

# Use a specific model:
bibfixer -i sample_input.bib --provider openrouter --model google/gemini-2.5-flash
```


### Review
Since bibfixer is experimental (see warning above), it's a good idea to diff the results. To quickly compare input and output, you can run:
```bash
diff -y --suppress-common-lines input.bib output.bib | less -R
```

## Streamlit app

In addition to the dependencies in `pyproject.toml`, install `streamlit>=1.30.0`.

From the repo root, run:

```bash
streamlit run app.py
```
