Metadata-Version: 2.4
Name: bibsync
Version: 0.3.2
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Rust
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Text Processing :: Markup :: LaTeX
License-File: LICENSE
Summary: Synchronize BibTeX files from citation keys in LaTeX sources
Keywords: bibtex,latex,arxiv,ads,inspirehep
Home-Page: https://github.com/isaac-cf-wong/bibsync
Author: Isaac C. F. Wong
License-Expression: BSD-3-Clause
Requires-Python: >=3.12
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Documentation, https://isaac-cf-wong.github.io/bibsync
Project-URL: Homepage, https://github.com/isaac-cf-wong/bibsync
Project-URL: Issues, https://github.com/isaac-cf-wong/bibsync/issues
Project-URL: Repository, https://github.com/isaac-cf-wong/bibsync

# bibsync

[![CI](https://github.com/isaac-cf-wong/bibsync/actions/workflows/ci.yml/badge.svg)](https://github.com/isaac-cf-wong/bibsync/actions/workflows/ci.yml)
[![Documentation Status](https://github.com/isaac-cf-wong/bibsync/actions/workflows/documentation.yml/badge.svg)](https://isaac-cf-wong.github.io/bibsync/)
[![Crates.io](https://img.shields.io/crates/v/bibsync)](https://crates.io/crates/bibsync)
[![Docs.rs](https://docs.rs/bibsync/badge.svg)](https://docs.rs/bibsync)
[![PyPI Version](https://img.shields.io/pypi/v/bibsync)](https://pypi.org/project/bibsync/)
[![Python Versions](https://img.shields.io/pypi/pyversions/bibsync)](https://pypi.org/project/bibsync/)
[![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](LICENSE)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![DOI](https://zenodo.org/badge/1251520994.svg)](https://doi.org/10.5281/zenodo.20422622)
[![SPEC 0 — Minimum Supported Dependencies](https://img.shields.io/badge/SPEC-0-green?labelColor=%23004811&color=%235CA038)](https://scientific-python.org/specs/spec-0000/)

`bibsync` synchronizes BibTeX files from citation keys in LaTeX sources. It is
inspired by `adstex`, with provider support for both NASA ADS and InspireHEP.

The primary workflow is to cite papers by identifier, especially arXiv ID:

```tex
\citep{2404.14498}
\citet{arXiv:2312.00752}
```

Then check the bibliography:

```shell
bibsync main.tex -o references.bib
```

To update the file, add `--fix`:

```shell
bibsync --fix main.tex -o references.bib
```

`bibsync` scans the TeX file, resolves missing identifier-like citekeys through
NASA ADS and/or InspireHEP, rewrites provider BibTeX entries so the citekey stays
the key used in TeX, and reports whether the output `.bib` file is current. With
`--fix`, it writes the merged bibliography.

When a citekey cannot be resolved, the command prints the key together with a
reason and the likely fix. For example, an unsupported citekey is reported as an
identifier-format problem, while an arXiv ID or DOI that the provider cannot find
is reported as a provider miss.

## Installation

Install the Rust CLI from crates.io:

```shell
cargo install bibsync
```

Install the Python package from PyPI:

```shell
pip install bibsync
```

The PyPI package includes Python bindings and installs the `bibsync` command.
Python 3.12 or newer is required.

You can also download a pre-built binary from the
[GitHub releases page](https://github.com/isaac-cf-wong/bibsync/releases). Pick
the archive for your platform, extract it, and place the `bibsync` executable
somewhere on your `PATH`.

To compile from source:

```shell
git clone https://github.com/isaac-cf-wong/bibsync.git
cd bibsync
cargo build --release
```

The compiled binary is written to `target/release/bibsync` on Unix-like systems
or `target\release\bibsync.exe` on Windows.

## Citing

If `bibsync` contributes to a scientific publication, please cite it using the
Zenodo record:

[https://doi.org/10.5281/zenodo.20422622](https://doi.org/10.5281/zenodo.20422622)

Citation formats including BibTeX can be exported directly from the Zenodo page.
For convenience, you can also use:

```bibtex
@software{wong2026bibsync,
  author  = {Wong, Isaac C. F.},
  title   = {bibsync: A Rust package to automatically resolve, synchronize, and validate LaTeX citations across BibTeX databases},
  version = {v0.3.2},
  year    = {2026},
  month   = may,
  doi     = {10.5281/zenodo.20422622},
  url     = {https://doi.org/10.5281/zenodo.20422622}
}
```

## Providers

By default `bibsync` tries NASA ADS first and InspireHEP second:

```shell
bibsync --fix main.tex -o references.bib --provider auto
```

NASA ADS requires an API token:

```shell
export ADS_API_TOKEN="..."
```

You can choose a single provider:

```shell
bibsync --fix main.tex -o references.bib --provider ads
bibsync --fix main.tex -o references.bib --provider inspire
```

InspireHEP supports arXiv IDs and DOIs. NASA ADS supports arXiv IDs, DOIs, and
ADS bibcodes.

## Python API

`bibsync` can also be installed from PyPI:

```shell
pip install bibsync
```

The PyPI package provides Python bindings backed by the Rust implementation:

```python
import bibsync

report = bibsync.sync_files(
    ["main.tex"],
    output="references.bib",
    provider="inspire",
    check=True,
)
```

It also installs the `bibsync` command. The command delegates to the same Rust
CLI implementation as the Cargo-installed binary, so command-line behavior is
kept in one place.

## Existing Bibliographies

If the TeX source contains `\bibliography{references}`, `bibsync` can discover
`references.bib` automatically:

```shell
bibsync --fix main.tex
```

Additional read-only bibliographies can be used to avoid duplicating entries:

```shell
bibsync --fix main.tex -o references.bib -r shared.bib software.bib
```

Use `--merge-other` to copy matching entries from those read-only files into the
main output file.

To update a bibliography in place, pass a single `.bib` file:

```shell
bibsync --fix references.bib --force-regenerate
```

Existing input files are validated before resolution. A missing single `.bib`
input, `--other` bibliography, or `--ignore-file` is reported as an error with
the path that could not be read. Existing bibliography files are also parsed
strictly, so malformed BibTeX reports the file and the approximate failing
entry instead of being treated as an empty or partial bibliography.

### Update Behavior

By default `bibsync` leaves published entries untouched. Only entries that look
like unpublished preprints — those with an `archivePrefix` or `eprinttype` field
but no `journal` field — are re-queried to check whether they have been
published. If so, the entry is updated; otherwise it is preserved.

| Flag                 | Behavior                                      |
| -------------------- | --------------------------------------------- |
| _(default)_          | Re-check preprints; skip published entries    |
| `--no-update`        | Skip all existing entries                     |
| `--update-all`       | Re-resolve all existing entries               |
| `--force-regenerate` | Re-resolve and overwrite all existing entries |

### Ignoring Entries

To exclude specific entries from all resolution — for example, books or theses
you have curated by hand — list their citekeys in a `.bibsyncignore` file:

```text
# .bibsyncignore
knuth1997art
smith2024thesis
```

Pass the file with `--ignore-file`:

```shell
bibsync --fix main.tex -o references.bib --ignore-file .bibsyncignore
```

## Cache

Use `--cache` to avoid repeated provider API calls:

```shell
bibsync --cache main.tex -o references.bib
bibsync --fix --cache main.tex -o references.bib
```

The cache stores provider records and mappings from arXiv IDs or DOIs to the
provider's canonical record ID. Preprint entries that are re-checked for
publication always bypass the cache and fetch a fresh result, then write it back.
Use `--refresh-cache` to force a fresh fetch for all entries:

```shell
bibsync --fix --refresh-cache main.tex -o references.bib
```

Override the cache location with `--cache-dir DIR`.

If a cache file is corrupt, `bibsync` reports the exact cache path and asks you
to refresh or remove the bad cache entry. Provider request failures include the
provider and citekey or batch being resolved.

## Pre-commit

The repository includes `.pre-commit-hooks.yaml`, so other projects can use
`bibsync` as a pre-commit hook.

Use the pre-built binary hook for faster installs:

```yaml
repos:
    - repo: https://github.com/isaac-cf-wong/bibsync
      rev: v0.3.2
      hooks:
          - id: bibsync-bin
            args: [--cache, --provider, inspire, --output, references.bib]
```

The binary hook downloads a platform-specific archive from the GitHub release
matching `rev` and caches it under pre-commit's cache directory. The source hook
is available for Linux x86_64/aarch64, macOS x86_64/aarch64, and Windows x86_64.
The source hook is still available, but it compiles the Rust crate during hook
installation:

```yaml
repos:
    - repo: https://github.com/isaac-cf-wong/bibsync
      rev: v0.3.2
      hooks:
          - id: bibsync
            args: [--provider, inspire, --output, references.bib]
```

By default, the hook checks whether the bibliography is current and fails without
writing changes. To let the hook update files, add `--fix` to the hook args:

```yaml
repos:
    - repo: https://github.com/isaac-cf-wong/bibsync
      rev: v0.3.2
      hooks:
          - id: bibsync-bin
            args: [--fix, --cache, --provider, inspire, --output, references.bib]
```

To skip manually curated entries, add `--ignore-file`:

```yaml
repos:
    - repo: https://github.com/isaac-cf-wong/bibsync
      rev: v0.3.2
      hooks:
          - id: bibsync-bin
            args:
                [
                    --fix,
                    --cache,
                    --provider,
                    inspire,
                    --output,
                    references.bib,
                    --ignore-file,
                    .bibsyncignore,
                ]
```

For a project-local hook while developing `bibsync` itself:

```yaml
repos:
    - repo: local
      hooks:
          - id: bibsync
            name: bibsync
            entry: cargo run -- --fix --provider inspire --output references.bib
            language: system
            files: \.tex$
```

