Metadata-Version: 2.4
Name: axm-bib
Version: 0.2.1
Summary: AXM bibliographic tools — DOI resolution, BibTeX retrieval, paper search & PDF download
Project-URL: Homepage, https://github.com/axm-protocols/axm-bib
Project-URL: Repository, https://github.com/axm-protocols/axm-bib
Project-URL: Issues, https://github.com/axm-protocols/axm-bib/issues
Project-URL: Documentation, https://axm-protocols.github.io/axm-bib/
Author-email: Gabriel Jarry <gabriel@axm-protocols.io>
License: Apache-2.0
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: axm
Requires-Dist: bibtexparser>=1.4
Requires-Dist: cyclopts>=4.5.1
Requires-Dist: habanero>=1.2
Requires-Dist: httpx>=0.27
Requires-Dist: pydantic>=2.0
Requires-Dist: pymupdf4llm>=0.0.17
Requires-Dist: pymupdf>=1.25
Description-Content-Type: text/markdown

<p align="center">
  <img src="https://raw.githubusercontent.com/axm-protocols/axm-init/main/assets/logo.png" alt="AXM Logo" width="180" />
</p>

<p align="center">
  <strong>axm-bib — Bibliographic tools: search papers, resolve DOIs, download & extract PDFs</strong>
</p>


<p align="center">
  <a href="https://github.com/axm-protocols/axm-bib/actions/workflows/ci.yml"><img src="https://github.com/axm-protocols/axm-bib/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
  <a href="https://axm-protocols.github.io/axm-init/explanation/check-grades/"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-bib/gh-pages/badges/axm-init.json" alt="axm-init"></a>
  <a href="https://axm-protocols.github.io/axm-audit/"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-bib/gh-pages/badges/axm-audit.json" alt="axm-audit"></a>
  <a href="https://coveralls.io/github/axm-protocols/axm-bib?branch=main"><img src="https://coveralls.io/repos/github/axm-protocols/axm-bib/badge.svg?branch=main" alt="Coverage"></a>
  <a href="https://pypi.org/project/axm-bib/"><img src="https://img.shields.io/pypi/v/axm-bib" alt="PyPI"></a>
  <img src="https://img.shields.io/badge/python-3.12%2B-blue" alt="Python 3.12+">
  <a href="https://axm-protocols.github.io/axm-bib/"><img src="https://img.shields.io/badge/docs-live-brightgreen" alt="Docs"></a>
</p>

---

## Features

- 🔍 **Search** — Find papers by title/keywords (Semantic Scholar + CrossRef fallback)
- 📖 **DOI → BibTeX** — Resolve any DOI to a BibTeX entry (CrossRef)
- 📥 **PDF Pipeline** — Download, extract, and organize papers in one command
- 📄 **Content Extraction** — PDF → Markdown + figure PNGs (PyMuPDF)
- 🪃 **arXiv Fallback** — Direct download from arXiv when Unpaywall has no URL
- 🤖 **MCP Integration** — Auto-discovered tools via `axm-mcp`

## Installation

```bash
uv add axm-bib
```

## Quick Start

```bash
# Search papers
axm-bib search "attention is all you need" -n 5

# Resolve a DOI to BibTeX
axm-bib doi 10.1145/363235.363259

# Download, extract & organize a paper (full pipeline)
axm-bib pdf 10.48550/arXiv.1706.03762
```

### Pipeline Output

`axm-bib pdf` creates a complete paper folder:

```
~/axm/papers/vaswani2017attention/
├── vaswani2017attention.pdf   # downloaded PDF
├── paper.bib                  # BibTeX entry
├── content.md                 # extracted Markdown
└── figures/
    ├── fig_001.png
    └── ...
```

```
Downloaded: ~/axm/papers/vaswani2017attention
  PDF: vaswani2017attention.pdf (1,234,567 bytes)
  BibTeX: paper.bib
  Markdown: content.md (8,432 words, 45,123 chars, 12 pages)
  Figures: 8
```

## CLI Commands

### `axm-bib search`

| Option | Default | Description |
|---|---|---|
| `QUERY` | *required* | Search query (title, keywords) |
| `--limit`, `-n` | 5 | Max results (1–100) |
| `--abstract/--no-abstract` | `True` | Show paper abstracts |
| `--abstract-len` | 0 (full) | Truncate abstracts to N chars |

### `axm-bib doi`

| Option | Description |
|---|---|
| `DOI` | Digital Object Identifier to resolve |

### `axm-bib pdf`

| Option | Default | Description |
|---|---|---|
| `DOI` | *required* | DOI of the paper |
| `--output`, `-o` | `~/axm/papers/` | Output directory |

Downloads the PDF, extracts Markdown + figures, and writes `paper.bib` — all in one step.
Supports arXiv papers even when Unpaywall has no URL (direct arXiv fallback).

### `axm-bib extract`

| Option | Default | Description |
|---|---|---|
| `PDF_PATH` | *required* | Path to a local PDF file |
| `--output-dir`, `-o` | auto | Output directory |
| `--figures/--no-figures` | `True` | Extract figures as PNG |

Standalone extraction for PDFs you already have.

## MCP Integration

`axm-bib` tools are automatically discovered by `axm-mcp`:

| Tool | Description |
|---|---|
| `bib_search` | Search papers by keywords |
| `bib_doi` | Resolve DOI to BibTeX |
| `bib_pdf` | Full pipeline: download + extract + BibTeX |
| `bib_extract` | Extract a local PDF to Markdown + figures |

## Configuration

| Variable | Purpose |
|---|---|
| `UNPAYWALL_EMAIL` | Email for Unpaywall API (prompted on first use) |
| `S2_API_KEY` | Optional Semantic Scholar API key for higher rate limits |

Config file: `~/.config/axm-bib/config.toml`

## Development

```bash
git clone https://github.com/axm-protocols/axm-bib.git
cd axm-bib
uv sync --all-groups
uv run pytest           # 173 tests
uv run ruff check src/  # lint
```

## License

Apache License 2.0
