Metadata-Version: 2.4
Name: arxiv-dl
Version: 1.3.2
Summary: Command-line Papers Downloader. Citation extraction and PDF naming automation.
Project-URL: Homepage, https://github.com/MarkHershey/arxiv-dl
Project-URL: Issues, https://github.com/MarkHershey/arxiv-dl/issues
Author-email: Mark He Huang <dev@markhh.com>
License-Expression: MIT
License-File: LICENSE
Keywords: CVF,CVPR,ECCV,ICCV,WACV,arxiv,downloader,paper
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Utilities
Requires-Python: >=3.9
Requires-Dist: beautifulsoup4>=4.13.4
Requires-Dist: pydantic>=2.11.7
Requires-Dist: pymupdf>=1.26.1
Requires-Dist: requests>=2.32.4
Requires-Dist: rich>=14.0.0
Provides-Extra: dev
Requires-Dist: black; extra == 'dev'
Requires-Dist: check-manifest; extra == 'dev'
Requires-Dist: isort; extra == 'dev'
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: tox; extra == 'dev'
Description-Content-Type: text/markdown

# arXiv-dl

Command-line research paper downloader for papers hosted on [arXiv](https://arxiv.org/), [Hugging Face Papers](https://huggingface.co/papers), [NeurIPS](https://proceedings.neurips.cc/), [CVF Open Access](https://openaccess.thecvf.com/menu) (CVPR, ICCV, WACV), and [ECVA](https://www.ecva.net/papers.php) (ECCV).

[![](https://img.shields.io/pypi/v/arxiv-dl)](https://pypi.org/project/arxiv-dl/)
[![](https://img.shields.io/pypi/dm/Arxiv-dl)](https://pypistats.org/packages/arxiv-dl)
[![](https://img.shields.io/badge/code%20style-black-black)](https://github.com/psf/black)
[![](https://img.shields.io/badge/license-MIT-black)](https://github.com/MarkHershey/arxiv-dl/blob/master/LICENSE)

_Disclaimer: This is an opinionated command-line tool for downloading papers. It prioritizes ease of use for researchers and is not an official arXiv project._

![](imgs/demo_v1.2.0.png)

## What does it do?

- Downloads papers from [arXiv](https://arxiv.org/), [Hugging Face Papers](https://huggingface.co/papers), [NeurIPS](https://proceedings.neurips.cc/), [CVPR, ICCV, WACV](https://openaccess.thecvf.com/menu), and [ECCV](https://www.ecva.net/papers.php) with a simple CLI.
- Speeds up downloads with [aria2](https://aria2.github.io/) when available.
- Retrieves paper metadata:
    - Title, abstract, and year
    - Authors
    - Comments and conference acceptance info
    - Repository URLs when available
    - `BibTeX` citation
- Maintains a list of local papers and their metadata in a JSON file.
- Lets you configure the download destination with an environment variable or command-line option.
- Saves downloaded papers with standardized filenames.

## Why?

- Save time downloading and organizing papers.
- Use multiple parallel connections for faster downloads.
- Keep a local paper list for lookup, notes, and citations.

## Installation

For regular command-line use, install with `pipx`:

- Prerequisite: Python 3.9 or later

```bash
pipx install arxiv-dl
```

If `pipx` is not installed:

```bash
# Debian/Ubuntu
sudo apt install pipx
pipx ensurepath

# macOS
brew install pipx
pipx ensurepath
```

> [!NOTE]
> `pipx` installs command-line tools in isolated environments and exposes their commands on your `PATH`. This avoids conflicts with operating-system-managed Python installations, including Debian/Ubuntu environments that block global `pip install` through PEP 668.

To upgrade:

```bash
pipx upgrade arxiv-dl
```

If you prefer `pip`, install inside a virtual environment:

```bash
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -U arxiv-dl
```

Optionally, install [aria2c](https://aria2.github.io/) for multi-connection downloads.

- macOS: `brew install aria2`
- Linux: `sudo snap install aria2c`

## Usage

After installation, use `paper` in your shell to download papers.
The legacy commands `arxiv-dl` and `getpaper` are equivalent to `paper`.

```bash
paper [OPTIONS] TARGET(s)
```

### Shell examples

```bash
# Download a single target
$ paper 1512.03385

# Download multiple targets
$ paper 2103.15538 2304.04415 https://arxiv.org/abs/1512.03385
```

### Supported Targets

<details>
<summary><strong>Click to expand</strong></summary>

✅ Supported, 🚧 Not Yet Supported, ❌ Not Supported

- **[ArXiv](https://arxiv.org/)**
    - ✅ ArXiv ID: `1512.03385` or `arXiv:1512.03385`
    - ✅ Legacy ArXiv ID: `alg-geom/9708001` or `cs/0002001`, etc.
    - ✅ ArXiv Abstract Page URL: `https://arxiv.org/abs/1512.03385`
    - ✅ ArXiv PDF Page URL: `https://arxiv.org/pdf/1512.03385.pdf`
    - ✅ ArXiv HTML Page URL: `https://arxiv.org/html/2506.15442`
- **[Hugging Face Papers](https://huggingface.co/papers)**
    - ✅ Single Paper Page URL: `https://huggingface.co/papers/2605.12357`
    - ✅ Current Daily Papers Page URL: `https://huggingface.co/papers`
    - ✅ Daily Papers Page URL: `https://huggingface.co/papers/date/2026-05-22`
    - ✅ Weekly Papers Page URL: `https://huggingface.co/papers/week/2026-W21`
    - ✅ Monthly Papers Page URL: `https://huggingface.co/papers/month/2026-05`
    - ✅ Trending Papers Page URL: `https://huggingface.co/papers/trending`
    - ✅ User/Organization Papers Page URL: `https://huggingface.co/huggingface/papers`
    - ✅ Collection Page URL: `https://huggingface.co/collections/Testerpce/memory`
- **[CVF Open Access](https://openaccess.thecvf.com/menu) (CVPR, ICCV, WACV)**
    - ✅ CVF Abstract Page URL: `https://openaccess.thecvf.com/content/**/html/**/*.html`
    - ✅ CVF PDF Page URL: `https://openaccess.thecvf.com/content/**/papers/**/*.pdf`
- **[ECVA](https://www.ecva.net/papers.php) (ECCV)**
    - ✅ ECVA Abstract Page URL: `https://www.ecva.net/html/**/*.php`
    - ❌ ECVA PDF Page URL: `https://www.ecva.net/papers/**/*.pdf`
- **[NeurIPS](https://proceedings.neurips.cc/) / [NIPS](https://papers.nips.cc/)**
    - ✅ NeurIPS Abstract Page URL: `https://proceedings.neurips.cc/paper_files/paper/**/hash/**/*.html`
    - ✅ NeurIPS PDF Page URL: `https://proceedings.neurips.cc/paper_files/paper/**/file/**/*.pdf`
    - ✅ NIPS mirror Abstract Page URL: `https://papers.nips.cc/paper_files/paper/**/hash/**/*.html`
    - ✅ NIPS mirror PDF Page URL: `https://papers.nips.cc/paper_files/paper/**/file/**/*.pdf`
- **[OpenReview](https://openreview.net/)**
    - 🚧 TODO

</details>

### Common Options

- `-v`, `--verbose`: Print full details.
- `-d`, `--download-dir`: Set the download directory for this run. This overrides both the default path and `ARXIV_DOWNLOAD_FOLDER`.
- `-n`, `--n-threads`: Set the number of parallel download connections used by `aria2`.

> [!TIP]
> Run `paper -h` to see all options.

### Python API

```python
from arxiv_dl import download_paper

download_paper(target="1512.03385", download_dir=".", set_verbose_level="silent")
```

## Configuration

### Default Download Destination

- By default, papers are downloaded to `$HOME/Downloads/ArXiv_Papers`.

### Custom Download Destination

Set `ARXIV_DOWNLOAD_FOLDER` to choose a persistent download destination. Add this to your `.bashrc` or `.zshrc`:

```bash
export ARXIV_DOWNLOAD_FOLDER="YOUR/PATH/TO/ANY/FOLDER"
```

- Download destination priority:
    1.  Command-line option `-d` (highest priority)
    2.  Environment variable `ARXIV_DOWNLOAD_FOLDER`
    3.  Default download destination (lowest priority)

### Custom Command Alias

- You can define aliases to rename the command or add default options:
    ```bash
    alias dp="paper"
    alias dpv="paper -v -d '~/Documents/Papers'"
    ```

## Contributing

Development, testing, build, and publishing notes are in [DEVELOPMENT.md](DEVELOPMENT.md).

## License

This project is licensed under the [MIT License](https://github.com/MarkHershey/arxiv-dl/blob/master/LICENSE).  
&copy; Mark H. Huang. All rights reserved.
