Metadata-Version: 2.4
Name: arxiv2md
Version: 0.1.1
Summary: A command-line tool and Python library to convert arXiv papers to Markdown format.
Project-URL: Homepage, https://github.com/misya11p/arxiv2md
Author-email: komiya <90894519+misya11p@users.noreply.github.com>
License: MIT License Copyright (c) 2025 komiya
        
        Permission is hereby granted, free of
        charge, to any person obtaining a copy of this software and associated
        documentation files (the "Software"), to deal in the Software without
        restriction, including without limitation the rights to use, copy, modify, merge,
        publish, distribute, sublicense, and/or sell copies of the Software, and to
        permit persons to whom the Software is furnished to do so, subject to the
        following conditions:
        
        The above copyright notice and this permission notice
        (including the next paragraph) shall be included in all copies or substantial
        portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF
        ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
        MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
        EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
        OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
        FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
        THE SOFTWARE.
License-File: LICENSE
Keywords: arxiv,converter,markdown
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Requires-Dist: arxiv>=2.2.0
Requires-Dist: bs4>=0.0.2
Requires-Dist: halo>=0.0.31
Requires-Dist: lxml>=6.0.0
Requires-Dist: requests>=2.32.5
Requires-Dist: typer>=0.16.1
Description-Content-Type: text/markdown

# arxiv2md

A command-line tool and Python library for converting arXiv papers to Markdown format.

It retrieves the paper's source code (.tex) from an input arXiv URL and converts it to Markdown format.

## Setup

First, you need to install `latexml`:

```bash
# For macOS
brew install latexml

# For Ubuntu
sudo apt update
sudo apt install latexml
```

Then install `arxiv2md`:

```bash
# Using pip
pip install arxiv2md

# Using uv
uv tool install arxiv2md
```

## Usage

Simply provide the arXiv URL:

```bash
arxiv2md https://arxiv.org/abs/1706.03762
```

For use from Python code:

```python
from arxiv2md import arxiv2md

markdown, metadata = arxiv2md("https://arxiv.org/abs/1706.03762")
with open("output.md", "w") as f:
    f.write(markdown)
```

Example output file: [example.md](example.md)

## Notes

- The input URL doesn't necessarily need to be the arXiv's abstract page. It will work with PDF pages or source code pages as well. Ultimately, it should work with any string containing an arXiv ID.
- Papers without provided LaTeX source code cannot be converted.
- Figures and tables will be ignored.
- Papers not using bibtex will have reference citations displayed incorrectly.
- For papers with a large number of pages, processing by latexml may take considerable time.
