Metadata-Version: 2.4
Name: python-doxx
Version: 0.1.0
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Rust
License-File: LICENSE
Summary: Python bindings for the Rust doxx document converter
Keywords: docx,conversion,markdown,python-binding,doxx
Author: Codex Assistant
Requires-Python: >=3.8
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/oneryalcin/python-doxx
Project-URL: Repository, https://github.com/oneryalcin/python-doxx
Project-URL: Issues, https://github.com/oneryalcin/python-doxx/issues

# python-doxx

Minimal Python bindings for the [`doxx`](https://github.com/bgreenwell/doxx) Rust library. Exposes just the pieces needed for scripted document inspection:

- `convert(path, export_format="markdown", page=None, images=False)` — export `.docx` files as Markdown, plain text, JSON, CSV, or ANSI strings. Optional `page` isolates a single page (1-indexed); set `images=True` to retain image references when available.
- `search(path, query, page=None)` — return match locations for quick content checks, mirroring `doxx --search`.
- `extract_images(path, output_dir)` — dump embedded images, similar to `--extract-images` in the CLI.

## Installation

Once the tagged `v0.1.0` release has propagated to PyPI:

```bash
pip install python-doxx
# or
uv pip install python-doxx
```

To build from source locally:

```bash
maturin develop
```

## Usage

```python
import doxx

# Export to Markdown
markdown = doxx.convert("report.docx", export_format="markdown", images=True)

# Pull a single page as CSV
csv_page = doxx.convert("data.docx", export_format="csv", page=2)

# Locate text snippets
hits = doxx.search("contract.docx", "payment")
for hit in hits:
    print(hit.page, hit.text)

# Extract embedded images
saved = doxx.extract_images("slides.docx", "./images")
```

Supported export formats: `markdown`, `text`, `json`, `csv`, and `ansi`.

Images are optional; enable them when you need real file paths in the exported Markdown or ANSI render. CSV export raises `ValueError` if the document contains no tables.

## Releasing

1. Make sure `CHANGELOG.md` (if present) and `pyproject.toml` share the new version.
2. Tag the release (`git tag v0.1.0`), then `git push --tags`.
3. The GitHub Actions workflow builds wheels for macOS, Linux, and Windows, uploads them as artifacts, and—when `PYPI_API_TOKEN` is configured in the repository secrets—publishes to PyPI automatically.

