Metadata-Version: 2.4
Name: aspose-words-foss
Version: 26.5.0
Summary: Lightweight free open-source alternative to Aspose.Words — convert DOCX, DOC, RTF to Markdown, text, and PDF
Project-URL: Homepage, https://github.com/aspose-words-foss/Aspose.Words-FOSS-for-Python
Project-URL: Repository, https://github.com/aspose-words-foss/Aspose.Words-FOSS-for-Python
Project-URL: Issues, https://github.com/aspose-words-foss/Aspose.Words-FOSS-for-Python/issues
Author: Aspose
License-Expression: MIT
License-File: LICENSE
Keywords: aspose,convert,doc,document,docx,markdown,office,openxml,pdf,rtf,word
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Office/Business
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Markup
Classifier: Typing :: Typed
Requires-Python: <3.13,>=3.10
Requires-Dist: fpdf2>=2.7.0
Requires-Dist: olefile>=0.46
Requires-Dist: pydantic>=2.0.0
Provides-Extra: dev
Requires-Dist: pillow>=10.0.0; extra == 'dev'
Requires-Dist: pytest>=9.0.2; extra == 'dev'
Description-Content-Type: text/markdown

# Aspose.Words FOSS

A lightweight, open-source Python library for converting DOCX, DOC, RTF, TXT, and MD files to DOCX, Markdown, plain text, and PDF without requiring Microsoft Word.

A free, lightweight version of [Aspose.Words for Python via .NET](https://github.com/aspose-words/Aspose.Words-for-Python-via-.NET) with a compatible API (`Document`, `SaveFormat`, `SaveOptions`).

[![Python](https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12-blue.svg)](https://www.python.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Features

- **DOCX Read/Write**: Pure Python reader using only the standard library (`zipfile`, `xml.etree`)
- **DOC Support**: Word 97-2003 binary format reader via `olefile`
- **RTF Support**: Rich Text Format reader via OLE2 delegation
- **Plain Text & Markdown Input**: Read `.txt` and `.md` files
- **Markdown Export**: Rich formatting — headings, bold/italic/strikethrough/underline, ordered and unordered lists (including nested), tables, block quotes, code blocks, and hyperlinks. Encoding and paragraph break sequence are configurable
- **PDF Export**: Generate PDF output via `fpdf2`. Applied `PdfSaveOptions` fields: `compliance`, `image_compression`, `jpeg_quality`, `outline_options`, `export_document_structure`, `export_bookmarks_outline`, `zoom_behavior`, `zoom_factor`, `display_doc_title`
- **Plain Text Export**: Extract document text content

## Installation

From PyPI:

```bash
pip install aspose-words-foss
```

Nightly (latest from GitHub):

```bash
pip install git+https://github.com/aspose-words-foss/Aspose.Words-FOSS-for-Python.git
```

## Quick Start

### Convert a document to Markdown

```python
import aspose.words_foss as aw

doc = aw.Document("input.docx")  # or .doc, .rtf, .txt, .md
doc.save("output.md", aw.SaveFormat.MARKDOWN)
```

### Export to PDF

```python
import aspose.words_foss as aw

doc = aw.Document("input.docx")
doc.save("output.pdf", aw.SaveFormat.PDF)
```

### Export to DOCX

```python
import aspose.words_foss as aw

doc = aw.Document("input.docx")  # or .doc, .rtf
doc.save("output.docx", aw.SaveFormat.DOCX)
```

### Extract plain text

```python
import aspose.words_foss as aw

doc = aw.Document("input.docx")
text = doc.get_text()
```

### Save with options

```python
import aspose.words_foss as aw
from aspose.words_foss.saving import (
    MarkdownSaveOptions,
    OoxmlSaveOptions,
    PdfSaveOptions,
    CompressionLevel,
)

doc = aw.Document("input.docx")

# Markdown: underline, encoding, paragraph break
md_opts = MarkdownSaveOptions()
md_opts.export_underline_formatting = True
md_opts.encoding = "utf-8-sig"        # write a UTF-8 BOM
md_opts.paragraph_break = "\r\n"      # CRLF between paragraphs
doc.save("output.md", md_opts)

# DOCX: compression level
ooxml_opts = OoxmlSaveOptions()
ooxml_opts.compression_level = CompressionLevel.MAXIMUM
doc.save("output.docx", ooxml_opts)

pdf_opts = PdfSaveOptions()
doc.save("output.pdf", pdf_opts)
```

## Requirements

- Python 3.10 or higher
- olefile >= 0.46
- fpdf2 >= 2.7.0
- pydantic >= 2.0.0

## API Examples

Runnable examples demonstrating the `aspose.words_foss` API:
ApiExamples folder

### Files

| File | What it shows |
|------|---------------|
| `convert_document.py` | Every input format (DOCX, DOC, RTF, TXT, MD) to every output format (Markdown, PDF, TXT) |
| `working_with_markdown_save_options.py` | `MarkdownSaveOptions` — `export_underline_formatting`, `encoding`, `paragraph_break` |
| `working_with_ooxml_save_options.py` | `OoxmlSaveOptions` for DOCX export — `pretty_format`, `compression_level` |
| `working_with_pdf_save_options.py` | PDF export from all input formats |
| `working_with_txt_save_options.py` | Plain-text export and `get_text()` |
| `working_with_images.py` | Image-containing documents to all output formats |

### Running

```bash
# Individual scripts
python ApiExamples/convert_document.py

# All via pytest
python -m pytest ApiExamples/ -v --rootdir=ApiExamples -c ApiExamples/pytest.ini
```

### Input / Output

- **Input**: `tests/data/input/` (shared test fixtures)
- **Output**: `ApiExamples/output/` (git-ignored)

## License

This project is licensed under the MIT License - see the [LICENSE](License/license.txt) file for details.

## Support

- **Issues**: [GitHub Issues](https://github.com/aspose-words-foss/Aspose.Words-FOSS-for-Python/issues)