Metadata-Version: 2.4
Name: textnormx
Version: 0.2.0
Summary: Normalize/clean text from PDF OCR/extraction (PUA bullets, quotes, dashes, NBSP, control chars)
Project-URL: Homepage, https://example.com
Author-email: Vous <you@example.com>
License: MIT
Requires-Python: >=3.8
Provides-Extra: html
Requires-Dist: beautifulsoup4>=4.9; extra == 'html'
Description-Content-Type: text/markdown

# textnormx

Cleaning extracted text (PDF/OCR): PUA bullets (`\uf0b7`), NBSP, quotation marks, dashes,
control characters, summary lines, etc.

## Install

pip install textnormx
