Metadata-Version: 2.4
Name: rigour
Version: 2.1.1
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Rust
Requires-Dist: pyyaml>=5.0.0,<7.0.0
Requires-Dist: normality>=3.0.1,<4.0.0
Requires-Dist: orjson>=3.0.0,<4.0.0
Requires-Dist: python-stdnum>=2.0,<3.0.0
Requires-Dist: jinja2>=3.1.0,<4.0.0
Requires-Dist: pip>=10.0.0 ; extra == 'dev'
Requires-Dist: bump2version ; extra == 'dev'
Requires-Dist: wheel>=0.29.0 ; extra == 'dev'
Requires-Dist: ruamel-yaml ; extra == 'dev'
Requires-Dist: black ; extra == 'dev'
Requires-Dist: build ; extra == 'dev'
Requires-Dist: twine ; extra == 'dev'
Requires-Dist: mypy ; extra == 'dev'
Requires-Dist: pytest ; extra == 'dev'
Requires-Dist: pytest-cov ; extra == 'dev'
Requires-Dist: types-pyyaml ; extra == 'dev'
Requires-Dist: types-setuptools ; extra == 'dev'
Requires-Dist: types-pyyaml ; extra == 'dev'
Requires-Dist: coverage>=4.1 ; extra == 'dev'
Requires-Dist: pillow ; extra == 'docs'
Requires-Dist: cairosvg ; extra == 'docs'
Requires-Dist: mkdocs ; extra == 'docs'
Requires-Dist: mkdocstrings[python] ; extra == 'docs'
Requires-Dist: mkdocs-material ; extra == 'docs'
Provides-Extra: dev
Provides-Extra: docs
License-File: LICENSE
Summary: Financial crime domain data validation and normalization library.
Author-email: OpenSanctions <info@opensanctions.org>
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Documentation, https://rigour.followthemoney.tech/
Project-URL: Issues, https://github.com/opensanctions/rigour/issues
Project-URL: Repository, https://github.com/opensanctions/rigour.git

# rigour

Data cleaning and validation functions for processing various types of text emanating and describing the business world. This applies to human and company names, language, territory
and country codes, corporate and tax identifiers, etc.

The underlying idea is that handling these sorts of descriptors is easy on first glance, but reveals a dizzying set of complexity when carried into production. This is why `rigour` consolidates implementations that have already met some edge cases and are well-tested.

## Installing `rigour`

You can grab the latest release from PyPI:

```bash
pip install -U rigour
```

## Usage & documentation 

See: https://rigour.followthemoney.tech/

## Acknowledgements

The address formatting database contained in `rigour/data/addresses/formats.yml` is derived from `worldwide.yml` in the [OpenCageData address-formatting repository](https://github.com/OpenCageData/address-formatting). It is used to format addresses according to customs in the country that is been encoded.

`rigour` consolidates and includes a set of older Python libraries into a single codebase: `languagecodes`, `pantomime`, `fingerprints`. The development of these libraries was funded by OCCRP as part of the Aleph software project.

## License

MIT. See `LICENSE`.
