Metadata-Version: 2.4
Name: juditha
Version: 4.4.0
Summary: A super-fast canonical name lookup service
License: AGPLv3+
License-File: LICENSE
License-File: NOTICE
Author: Simon Wörpel
Author-email: simon.woerpel@pm.me
Requires-Python: >=3.11,<3.14
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: ahocorasick-rs (>=1.0.3,<2.0.0)
Requires-Dist: anystore (>=1.2.0,<2.0.0)
Requires-Dist: followthemoney (>=4.8.2,<5.0.0)
Requires-Dist: ftmq[level] (>=4.8.2,<5.0.0)
Requires-Dist: jellyfish (>=1.2.1,<2.0.0)
Requires-Dist: rapidfuzz (>=3.9.0,<4.0.0)
Requires-Dist: rigour (>=2.1.1,<3.0.0)
Requires-Dist: tantivy (>=0.25.1,<0.26.0)
Description-Content-Type: text/markdown

[![juditha on pypi](https://img.shields.io/pypi/v/juditha)](https://pypi.org/project/juditha/)
[![PyPI Downloads](https://static.pepy.tech/badge/juditha/month)](https://pepy.tech/projects/juditha)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/juditha)](https://pypi.org/project/juditha/)
[![Python test and package](https://github.com/dataresearchcenter/juditha/actions/workflows/python.yml/badge.svg)](https://github.com/dataresearchcenter/juditha/actions/workflows/python.yml)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)
[![Coverage Status](https://coveralls.io/repos/github/dataresearchcenter/juditha/badge.svg?branch=main)](https://coveralls.io/github/dataresearchcenter/juditha?branch=main)
[![AGPLv3+ License](https://img.shields.io/pypi/l/juditha)](./LICENSE)
[![Pydantic v2](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/pydantic/pydantic/main/docs/badge/v2.json)](https://pydantic.dev)

# Juditha

A super-fast in-process lookup service for canonical names, backed by [tantivy](https://github.com/quickwit-oss/tantivy).

`juditha` exists to tame the noise that follows from [Named Entity Recognition](https://en.wikipedia.org/wiki/Named-entity_recognition): given a huge list of *known names* (company registries, persons of interest, sanctions lists), it tells you whether a span produced by your NER pipeline corresponds to one of them, even when the casing, accents, token order, or spelling differs.

The implementation uses a pre-populated names database and index. Data is either [FollowTheMoney](https://followthemoney.tech) entities or simply list of names.

## Documentation

https://docs.investigraph.dev/lib/juditha

## The name

**Juditha Dommer** was the daughter of a coppersmith and raised seven children, while her husband Johann Pachelbel wrote a *canon*.

## Versioning

To mark the compatibility with [followthemoney](https://followthemoney.tech), `juditha` follows the same major version, which is currently 4.x.x.

## License and copyright

`juditha`, (C) 2024 investigativedata.io. (C) 2025, 2026 [Data and Research Center – DARC](https://dataresearchcenter.org). Licensed under AGPLv3 or later. See [NOTICE](https://github.com/dataresearchcenter/juditha/blob/main/NOTICE) and [LICENSE](https://github.com/dataresearchcenter/juditha/blob/main/LICENSE).

