Metadata-Version: 2.4
Name: license-normalizer
Version: 0.4
Summary: Comprehensive licence normalisation with a three-level hierarchy.
Author-email: Artur Barseghyan <artur.barseghyan@gmail.com>
Maintainer-email: Artur Barseghyan <artur.barseghyan@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/barseghyanartur/licence-normaliser/
Project-URL: Repository, https://github.com/barseghyanartur/licence-normaliser/
Project-URL: Issues, https://github.com/barseghyanartur/licence-normaliser/issues
Keywords: license,normalisation,spdx,creative commons,open source
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: 3.15
Classifier: Programming Language :: Python
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/x-rst
License-File: LICENSE
Provides-Extra: all
Requires-Dist: licence-normaliser[build,dev,docs,test]; extra == "all"
Provides-Extra: dev
Requires-Dist: detect-secrets; extra == "dev"
Requires-Dist: doc8; extra == "dev"
Requires-Dist: ipython; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: uv; extra == "dev"
Provides-Extra: test
Requires-Dist: pytest; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Requires-Dist: pytest-codeblock; extra == "test"
Provides-Extra: docs
Requires-Dist: sphinx; extra == "docs"
Requires-Dist: sphinx-autobuild; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=1.3.0; extra == "docs"
Requires-Dist: sphinx-no-pragma; extra == "docs"
Requires-Dist: sphinx-markdown-builder; extra == "docs"
Requires-Dist: sphinx-llms-txt-link; extra == "docs"
Requires-Dist: sphinx-source-tree; extra == "docs"
Provides-Extra: build
Requires-Dist: build; extra == "build"
Requires-Dist: twine; extra == "build"
Requires-Dist: wheel; extra == "build"
Dynamic: license-file

==================
licence-normaliser
==================

.. image:: https://raw.githubusercontent.com/barseghyanartur/licence-normaliser/main/docs/_static/licence_normaliser_logo.webp
   :alt: licence-normaliser logo
   :align: center

Comprehensive licence normalsation with a three-level hierarchy.

.. image:: https://img.shields.io/pypi/v/licence-normaliser.svg
   :target: https://pypi.python.org/pypi/licence-normaliser
   :alt: PyPI Version

.. image:: https://img.shields.io/pypi/pyversions/licence-normaliser.svg
   :target: https://pypi.python.org/pypi/licence-normaliser/
   :alt: Supported Python versions

.. image:: https://github.com/barseghyanartur/licence-normaliser/actions/workflows/test.yml/badge.svg?branch=main
   :target: https://github.com/barseghyanartur/licence-normaliser/actions
   :alt: Build Status

.. image:: https://readthedocs.org/projects/licence-normaliser/badge/?version=latest
    :target: http://licence-normaliser.readthedocs.io
    :alt: Documentation Status

.. image:: https://img.shields.io/badge/docs-llms.txt-blue
    :target: https://licence-normaliser.readthedocs.io/en/latest/llms.txt
    :alt: llms.txt - documentation for LLMs

.. image:: https://img.shields.io/badge/license-MIT-blue.svg
   :target: https://github.com/barseghyanartur/licence-normaliser/#Licence
   :alt: MIT

.. image:: https://coveralls.io/repos/github/barseghyanartur/licence-normaliser/badge.svg?branch=main&service=github
    :target: https://coveralls.io/github/barseghyanartur/licence-normaliser?branch=main
    :alt: Coverage

``licence-normaliser`` is a comprehensive licence normalisation library that
maps any licence representation (SPDX tokens, URLs, prose descriptions) to a
canonical three-level hierarchy.

Features
========

- **Three-level hierarchy** - LicenceFamily → LicenceName → LicenceVersion.
- **Wide format support** - SPDX tokens, URLs, prose descriptions.
- **Creative Commons support** - Full CC family with versions and IGO variants.
- **Publisher-specific licences** - Springer, Nature, Elsevier, Wiley, ACS,
  and more.
- **File-driven data** - Add aliases, URLs, and patterns by editing JSON files.
  No Python code changes required for new synonyms.
- **Pluggable parsers** - Drop in a new parser class to ingest
  any external licence registry. Parsers implement plugin interfaces
  (``RegistryPlugin``, ``URLPlugin``, etc.).
- **Strict mode** - Raise ``LicenceNotFoundError`` instead of silently
  returning ``"unknown"``.
- **Caching** - LRU caching for performance.
- **CLI** - Command-line interface with ``--strict`` and ``--explain`` support.

Hierarchy
=========

The library uses a three-level hierarchy:

1. **LicenceFamily** - broad bucket: ``"cc"``, ``"osi"``, ``"copyleft"``,
   ``"publisher-tdm"``, ...
2. **LicenceName** - version-free: ``"cc-by"``, ``"cc-by-nc-nd"``, ``"mit"``,
   ``"wiley-tdm"``
3. **LicenceVersion** - fully resolved: ``"cc-by-3.0"``, ``"cc-by-nc-nd-4.0"``

Installation
============

With ``uv``:

.. code-block:: sh

    uv pip install licence-normaliser

Or with ``pip``:

.. code-block:: sh

    pip install licence-normaliser

Quick start
===========

.. code-block:: python
    :name: test_quick_start

    from licence_normaliser import normalise_licence

    v = normalise_licence("CC BY-NC-ND 4.0")
    str(v)                  # "cc-by-nc-nd-4.0"   ← LicenceVersion
    str(v.licence)          # "cc-by-nc-nd"       ← LicenceName
    str(v.licence.family)   # "cc"                ← LicenceFamily

Strict mode
===========

By default, unresolvable inputs return an ``"unknown"`` result.  Pass
``strict=True`` to raise ``LicenceNotFoundError`` instead:

.. code-block:: python
    :name: test_strict_mode

    from licence_normaliser import normalise_licence
    from licence_normaliser.exceptions import LicenceNotFoundError

    # Silent fallback (default)
    v = normalise_licence("some-unknown-string")
    v.family.key  # "unknown"

    # Strict: raises on unresolvable input
    try:
        v = normalise_licence("some-unknown-string", strict=True)
    except LicenceNotFoundError as exc:
        print(exc.raw)      # original input
        print(exc.cleaned)  # cleaned form that failed lookup

Trace / Explain
===============

Set ``ENABLE_LICENCE_NORMALISER_TRACE=1`` or pass ``trace=True`` to get
resolution traces showing how the licence was matched:

.. code-block:: python
    :name: test_trace

    from licence_normaliser import normalise_licence

    # Via function
    v = normalise_licence("cc by-nc-nd 3.0 igo", trace=True)
    print(v.explain())

    # Via class
    from licence_normaliser import LicenceNormaliser
    ln = LicenceNormaliser(trace=True)
    v = ln.normalise_licence("MIT")
    print(v.explain())

Output shows the resolution pipeline (alias → registry → url → prose →
fallback) and which source file + line matched:

.. code-block:: text

    Input: 'cc by-nc-nd 3.0 igo' → 'cc by-nc-nd 3.0 igo'
      [✓] alias: 'cc by-nc-nd 3.0 igo' → 'cc-by-nc-nd-3.0-igo' (line 139 in aliases.json)

    Result:
      version_key: 'cc-by-nc-nd-3.0-igo'
      name_key: 'cc-by-nc-nd'
      family_key: 'cc'

The trace can also be accessed via ``v._trace`` for programmatic use.

Batch normalisation
===================

.. code-block:: python
    :name: test_batch_normalisation

    from licence_normaliser import normalise_licences

    results = normalise_licences(["MIT", "Apache-2.0", "CC BY 4.0"])
    for r in results:
        print(r.key)

    # Strict batch - raises on first unresolvable
    results = normalise_licences(["MIT", "Apache-2.0"], strict=True)

Custom plugins
==============

The ``LicenceNormaliser`` class lets you inject custom plugin classes for
specialised use cases:

.. code-block:: python
    :name: test_custom_plugins

    from licence_normaliser import LicenceNormaliser
    from licence_normaliser.parsers.alias import AliasParser
    from licence_normaliser.parsers.spdx import SPDXParser

    # Use only SPDX + Alias plugins (no CC, no publisher URLs)
    ln = LicenceNormaliser(
        registry=[SPDXParser],
        alias=[AliasParser],
        family=[AliasParser],
        name=[AliasParser],
        cache=True,
        cache_maxsize=8192,
    )

    # MIT resolves via SPDX parser
    assert str(ln.normalise_licence("MIT")) == "mit"

    # CC BY resolves via Alias
    assert str(ln.normalise_licence("CC BY-NC-ND 4.0")) == "cc-by-nc-nd-4.0"

.. note::

    Explicit plugin passing is optional — ``LicenceNormaliser()``
    automatically loads defaults. Use the pattern above only if you need
    custom plugins or reduce number of plugins loaded.

For caching, ``LicenceNormaliser`` wraps the resolution method
with ``lru_cache``.
Disable it by passing ``cache=False`` for debugging:

.. code-block:: python
    :name: test_caching

    from licence_normaliser import LicenceNormaliser

    ln = LicenceNormaliser(cache=False)
    result = ln.normalise_licence("MIT")

Update data (CLI)
=================

.. code-block:: sh

    licence-normaliser update-data --force
    # Fetches fresh SPDX, OpenDefinition, OSI, CreativeCommons, and ScanCode JSONs

Integration tests (public API only)
===================================

All integration tests live in
``src/licence_normaliser/tests/test_integration.py``
and only import the public API.

CLI usage
=========

Normalise a single licence:

.. code-block:: sh

    licence-normaliser normalise "MIT"
    # Output: mit

    licence-normaliser normalise --full "CC BY 4.0"
    # Output:
    # Key: cc-by-4.0
    # URL: https://creativecommons.org/licenses/by/4.0/
    # Licence: cc-by
    # Family: cc

    licence-normaliser normalise --strict "totally-unknown"
    # Exits with code 1 and prints an error

Batch normalise:

.. code-block:: sh

    licence-normaliser batch MIT "Apache-2.0" "CC BY 4.0"
    licence-normaliser batch --strict MIT "Apache-2.0"

Exceptions
==========

.. code-block:: python
    :name: test_exceptions

    from licence_normaliser.exceptions import (
        LicenceNormaliserError,   # base class
        LicenceNotFoundError,     # raised by strict mode
    )

Testing
=======

All tests run inside Docker:

.. code-block:: sh

    make test

To test a specific Python version:

.. code-block:: sh

    make test-env ENV=py312

Licence
=======

MIT

Author
======

Artur Barseghyan <artur.barseghyan@gmail.com>
