Metadata-Version: 2.4
Name: lyndon-words
Version: 0.2.0
Summary: Lyndon words, Duval factorization, necklace and bracelet enumeration, and De Bruijn sequences in pure Python.
Project-URL: Homepage, https://github.com/amaar-mc/lyndon-words
Project-URL: Repository, https://github.com/amaar-mc/lyndon-words
Project-URL: Issues, https://github.com/amaar-mc/lyndon-words/issues
Project-URL: Changelog, https://github.com/amaar-mc/lyndon-words/blob/main/CHANGELOG.md
Author: Amaar Chughtai
License: MIT License
        
        Copyright (c) 2026 Amaar Chughtai
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: combinatorics,combinatorics-on-words,de-bruijn,lyndon-words,necklaces,strings
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Typing :: Typed
Requires-Python: >=3.10
Provides-Extra: dev
Requires-Dist: hypothesis>=6; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest>=8; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Description-Content-Type: text/markdown

# lyndon-words

<p align="center">
  <img src="assets/logo.png" alt="lyndon-words logo" width="160">
</p>

[![CI](https://github.com/amaar-mc/lyndon-words/actions/workflows/ci.yml/badge.svg)](https://github.com/amaar-mc/lyndon-words/actions/workflows/ci.yml)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](./LICENSE)

Combinatorics on words in pure Python with zero dependencies: Lyndon-word testing, Duval's
Chen-Fox-Lyndon factorization, necklace and bracelet enumeration, and De Bruijn sequences.

## What is this?

A **Lyndon word** is a non-empty word that is strictly smaller than all of its rotations.
Lyndon words are the building blocks of combinatorics on words: every word factors uniquely
into a non-increasing concatenation of Lyndon words (the Chen-Fox-Lyndon theorem), and
Lyndon words index the necklaces (rotation classes) and the De Bruijn sequences over an
alphabet.

This library implements the classic algorithms directly, with no dependencies and strict
typing.

## Install

```sh
pip install lyndon-words
```

> PyPI release pending. Install from source:
> ```sh
> git clone https://github.com/amaar-mc/lyndon-words
> cd lyndon-words
> pip install -e .
> ```

## Quick start

```python
from lyndon_words import is_lyndon, factorize, enumerate_necklaces, enumerate_bracelets, de_bruijn, lyndon_array

# Lyndon-word test (accepts str, list, or tuple)
is_lyndon("aab")     # True
is_lyndon("aba")     # False
is_lyndon((0, 0, 1)) # True

# Duval's Chen-Fox-Lyndon factorization (non-increasing Lyndon factors)
factorize("banana")  # ['b', 'an', 'an', 'a']
factorize("aababab") # ['aabab', 'ab']

# Necklaces: rotation-class representatives over {0, 1} of length 3
list(enumerate_necklaces(alphabet_size=2, length=3))
# [(0, 0, 0), (0, 0, 1), (0, 1, 1), (1, 1, 1)]

# Bracelets: rotation + reflection classes over {0, 1} of length 4
list(enumerate_bracelets(alphabet_size=2, length=4))
# [(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 1, 1), (0, 1, 0, 1), (0, 1, 1, 1), (1, 1, 1, 1)]

# De Bruijn sequence B(2, 3): every length-3 binary word appears once cyclically
de_bruijn(alphabet_size=2, length=3)
# (0, 0, 0, 1, 0, 1, 1, 1)

# Lyndon array: longest Lyndon prefix length at each position
lyndon_array("banana")   # [1, 2, 1, 2, 1, 1]
lyndon_array("abcd")     # [4, 3, 2, 1]   (entire increasing suffix is Lyndon)
lyndon_array("aaaa")     # [1, 1, 1, 1]   (all equal, no run > 1 is Lyndon)
lyndon_array("0010011")  # [7, 2, 1, 4, 3, 1, 1]
```

## Word representation

Words over an integer alphabet are represented as `tuple[int, ...]`, for example
`(0, 0, 1)` for a 3-symbol word over `{0, 1}`. This is the canonical type, and every
enumeration and generation function (`enumerate_necklaces`, `enumerate_bracelets`,
`de_bruijn`) yields or returns tuples of ints.

The word predicates `is_lyndon` and `factorize` are generic over any comparable
`Sequence`, so `str`, `list`, and `tuple` all work. The factors returned by `factorize`
have the same type as the input: pass a `str` and you get a list of `str` factors, pass a
`tuple` and you get a list of `tuple` factors. String examples like `"aab"` behave exactly
like integer tuples because string comparison is lexicographic.

## API

Enumeration and generation functions take keyword-only `alphabet_size` and `length`.

| Function | Description |
|---|---|
| `is_lyndon(word)` | True iff `word` is non-empty and strictly smaller than all proper rotations |
| `factorize(word)` | Duval's O(n) Chen-Fox-Lyndon factorization into non-increasing Lyndon factors |
| `lyndon_array(word)` | For each index `i`, the length of the longest Lyndon prefix of `word[i:]` |
| `enumerate_necklaces(*, alphabet_size, length)` | FKM enumeration of rotation classes, lexicographic order |
| `enumerate_bracelets(*, alphabet_size, length)` | Enumeration of rotation + reflection classes, lexicographic order |
| `de_bruijn(*, alphabet_size, length)` | De Bruijn sequence B(alphabet_size, length) via FKM |

`alphabet_size >= 1` and `length >= 1` are required; otherwise a `ValueError` is raised.

## Definitions

**Lyndon word.** A non-empty word strictly smaller, in lexicographic order, than every one
of its proper rotations (equivalently, every proper suffix). Lyndon words are aperiodic.

**Chen-Fox-Lyndon factorization.** Every non-empty word factors uniquely into Lyndon words
`l_1 >= l_2 >= ... >= l_m`. Duval's algorithm computes this in O(n) time and O(1) extra
space.

**Necklace.** A rotation-equivalence class of length-`n` words. The representative is the
lexicographically least rotation. The count is
`(1/n) * sum over d dividing n of phi(d) * k^(n/d)`.

**Bracelet.** An equivalence class under rotation and reflection (the dihedral group). The
representative is the lexicographically least element of the orbit.

**Lyndon array.** For a word `s` of length `n`, the Lyndon array is the integer array
`L` where `L[i]` is the length of the longest Lyndon word that is a prefix of `s[i:]`.
Every entry satisfies `1 <= L[i] <= n - i`. The first entry `L[0]` equals the length of
the first (leftmost) factor in the Chen-Fox-Lyndon factorization of `s`: that factor is
the unique longest Lyndon prefix of the entire word.

**De Bruijn sequence.** A cyclic sequence `B(k, n)` of length `k^n` in which every
length-`n` word appears exactly once as a contiguous cyclic subword. The FKM construction
concatenates, in lexicographic order, every Lyndon word whose length divides `n`.

## References

- Duval, J.-P. (1983). Factorizing words over an ordered alphabet. Journal of Algorithms.
- Chen, K.-T., Fox, R. H., Lyndon, R. C. (1958). Free differential calculus IV. Annals of Mathematics.
- Fredricksen, H., Maiorana, J. (1978). Necklaces of beads in k colors and k-ary de Bruijn sequences. Discrete Mathematics.
- Fredricksen, H., Kessler, I. J. (1986). An algorithm for generating necklaces of beads in two colors. Discrete Mathematics.
- Ruskey, F. (2003). Combinatorial Generation. University of Victoria.

## License

MIT. Copyright (c) 2026 Amaar Chughtai.
