Metadata-Version: 2.4
Name: tokeydokey
Version: 1.0.1
Summary: Create random identifiers using a fixed number of non-overlapping LLM tokens.
License: MIT License
        
        Copyright (c) 2025 Blixt
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: identifiers,llm,random,tiktoken,tokens
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: tiktoken>=0.5.0
Description-Content-Type: text/markdown

# tokeydokey

Create random identifiers using a fixed number of non-overlapping LLM tokens.

## Quick start

```bash
uv sync
uv run python - <<'PY'
import tokeydokey

print(tokeydokey.generate())
print(tokeydokey.generate(n=5))
PY
```

## Development

```bash
uv sync --group dev
uv run pytest
```

## Regenerate pools

```bash
uv run python scripts/generate_pools.py
uv run python scripts/generate_pools.py --encoding cl100k_base --out src/tokeydokey/_pools.py
```

## Example pool math (o200k_base, dot/dash union)

Start pool N = 3.89×10<sup>4</sup> (alnum tokens), next pool M = 6.38×10<sup>3</sup> (".word" or "-word").

| Tokens | Combinations                           | Tokens | Combinations                            | Tokens | Combinations                            |
| -----: | -------------------------------------- | -----: | --------------------------------------- | -----: | --------------------------------------- |
|      1 | 3.89×10<sup>4</sup> (~2<sup>15</sup>)  |      5 | 6.46×10<sup>19</sup> (~2<sup>66</sup>)  |      9 | 1.07×10<sup>35</sup> (~2<sup>116</sup>) |
|      2 | 2.49×10<sup>8</sup> (~2<sup>28</sup>)  |      6 | 4.12×10<sup>23</sup> (~2<sup>78</sup>)  |     10 | 6.83×10<sup>38</sup> (~2<sup>129</sup>) |
|      3 | 1.59×10<sup>12</sup> (~2<sup>41</sup>) |      7 | 2.63×10<sup>27</sup> (~2<sup>91</sup>)  |     11 | 4.36×10<sup>42</sup> (~2<sup>142</sup>) |
|      4 | 1.01×10<sup>16</sup> (~2<sup>53</sup>) |      8 | 1.68×10<sup>31</sup> (~2<sup>104</sup>) |     12 | 2.78×10<sup>46</sup> (~2<sup>154</sup>) |

Note: For ~128 bits of entropy, base64 needs 22 chars (132 bits) which average ~15.2 tokens in o200k_base; dot/dash union needs ~10 tokens. This is roughly 50% more token-efficient than random base64 identifiers.

## Alternatives considered

- CamelTitle (Titlecase 2-12 chars): pool size 8,482, 100% compatible for concatenation.
- Word/(Word+Number) alternating: union pool size 9,482 (adds 0-999), 100% compatible.
- Dot-only: next pool 4,410, 100% compatible.
- Base62: around 8.6 bits per token in o200k_base; token count varies.

## License

MIT
