Metadata-Version: 2.4
Name: meridian-euid-spec
Version: 0.4.5
Summary: Specification, shared test vectors, and the Python reference implementation for Meridian Enterprise Unique Identifiers (EUIDs).
Author: Meridian contributors
License-Expression: CC-BY-4.0
Project-URL: Homepage, https://github.com/lsmc-bio/meridian-euid
Project-URL: Repository, https://github.com/lsmc-bio/meridian-euid
Project-URL: Documentation, https://github.com/lsmc-bio/meridian-euid/blob/main/README.md
Project-URL: Specification, https://github.com/lsmc-bio/meridian-euid/blob/main/SPEC.md
Project-URL: Issues, https://github.com/lsmc-bio/meridian-euid/issues
Keywords: euid,identifier,specification,test-vectors,crockford-base32,barcode,label
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE-SPEC
Dynamic: license-file

# meridian-euid

[![Python CI](https://github.com/lsmc-bio/meridian-euid/actions/workflows/python-ci.yml/badge.svg)](https://github.com/lsmc-bio/meridian-euid/actions/workflows/python-ci.yml)
[![Tag](https://img.shields.io/badge/tag-0.4.3-blue?style=flat-square)](https://github.com/lsmc-bio/meridian-euid/tags)
[![Spec](https://img.shields.io/badge/spec-0.4.0-blue?style=flat-square)](SPEC.md)
[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](LICENSE-SPEC)

Meridian defines a compact **Enterprise Unique Identifier (EUID)** format suitable for printed labels and barcodes.

For v0.4.0, Meridian is intentionally strict:

- **Canonical**: one shape only, `DOMAIN-PREFIX-BODYCHECKSUM`
- **Lookup-first**: the string identifies an object but does not explain it
- **Sequence-backed**: BODY is a PostgreSQL `BIGINT`-bounded value
- **Governed**: `(DOMAIN, PREFIX)` ownership is tracked in checked-in JSON registries
- **Python-first**: Python is the only supported reference implementation in this repo

## Quick Start

### Structure / Namespeaces (Domains & Prefixes) / Encoding Compression

### Meridian BODY length example table for fixed label `Z-TU-<body><checksum>`

Assumptions:

- alphabet size = 32
- first BODY character cannot be `0`
- BODY encodes an integer in `1..9223372036854775807`
- checksum is always exactly 1 character
- full visible label length = `len(body) + 6`

Cap:

- max BODY integer: `9223372036854775807`
- max BODY encoding: `7ZZZZZZZZZZZZ`
- max BODY length: `13`

| BODY length | Exact valid BODY encodings at this length | Cumulative valid BODY encodings up to this length | Full label length `Z-TU-...` |
|---:|---:|---:|---:|
| 1  | 31 | 31 | 7 |
| 2  | 992 | 1,023 | 8 |
| 3  | 31,744 | 32,767 | 9 |
| 4  | 1,015,808 | 1,048,575 | 10 |
| 5  | 32,505,856 | 33,554,431 | 11 |
| 6  | 1,040,187,392 | 1,073,741,823 | 12 |
| 7  | 33,285,996,544 | 34,359,738,367 | 13 |
| 8  | 1,065,151,889,408 | 1,099,511,627,775 | 14 |
| 9  | 34,084,860,461,056 | 35,184,372,088,831 | 15 |
| 10 | 1,090,715,534,753,792 | 1,125,899,906,842,623 | 16 |
| 11 | 34,902,897,112,121,344 | 36,028,797,018,963,967 | 17 |
| 12 | 1,116,892,707,587,883,008 | 1,152,921,504,606,846,975 | 18 |
| 13 | 8,070,450,532,247,928,832 | 9,223,372,036,854,775,807 | 19 |

### Notes

Note the significant compression crocker offers over integer only identifiers. If body was limited to only integers, then if body length == 1, there would be 9 int encodings (excluding 0) vs the 32 crocker encodings for body=1 ... and at body length==6 there are 1M integer possibilities and 1B crocker encoded possibilities.

For BODY lengths 1 through 12:

- exact count = `31 × 32^(n-1)`

For BODY length 13:

- the raw syntactic count would be `31 × 32^12`
- but the integer cap truncates that to exactly:
  - `7 × 32^12 = 8,070,450,532,247,928,832`

The cumulative total through length 13 equals the full allowed BODY integer space:

- `9,223,372,036,854,775,807`

### Python

```bash
cd reference-implementations/python
pip install -e ".[dev]"

# Validate a canonical EUID
meridian-euid validate A1-Z9-3V9

# Encode an integer as a canonical EUID
meridian-euid encode 123 Z9 --domain-code A1

# Parse an EUID into its structural components
meridian-euid parse MAX-MAX-7ZZZZZZZZZZZZB

# Compute the check character for DOMAIN + PREFIX + BODY
meridian-euid compute-check A1Z93V
```

## Documentation

| Document | Description |
|----------|-------------|
| [SPEC.md](SPEC.md) | Normative specification |
| [meridian_clinical_lab_profile.md](meridian_clinical_lab_profile.md) | Minimal barcode and rendering profile for clinical lab use |
| [checksum_elaboration.md](checksum_elaboration.md) | Why Meridian keeps the checksum inside the same safe alphabet |
| [domain_code_registry.json](domain_code_registry.json) | Checked-in manual registry of valid domain codes |
| [prefix_ownership_registry.json](prefix_ownership_registry.json) | Checked-in manual registry of `(domain, prefix)` ownership |
| [test-vectors/README.md](test-vectors/README.md) | Shared machine-readable vectors and contract notes |
| [reference-implementations/python/README.md](reference-implementations/python/README.md) | Python reference implementation and CLI |

## Repository Structure

```text
meridian-euid/
├── SPEC.md
├── CHANGELOG.md
├── domain_code_registry.json
├── prefix_ownership_registry.json
├── meridian_clinical_lab_profile.md
├── checksum_elaboration.md
├── test-vectors/
│   ├── README.md
│   └── v3.json
└── reference-implementations/
    └── python/
```

## EUID Format

Canonical format:

`DOMAIN-PREFIX-BODYCHECKSUM`

Example:

`A1-Z9-3V9`

Rules:

- `DOMAIN`: 1-4 uppercase Crockford Base32 characters from `0123456789ABCDEFGHJKMNPQRSTVWXYZ`
- `PREFIX`: 1-4 uppercase Crockford Base32 characters from the same alphabet
- `BODY`: Crockford Base32 encoding of an integer in `1..9223372036854775807`, with no leading `0`
- `CHECKSUM`: 1 Crockford Base32 character
- Forbidden everywhere: `I`, `L`, `O`, `U`
- Lowercase input MUST be rejected
- Checksum payload is `DOMAIN + PREFIX + BODY`; hyphens are syntax only

## Governance

The repo carries a lightweight manual governance model:

- `domain_code_registry.json` declares valid domains.
- `prefix_ownership_registry.json` maps each `(DOMAIN, PREFIX)` pair to exactly one `issuer_app_code`.

`issuer_app_code` is a lowercase ASCII governance token matching:

`^[a-z0-9]+(?:[._-][a-z0-9]+)*$`

Claims are explicit JSON objects, for example:

```json
{
  "Z9": {
    "issuer_app_code": "meridian.lims"
  }
}
```

Canonical validation does not require a governance lookup. Issuance and resolver routing typically do.

## Breaking Note

v0.4.0 removes the older split shapes and compatibility surfaces:

- no absent-domain form
- no `:` separator
- no sandbox/production mode split
- no legacy aliases
- no supported TypeScript reference implementation

## Contributing

Before proposing changes:

1. Run the Python test suite under [reference-implementations/python](reference-implementations/python).
2. Keep the registries human-auditable.
3. Update the shared vectors and docs in the same change when syntax or governance expectations move.
4. Treat syntax, alphabet, checksum, and integer-range changes as major-version work.
 
 
