Metadata-Version: 2.4
Name: eidr
Version: 0.1.0rc1
Summary: Official Python SDK for the Entertainment Identifier Registry (EIDR)
Project-URL: Homepage, https://eidr.org
Project-URL: Documentation, https://eidr-id.github.io/eidr-python-sdk/
Project-URL: Source, https://github.com/EIDR-ID/eidr-python-sdk
Project-URL: Issues, https://github.com/EIDR-ID/eidr-python-sdk/issues
Project-URL: Changelog, https://github.com/EIDR-ID/eidr-python-sdk/blob/main/CHANGELOG.md
Author-email: "Entertainment Identifier Registry (EIDR)" <support@eidr.org>
License: Apache-2.0
License-File: LICENSE
Keywords: doi,eidr,entertainment,identifier,mec,media,metadata,movielabs
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Multimedia :: Video
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: lxml>=5.0
Requires-Dist: rfc8785>=0.1
Provides-Extra: aws
Requires-Dist: boto3>=1.34; extra == 'aws'
Provides-Extra: client
Requires-Dist: httpx>=0.27; extra == 'client'
Provides-Extra: dev
Requires-Dist: hypothesis>=6.100; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pydantic>=2; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: python-stdnum>=1.20; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Requires-Dist: types-boto3>=1.0.2; extra == 'dev'
Requires-Dist: xsdata-pydantic>=24; extra == 'dev'
Requires-Dist: xsdata[cli]>=24; extra == 'dev'
Provides-Extra: digital
Requires-Dist: pydantic>=2; extra == 'digital'
Requires-Dist: xsdata-pydantic>=24; extra == 'digital'
Provides-Extra: docs
Requires-Dist: furo>=2024.1; extra == 'docs'
Requires-Dist: myst-parser>=3.0; extra == 'docs'
Requires-Dist: sphinx-autodoc2>=0.5; extra == 'docs'
Requires-Dist: sphinx>=7.0; extra == 'docs'
Requires-Dist: sphinxcontrib-openapi>=0.8; extra == 'docs'
Description-Content-Type: text/markdown

# eidr — Official Python SDK for the Entertainment Identifier Registry

[![PyPI](https://img.shields.io/pypi/v/eidr.svg)](https://pypi.org/project/eidr/)
[![Python](https://img.shields.io/pypi/pyversions/eidr.svg)](https://pypi.org/project/eidr/)
[![License](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](LICENSE)

`eidr` is the official Python SDK for the
[Entertainment Identifier Registry](https://eidr.org). It provides
typed, ergonomic access to EIDR's Content, Party, and Video Service
ID registries over the native XML REST API.

> **Status:** `0.1.0rc1` — first release candidate. The public API
> contract documented in [STABILITY.md](STABILITY.md) takes effect
> at this release; remaining changes before `1.0.0` are limited to
> release-blocking defect corrections. See
> [CHANGELOG.md](CHANGELOG.md) for what's new and the
> [STABILITY.md](STABILITY.md) "Breaking changes by version"
> section if upgrading from a beta release.

## What's implemented today

- **Codec layer** — XML ↔ intermediate dict ↔ JSON conversion
  following the MovieLabs MDDF JSON Encoding Best Practice. Supports
  both infoset round-trip and canonical output (W3C Canonical XML 2.0
  for XML; RFC 8785 JCS for JSON, when ``rfc8785`` is installed).
- **Typed records** — `ContentRecord`, `PartyRecord`, `ServiceRecord`
  wrappers over the codec dict, with structured property accessors,
  pre-submission validation per the EIDR submission profile, and a
  root-element-driven `eidr.parse()` dispatcher.
- **EIDR IDs** — `EIDRID` value type with ISO 7064 Mod 37,36
  check-character validation and lossless conversions between
  canonical DOI form, URN form, and bare suffix.
- **Credentials** — five sources, all unified under
  `Credentials.load()`: EIDR XML config file, JSON file, AWS Secrets
  Manager (with ``[aws]`` extra), environment variables, and direct
  construction.
- **Sync HTTP client** — `eidr.Client`, gated on the ``[client]``
  extra. Wraps transport, authentication, retry policy, and response
  parsing. Operations:
  - **Content reads:** `resolve`, `query`, `graph_traversal`,
    `status_lookup`.
  - **Content writes:** `register`, `match`, `modify`, `delete`,
    `promote`, `alias`, `add_relationship`, `remove_relationship`,
    `replace_relationship`. Each supports `immediate=`, async tokens,
    and optional polling with `wait_timeout=`.
  - **Video Service:** `service_query` (read), `create_service`,
    `modify_service`, `delete_service`, `alias_service`,
    `set_service_parent`, `service_children`.
  - **Party:** `party_query` (read), `create_party`, `modify_party`,
    `delete_party`, `alias_party`, `activate_party`,
    `deactivate_party`, `change_party_password`.
  - **Virtual fields:** `virtual_fields(asset_id)` returning a
    `VirtualFields` value object with `full`/`self_defined`/`alias`
    serialized views.
- **Async HTTP client** — `eidr.AsyncClient`, same operations as
  `Client` but with `async def` methods backed by
  `httpx.AsyncClient`. Use with `async with`. `AsyncToken` mirrors
  `Token` with awaitable `poll`/`wait`/`operation_result`. The
  sync-under-load fallback case (registry deferring an
  `immediate=True` Create/Modify to async under load, per REST API
  §2.1.1) is handled transparently; `Token.to_async(async_client)`
  bridges a sync-fallback token into an async workflow when needed.
- **Tracing** — built-in support-diagnostics facility. `TraceSink`
  protocol with `LoggerSink`, `FileSink`, and `ListSink`
  implementations. Sensitive headers (Authorization, Cookie, etc.)
  redacted at capture time. Body size limits and "safe support
  bundle" body-redaction modes available via `TransportConfig`.
- **Typed Digital sub-model** — `eidr.models.digital`, gated on
  the ``[digital]`` extra. Auto-generated from the bundled XSD by
  `xsdata-pydantic` at SDK-development time and shipped pre-built;
  end users never run the codegen. Provides typed access to a
  Manifestation's `<Digital>` sub-block (audio / video / subtitle /
  interactive tracks; container-level packaging metadata) via
  `manifestation.digital_typed()` / `set_digital()` accessors. The
  raw codec dict on `manifestation.digital` is unchanged for
  pass-through and dedup workflows that compare whole-block
  equality. The codec dict layer remains the source of truth — the
  typed view is transient (each call materializes a fresh
  pydantic instance from the dict).
- **Type-hinted** throughout (PEP 561 `py.typed`); `mypy --strict`
  clean across all 48 source files.

## Implementation status

The SDK has reached **functional and ergonomic parity with the Java
SDK on the core public registry-operation surface** as of `0.1.0b3`
(M13). Java-only surfaces excluded by design — batch operations,
user/ACL admin, UserOverride/impersonation — are listed under "Out
of scope" below.

**Read paths (all fully typed):**

- `resolve()` / `resolve_party()` / `resolve_service()` — single-record
  lookups. Returns typed `ContentRecord` / `Party` / `Service`.
- `query()` — content asset query (XPath-style filter strings).
  Page-by-page access to a typed `QueryResults`.
- `iter_query()` / `iter_query_ids()` — auto-paging iterators over
  asset-query results, yielding `ContentRecord` and ID strings
  respectively.
- `find_parties_by_name()`, `find_parties_from_catalog()`,
  `find_services_by_name()`, `find_services_from_catalog()` —
  typed query helpers returning `PartyQueryResults` /
  `ServiceQueryResults`.
- `party_query()` / `service_query()` — escape hatch for advanced
  query bodies; takes raw XML and returns `ParsedResponse`.
- `graph_traversal()` / `service_parent()` /
  `service_children()` — service-graph reads.
- `modification_base()` — fetch a record body suitable for use as
  a `modify()` starting point. Requires `creation_type`. Use
  `modification_base_auto(asset_id)` to resolve-then-infer when
  the type isn't known statically.
- `iter_status_by_user()` / `iter_status_by_registrant()` /
  `iter_status_superparty()` / `iter_status_by_token()` —
  auto-paging iterators yielding `OperationStatusEntry`.
- `virtual_fields()` — search-index virtual fields (full /
  self-defined views).

**Write paths:**

- `register()` / `modify()` / `delete()` / `alias()` / `promote()`
  for content; `create_party()` / `modify_party()` /
  `change_password()` for parties; `create_service()` /
  `modify_service()` / `delete_service()` for services. Each is
  single-operation; batch is out of scope.
- `add_relationship()` / `replace_relationship()` /
  `remove_relationship()` for asset relationships.

**Async surface:** every method has an exact async mirror on
`AsyncClient`. The auto-paging iterators return `AsyncIterator`
for use with `async for`.

## Out of scope

The following surfaces are intentionally not exposed:

- **Batch operations** (`registerBatchFromXML`, etc.) — the
  single-operation API is the public surface; callers wanting
  batches drop to the codec layer.
- **User admin and ACL admin** (`/user/*`, `AdminAcl.*`) — internal
  Operations machinery, not for the public SDK.
- **Impersonation / UserOverride** (per-operation user tokens with
  forced dedup flags) — Superparty-only feature, deferred pending
  a clean public surface design. Targeted post-1.0.

## Roadmap to 1.0

Items planned but not blocking 1.0:

- **Schema validation expansion** (`SchemaSource` to a full lxml
  URI resolver) — targeted for 1.1.

See the M9 cover letter for the full Python↔Java SDK variance
table (kept current through M14). Subsequent cover letters
narrate what changed in each milestone and link back to the
variance table.

## Installation

```bash
pip install eidr               # codec + records + IDs (no network)
pip install 'eidr[client]'     # adds the HTTP client (httpx)
pip install 'eidr[aws]'        # adds AWS Secrets Manager support
pip install 'eidr[digital]'    # adds the typed Digital sub-model (pydantic)
pip install 'eidr[client,aws]' # both
```

## Quick start

### Resolve

```python
from eidr import Client, Credentials, registries

with Client(
    registries.SANDBOX2,
    Credentials.load(),  # or from_eidr_xml, from_json, from_aws_secret, etc.
) as client:
    record = client.resolve("10.5240/0000-02ED-1DCE-6AAF-99F7-M")
    print(record.id, record.resource_name)
```

### Register a new record (synchronous)

```python
from eidr.models.content import ContentRecord

record = ContentRecord.from_xml(my_record_bytes)
created = client.register(record, immediate=True)
print("Assigned ID:", created.id)
```

### Register with deferred polling (sync Client, registry-async write)

```python
token = client.register(record, immediate=False)
# Persist token.value if you need to resume later.
result = token.operation_result(timeout=120)
if result.status.name == "SUCCESS":
    print("Registered:", result.id)
elif result.status.name == "PENDING":
    print("Still pending; sub-tokens:", result.sub_tokens)
```

### Async workflow with AsyncClient

For programs using `asyncio`, `AsyncClient` mirrors `Client`'s API
with `async def` methods. Every operation — `resolve`, `register`,
`match`, `modify`, `delete`, `promote`, `alias`, relationship ops,
`query`, `graph_traversal`, `party_query`, `service_query`,
`status_lookup` — has an `async` counterpart with the same
signature. Returned `Token`s become `AsyncToken`s whose
`poll`/`wait`/`operation_result` methods are awaitable.

```python
import asyncio
from eidr import AsyncClient, Credentials, registries

async def main():
    async with AsyncClient(
        registries.SANDBOX2,
        Credentials.load(),
    ) as client:
        # Read
        record = await client.resolve("10.5240/0000-02ED-1DCE-6AAF-99F7-M")

        # Registry-async write (AsyncToken returned; awaitable poll)
        token = await client.register(new_record, immediate=False)
        result = await token.operation_result(timeout=120)
        print(f"{result.status.name}: {result.id}")

asyncio.run(main())
```

**Sync-under-load fallback.** Per EIDR REST API §2.1.1, an
`immediate=True` Create or Modify can be deflected to async by the
registry when dedupe can't complete within the response window.
Both `Client.register` and `AsyncClient.register` handle this:
`immediate=True` will *usually* return a `ContentRecord` directly,
but may return a `Token`/`AsyncToken` if the registry deferred.
Deflection does not apply to `match()`, which always resolves
inline.

Production callers should always handle both return types. The
recommended idiom:

```python
from eidr import Client, Token, ContentRecord, registries

with Client(registries.SANDBOX2, creds) as client:
    result = client.register(record, immediate=True)

    if isinstance(result, Token):
        # Registry deferred under load. Persist token.value if you
        # want to survive a process restart, then poll.
        op = result.operation_result(timeout=120)
        registered = op.record  # ContentRecord, or None on failure
    else:
        # Registry handled it inline.
        registered = result  # ContentRecord

    if registered is not None:
        print("Registered:", registered.id)
```

Equivalent for `AsyncClient`:

```python
from eidr import AsyncClient, AsyncToken, ContentRecord, registries

async with AsyncClient(registries.SANDBOX2, creds) as client:
    result = await client.register(record, immediate=True)

    if isinstance(result, AsyncToken):
        op = await result.operation_result(timeout=120)
        registered = op.record
    else:
        registered = result
```

The SDK does not auto-wrap this. Auto-wrapping would hide the
fallback case from callers who legitimately want to know whether
their immediate registration completed inline (faster, no extra
round-trips) versus deferred (caller may want to release the worker
slot, queue the polling, etc.). The `isinstance` check is two lines
of boilerplate for a meaningful semantic distinction.

If you already have a sync `Token` (e.g., from a legacy sync call
site) and want to `await` its completion from an async workflow,
`Token.to_async(async_client)` converts it into an `AsyncToken`
without re-issuing the write. The reverse conversion is not
offered — running `asyncio.run()` from within sync code is almost
always a sign of something wrong elsewhere.

### Query

```python
results = client.query(
    "/FullMetadata/BaseObjectData/ReferentType IS Movie",
    page_number=1,
    page_size=50,
)
for record in results.records:
    print(record.id, record.resource_name)
print(f"Page 1 of ~{(results.total_matches + 49) // 50}")
if results.has_more_pages:
    next_page = client.query(..., page_number=2, page_size=50)
```

### Graph traversal

```python
from eidr import GraphTraversalType

descendants = client.graph_traversal(
    GraphTraversalType.FIND_DESCENDANTS,
    series_id,
    referent_type_filter="TV",
)
```

### Tracing for support diagnostics

```python
from eidr import FileSink

with Client(..., tracing=FileSink("/tmp/eidr-trace.log")) as client:
    client.resolve("10.5240/...")
# Trace file now contains every HTTP request/response with sensitive
# headers redacted. For sharing with third parties, also enable
# trace_redact_bodies=True via TransportConfig.
```

### Video Service writes

```python
from eidr.models.service import ServiceRecord

with Client(registry, creds) as client:
    # Create a new service. The registry assigns the ID.
    new_svc = ServiceRecord.from_xml(b"""<?xml version="1.0"?>
        <Service xmlns="http://www.eidr.org/schema">
          <ServiceName>
            <DisplayName>My Streaming Service</DisplayName>
            <SortName>My Streaming Service</SortName>
          </ServiceName>
          <Active>true</Active>
        </Service>""")
    created = client.create_service(new_svc)
    # created is a ServiceRecord with the registry-assigned ID.

    # Modify (full record body required — the registry replaces all content)
    updated = client.modify_service(updated_svc)

    # Simple ops (return None on success)
    client.alias_service("10.5239/AAAA-BBBB", target_id="10.5239/CCCC-DDDD")
    client.set_service_parent("10.5239/AAAA-BBBB", parent_id="10.5239/PPPP-QQQQ")
    client.delete_service("10.5239/AAAA-BBBB")

    # service_children returns the parsed envelope
    response = client.service_children("10.5239/AAAA-BBBB")
```

### Party writes

> **The Superparty gate.** Party-administration operations
> (`create_party`, `modify_party`, `delete_party`, `alias_party`,
> `activate_party`, `deactivate_party`, `change_party_password`)
> are restricted by the EIDR registry to a single hard-coded
> Party — the "Superparty" with ID `10.5237/superparty`. Unlike
> Service writes, there is no Role mechanism that lets the
> registry delegate this authority. The SDK enforces this
> client-side via a strict-by-default gate: invoking any of the
> seven destructive Party methods with a non-Superparty
> `party_id` raises `EIDRSDKPolicyError` *before* any HTTP traffic.
>
> Configurable kwargs on `Client` / `AsyncClient`:
>
> - `superparty_id: str = "10.5237/superparty"` — the ID the gate
>   requires. The default is correct for all current EIDR
>   registries (production, sandbox1, sandbox2, sandbox2-mirror).
> - `enforce_superparty_gate: bool = True` — set to `False` to
>   bypass the gate. Useful only for testing the SDK itself or
>   for the unusual case where the caller is the Superparty under
>   a non-default ID.
>
> **Read operations** (`resolve_party`, `party_query`) are
> **never gated** — they're open to any caller.
>
> If you bypass the gate (or override the Superparty ID
> incorrectly), the registry will reject your request server-side
> with `EIDRAuthorizationError`. The gate exists to make the
> failure earlier and clearer — it is not a security boundary,
> just a usability one.

```python
from eidr.models.party import PartyRecord

# The Superparty has its own credentials; ordinary clients will
# trip the gate immediately. Only Superparty-credentialed callers
# should use these operations in production.
with Client(registry, superparty_creds) as client:
    new_party = PartyRecord.from_xml(b"""<?xml version="1.0"?>
        <Party xmlns="http://www.eidr.org/schema">
          <PartyName>
            <DisplayName>Example Org</DisplayName>
            <SortName>Example Org</SortName>
          </PartyName>
        </Party>""")
    created = client.create_party(new_party, password="initial-password")

    # Modify a party (no password — use change_party_password for that)
    client.modify_party(updated_party)

    # Activate / deactivate / delete / alias (return None)
    client.activate_party("10.5237/AAAA-BBBB")
    client.deactivate_party("10.5237/AAAA-BBBB")
    client.alias_party("10.5237/AAAA-BBBB", target_id="10.5237/CCCC-DDDD")
    client.delete_party("10.5237/AAAA-BBBB")

    # Change a password (the password traverses the wire in the URL
    # query string — treat the URL as sensitive).
    client.change_party_password("10.5237/AAAA-BBBB", "new-password")
```

### Virtual fields retrieval

```python
vf = client.virtual_fields("10.5240/7791-8534-2C23-9030-8610-5")
# vf.id is always present
# vf.full / vf.self_defined / vf.alias are each str | None,
# carrying serialized record content.
if vf.full is not None:
    full_record = ContentRecord.from_xml(vf.full.encode("utf-8"))
```

### File-based (codec-only) mode

```python
from eidr.codecs import xml, json

# XML bytes in, JSON dict out
json_dict = xml.to_json_dict(xml_bytes)

# JSON dict in, canonical XML bytes out (signing-ready)
xml_bytes = json.to_xml_canonical(json_dict, method="c14n2")
```

### Typed Digital sub-model (requires `[digital]` extra)

For programmatic inspection or construction of a Manifestation's
`<Digital>` block — the audio / video / subtitle tracks and
container-level packaging metadata — use the typed pydantic v2
model exposed at `eidr.models.digital`. The codec dict layer
remains the source of truth; the typed view is transient (each
`digital_typed()` call returns a fresh instance):

```python
from eidr import Client, registries
from eidr.models.digital import (
    DigitalTracks, Track, DigitalAssetAudio,
)
# (also: DigitalAssetAudioLanguageType, etc., re-exported from the
#  full generated tree via `from eidr.models.digital import *`)

with Client(registries.SANDBOX2, creds) as client:
    record = client.resolve("10.5240/...some-manifestation-ID...")

    manifest = record.creation_block  # ManifestationInfo
    typed = manifest.digital_typed()  # → DigitalTracks | None
    if typed is not None:
        for entry in typed.track_or_container:
            if isinstance(entry, Track) and isinstance(
                entry.choice, DigitalAssetAudio
            ):
                print(entry.choice.language.value, entry.choice.type_value)
```

To write back, build a typed `DigitalTracks` and call `set_digital`:

```python
from eidr.models import _digital_generated as gen

new_audio = gen.DigitalAssetAudioDataType(
    type_value="primary",
    language=gen.DigitalAssetAudioLanguageType(value="fr"),
)
new_track = gen.DigitalAssetMetadataType(choice=new_audio)
new_tracks = DigitalTracks(track_or_container=[new_track])

manifest.set_digital(new_tracks)  # codec dict updated under the hood
```

The raw codec dict at `manifest.digital` is unchanged from M5 and
remains the right tool for whole-block equality comparisons (the
common dedup case). The typed view is for programmatic field
access where the schema-driven validation pays off.

**Performance note:** `digital_typed()` round-trips through XML
serialization (codec dict → XML bytes → pydantic via
`xsdata-pydantic`), which adds ~5-15 ms per call depending on
manifestation size. For high-frequency Manifestation processing
(e.g., walking thousands of records to extract track-level
fields), prefer the codec-dict accessors (`record.data`,
`manifest.digital`) which are pure dict access and ~100× faster.
The typed view is a correctness-first ergonomic; the codec-dict
is the performance-first interface. This characteristic is
documented in [`STABILITY.md`](STABILITY.md) and won't change
without a major-version bump.

## Documentation

- API reference under construction; the source modules carry full
  docstrings — `pydoc eidr.client` is a useful starting point.
- The `M6_COVER_LETTER.md` in this drop documents the current
  scope, design findings, and reviewer-attention items in detail.

## Relationship to other EIDR tools

- **EIDR Java SDK**: the long-standing reference SDK. This Python
  library targets the same REST API with Python-native ergonomics.
- **eidr-cli** (planned): command-line tools built on this library.

## Contributing

Issues and pull requests welcome at
[github.com/EIDR-ID/eidr-python-sdk](https://github.com/EIDR-ID/eidr-python-sdk).

### Development setup

The full developer install pulls in every optional extra so all
tests, type checks, and example scripts work out of the box:

```bash
git clone https://github.com/EIDR-ID/eidr-python-sdk.git
cd eidr-python-sdk
pip install -e '.[client,digital,aws,dev,docs]'
```

If you skip an extra you'll see import errors when you run the
parts of the codebase that need it — for example, `[digital]`
brings in pydantic and `xsdata-pydantic`, which the typed Digital
sub-model and a handful of unit tests require. `[client]` brings
in httpx, which most non-codec tests assume. `[dev]` brings in
the test/lint toolchain (pytest, ruff, mypy).

### Test suite

```bash
pytest                       # full unit suite (fast, no network)
pytest -m integration        # live-sandbox tests (requires creds)
pytest -m ""                 # everything, including integration
```

The default `pytest` invocation skips live-registry tests via
`-m "not integration"` in `pyproject.toml`. To run them, set
three environment variables and invoke with `-m integration`:

```bash
export EIDR_TEST_USER_ID="10.5238/yourusername"
export EIDR_TEST_PARTY_ID="10.5237/AAAA-BBBB"
export EIDR_TEST_PASSWORD="your-sandbox-password"
pytest -m integration
```

Integration coverage is intentionally minimal and non-destructive:
anonymous resolve, authenticated resolve, status-lookup with a
synthetic token (verifies error mapping), one Match (the registry
recognizes it as a duplicate of itself, so no state changes), and
one ID-only query. No register/modify/delete in the automatic
lane — those mutate state and require dedicated test-party
infrastructure.

### Pipeline gates

```bash
ruff check src tests         # lint
ruff format --check src tests  # formatting
mypy src/eidr                # strict type-check
pytest                       # unit tests
```

## License

Apache 2.0 — see [LICENSE](LICENSE).

Copyright © 2026 Entertainment Identifier Registry.
