Metadata-Version: 2.4
Name: extended-data
Version: 8.4.1
Summary: Comprehensive Python data utilities for serialization, inputs, logging, and workflows
Project-URL: Documentation, https://extended-data.dev
Project-URL: Issues, https://github.com/jbcom/extended-data/issues
Project-URL: Source, https://github.com/jbcom/extended-data
Project-URL: Changelog, https://github.com/jbcom/extended-data/blob/main/packages/extended-data/CHANGELOG.md
Author-email: Jon Bogaty <jon@jonbogaty.com>
Maintainer-email: Jon Bogaty <jon@jonbogaty.com>
License: MIT
Keywords: configuration,data,files,inputs,logging,serialization,workflows
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: deepmerge>=2.0
Requires-Dist: gitpython>=3.1.0
Requires-Dist: inflection>=0.5.1
Requires-Dist: num2words>=0.5.14
Requires-Dist: orjson>=3.10.7
Requires-Dist: python-hcl2>=4.3.4
Requires-Dist: pyyaml>=6.0.1
Requires-Dist: rich>=13.7.1
Requires-Dist: ruamel-yaml>=0.18.0
Requires-Dist: sortedcontainers>=2.4.0
Requires-Dist: tomlkit>=0.13.2
Requires-Dist: typing-extensions>=4.12.2
Requires-Dist: validators>=0.22.0
Requires-Dist: wrapt>=1.16.0
Provides-Extra: all
Provides-Extra: dev
Requires-Dist: coverage[toml]>=7.6.0; extra == 'dev'
Requires-Dist: hypothesis>=6.100.2; extra == 'dev'
Requires-Dist: mypy>=1.20.1; extra == 'dev'
Requires-Dist: pytest-asyncio>=1.3.0; extra == 'dev'
Requires-Dist: pytest-cov>=7.1.0; extra == 'dev'
Requires-Dist: pytest-mock>=3.15.1; extra == 'dev'
Requires-Dist: pytest-timeout>=2.4.0; extra == 'dev'
Requires-Dist: pytest-xdist>=3.6.1; extra == 'dev'
Requires-Dist: pytest>=9.0.3; extra == 'dev'
Requires-Dist: ruff>=0.8.0; extra == 'dev'
Requires-Dist: sortedcontainers-stubs>=2.4.2; extra == 'dev'
Requires-Dist: types-pyyaml>=6.0.12.20240724; extra == 'dev'
Provides-Extra: docs
Requires-Dist: furo>=2025.12.19; extra == 'docs'
Requires-Dist: myst-parser<6.0.0,>=4.0.1; extra == 'docs'
Requires-Dist: sphinx-autodoc2>=0.5.0; extra == 'docs'
Requires-Dist: sphinx-copybutton>=0.5.2; extra == 'docs'
Requires-Dist: sphinx<10.0.0,>=8.2.3; extra == 'docs'
Provides-Extra: tests
Requires-Dist: coverage[toml]>=7.6.0; extra == 'tests'
Requires-Dist: hypothesis>=6.100.2; extra == 'tests'
Requires-Dist: pytest-asyncio>=1.3.0; extra == 'tests'
Requires-Dist: pytest-cov>=7.1.0; extra == 'tests'
Requires-Dist: pytest-mock>=3.15.1; extra == 'tests'
Requires-Dist: pytest-timeout>=2.4.0; extra == 'tests'
Requires-Dist: pytest-xdist>=3.6.1; extra == 'tests'
Requires-Dist: pytest>=9.0.3; extra == 'tests'
Provides-Extra: typing
Requires-Dist: mypy>=1.20.1; extra == 'typing'
Requires-Dist: sortedcontainers-stubs>=2.4.2; extra == 'typing'
Requires-Dist: types-pyyaml>=6.0.12.20240724; extra == 'typing'
Description-Content-Type: text/markdown

# Extended Data

Comprehensive Python data utilities for serialization, configuration inputs,
structured logging, file processing, and workflow composition.

The public API lives under one `extended_data` namespace with three deliberate
tiers:

- Tier 1: pure functions for codecs, string transforms, redaction, matching,
  type coercion, mapping, sequence, and state utilities.
- Tier 2: `ExtendedData`, `ExtendedString`, `ExtendedDict`, `ExtendedList`,
  `ExtendedTuple`, and `ExtendedSet` containers that expose Tier 1 operations
  as methods. `ExtendedData` is the common root and polymorphic constructor for
  the shape-specific containers.
- Tier 3: data processors that compose the first two tiers for files, inputs,
  logging, export/import boundaries, and workflows.

External API clients and provider-backed Python sync live in the separate
`vendor-fabric` distribution. Agent workflow orchestration lives in the
separate `agentic-fabric` distribution.

Documentation: [extended-data.dev](https://extended-data.dev)

## Install

```bash
pip install extended-data
```

Development and documentation extras are available for contributors:

```bash
pip install "extended-data[dev]"
pip install "extended-data[docs]"
```

## Usage

```python
from extended_data import DataFile, DataWorkflow, ExtendedData, ExtendedDict, InputProvider, Logging, decode_file
from extended_data.primitives import decode_json, encode_yaml, number_to_words, redact_sensitive_text

logger = Logging(logger_name="example", enable_console=False, enable_file=False)
inputs = InputProvider(inputs={"SERVICE_NAME": "api"}, from_environment=False)
data = decode_json('{"service": {"name": "api"}}')
payload = ExtendedDict(data).deep_merge({"replicas": 3})
wrapped = ExtendedData(payload).merge({"owner": "platform"})
decoded_file = decode_file('{"service": {"name": "worker"}}', suffix="json")
artifact = DataFile.decode("service:\n  name: api\n", suffix="yaml")
workflow = DataWorkflow.from_value(wrapped).transform("unhump").result()

logger.logged_statement("prepared workflow", json_data=workflow.as_builtin(), log_level="info")

assert inputs.inputs["SERVICE_NAME"] == "api"
assert wrapped.as_builtin()["owner"] == "platform"
assert decoded_file["service"]["name"].upper_first() == "Worker"
assert artifact.metadata["encoding"] == "yaml"
assert number_to_words(42) == "forty-two"
assert redact_sensitive_text("Authorization: Bearer raw_token") == "Authorization: [REDACTED]"
assert "replicas: 3" in encode_yaml(workflow.as_builtin())
```

The installed CLI exposes the Tier 3 data boundary:

```bash
extended-data decode '{"service": {"name": "api"}}' --suffix json
extended-data decode --file config.yaml --output json
extended-data inspect --file config.yaml
extended-data merge config/base.yaml config/dev.yaml --output yaml
extended-data transform --file payload.json --step reconstruct --step unhump
```

## Package Shape

```text
extended_data/
  containers/   Tier 2 ExtendedData root plus String/Dict/List/Tuple/Set containers
  inputs/       InputProvider and decorator-based input injection
  io/           Tier 3 file, import, export, and base64 processors
  logging/      structured lifecycle logging
  primitives/   Tier 1 pure functions and codecs
  workflows/    Tier 3 higher-order workflow composition
```

Tier 1 primitive names are explicit in this major version and live under
`extended_data.primitives`, not the package root. Use `bytes_to_string()` for
bytes-like coercion and `string_to_bool()`, `string_to_int()`,
`string_to_float()`, `string_to_path()`, `string_to_date()`,
`string_to_datetime()`, and `string_to_time()` for scalar string conversion.
Use `redact_sensitive_text()` and `redact_sensitive_data()` for diagnostic and
JSON-like payload redaction. Pass `values=[...]` when a caller knows specific
context values, such as resource IDs, emails, paths, or URLs, must be withheld
in addition to common secret fields.

Tier 2 containers inherit from standard Python collection primitives and expose
chainable data operations. `ExtendedData` is the polymorphic constructor for any
incoming value: `ExtendedData({"service": "api"})` is an `ExtendedDict`,
`ExtendedData(["api"])` is an `ExtendedList`, and `ExtendedData("api")` is an
`ExtendedString`, while all of them are also `isinstance(value, ExtendedData)`.
For example, `ExtendedString.decode_json()` promotes JSON into extended
containers, `ExtendedDict.reconstruct_special_types()` turns string scalars into
booleans/numbers/dates where safe, and `ExtendedList.first_non_empty()` returns
the first meaningful value without lowering the surrounding data boundary.

Tier 3 processors keep structured data moving through explicit boundaries.
`DataFile` reads, decodes, tracks metadata, and exports structured files.
`DataWorkflow` layers reads, merges, transforms, writes, syncs, and provenance
into a single result object. `InputProvider` loads direct inputs and environment
data, and `Logging` provides structured lifecycle logging with stored-message
snapshots returned as extended containers.

The old `extended_data_types`, `directed_inputs_class`, and `lifecyclelogging`
package names are not shimmed. The removed `extended_data.connectors` and
`extended_data.secrets` namespaces are also not preserved. Clean-break import
failures are intentional so stale migrations are visible.

## Local Development

```bash
uv sync --all-extras --dev
tox -e lint
tox -e typecheck
tox -e py311,py312,py313,py314
tox -e examples
tox -e docs
tox -e build
```
