Metadata-Version: 2.3
Name: transmog
Version: 1.0.6
Summary: A data transformation library for flattening complex nested structures into tabular formats while preserving hierarchical relationships
License: MIT
Keywords: csv,data-pipeline,data-processing,data-transformation,elt,etl,flattening,json,normalization,parquet,pyarrow
Author: Scott Draper
Author-email: admin@scottdraper.io
Requires-Python: >=3.9
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database :: Database Engines/Servers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Provides-Extra: data
Provides-Extra: dev
Provides-Extra: docs
Provides-Extra: minimal
Provides-Extra: test
Requires-Dist: bandit (>=1.7.5) ; extra == "dev"
Requires-Dist: furo (>=2024.1.29) ; extra == "docs"
Requires-Dist: interrogate (>=1.5) ; extra == "dev"
Requires-Dist: linkify-it-py (>=2) ; extra == "docs"
Requires-Dist: memory-profiler (>=0.60) ; extra == "dev"
Requires-Dist: mypy (>=0.9) ; extra == "dev"
Requires-Dist: myst-parser (>=2) ; extra == "docs"
Requires-Dist: orjson (>=3.8)
Requires-Dist: polars (>=0.20)
Requires-Dist: pre-commit (>=3.5) ; extra == "dev"
Requires-Dist: pyarrow (>=7)
Requires-Dist: pyproject-fmt (>=1.5.1) ; extra == "dev"
Requires-Dist: pyproject-parser (>=0.7) ; extra == "docs"
Requires-Dist: pytest (>=7) ; extra == "dev"
Requires-Dist: pytest (>=7) ; extra == "test"
Requires-Dist: pytest-benchmark (>=4) ; extra == "dev"
Requires-Dist: pytest-benchmark (>=4) ; extra == "test"
Requires-Dist: pytest-cov (>=3) ; extra == "dev"
Requires-Dist: pytest-cov (>=3) ; extra == "test"
Requires-Dist: ruff (>=0.3) ; extra == "dev"
Requires-Dist: safety (>=2.3.5) ; extra == "dev"
Requires-Dist: sphinx (>=7.2) ; extra == "docs"
Requires-Dist: sphinx-autobuild (>=2021.3.14) ; extra == "docs"
Requires-Dist: sphinx-autodoc-typehints (>=1.24) ; extra == "docs"
Requires-Dist: sphinx-copybutton (>=0.5) ; extra == "docs"
Requires-Dist: sphinx-design (>=0.5) ; extra == "docs"
Requires-Dist: sphinx-rtd-theme (>=2) ; extra == "docs"
Requires-Dist: sphinxcontrib-applehelp (>=1) ; extra == "docs"
Requires-Dist: sphinxcontrib-htmlhelp (>=2) ; extra == "docs"
Requires-Dist: sphinxcontrib-jsmath (>=1) ; extra == "docs"
Requires-Dist: sphinxcontrib-mermaid (>=0.8.1) ; extra == "docs"
Requires-Dist: sphinxcontrib-napoleon (>=0.7) ; extra == "docs"
Requires-Dist: sphinxcontrib-qthelp (>=1) ; extra == "docs"
Requires-Dist: sphinxcontrib-serializinghtml (>=1) ; extra == "docs"
Requires-Dist: types-pyyaml ; extra == "dev"
Requires-Dist: types-toml ; extra == "dev"
Requires-Dist: typing-extensions (>=4)
Project-URL: Bug Tracker, https://github.com/scottdraper8/transmog/issues
Project-URL: Documentation, https://scottdraper8.github.io/transmog/
Project-URL: Homepage, https://github.com/scottdraper8/transmog
Description-Content-Type: text/markdown

# Transmog

[![PyPI version](https://img.shields.io/pypi/v/transmog.svg?logo=pypi)](https://pypi.org/project/transmog/)
[![Python versions](https://img.shields.io/badge/python-3.9%2B-blue?logo=python)](https://pypi.org/project/transmog/)
[![License](https://img.shields.io/github/license/scottdraper8/transmog.svg?logo=github)](https://github.com/scottdraper8/transmog/blob/main/LICENSE)

A Python library for transforming complex nested data structures into flat,
tabular formats while preserving hierarchical relationships.

## Features

- **Multiple Input Formats**: JSON, JSONL, CSV
- **Nested Structure Handling**: Flattens deeply nested objects with customizable separators
- **Array Processing**: Extracts arrays as child tables with parent-child relationships maintained
- **Output Options**: Python dictionaries, PyArrow tables, JSON, CSV, Parquet
- **Performance Features**: Chunked processing, streaming output, memory optimization
- **Data Integrity**: Deterministic ID generation, consistent parent-child linking
- **Error Recovery**: Configurable strategies for handling malformed data

## Installation

```bash
pip install transmog
```

Optional dependencies:

```bash
pip install transmog[dev]  # Development tools
```

## Quick Example

```python
import transmog as tm

# Sample nested data
data = {
    "user": {
        "id": 1,
        "name": "John Doe",
        "contact": {
            "email": "john@example.com"
        },
        "orders": [
            {"id": 101, "amount": 99.99},
            {"id": 102, "amount": 45.50}
        ]
    }
}

# Process the data
processor = tm.Processor()
result = processor.process(data)

# Access the data
tables = result.to_dict()
main_table = tables["main"]
orders = tables["user_orders"]

# Export to different formats
result.write_all_json("output/json")
result.write_all_csv("output/csv")
result.write_all_parquet("output/parquet")
```

## Configuration

```python
# Use pre-configured modes
config = tm.TransmogConfig.memory_optimized()
# or
config = tm.TransmogConfig.performance_optimized()

# Custom configuration
config = (
    tm.TransmogConfig.default()
    .with_naming(separator=".")
    .with_processing(cast_to_string=True)
    .with_metadata(id_field="custom_id")
    .with_error_handling(max_retries=3)
)

processor = tm.Processor(config=config)
```

## Large Dataset Processing

```python
# Memory-optimized processing
processor = tm.Processor.memory_optimized()

# Chunked processing
result = processor.process_chunked(
    "large_data.jsonl",
    entity_name="records",
    chunk_size=1000
)

# Streaming output
processor.stream_process_file(
    "large_data.jsonl",
    entity_name="records",
    output_format="parquet",
    output_destination="output_dir"
)
```

## Error Handling

```python
# Skip and log errors
processor = tm.Processor().with_error_handling(recovery_strategy="skip")

# Partial recovery (preserves valid portions)
processor = tm.Processor.with_partial_recovery()
```

## Documentation

- [Installation Guide](https://scottdraper8.github.io/transmog/installation.html)
- [Getting Started](https://scottdraper8.github.io/transmog/getting-started.html)
- [Configuration Guide](https://scottdraper8.github.io/transmog/configuration.html)
- [API Reference](https://scottdraper8.github.io/transmog/api/index.html)
- [Examples](examples/README.md)

## License

MIT License

