Metadata-Version: 2.3
Name: toon-formatter
Version: 1.0.1
Summary: A comprehensive Python library for encoding, decoding, and converting data using the Token-Oriented Object Notation (TOON) format - optimized for LLM contexts and human readability
License: MIT
Keywords: toon,serialization,notation,llm,json,csv,xml,converter
Author: Juan Manuel Panozzo Zenere
Author-email: juanmanuel.panozzozenere@alumnos.uai.edu.ar
Requires-Python: >=3.10
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing
Description-Content-Type: text/markdown

# TOON - Token-Oriented Object Notation

A Python library for working with TOON (Token-Oriented Object Notation) format, a compact and human-readable serialization format optimized for Large Language Model (LLM) contexts.

## Features

- ✅ **TOON Encoder & Decoder** - Convert Python data structures to/from TOON format
- ✅ **JSON Conversion** - Transform between JSON and TOON
- ✅ **CSV Conversion** - Convert tabular CSV to/from TOON
- ✅ **XML Conversion** - Transform between XML and TOON
- ✅ **Zero External Dependencies** - Everything built from scratch
- ✅ **TDD** - 146 tests with 100% pass rate
- ✅ **Clean Code** - Clean, modular, and well-documented code
- ✅ **Type-Safe** - Complete type hints for better developer experience

## Installation

```bash
pip install pytoon
```

Or with Poetry:

```bash
poetry add pytoon
```

## Quick Start

### Basic Encoder and Decoder

```python
from toon import ToonEncoder, ToonDecoder

# Encode Python data to TOON
encoder = ToonEncoder()
data = {
    "id": 123,
    "name": "Ada",
    "tags": ["admin", "user"],
    "active": True
}
toon_str = encoder.encode(data)
print(toon_str)
# Output:
# id: 123
# name: Ada
# tags[2]: admin,user
# active: true

# Decode TOON back to Python
decoder = ToonDecoder()
decoded_data = decoder.decode(toon_str)
print(decoded_data)
# Output: {'id': 123, 'name': 'Ada', 'tags': ['admin', 'user'], 'active': True}
```

### Tabular Arrays

```python
from toon import ToonEncoder

encoder = ToonEncoder()
data = [
    {"sku": "A1", "qty": 2, "price": 9.99},
    {"sku": "B2", "qty": 1, "price": 14.5}
]
toon_str = encoder.encode(data)
print(toon_str)
# Output:
# [2]{sku,qty,price}:
#   A1,2,9.99
#   B2,1,14.5
```

### JSON ↔ TOON Conversion

```python
from toon import ToonConverter

converter = ToonConverter()

# JSON to TOON
json_str = '{"id": 1, "name": "Alice", "roles": ["admin", "user"]}'
toon = converter.json_to_toon(json_str)
print(toon)
# Output:
# id: 1
# name: Alice
# roles[2]: admin,user

# TOON to JSON
json_result = converter.toon_to_json(toon, indent=2)
print(json_result)
```

### CSV ↔ TOON Conversion

```python
from toon import ToonConverter

converter = ToonConverter()

# CSV to TOON
csv_data = """id,name,active
1,Ada,true
2,Bob,false"""

toon = converter.csv_to_toon(csv_data)
print(toon)
# Output:
# [2]{id,name,active}:
#   1,Ada,true
#   2,Bob,false

# TOON to CSV
csv_result = converter.toon_to_csv(toon)
print(csv_result)
```

### XML ↔ TOON Conversion

```python
from toon import ToonConverter

converter = ToonConverter()

# XML to TOON
xml_data = """<users>
    <user>
        <id>1</id>
        <name>Ada</name>
    </user>
</users>"""

toon = converter.xml_to_toon(xml_data)
print(toon)

# TOON to XML
xml_result = converter.toon_to_xml(toon, root_name="users")
print(xml_result)
```

## Configuration Options

### Encoder

```python
from toon import ToonEncoder

# Customize indentation, delimiter, and length marker
encoder = ToonEncoder(
    indent=4,              # Spaces per level (default: 2)
    delimiter="|",         # Delimiter: ",", "\t", or "|" (default: ",")
    length_marker=True     # Include # marker (default: False)
)

data = [1, 2, 3]
print(encoder.encode(data))
# Output: [#3|]: 1|2|3
```

### Decoder

```python
from toon import ToonDecoder

# Strict or permissive mode
decoder = ToonDecoder(
    indent=2,      # Expected spaces per level (default: 2)
    strict=True    # Strict validation mode (default: True)
)
```

## TOON Format Features

### Primitives
```python
# Numbers, strings, booleans, and null
encoder.encode(42)        # → "42"
encoder.encode("hello")   # → "hello"
encoder.encode(True)      # → "true"
encoder.encode(None)      # → "null"
```

### Objects
```python
# Simple and nested objects
data = {
    "user": {
        "id": 1,
        "name": "Ada"
    }
}
# Output:
# user:
#   id: 1
#   name: Ada
```

### Primitive Arrays (Inline)
```python
[1, 2, 3, 4, 5]
# → [5]: 1,2,3,4,5
```

### Object Arrays (Tabular)
```python
[
    {"id": 1, "name": "Ada"},
    {"id": 2, "name": "Bob"}
]
# → [2]{id,name}:
#     1,Ada
#     2,Bob
```

### Mixed Arrays (Expanded)
```python
[1, {"a": 1}, "text"]
# → [3]:
#     - 1
#     - a: 1
#     - text
```

### Alternative Delimiters

```python
# Tab-delimited
encoder = ToonEncoder(delimiter="\t")
[1, 2, 3]
# → [3	]: 1	2	3

# Pipe-delimited
encoder = ToonEncoder(delimiter="|")
[1, 2, 3]
# → [3|]: 1|2|3
```

## Project Structure

```
toon/
├── src/
│   └── toon/
│       ├── __init__.py      # Main exports
│       ├── encoder.py       # ToonEncoder
│       ├── decoder.py       # ToonDecoder, ToonDecodeError
│       └── converter.py     # ToonConverter (JSON/CSV/XML)
├── tests/
│   ├── test_encoder.py      # Encoder tests (57 tests)
│   ├── test_decoder.py      # Decoder tests (56 tests)
│   └── test_converter.py    # Converter tests (33 tests)
├── examples.py              # Usage examples
├── pyproject.toml
├── README.md
└── SPEC.md                  # Complete TOON specification
```

## Testing

```bash
# Run all tests
poetry run pytest

# Tests with coverage
poetry run pytest --cov=toon tests/

# Specific tests
poetry run pytest tests/test_encoder.py
poetry run pytest tests/test_decoder.py -v
```

## Running Examples

```bash
# Run the examples file to see TOON in action
poetry run python examples.py
```

## Command-Line Interface (CLI)

TOON includes a powerful CLI for file operations:

```bash
# Convert between formats
poetry run toon convert data.json data.toon --from json --to toon
poetry run toon convert inventory.csv inventory.toon --from csv --to toon

# Validate TOON files
poetry run toon validate data.toon

# Format/pretty-print TOON files
poetry run toon format data.toon

# Optimize for minimum token count
poetry run toon minify data.toon

# Show file information
poetry run toon info data.toon --verbose

# Compare TOON files
poetry run toon diff file1.toon file2.toon --semantic
```

**Available Commands:**
- `convert` - Convert between JSON/CSV/XML/TOON formats
- `validate` - Validate TOON file syntax
- `format` - Format/pretty-print TOON files
- `minify` - Optimize for minimum token count
- `info` - Show detailed file information
- `diff` - Compare two TOON files

See [CLI.md](CLI.md) for complete CLI documentation and examples.

## Benefits of TOON

1. **Token Reduction**: 30-60% fewer tokens than JSON for tabular data
2. **Readability**: Clear and easy-to-read format
3. **Deterministic**: Consistent and predictable encoding
4. **Strict Validation**: Strict mode to ensure data integrity
5. **Interoperability**: Easy conversion between JSON, CSV, and XML

## Use Cases

- 📊 Efficient serialization of tabular data for LLMs
- 🔄 Data format transformation (JSON ↔ CSV ↔ XML ↔ TOON)
- 💾 Compact storage of configurations
- 📡 Transmission of structured data
- 🤖 Prompt contexts for language models

## License

MIT

## Specification

For complete details about the TOON format, see [SPEC.md](SPEC.md).

## Development

Developed following **TDD** (Test-Driven Development) and **Clean Code** principles:

- ✅ 146 tests passing
- ✅ Complete edge case coverage
- ✅ Type hints throughout the codebase
- ✅ Comprehensive documentation
- ✅ Zero external dependencies
- ✅ Modular and maintainable code

## Author

**Juan Manuel Panozzo Zenere**  
Email: juanmanuel.panozzozenere@alumnos.uai.edu.ar

## Acknowledgments

TOON specification by Johann Schopplich ([@johannschopplich](https://github.com/johannschopplich))


