Metadata-Version: 2.4
Name: toon-cli
Version: 0.1.2
Summary: A production-grade CLI for converting JSON to TOON and back.
Author: Areeb
License: MIT
License-File: LICENSE
Keywords: cli,json,llm,serialization,token-optimization
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Utilities
Requires-Python: >=3.11
Requires-Dist: pydantic<3.0,>=2.7
Requires-Dist: rich<14.0,>=13.7
Requires-Dist: tiktoken<1.0,>=0.7
Requires-Dist: typer<1.0,>=0.12
Provides-Extra: dev
Requires-Dist: pytest-cov<6.0,>=5.0; extra == 'dev'
Requires-Dist: pytest<9.0,>=8.2; extra == 'dev'
Requires-Dist: ruff<1.0,>=0.6; extra == 'dev'
Description-Content-Type: text/markdown

# toon-cli

`toon-cli` is a production-grade Python CLI for converting JSON into TOON (Token-Oriented Object Notation) and back. TOON is designed to stay human-readable while reducing tokenizer cost for LLM workflows through indentation-based structure and repeated-key compression.

## Why TOON

TOON aims to preserve JSON semantics while cutting token overhead:

- Lossless round-trips between JSON and TOON
- Compact table notation for arrays of similarly shaped objects
- Token-aware benchmarking using `tiktoken`
- Strategy comparison against minified JSON baselines
- Path-level token attribution to show where payload weight lives
- Optional lossy transforms for prompt-oriented payload trimming
- Readable, diff-friendly text for prompt and context pipelines

Example JSON:

```json
{
  "users": [
    {
      "id": 1,
      "name": "Alice",
      "role": "admin"
    },
    {
      "id": 2,
      "name": "Bob",
      "role": "user"
    }
  ]
}
```

Example TOON:

```text
users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user
```

## Installation

From PyPI:

```bash
pip install toon-cli
```

After installation, confirm the command is available:

```bash
toon --help
```

## CLI

```bash
toon encode input.json
toon decode input.toon
toon benchmark input.json
toon optimize input.json
toon validate input.toon
toon stats input.json
```

Common options:

- `--output / -o` write to a file instead of stdout
- `--pretty / --minified` choose human-readable or compact output
- `--model` select a tokenizer model for token counting
- `--alias-keys` replace repeated keys with short aliases in TOON output
- `--sparse-tables` compress arrays of irregular scalar objects into table form
- `--drop-nulls`, `--drop-empty`, `--drop-keys`, `--max-string-length`, and `--top-k` apply optional lossy transforms during optimization

## Examples

Encode JSON into TOON:

```bash
toon encode samples/users.json --output samples/users.toon
```

Decode TOON back into JSON:

```bash
toon decode samples/users.toon --pretty
```

Benchmark savings:

```bash
toon benchmark samples/users.json
```

Pick the lowest-token strategy and emit it:

```bash
toon optimize samples/catalog.json --alias-keys --drop-nulls --drop-empty
```

Validate a TOON file:

```bash
toon validate samples/users.toon
```

Inspect stats:

```bash
toon stats samples/catalog.json
```

## Safety Notes

`toon-cli` is a local file-processing tool. In the current codebase:

- It reads local JSON or TOON files that you choose
- It writes output only to the path you pass or to predictable default filenames beside the input
- It does not send your file contents over the network
- It does not execute code from the input files it reads
- It rejects malformed TOON syntax instead of guessing
- It refuses to overwrite the input file when `toon optimize --output` points to that same file

Use the lossy optimization flags carefully, because they intentionally modify data:

- `--drop-nulls`
- `--drop-empty`
- `--drop-keys`
- `--top-k`
- `--max-string-length`

These are useful for prompt compression, but not for archival or exact data preservation.

## TOON Syntax Overview

TOON supports all JSON types:

- Strings
- Numbers
- Booleans
- Null
- Arrays
- Objects

Core syntax:

```text
name: Alice
active: true
count: 3
meta:
  created_by: system
tags[3]:
  - alpha
  - beta
  - gamma
```

Compact repeated-key compression:

```text
users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user
```

Quoted keys and strings use JSON escaping rules when needed.

## Project Layout

```text
toon/
|- benchmark/
|- cli/
|- parser/
|- serializer/
|- tokenizer/
`- utils/
```

## Developer Commands

```bash
pytest
ruff check .
```

## Packaging

The project is ready for PyPI publishing through standard `build` or `hatch` workflows:

```bash
python -m build
```

## Design Notes

- Encoder and decoder are fully typed
- Parser validates malformed indentation, headers, and row widths
- Schema inference is used to discover table-friendly arrays of objects
- Streaming parsing is supported from line iterables
- Token-aware optimization can choose compact table layouts when they improve tokenizer efficiency
