Metadata-Version: 2.4
Name: slotloss
Version: 0.2.0
Summary: Per-grammar-role loss decomposition for fine-tuned structured JSON output
Author-email: Breck Baldwin <breckbaldwin@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/breckbaldwin/slotloss
Project-URL: Paper, https://arxiv.org/abs/XXXX.XXXXX
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.0
Requires-Dist: transformers>=4.40
Provides-Extra: peft
Requires-Dist: peft>=0.10; extra == "peft"

# slotloss

Per-grammar-role loss analysis for structured JSON output from language models.

**Fine-tuning your LLM for JSON? Your aggregate metrics might be hiding per-field regressions.**

```bash
pip install slotloss
```

## What it does

`slotloss` is the observability layer for structured JSON output. It meets you wherever you are:

| You have | Command | Needs model? |
|----------|---------|:------------:|
| Messy text with JSON | `slotloss --extract --data output.txt` | No |
| JSON + schema | `slotloss --validate --schema s.json --data output.jsonl` | No |
| Broken JSON + schema | `slotloss --fix --schema s.json --data output.jsonl` | No |
| JSON + schema | `slotloss --anatomy --schema s.json --data output.jsonl` | No |
| Two output sets | `slotloss --diff --schema s.json --data a.jsonl --data2 b.jsonl` | No |
| Model + checkpoint | `slotloss --checkpoint lora/ --schema s.json --data test.jsonl` | Yes |

## The problem

Standard fine-tuning + grammar-constrained decoding produces valid JSON. Aggregate loss improves. But:

```
STRUCTURAL         5.33 -> 0.00     -100%   OK
KEY                0.47 -> 0.00     -100%   OK
BOOLEAN            0.46 -> 1.05     +130%   !! REGRESSION
TOTAL              0.55 -> 0.17      -69%
```

Aggregate loss improved 69%. Boolean prediction got 130% worse. `slotloss` catches this.

## Quick start

See [QUICK_START.md](QUICK_START.md) for a hands-on walkthrough from messy output to full analysis.

## Python API

```python
from slotloss import analyze

report = analyze(
    model_name="Qwen/Qwen2.5-0.5B-Instruct",
    checkpoint="my_lora/",
    schema="schema.json",
    data="test.jsonl",
)
print(report)

if report.regressions:
    print(f"REGRESSIONS: {[r.role for r in report.regressions]}")
```

Exit code is 1 if regressions are detected. Use in CI/CD.

## Grammar Roles

| Role | Description | Examples |
|------|-------------|----------|
| STRUCTURAL | JSON syntax | `{` `}` `[` `]` `:` `,` |
| QUOTE | String delimiters | `"` |
| KEY | Object key characters | `city`, `cuisine` |
| ENUM_VALUE | Categorical values | `Italian`, `Economy` |
| BOOLEAN | Boolean strings | `True`, `False` |
| NUMBER | Numeric characters | `42`, `3.14` |
| FREE_TEXT | Non-categorical content | names, addresses |
| WHITESPACE | Formatting | spaces, newlines |

## Links

- [Quick Start Tutorial](QUICK_START.md)
- [Fixing JSON Fine-Tuning](https://breckbaldwin.github.io/slotloss/) — mitigations, services
- [Paper](https://arxiv.org/abs/XXXX.XXXXX) — "Valid JSON, Wrong Answer" (2026)

## License

MIT
