Metadata-Version: 2.4
Name: jsonl-contract-check
Version: 0.2.0
Summary: Validate JSONL records against lightweight field contracts.
Author: Waheed Sattar
License-Expression: MIT
Project-URL: Homepage, https://github.com/iwaheedsattar/jsonl-contract-check
Project-URL: Repository, https://github.com/iwaheedsattar/jsonl-contract-check
Project-URL: Issues, https://github.com/iwaheedsattar/jsonl-contract-check/issues
Keywords: jsonl,validation,cli,schema,evals
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Utilities
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file

# jsonl-contract-check

Validate JSONL records against lightweight field contracts from the command line or from Python.

## Problem

JSONL is a convenient format for prompt runs, eval traces, data cleanup jobs, and batch API results, but small shape problems can hide until a downstream script fails. This tool gives you a fast local check for required fields, basic types, nested paths, and JSON strings embedded inside a record.

## Install

```bash
pip install jsonl-contract-check
```

For local development:

```bash
python -m venv .venv
source .venv/bin/activate
pip install -e .
```

## Contract Format

Create a contract JSON file with a `fields` array:

```json
{
  "fields": [
    { "path": "id", "type": "str" },
    { "path": "metrics.score", "type": "number" },
    { "path": "choices[0].message.content", "type": "str" },
    { "path": "raw_response", "type": "object", "parse_json": true },
    { "path": "notes", "type": "str", "required": false }
  ]
}
```

Supported types are `any`, `array`, `bool`, `float`, `int`, `null`, `number`, `object`, and `str`.
Use dotted paths for nested objects and `[index]` notation for array items, such as `choices[0].message.content`.

## CLI Usage

```bash
jsonl-contract-check --contract contract.json results.jsonl
```

Validate multiple files:

```bash
jsonl-contract-check -c contract.json runs/*.jsonl
```

Emit a machine-readable report:

```bash
jsonl-contract-check -c contract.json --format json results.jsonl
```

The command exits with:

- `0` when every checked row passes
- `1` when one or more records fail
- `2` when the contract or input files cannot be read

## Python Usage

```python
from jsonl_contract_check import audit_paths, load_contract

rules = load_contract("contract.json")
result = audit_paths(["results.jsonl"], rules)

if not result.ok:
    for issue in result.issues:
        print(issue)
```

## Example Output

```text
Status: FAIL
Files: 1
Records: 2
Embedded JSON parsed: 1

Field coverage:
  - id: 2/2 (str, required)
  - metrics.score: 1/2 (number, required)

Issues:
  - results.jsonl:2 [missing-field] Missing required field: metrics.score.
```

## Development

Run tests:

```bash
PYTHONPATH=src python -m unittest discover -s tests
```

Build the package:

```bash
python -m build
```

## Contributing

Issues and pull requests are welcome. Keep changes focused, include tests for behavior changes, and update the README when command behavior changes.
