Metadata-Version: 2.4
Name: mod-trace
Version: 0.2.0
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Environment :: Console
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Build Tools
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
License-File: LICENSE
Summary: Rust CLI for inspecting ML model artifacts without loading the framework
Keywords: cli,ml,onnx,catboost,model,inspector,ci,diff
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM

# mod-trace

Inspect ML model artifacts without loading the framework.

mod-trace is a small Rust CLI for answering a practical question:

```text
What is inside this model file?
```

It can inspect real artifacts such as CatBoost `.cbm` files, LightGBM `.txt`/`.lgb` text models, and ONNX `.onnx` graphs, then report structure, size, parameters, operator mix, rough inference cost, and changes between versions. CatBoost, LightGBM, and ONNX are all read natively — no Python, framework, or runtime needed (CatBoost `--deep` is the one optional exception).

The secondary tensor lab keeps the original `EXPLAIN ANALYZE` idea for tiny neural-network plans:

```sql
EXPLAIN ANALYZE SELECT ...
```

becomes:

```sh
mod-trace trace examples/tiny_attention_plan.json
```

## Core Commands

```sh
cargo run -- doctor
cargo run -- doctor --json
cargo run -- inspect path/to/model.cbm
cargo run -- inspect --json path/to/model.cbm
cargo run -- inspect --deep path/to/model.cbm
cargo run -- inspect path/to/model.onnx
cargo run -- inspect --json path/to/model.onnx
cargo run -- explain path/to/model.onnx
cargo run -- diff path/to/old_model.cbm path/to/new_model.cbm
cargo run -- diff --json path/to/old_model.cbm path/to/new_model.cbm
cargo run -- diff --deep path/to/old_model.cbm path/to/new_model.cbm
cargo run -- check --max-size-growth 20% --fail-on-feature-change path/to/old_model.cbm path/to/new_model.cbm
cargo run -- diff path/to/old_model.onnx path/to/new_model.onnx
cargo run -- diff --json path/to/old_model.onnx path/to/new_model.onnx
cargo run -- check --max-ops-growth 25% --fail-on-new-op path/to/old_model.onnx path/to/new_model.onnx
```

Installed binary form:

```sh
mod-trace doctor
mod-trace doctor --json
mod-trace inspect model.cbm
mod-trace inspect --json model.cbm
mod-trace inspect --deep model.cbm
mod-trace inspect model.onnx
mod-trace inspect --json model.onnx
mod-trace explain model.onnx
mod-trace diff old_model.cbm new_model.cbm
mod-trace diff --json old_model.cbm new_model.cbm
mod-trace diff --deep old_model.cbm new_model.cbm
mod-trace check --max-size-growth 20% --fail-on-feature-change old_model.cbm new_model.cbm
```

## Why This Exists

Data and ML engineers often inherit model artifacts:

- `fraud_model.cbm`
- `ranking_model.onnx`
- `model_v17.cbm`
- `candidate_model.onnx`

Before running them, it is useful to know:

- what type of model it is
- how large it is
- how many trees, parameters, nodes, or operators it contains
- what the rough per-row or per-forward-pass cost looks like
- what changed between two versions

mod-trace is not a model runtime. It is an artifact inspector.

## Doctor

Check which inspectors and optional helpers are available:

```sh
cargo run -- doctor
cargo run -- doctor --json
```

Example output:

```text
mod-trace Doctor
----------------

Built-in inspectors:
  CatBoost metadata: ok
  ONNX static graph: ok
  JSON tensor plans: ok

Optional Python helpers:
  Python: ok (/path/to/python)
  catboost: ok (1.2.8)

Available commands:
  inspect .cbm/.onnx/.json: available
  diff .cbm/.onnx: available
  inspect --deep .cbm: available
  diff --deep .cbm: available
```

Use `--json` when a setup script or CI job needs to check optional helper availability.

## JSON Output

Use JSON when mod-trace is part of CI, release checks, or model registry automation:

```sh
cargo run -- inspect --json path/to/model.cbm
cargo run -- inspect --json path/to/model.onnx
cargo run -- diff --json path/to/old_model.cbm path/to/new_model.cbm
cargo run -- diff --json path/to/old_model.onnx path/to/new_model.onnx
```

The JSON diff is designed for checks such as:

- fail if file size, parameter memory, or estimated ops grows too much
- fail if CatBoost feature names, training config, or learned-state fingerprint changes unexpectedly
- fail if ONNX operator counts or initializer tensors change

`--deep` CatBoost reports are text-only for now because they are diagnostic dumps from CatBoost's native Python parser.

## CI Checks

Use `check` when a model artifact should fail promotion if it changes too much:

```sh
cargo run -- check path/to/old_model.cbm path/to/new_model.cbm \
  --max-size-growth 20% \
  --fail-on-feature-change \
  --fail-on-training-config-change

cargo run -- check path/to/old_model.onnx path/to/new_model.onnx \
  --max-size-growth 20% \
  --max-ops-growth 25% \
  --fail-on-new-op
```

`check` prints a short PASS/FAIL report and exits nonzero when a rule fails.

## CatBoost

Inspect a CatBoost binary model:

```sh
cargo run -- inspect path/to/model.cbm
cargo run -- inspect --json path/to/model.cbm
cargo run -- inspect --deep path/to/model.cbm
cargo run -- explain path/to/model.cbm
cargo run -- catboost --deep --limit 10 path/to/model.cbm
```

`--deep` is optional and requires Python CatBoost. For a single artifact, it adds exact float/categorical feature typing and float border counts decoded through CatBoost's native parser.

Example output:

```text
CatBoost Model Summary
----------------------
Model: model.cbm
Format: CatBoost binary model (CBM1)
File size: 4.6 MiB

Execution Plan:
  Input row
   |
   v
  Quantize numeric/categorical features
   |
   v
  Traverse symmetric tree ensemble
   |
   v
  Sum leaf values

Estimated Cost:
  Trees / row: 500
  Configured/max split checks / row: 3500
  Max leaf slots: 64000
  Why: 500 trees * depth 7 = 3500 split checks / row
```

Diff two CatBoost artifacts:

```sh
cargo run -- diff path/to/old_model.cbm path/to/new_model.cbm
cargo run -- diff --json path/to/old_model.cbm path/to/new_model.cbm
cargo run -- diff --deep path/to/old_model.cbm path/to/new_model.cbm
```

The normal diff is fast and reads embedded metadata plus artifact fingerprints. `--deep` is optional and requires Python CatBoost. It fully decodes the `.cbm` files through CatBoost's native parser and compares split changes, leaf value changes, leaf weights, feature typing, scale/bias, split type mix, and float-feature border changes keyed by CatBoost's flat/original feature index.

Example output:

```text
Model Diff
----------
Type: CatBoost

Structure:
File size:
  4.6 MiB -> 6.2 MiB (+1677721)
Trees:
  500 -> 650 (+150)
Depth:
  7 -> 8 (+1)
Configured/max split checks / row:
  3500 -> 5200 (+1700)

Parameter-like Internals:
Full artifact fingerprint:
  0x2d2b00dd2375ee48 -> 0xb22f43a2cf612a37 (changed)
Metadata fingerprint:
  0x0e9a08d227179262 -> 0x54a9bd32b5196b8f (changed)
Learned-state fingerprint:
  0xf57e3eeca557d48a -> 0x1390d581269e9dca (changed)
  Note: CatBoost does not expose PyTorch-style parameter tensors here.

Training Config:
Loss:
  RMSE -> RMSE (same)
Learning rate:
  0.100000 -> 0.100000 (same)

Metrics:
Best learn RMSE:
  105.479737 -> 118.574709 (+13.094972)
  Note: learn RMSE increased by 12.41%. Lower is usually better if this metric is comparable.

Features:
Recovered feature names:
  82 -> 83 (+1)

Interpretation:
  Structure changed: inference cost or ensemble shape may differ.
  Training config changed in embedded metadata.
  Learned-state fingerprint changed, so internal CatBoost parameters likely changed even if tree count/depth did not.

CatBoost Deep Diff
------------------
Decoded Structure:
  Trees with split changes: 400 / 400
  Split positions changed: 2800

Leaf Values:
  Trees with leaf value changes: 400 / 400

Feature Processing:
  Float features: 13 -> 15
  Categorical features: 4 -> 2
  Total float borders: 787 -> 874
  Float feature list changes:
    Added:
      13: new_numeric_feature
      14: another_numeric_feature
  Categorical feature list changes:
    Removed:
      2: old_category_feature
      3: another_category_feature
  Feature type changes:
    9: month_feature: categorical -> float
  Float features with changed borders:
    0: numeric_feature_a: borders 59 -> 62, changed positions 59
    1: numeric_feature_b: borders 95 -> 93, changed positions 93
```

Create a safe synthetic CatBoost model for local testing:

```sh
python3 -m pip install catboost
python3 examples/make_sample_catboost.py
cargo run -- inspect examples/sample_catboost.cbm
```

The generated `.cbm` uses synthetic data only and is ignored by git.

Explain a single CatBoost artifact:

```sh
cargo run -- explain path/to/model.cbm
```

This describes the artifact as a tree ensemble, shows evidence from CBM metadata, estimates tree traversal cost, prints training metadata, and explains how to use `diff --deep` for exact version-to-version changes.

When Python CatBoost is available, `explain` also prints feature processing:

```text
Feature Processing:
  Float features: 15
  Categorical features: 2
  Total float borders: 874
  Float feature list:
    0: numeric_feature_a (62 borders)
  Categorical feature list:
    1: category_feature_a
```

## ONNX

Inspect an ONNX graph:

```sh
cargo run -- inspect path/to/model.onnx
cargo run -- inspect --json path/to/model.onnx
cargo run -- onnx --limit 10 path/to/model.onnx
cargo run -- onnx --json path/to/model.onnx
```

Example output:

```text
ONNX Model Summary
------------------
Model: model.onnx
Format: ONNX ModelProto
File size: 197.1 KiB
  IR version: 10
  Producer: pytorch
  Graph: main_graph
  Opsets: ai.onnx=18

Graph:
  Nodes: 89
  Initializers: 33
  Value info entries: 120
  Parameters: 58646 values / 231.2 KiB

Operator Mix:
  Add                      22
  MatMul                   16
  Reshape                  12
  Transpose                10
  LayerNormalization       5

Estimated Cost:
  Estimated ops: 3226
  Most expensive:
    node_MatMul_98           MatMul       256 ops
```

Diff two ONNX graphs:

```sh
cargo run -- diff path/to/old_model.onnx path/to/new_model.onnx
cargo run -- diff --json path/to/old_model.onnx path/to/new_model.onnx
```

ONNX diff includes a `Parameter Tensors` section. It compares initializer tensor count, names, shapes, dtypes, and raw-data fingerprints when raw tensor bytes are stored in the ONNX file.

Explain likely ONNX architecture:

```sh
cargo run -- explain path/to/model.onnx
```

Example output:

```text
ONNX Architecture Explanation
-----------------------------
Model: model.onnx

This model appears to be a transformer.

Evidence:
  - 12 Softmax nodes
  - 24 MatMul nodes
  - 24 LayerNormalization operators
  - initializer names mention embeddings

Estimated Architecture:
  Encoder/attention layers: ~12
  Hidden size: ~768
  Parameters: 109482240 values
  Estimated ops: 12345678

Why:
  Transformers repeatedly use MatMul for projections and attention scores.
  Softmax usually appears where attention scores become probabilities.
  LayerNormalization is common around transformer attention/MLP blocks.
```

mod-trace performs static ONNX graph inspection. It does not execute ONNX models and does not require ONNX Runtime.

## Exporting A Small ONNX Model

If you have a Hugging Face model locally, export it with fixed input shapes for better cost estimates:

```sh
python3 -m pip install torch transformers onnx onnxscript
```

```python
import torch
from transformers import AutoModel, AutoTokenizer

model_dir = "models/tiny-distilbert-base-cased"
onnx_path = f"{model_dir}/model_fixed.onnx"

tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModel.from_pretrained(model_dir)
model.eval()

inputs = tokenizer(
    "hello mod-trace",
    return_tensors="pt",
    padding="max_length",
    max_length=8,
    truncation=True,
)

torch.onnx.export(
    model,
    (inputs["input_ids"], inputs["attention_mask"]),
    onnx_path,
    input_names=["input_ids", "attention_mask"],
    output_names=["last_hidden_state", "pooler_output"],
    opset_version=18,
    dynamo=True,
)

print(onnx_path)
```

Then inspect it:

```sh
cargo run -- inspect models/tiny-distilbert-base-cased/model_fixed.onnx
```

Fixed shapes such as `[1, 8]` produce better numeric estimates than symbolic shapes such as `[batch, sequence]`.

## Tensor Lab

The original tensor-analysis MVP still exists as a lab for small handcrafted plans:

```sh
cargo run -- trace examples/tiny_attention_plan.json
cargo run -- compare examples/tiny_attention_plan.json
cargo run -- why examples/tiny_attention_plan.json
cargo run -- validate examples/broken_shape.json
```

Example plan:

```json
{
  "layers": [
    {
      "type": "self_attention",
      "tokens": 3,
      "head_dim": 4,
      "value_dim": 4
    },
    {
      "type": "linear",
      "in": 4,
      "out": 2
    },
    {
      "type": "softmax"
    }
  ]
}
```

`why` explains the attention cost:

```text
Why is attention expensive?
---------------------------
Attention layer: single_head_attention
  432 ops

Breakdown:
  Q projection             96 ops
  K projection             96 ops
  V projection             96 ops
  Q @ K^T                  72 ops
  attention @ V            72 ops

Explanation:
  Every token must compare itself against every other token.
  The score and value-mixing terms grow roughly with tokens^2.
```

This is useful for demos and for explaining transformer internals, but it is no longer the primary product surface.

## What It Does Not Do

mod-trace does not:

- run inference
- train models
- load PyTorch directly
- require CatBoost, PyTorch, or ONNX Runtime for inspection
- provide GPU kernels
- replace framework-native debugging tools

## Privacy

Do not commit real model weights or private business artifacts. The repository ignores:

- `models/`
- `examples/*.onnx`
- `examples/*.cbm`

Use `examples/make_sample_catboost.py` when you need a shareable `.cbm` demo.

## Architecture

mod-trace has three small inspection paths:

- CatBoost `.cbm` metadata scanner
- ONNX protobuf graph scanner
- JSON tensor-plan analyzer

See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for details.

