Metadata-Version: 2.4
Name: causal-certificate
Version: 0.1.0
Summary: A numeric strict-causality (and cross-example independence) certificate for PyTorch sequence models, via vector-Jacobian products.
Author-email: Akhilesh Gogikar <gogikar.akhilesh@gmail.com>
License: MIT License
        
        Copyright (c) 2026 Akhilesh Gogikar
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/Akhilesh-Gogikar/causal-certificate
Project-URL: Source, https://github.com/Akhilesh-Gogikar/causal-certificate
Project-URL: Issues, https://github.com/Akhilesh-Gogikar/causal-certificate/issues
Keywords: pytorch,causality,autoregressive,leakage,testing,sequence-models,certificate,vjp
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=1.13
Provides-Extra: test
Requires-Dist: pytest; extra == "test"
Dynamic: license-file

# causal-certificate

A small, numeric **strict-causality certificate** for PyTorch sequence models.
One forward pass + `T−1` vector-Jacobian products tells you whether any output at
position `t` depends on an input at position `s > t` — the silent bug that
manufactures phantom autoregressive results.

```python
from causal_certificate import certify, assert_strictly_causal

report = certify(my_mixer, x)         # x: (B, T, D) float; my_mixer: x -> y
print(report.summary())
# CausalCertificate(T=128, exhaustive cuts)
#   temporal   : leak=0.000e+00  frac=0.000e+00  -> STRICTLY CAUSAL
#   cross-batch: leak=0.000e+00  frac=0.000e+00  -> BATCH-INDEPENDENT

assert_strictly_causal(my_mixer, x)   # drop into a pytest
```

## Why

A "blockwise causal" Walsh–Hadamard token mixer once produced a **7.21× lower BPB**
than a matched transformer — a number that reached a provisional patent application
before it was found to be a within-block future-token leak. Standard sanity checks
passed because they probed *across-block* causality; the violation lived *inside*
blocks. This tool is the check that would have caught it in CI. (Case study &
full write-up: `papers/causality_leaks/`.)

## What it catches (three leak classes)

| Class | Example | Caught by |
|---|---|---|
| Temporal (position) leak | block-WHT / FFT / butterfly mixing; off-by-one causal masks; KV/RoPE drift | exhaustive-cut temporal certificate |
| Pooled/block readout | a block statistic broadcast back to every position | temporal certificate |
| Batch/sequence-statistic coupling | batchnorm-style couplings across the batch | **cross-batch** certificate (per-example probes are blind to it) |

Genuinely-causal ops certify at **exactly 0.0** (structural autograd zeros) — in
fp32 as well as fp64 — so there is no per-model threshold tuning. Validated on
external attention/conv models it never saw: causal MHA and causal conv → 0.0;
an injected off-by-one mask and a batchnorm coupling → flagged.

## Modes

- `certify(fn, x, cuts="all")` — the **certificate** (exhaustive cuts, complete).
- `certify(fn, x, cuts="rand", K=8)` — a cheap **always-on training monitor**;
  a single-pair leak is caught with probability `1 − (1 − 1/(T−1))^K`.
- `batch_check=True` (default when `B>1`) — adds the cross-batch certificate.

## Scope & honest attribution

This is a *numeric* certificate on a given architecture/config (generic inputs and
random cotangents), not a symbolic proof; detection is almost-sure, not worst-case
adversarial; it assumes equal input/output sequence length. The **method is not
novel** — it packages known probes: Karpathy's 2019 backprop-from-`t` temporal
check, the per-cut VJP gradient energy of *Effective Context in Neural Speech
Models* (arXiv:2505.22487), and Krokotsch's 2020 batch-independence unit test. The
contribution is the **packaging**: exhaustive-cut completeness + the cross-batch
extension, as a single drop-in certificate for sequence-model CI.

## Install & test

```bash
pip install -e .            # editable install (src layout)
pytest                      # or: python tests/test_external_models.py
```
The generalization test certifies external causal attention / conv at exactly 0.0,
and fires on an injected off-by-one mask and a batch-statistic coupling.
