Metadata-Version: 2.4
Name: cipherexplain
Version: 0.6.0
Summary: CipherExplain SDK — register models and get encrypted SHAP explanations via the CipherExplain API
Author-email: VaultBytes <b@vaultbytes.com>
License-Expression: AGPL-3.0-or-later
Project-URL: Homepage, https://vaultbytes.com/cipherexplain
Project-URL: Documentation, https://cipherexplain.vaultbytes.com
Project-URL: Source, https://github.com/VaultBytes/CipherExplain
Project-URL: Issues, https://github.com/VaultBytes/CipherExplain/issues
Keywords: shap,explainability,fhe,homomorphic-encryption,privacy,machine-learning,xai
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Security :: Cryptography
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.27
Requires-Dist: numpy>=1.24
Requires-Dist: scikit-learn>=1.3
Requires-Dist: py_ecc>=6.0
Provides-Extra: fhe
Requires-Dist: openfhe>=1.2; extra == "fhe"
Provides-Extra: lattice
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: pytest-httpx>=0.30; extra == "dev"
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Dynamic: license-file

<!--
Copyright (C) 2026 Bader Issaei / VaultBytes Innovations Ltd
SPDX-License-Identifier: AGPL-3.0-or-later
Patent pending: PCT/IB2026/053378, PCT/IB2026/053405
See NOTICE and LICENSE files for details.
-->

# CipherExplain Python SDK

Register your own models and get encrypted SHAP explanations via the [CipherExplain API](https://vaultbytes.com/cipherexplain).

[![PyPI](https://img.shields.io/pypi/v/cipherexplain.svg)](https://pypi.org/project/cipherexplain/)
[![Python](https://img.shields.io/pypi/pyversions/cipherexplain.svg)](https://pypi.org/project/cipherexplain/)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
[![Downloads](https://img.shields.io/pypi/dm/cipherexplain.svg)](https://pypi.org/project/cipherexplain/)

## Install

```bash
pip install cipherexplain
```

Dependencies installed automatically: `httpx`, `numpy`, `scikit-learn`.

### Optional: local FHE mode

For client-side CKKS encryption (`fhe_mode='ckks'`) install the `fhe` extra:

```bash
pip install 'cipherexplain[fhe]'
```

This adds [OpenFHE](https://openfhe.org) so the SDK can encrypt inputs locally and decrypt results without the server ever seeing plaintext.

## Quick start

```python
from cipherexplain_sdk import CipherExplainClient, extract_spec

client = CipherExplainClient(api_key="vb_...")

# 1. Train locally — nothing leaves your machine
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
model  = LogisticRegression().fit(scaler.fit_transform(X_train), y_train)

# 2. Extract weights only (no pickle, no training data)
spec = extract_spec(model, "my_model", feature_names=["age", "income", "score"],
                    scaler=scaler)

# 3. Register with the API
client.register(spec)

# 4. Get SHAP explanations — pass raw values, scaler is applied server-side
result = client.explain_raw("my_model", x_raw)
print(result["shap_values"])    # per-feature attributions
print(result["prediction"])     # 0.0–1.0 probability
```

## Counterfactuals + ECOA Reg-B reason codes (v0.5.1)

For credit-denial / adverse-action workflows: after `/explain`, ask the
server for a SHAP-guided minimal-L2 counterfactual + Form C-1 reason
codes. The server constructs `Enc(x') = Enc(x) ⊕ Enc(δ*)` from its own
session-stored `Enc(x)`; the client never gets to submit `Enc(x')`
directly, so the forged-CF attack is closed at the schema layer.

```python
exp = client.explain_raw("my_lr", x_denied)
cf  = client.counterfactual("my_lr", x_denied, exp)

print(cf["x_prime"])              # x + δ* — decision-flipping recourse
print(cf["decision_flipped"])     # True
for c in cf["reason_codes"]:
    print(c["form_c1_code"], "—", c["form_c1_text"])

# Optional: protect class / sticky features
from cipherexplain_sdk import FeatureManifest
manifest = FeatureManifest(immutable=[True, False, ..., False])  # age fixed
cf = client.counterfactual("my_lr", x_denied, exp,
                            feature_manifest=manifest)
```

`cf_attestation_mode` is `"UNATTESTED"` by default. The Pedersen homomorphism
+ real BLS12-381 G1 Σ-IPA π_CF + CRDC audit chain + composition β are real
(verified on prod CX22 in 0.5 ms / 0.8 ms p50). For the lattice-arm
attestation upgrade (`"ATTESTED_BGV_ZK"`), see the next section.

## v1-B lattice attestation: `bgv_zk=True` (v0.6.0)

`counterfactual(bgv_zk=True)` attaches a del Pino-Lyubashevsky-Seiler (PKC
2019) lattice binding proof to the request. The SDK generates a fresh
BGV keypair per call, encrypts the integer-quantised attribution vector
`δ_int` into `ct_BGV` (at n'=512, q≈2^60), and runs the Σ-protocol locally;
the server verifies the lattice-arm equation in ~50 ms and upgrades
`cf_attestation_mode` from `"UNATTESTED"` to `"ATTESTED_BGV_ZK"`.

```python
cf = client.counterfactual("my_lr", x_denied, exp, bgv_zk=True)
assert cf["cf_attestation_mode"] == "ATTESTED_BGV_ZK"
```

**Honest-client assumption (important)**: `bgv_zk=True` requires that
`ct_CKKS` (used by the SHAP computation) and `ct_BGV` (used by the proof)
encode the same `δ_int`. This is cryptographically unenforceable per
scheme-switching hardness (eprint 2023/988). The threat model is
**adversarial-server output, honest client** — appropriate for B2B
internal attestation where the client and the protected party are the
same entity (a bank attesting its own SHAP to an internal auditor or
third party). For consumer-facing Reg-B / ECOA adverse-action use cases,
stay on the default `bgv_zk=False` — the v1-A G1 Σ-IPA path is the right
fit there.

Server-side, the `bgv_zk` payload is consumed only when the deployment
sets `CE_CF_USE_BGV_ZK=1`. When the flag is off (default during rollout),
the SDK can still send `bgv_zk=True` payloads — the server accepts but
ignores them and the response stays on the v1-A path. No client changes
are needed when the server flips the flag.

Cost profile: ~16.5 KB proof on the wire (well under the 50 KB budget),
~100 ms prover latency on Mac M1 (pure-Python reference; LaZer C
extension on Linux AMD64 + AVX-512 reduces this 50-100× when bundled).

## Supported model types

### Tree ensembles (new)

`RandomForestClassifier`, `GradientBoostingClassifier`, `DecisionTreeClassifier` — and their regressor variants — are now fully supported. Use `serialize_model` from `cipherexplain_eval` to build the registration payload:

```python
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.preprocessing import StandardScaler
import cipherexplain_eval as ce

# Train locally
rf = RandomForestClassifier(n_estimators=100).fit(X_train, y_train)

# Serialize to JSON (no pickle, no training data)
payload = ce.serialize_model(
    rf,
    feature_names=["age", "income", "score"],
    model_id="my_rf",
)

# Register with the API (use the raw API or SDK client)
client.register(payload)

# Explain — fhe_mode is ignored for tree models, plaintext KernelSHAP runs
result = client.explain("my_rf", x)
print(result["shap_values"])
```

`GradientBoostingClassifier` works the same way — `serialize_model` captures the learning rate and initial estimator needed for correct probability reconstruction.

### Linear classifiers

Weights serialised to JSON — model must be described by a coefficient matrix and intercept vector.

| Framework | How |
|---|---|
| **sklearn** `LogisticRegression` | `extract_spec(model, ...)` |
| **sklearn** `LinearSVC` | `extract_spec(model, ...)` |
| Any sklearn-compatible object with `coef_` / `intercept_` | `extract_spec(model, ...)` |
| **PyTorch** `nn.Linear` | `extract_spec(layer, ...)` |
| **PyTorch** pure-linear `nn.Sequential` | `extract_spec(seq, ...)` |
| **TensorFlow / Keras** `Dense` (no activation) | `from_weights(layer.get_weights()[0].T, layer.get_weights()[1], ...)` |
| **JAX** / **statsmodels** / **R** / anything else | `from_weights(coef, intercept, ...)` |

### All supported model types

Supported: logistic regression, linear SVM, RandomForest, GradientBoosting,
DecisionTree, and MLP (ReLU, 2-3 hidden layers). Any framework via weight
export.

| `model_type` | Class | SHAP method |
|---|---|---|
| `logistic_regression` | `sklearn.LogisticRegression` | KernelSHAP (FHE-simulated) |
| `linear_svc` | `sklearn.LinearSVC` | KernelSHAP (FHE-simulated) |
| `decision_tree` | `sklearn.DecisionTreeClassifier/Regressor` | KernelSHAP (plaintext) |
| `random_forest` | `sklearn.RandomForestClassifier/Regressor` | KernelSHAP (plaintext) |
| `gradient_boosting` | `sklearn.GradientBoostingClassifier/Regressor` | KernelSHAP (plaintext) |
| `mlp` (PANCE) | `sklearn.MLPClassifier` (ReLU, 2-3 hidden layers) | KernelSHAP via LP-optimal polynomial activation |

### TensorFlow / Keras example

```python
from cipherexplain_sdk import from_weights

# Keras model with a single Dense output layer
import numpy as np
w, b = keras_model.layers[-1].get_weights()  # w shape: (n_features, n_classes)
spec = from_weights(
    coef=w.T,          # transpose to (n_classes, n_features)
    intercept=b,
    model_id="keras_model",
    feature_names=[...],
    classes=[0, 1],
)
client.register(spec)
```

### JAX / statsmodels / R example

```python
from cipherexplain_sdk import from_weights

# Pass coefficient array and intercept directly
spec = from_weights(
    coef=[[0.42, -0.17, 0.93]],   # shape (1, n_features) for binary
    intercept=[-0.31],
    model_id="my_model",
    feature_names=["f1", "f2", "f3"],
    classes=[0, 1],
    scaler_mean=[0.0, 0.0, 0.0],  # optional: embed scaler for /explain_raw
    scaler_scale=[1.0, 1.0, 1.0],
)
client.register(spec)
```

## Client reference

```python
client = CipherExplainClient(api_key="vb_...", base_url="https://cipherexplain.vaultbytes.com")
```

### Model management

| Method | Description |
|---|---|
| `client.register(spec)` | Register a model (spec from `extract_spec` or `from_weights`) |
| `client.list_models()` | List all models available to your key |
| `client.delete(model_id)` | Delete a registered model and free its slot |

**Slot limits:** free = 1 model · developer = 10 · enterprise = unlimited

### Explanations

| Method | Description |
|---|---|
| `client.explain(model_id, features)` | SHAP explanation — features must be pre-scaled |
| `client.explain_raw(model_id, features)` | SHAP explanation — raw values, auto-scaled server-side |

Both return:
```python
{
    "prediction": 0.74,          # model probability
    "base_rate": 0.31,           # baseline (training set average)
    "shap_values": [...],        # per-feature attribution (sum ≈ prediction - base_rate)
    "feature_names": [...],
    "metadata": {...}
}
```

#### MLP fast lane — `linear_surrogate=True`

For MLP models only, opt in per-request to a rank-1 Jacobian linear
surrogate evaluated as a single FHE matmul against the registered
baseline:

```python
result = client.explain("my_mlp", x, fhe_mode="ckks", linear_surrogate=True)
print(result["metadata"]["method"])  # "openfhe_ckks_mlp_jacobian_surrogate"
```

| Path | Wall (prod CX22, 2 vCPU) | SHAP L∞ vs exact |
|---|---|---|
| Default (diagonal-encoded coalition-packed, d=50, K=390) | ~73 s | 0 (exact path) |
| `linear_surrogate=True` | **~7 s** | 0.062 measured · `error_bound=0.15` reported |

The 0.15 reported `error_bound` is the conservative empirical floor
we sign customer SLAs against. No-op for non-MLP models. If your
accuracy budget is tighter than 0.15, omit the flag and accept the
~73 s wall.

### Account

| Method | Description |
|---|---|
| `client.usage()` | Monthly quota: used / remaining |
| `client.rotate_key()` | Issue new key, deactivate current. Models migrate automatically. |
| `client.health()` | API health check — no key required |
| `client.load_demo_model()` | Load the demo credit model (`credit_model`) |

### Key rotation

```python
result = client.rotate_key()
print(result["new_key"])         # vb_... — save this immediately
print(result["models_migrated"]) # all your models move automatically

# Your old key is now inactive — create a new client
client = CipherExplainClient(api_key=result["new_key"])
```

## Error handling

All methods raise `httpx.HTTPStatusError` on API errors. Check `exc.response.status_code`:

```python
import httpx

try:
    client.register(spec)
except httpx.HTTPStatusError as exc:
    if exc.response.status_code == 401:
        print("Missing X-API-Key")
    elif exc.response.status_code == 403:
        print("Invalid key or tier restriction (e.g. PDF reports need Developer+)")
    elif exc.response.status_code == 404:
        print("Model not found")
    elif exc.response.status_code == 409:
        print("Model ID already registered — delete it first or use a different ID")
    elif exc.response.status_code == 422:
        print("Validation error:", exc.response.json()["detail"])
    elif exc.response.status_code == 429:
        print("Quota exceeded — model slot limit or monthly call limit reached")
    else:
        raise
```

## Differential Privacy for Published SHAP Values

Use DP only when you publish SHAP values externally (dashboards, reports, partners). For internal use inside a trust boundary it is unnecessary.

### Via the API — `apply_dp=True`

```python
# Enable DP protection before sharing SHAP values externally
result = client.explain_raw(
    "my_model",
    x_raw,
    apply_dp=True,
    dp_epsilon=10.0,
    dp_delta=1e-5,
    dp_clip_C=0.2,
)
print(result["shap_values"])        # DP-protected values
print(result["dp_sigma"])           # noise std used
print(result["dp_epsilon_used"])    # epsilon applied

# WARNING: only enable apply_dp=True when sharing SHAP values
# with external parties. If using SHAP values internally only,
# FHE already protects the input — DP is not needed.
```

### Client-side — bring your own `phi`

```python
from cipherexplain_sdk import clip_and_noise, PrivacyAccountant

# phi_hat is the decrypted post-regression SHAP vector (length d).
phi_noisy, sigma = clip_and_noise(phi_hat, C=0.2, epsilon=10.0, delta=1e-5)

acct = PrivacyAccountant(epsilon_per_query=10.0, delta=1e-5)
acct.record_query()
print(acct.total_epsilon(), acct.total_delta())
```

Recommended parameters: `C=0.2`, `epsilon=10-20`, `delta=1e-5`.

Warnings:
- Do not re-normalize `phi_noisy` (e.g. by dividing so the values sum to `y_hat - baseline`). Re-normalization multiplies noise by an input-dependent factor and breaks DP. The efficiency axiom holds in expectation via zero-mean noise.
- Noise is calibrated against the *post-regression* SHAP sensitivity, which is `Theta(1)` in K (Paper 4 column-cancellation lemma), not the pre-regression `Theta(sqrt(K)/n)`.
- Compose budget across queries via `PrivacyAccountant` (zCDP accounting).

## Tier limits

| | Free | Developer | Enterprise |
|---|---|---|---|
| SHAP API calls / month | 50 | 5,000 | 50,000 |
| Oracle runs / month | 3 | 500 | 5,000 |
| Model slots | 1 | 10 | Unlimited |
| PDF audit reports | — | Yes | Yes |

## Get an API key

[vaultbytes.com/cipherexplain](https://vaultbytes.com/cipherexplain)
