Metadata-Version: 2.4
Name: trustlens
Version: 0.4.0
Summary: Audit ML models beyond accuracy — calibration, fairness, latent health, and deployment verdicts.
Author-email: Shahid Ul Islam <shahid9664@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/Khanz9664/TrustLens
Project-URL: Source, https://github.com/Khanz9664/TrustLens
Project-URL: Portfolio, https://khanz9664.github.io/portfolio/
Project-URL: LinkedIn, https://www.linkedin.com/in/shahid-ul-islam-13650998/
Project-URL: Instagram, https://instagram.com/shaddy9664
Keywords: machine learning,model evaluation,model reliability,calibration,model trust,fairness,bias detection
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.23
Requires-Dist: scikit-learn>=1.2
Requires-Dist: matplotlib>=3.6
Requires-Dist: scipy>=1.9
Provides-Extra: xgboost
Requires-Dist: xgboost>=1.7; extra == "xgboost"
Provides-Extra: torch
Requires-Dist: torch>=2.0; extra == "torch"
Requires-Dist: torchvision>=0.15; extra == "torch"
Provides-Extra: full
Requires-Dist: xgboost>=1.7; extra == "full"
Requires-Dist: torch>=2.0; extra == "full"
Requires-Dist: torchvision>=0.15; extra == "full"
Requires-Dist: shap>=0.42; extra == "full"
Requires-Dist: captum>=0.6; extra == "full"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: pre-commit>=3.0; extra == "dev"
Requires-Dist: build>=1.0; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Requires-Dist: pip-audit>=2.7; extra == "dev"
Requires-Dist: pytest-xdist>=3.0; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=6.0; extra == "docs"
Requires-Dist: furo>=2024.1.1; extra == "docs"
Requires-Dist: myst-parser>=2.0; extra == "docs"
Requires-Dist: nbsphinx>=0.9; extra == "docs"
Dynamic: license-file

<div align="center">

# TrustLens

### Audit ML models beyond accuracy — calibration, fairness, latent health, and deployment verdicts.

<br/>

[![PyPI](https://badge.fury.io/py/trustlens.svg)](https://pypi.org/project/trustlens/)
[![Downloads](https://img.shields.io/pypi/dm/trustlens)](https://pypi.org/project/trustlens)
[![CI](https://github.com/Khanz9664/TrustLens/actions/workflows/ci.yml/badge.svg)](https://github.com/Khanz9664/TrustLens/actions)
[![Coverage](https://img.shields.io/badge/coverage-75%25-brightgreen)](https://github.com/Khanz9664/TrustLens)
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
[![Tests](https://img.shields.io/badge/tests-237%20passing-success)](https://github.com/Khanz9664/TrustLens/actions)

<br/>

[**Quickstart**](#quickstart) · [**How It Works**](#how-it-works) · [**Demo Video**](https://youtu.be/lZo7aWg6efI) · [**Docs**](docs/index.md) · [**Project Showcase**](https://khanz9664.github.io/portfolio/projects/trustlens.html)

</div>

---

> **Your model has 92% accuracy. It's still not safe for deployment.**
>
> Accuracy measures what went right. TrustLens measures what can go wrong — in production, on subgroups, and at high confidence.

---

## Why TrustLens

Standard evaluation stops at accuracy. Silent failures happen when:

- A model is **overconfident** — "90% sure" but right only 60% of the time
- Performance **collapses on subgroups** — gender, age, or region hidden inside a good aggregate score
- The model is **confidently wrong** — high-confidence errors that indicate systemic risk
- **Latent representations overlap** — classes bleed together where the model can't tell them apart

TrustLens surfaces all four with a single audit, and outputs a machine-readable deployment verdict.

---

## Supported Frameworks

TrustLens uses a **Prediction Resolver Architecture** to automatically handle different ML frameworks:

- **scikit-learn** — Full support for all `ClassifierMixin` estimators.
- **XGBoost** — Native support for `XGBClassifier` and raw `Booster` objects.
- **Planned** — LightGBM, CatBoost, PyTorch, TensorFlow/Keras.

TrustLens **automatically detects** your model's framework. You don't need to change your code when switching from sklearn to XGBoost.

---

## Quickstart

```bash
pip install trustlens
# Extended visualization support
pip install trustlens[full]
```

Run a one-line audit to see why 94% accuracy isn't the full story:

```python
from trustlens import quick_analyze

quick_analyze(dataset="breast_cancer")
```

```
TRUST SCORE: 68/100 [D]
Assessment : Low Trust — Blocked by high diagnostic risk

  Base Score        : 76
  Penalties Applied : -7.7 (Failure Risk)
  Final Score       : 68

→ Model shows high failure risk and is NOT ready for deployment.
```

---

## How It Works

TrustLens runs four diagnostic modules and combines them into a single **Trust Score (0–100)** with a CI/CD-ready deployment verdict.

| Module | What It Catches |
|---|---|
|  **Calibration** | Confidence vs. correctness mismatch, overconfidence, ECE |
|  **Fairness** | Subgroup performance gaps, equalized-odds violations |
|  **Representation** | Latent space health, class separation, overlap detection |
|  **Decision Engine** | Composite Trust Score + `Ready` / `Blocked` verdict |

---

## Full Audit

### Automatic Detection (Sklearn / XGBoost)

```python
from trustlens import analyze
from xgboost import XGBClassifier

model = XGBClassifier().fit(X_train, y_train)

# TrustLens automatically detects XGBoost and resolves predictions
report = analyze(
    model=model,
    X=X_test,
    y_true=y_test,
    sensitive_features={"gender": gender_test}
)

report.show()
```

### Manual Prediction Override

For external inference systems or unsupported frameworks, you can pass predictions directly:

```python
report = analyze(
    model=None, # optional when passing y_pred/y_prob
    X=X_test,
    y_true=y_test,
    y_pred=external_preds,
    y_prob=external_probs
)
```

### Audit Metadata & Provenance

Every report tracks its own backend provenance for auditability:

```python
print(report.metadata["framework"])  # "xgboost"
print(report.metadata["backend"])    # {'resolver': 'xgboost', 'framework_version': '2.0.3', ...}
```

### Save & Export

```python
# Save as a unified JSON artifact (best for experiment trackers)
report.save("report.json")

# Save as a full directory bundle (best for human review)
report.save("trust_report/")
```

### Output artifacts (Directory Bundle)

```
trust_report/
├── trust_score.json    ← deployment verdict & composite score
├── report.json         ← raw diagnostic metrics
├── metadata.json       ← environment, version, backend provenance
├── report.txt          ← human-readable summary
└── visuals/            ← per-module diagnostic plots (PNG)
```

### CI/CD gating

Gate model promotion on `trust_score.json` — no custom scripting needed:

```json
{
  "score": 68,
  "grade": "D",
  "verdict": "Low Trust — Blocked by high failure risk",
  "is_blocked": true
}
```

---

## Diagnostics in Practice

<div align="center">
<table width="100%">
  <tr>
    <td width="50%" align="center">
      <b>Calibration</b><br/>
      <img src="assets/calibration_plot.png" width="90%" /><br/>
      <sub>Does confidence align with correctness?</sub>
    </td>
    <td width="50%" align="center">
      <b>Fairness & Bias</b><br/>
      <img src="assets/fairness_gap_region.png" width="90%" /><br/>
      <sub>Are subgroups treated equally?</sub>
    </td>
  </tr>
  <tr>
    <td width="50%" align="center">
      <b>Latent Space Health</b><br/>
      <img src="assets/representation_embedding_2d.png" width="90%" /><br/>
      <sub>Is class separation clean?</sub>
    </td>
    <td width="50%" align="center">
      <b>Deployment Verdict</b><br/>
      <img src="assets/summary_dashboard.png" width="90%" /><br/>
      <sub>Is this model safe to ship?</sub>
    </td>
  </tr>
</table>
</div>

---

## Demo

[![Watch the demo](https://img.youtube.com/vi/lZo7aWg6efI/mqdefault.jpg)](https://youtu.be/lZo7aWg6efI)

15-minute walkthrough: diagnostics, trust scoring, fairness analysis, and visual dashboards.

Want a deeper look at the architecture and design decisions? → [**Interactive Project Showcase**](https://khanz9664.github.io/portfolio/projects/trustlens.html)

---

## Run the Full Demo

```bash
python demo.py
```

Generates multi-model comparisons, fairness deep-dives, latent space projections, JSON audits, and visual dashboards across all modules.

---

## Contributing

All contributions welcome — new metrics, diagnostic plugins, and visualizations.

→ [**Contributing Guide**](CONTRIBUTING.md) · [**Open an Issue**](https://github.com/Khanz9664/TrustLens/issues) · [**Docs**](docs/index.md)

<br/>

<a href="https://github.com/Khanz9664"><img src="https://github.com/Khanz9664.png" width="36" style="border-radius:50%" /></a>
<a href="https://github.com/jayssSmm"><img src="https://github.com/jayssSmm.png" width="36" style="border-radius:50%" /></a>
<a href="https://github.com/WeiGuang-2099"><img src="https://github.com/WeiGuang-2099.png" width="36" style="border-radius:50%" /></a>
<a href="https://github.com/CrepuscularIRIS"><img src="https://github.com/CrepuscularIRIS.png" width="36" style="border-radius:50%" /></a>
<a href="https://github.com/komoike-oss28-ui"><img src="https://github.com/komoike-oss28-ui.png" width="36" style="border-radius:50%" /></a>
<a href="https://github.com/sidharth-vijayan"><img src="https://github.com/sidharth-vijayan.png" width="36" style="border-radius:50%" /></a>
<a href="https://github.com/MustansirNisar"><img src="https://github.com/MustansirNisar.png" width="36" style="border-radius:50%" /></a>

---

## Citation

```bibtex
@software{trustlens2026,
  author = {Shahid Ul Islam},
  title  = {TrustLens: Audit ML models beyond accuracy},
  year   = {2026},
  url    = {https://github.com/Khanz9664/TrustLens}
}
```

<div align="center">
<br/>
Built by <a href="https://github.com/Khanz9664"><b>Shahid Ul Islam</b></a> &nbsp;·&nbsp;
<a href="https://khanz9664.github.io/portfolio/">Portfolio</a> &nbsp;·&nbsp;
<a href="https://www.linkedin.com/in/shahid-ul-islam-13650998/">LinkedIn</a>
</div>
