Metadata-Version: 2.4
Name: ai-critic
Version: 0.2.5
Summary: Fast AI evaluator for scikit-learn models
Author-email: Luiz Seabra <filipedemarco@yahoo.com>
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: scikit-learn

# ai-critic 🧠: The Quality Gate for Machine Learning Models

**ai-critic** is a specialized **decision-making** tool designed to audit the reliability and readiness for deployment of scikit-learn compatible Machine Learning models.

Instead of just measuring performance (accuracy, F1 score), **ai-critic** acts as a "Quality Gate," operating the model in search of hidden risks that can lead to production failures, such as data leaks, structural overfitting, and vulnerability to noise.

---

## 🚀 1. Getting Started (The Basics)

This section is ideal for beginners who need a quick verdict on the health of their model.

### 1.1. Installation

Install the library directly from PyPI:

```bash
pip install ai-critic
```

### 1.2. The Quick Verdict

With just a few lines, you can get an executive evaluation and a deployment recommendation.

```python
from ai_critic import AICritic
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification

# 1. Prepare your data and model
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
model = RandomForestClassifier(max_depth=5, random_state=42)

# 2. Initialize Criticism
# AICritic performs all audits internally
critic = AICritic(model, X, y)

# 3. Obtain the Executive Summary
report = critic.evaluate(view="executive")

print(f"Verdict: {report['verdict']}")
print(f"Risk: {report['risk_level']}")
print(f"Reason Main: {report['main_reason']}")

#Expected Output:

# Verdict: ✅ Acceptable
# Risk: Low
# Main Reason: No critic risks detected.

```

---

## 💡 2. Understanding the Critique (The Intermediary)

For the data scientist who needs to understand *why* the model received a verdict and what the next steps are.

### 2.1. The Four Pillars of the Audit

The **ai-critic** evaluates your model across four critic dimensions.

| Category | Main Risk | Code Module |
| :--- | :--- | :--- |
| 📈 **Validation** | Suspicious CV Scores | `ai_critic.performance` |
| 🧪 **Robustness** | Noise Vulnerability | `ai_critic.robustness` |

2.2. Visual and Technical Analysis

The `evaluate` method allows you to view the results and access the complete technical report.

```Python
# Continuing the previous example...

# 1. Generate the full report and visualizations
# plot=True generates Correlation, Learning Curve, and Robustness graphs
full_report = critic.evaluate(view="all", plot=True)

# 2. Access the Technical Summary for Recommendations
technical_summary = full_report["technical"]

print("\n--- Technical Recommendations ---")
for i, risk in enumerate(technical_summary["key_risks"]):
print(f"Risk {i+1}: {risk}")
print(f"Recommendation: {technical_summary['recommendations'][i]}")

# Example of Risk (if there were one):
# Risk 1: The depth of the tree may be too high for the size of the dataset.

# Recommendation: Reduce model complexity or adjust hyperparameters.


###2.3. Robustness Test

A robust model should maintain its performance even with small disturbances in the data. The `ai-critic` test assesses this by injecting noise into the input data.

```python
# Accessing the specific result of the Robustness module
robustness_result = full_report["details"]["robustness"]

print("\n--- Robustness Test ---")
print(f"Original CV Score: {robustness_result['cv_score_original']:.4f}")
print(f"CV Score with Noise: {robustness_result['cv_score_noisy']:.4f}")
print(f"Performance Drop: {robustness_result['performance_drop']:.4f}")
print(f"Robustness Verdict: {robustness_result['verdict']}")

# Possible Verdicts:
# - Stable: Acceptable drop.

# - Fragile: Significant drop (risk).

# - Misleading: Original performance inflated by leakage.

```

---

## ⚙️ 3. Integration and Governance (The Advanced)

This section is for MLOps engineers and architects looking to integrate **ai-critic** into automated pipelines and create custom deployment logic.

###3.1. The Deployment Gate (`deploy_decision`)

The `deploy_decision()` method is the final control point. It returns a structured object that classifies problems into *Hard Blockers* (prevent deployment) and *Soft Blockers* (require attention, but can be accepted with reservations).

Python
# Example of use in a CI/CD pipeline
decision = critic.deploy_decision()

if decision["deploy"]:
print("✅ Deployment Approved. Risk Level: Low.")
other:
print(f"❌ Deployment Blocked. Risk Level: {decision['risk_level'].upper()}")
print("Blocking Issues:")
for issue in decision["blocking_issues"]:
print(f"- {problem}")

# The decision object also includes a heuristic confidence score (0.0 to 1.0)
print(f"Heuristic Confidence in Model: {decision['confidence']:.2f}")

```

###3.2. AccessFor custom *governance* rules or logic, you can access the raw data of each module through the `"details"` view.

```python
# Accessing Data Leakage Details
data_details = critic.evaluate(view="details")["data"]

if data_details["data_leakage"]["suspected"]:

print("\n--- Data Leak Alert ---")

for detail in data_details["data_leakage"]["details"]:

print(f"Feature {detail['feature_index']} with correlation of {detail['correlation']:.4f}")

# Accessing Structural Overfitting Details
config_details = critic.evaluate(view="details")["config"]

if config_details["structural_warnings"]:

print("\n--- Structural Alert ---")

for warning in config_details["structural_warnings"]:

print(f"Warning: {warning['message']} (Max Depth: {warning['max_depth']}, Recommended: {warning['recommended_max_depth']})")
```

### 3.3. Best Practices and Use Cases

| Use | Recommended Action |
| :--- | :--- |
| **CI/CD** | Use `deploy_decision()` as an automated quality gate. |
| **Tuning** | Use the technical view to guide hyperparameter optimization. |
| **Governance** | Log the details view for auditing and compliance. |
| **Communication** | Use the executive view to report risks to non-technical stakeholders. |

---

## 📄 License

Distributed under the **MIT License**.

--

## 🧠 Final Note

> **ai-critic** is not a benchmarking tool. It's a decision-making tool.

If a model fails here, it doesn't mean it's "bad," but rather that it **shouldn't be trusted yet**. The goal is to inject the necessary skepticism to build truly robust AI systems.
