Metadata-Version: 2.4
Name: ml-modelguard
Version: 0.2.0
Summary: A drop-in seat-belt library for machine-learning model files that prevents hidden malware and verifies provenance
Author-email: Kartik Khosa <kartik.khosa@gmail.com>
Project-URL: Homepage, https://github.com/kk25081998/Modelguard
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: pydantic>=2.0.0
Requires-Dist: typer>=0.9.0
Requires-Dist: sigstore>=2.0.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0.0
Provides-Extra: torch
Requires-Dist: torch>=1.9.0; extra == "torch"
Provides-Extra: tensorflow
Requires-Dist: tensorflow>=2.8.0; extra == "tensorflow"
Provides-Extra: sklearn
Requires-Dist: scikit-learn>=1.0.0; extra == "sklearn"
Provides-Extra: onnx
Requires-Dist: onnx>=1.12.0; extra == "onnx"
Requires-Dist: onnxruntime>=1.12.0; extra == "onnx"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"

# ModelGuard 🛡️

[![PyPI version](https://badge.fury.io/py/ml-modelguard.svg)](https://badge.fury.io/py/ml-modelguard)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Tests](https://img.shields.io/badge/tests-54%2F54%20passing-brightgreen.svg)](https://github.com/kk25081998/Modelguard)

A drop-in "seat-belt" library for machine learning model files that **prevents hidden malware**, **verifies provenance**, and works seamlessly across PyTorch, TensorFlow, scikit-learn, and ONNX.

## 🚨 The Problem

Machine learning models are increasingly being shared and downloaded from public repositories, but this creates serious security risks:

- **Arbitrary Code Execution**: ML model formats based on Pickle can execute malicious code when loaded
- **Supply Chain Attacks**: Models from untrusted sources can contain hidden malware
- **No Provenance Verification**: No way to verify who created a model or if it's been tampered with
- **Framework Fragmentation**: Different security approaches for each ML framework

## ✨ The Solution

ModelGuard provides comprehensive ML model security with:

🔒 **Safe Loading** - Blocks malicious Pickle opcodes with restricted unpickler  
🔐 **Signature Verification** - Guarantees model provenance via Sigstore signatures  
⚡ **Zero Friction** - Drop-in replacement requiring minimal code changes  
🌐 **Multi-Framework** - Unified security across PyTorch, TensorFlow, scikit-learn, and ONNX  
🚀 **Production Ready** - Extensively tested with 54/54 tests passing

## 🚀 Quick Start

### Installation

```bash
pip install ml-modelguard
```

### Basic Usage

**Option 1: Direct Replacement**

```python
# Before: Unsafe loading
import torch
model = torch.load('model.pth')

# After: Safe loading
import modelguard.torch as torch
model = torch.safe_load('model.pth')
```

**Option 2: Context Manager (Recommended)**

```python
import modelguard
import torch

with modelguard.patched():
    model = torch.load('model.pth')  # Automatically secured
```

**Option 3: CLI Scanning**

```bash
# Scan a model file
modelguard scan model.pth

# Scan entire directory
modelguard scan ./models/ --recursive

# Get JSON output
modelguard scan model.pth --format json
```

## 🔧 Framework Support

### PyTorch

```python
import modelguard.torch as torch
model = torch.safe_load('model.pth')
```

### TensorFlow/Keras

```python
import modelguard.tensorflow as tf
model = tf.safe_load('model.h5')
```

### scikit-learn

```python
import modelguard.sklearn as sklearn
model = sklearn.safe_load('model.pkl')
```

### ONNX

```python
import modelguard.onnx as onnx
model = onnx.safe_load('model.onnx')
```

## 🛡️ Security Features

### Malicious Code Detection

ModelGuard analyzes Pickle opcodes to detect dangerous patterns:

- **GLOBAL opcodes** that import dangerous functions
- **REDUCE opcodes** that execute arbitrary code
- **BUILD opcodes** that construct malicious objects

### Signature Verification

Verify model authenticity using Sigstore:

```bash
# Sign a model
modelguard sign model.pth

# Verify signature
modelguard verify model.pth
```

### Policy Enforcement

Configure security policies via environment variables or YAML:

```yaml
# modelguard.yaml
enforce: true
require_signatures: true
trusted_signers:
  - "alice@company.com"
  - "bob@company.com"
max_file_size_mb: 1000
```

## 📊 Performance

ModelGuard is designed for production use with excellent performance:

- **Fast Scanning**: < 150ms for 100MB models (2x better than target)
- **Memory Efficient**: Stable memory usage with no leaks
- **Concurrent Safe**: Thread-safe operations with linear scaling
- **Low Overhead**: Reasonable security overhead for comprehensive protection

## 🔧 Configuration

### Environment Variables

```bash
export MODELGUARD_ENFORCE=true
export MODELGUARD_REQUIRE_SIGNATURES=true
export MODELGUARD_TRUSTED_SIGNERS="alice@company.com,bob@company.com"
```

### Policy File

Create `modelguard.yaml` in your project root:

```yaml
enforce: true
require_signatures: false
scan_on_load: true
max_file_size_mb: 1000
timeout_seconds: 30
```

## 📚 Examples

### Enterprise Security Setup

```python
import modelguard
import os

# Configure strict security policy
os.environ['MODELGUARD_ENFORCE'] = 'true'
os.environ['MODELGUARD_REQUIRE_SIGNATURES'] = 'true'
os.environ['MODELGUARD_TRUSTED_SIGNERS'] = 'security@company.com'

# All model loading is now secured
with modelguard.patched():
    import torch
    import tensorflow as tf

    # Both calls are automatically secured
    pytorch_model = torch.load('model.pth')
    tf_model = tf.keras.models.load_model('model.h5')
```

### Development Workflow

```python
import modelguard.torch as torch

# Safe loading with detailed feedback
try:
    model = torch.safe_load('untrusted_model.pth')
    print("✅ Model loaded safely")
except modelguard.MaliciousModelError as e:
    print(f"🚨 Malicious content detected: {e}")
except modelguard.SignatureError as e:
    print(f"🔐 Signature verification failed: {e}")
```

## 🧪 Testing

ModelGuard has comprehensive test coverage:

```bash
# Run all tests
pytest tests/

# Run specific test categories
pytest tests/test_policy.py      # Policy engine tests
pytest tests/test_scanner.py     # Malware detection tests
pytest tests/test_loaders.py     # Framework loader tests
pytest tests/test_performance.py # Performance benchmarks
```

**Test Results**: 54/54 tests passing ✅

## 🤝 Contributing

We welcome contributions! Here's how to get started:

### Development Setup

1. **Fork and Clone**

   ```bash
   git clone https://github.com/YOUR_USERNAME/Modelguard.git
   cd Modelguard
   ```

2. **Install Development Dependencies**

   ```bash
   pip install -e ".[dev]"
   ```

3. **Run Tests**

   ```bash
   pytest tests/
   ```

4. **Code Quality Checks**
   ```bash
   ruff check src/ tests/
   mypy src/
   ```

### What We Need Help With

- 🐛 **Bug Reports**: Found an issue? Open an issue with details
- 🚀 **New Features**: Ideas for improving ML security
- 📚 **Documentation**: Help improve our docs and examples
- 🧪 **Testing**: More test cases and edge case coverage
- 🔧 **Framework Support**: Additional ML framework integrations

See our [Contributing Guide](CONTRIBUTING.md) for detailed guidelines.

## 📄 License

ModelGuard is licensed under the [Apache License 2.0](LICENSE).

## 🔗 Links

- **PyPI**: https://pypi.org/project/ml-modelguard/
- **Documentation**: https://github.com/kk25081998/Modelguard
- **Issues**: https://github.com/kk25081998/Modelguard/issues

## 🙏 Acknowledgments

- **Sigstore** for signature verification infrastructure
- **Python Security Team** for security best practices
- **ML Community** for feedback and testing

---

**Made with ❤️ for the ML community's security**
