Metadata-Version: 2.4
Name: weighted_mcc
Version: 0.1.0
Summary: A robust measure of multiclass classifier performance for observations with individual weights.
Author-email: kuslavicek <kuslavicek@gmail.com>
License: MIT License
        
        Copyright (c) 2026 kuslavicek
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/kuslavicek/weighted_mcc
Project-URL: Bug Tracker, https://github.com/kuslavicek/weighted_mcc/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Dynamic: license-file

# Weighted MCC: Robust Multiclass Metrics

**Weighted MCC** is a Python package that implements robust performance metrics for binary and multiclass classification tasks where individual observations have different importance weights.

Based on the paper describing **Weighted Matthews Correlation Coefficient (MCC)**, this package provides a mathematically sound way to evaluate classifiers in high-stakes domains like medical imaging (segmentation) and autonomous driving, where some errors are costlier than others.

## Features

- **Weighted Binary MCC**: Calculate MCC for binary tasks with per-sample weights.
- **Multiclass Extensions**:
  - **ECC (Extended Correlation Coefficient)**: A robust multiclass generalization of MCC.
  - **MPC (Multivariate Pearson Correlation)**: Variants (MPC1, MPC2) derived from covariance matrix theory.
- **Robustness Analysis**:
  - Compute theoretical upper bounds on metric stability given weight perturbations ($\epsilon$).
  - Theoretically proven stability ensures metrics are not brittle to small weight changes.
- **Efficient Implementation**: Vectorized operations using NumPy for high performance on large datasets.

## Installation

```bash
pip install weighted_mcc
```

## Usage

### Binary Classification

```python
import numpy as np
from weighted_mcc import weighted_mcc

y_true = np.array([1, 0, 1, 1, 0])
y_pred = np.array([1, 0, 0, 1, 1])
weights = np.array([2.0, 1.0, 5.0, 1.0, 1.0]) # 3rd sample is critical

# Calculate Weighted MCC
score = weighted_mcc(y_true, y_pred, weights)
print(f"Weighted MCC: {score:.4f}")
```

### Multiclass Classification

For multiclass, inputs should generally be one-hot encoded for the mathematical functions, or use helper utilities if provided (check documentation).

```python
import numpy as np
from weighted_mcc import extended_corr_coef, mpc_trace_ratio

# Example: 3 classes, 4 samples
y_true = np.array([[1,0,0], [0,1,0], [0,0,1], [1,0,0]])
y_pred = np.array([[1,0,0], [0,0,1], [0,0,1], [0,1,0]])
weights = np.array([1.0, 1.0, 2.0, 1.0])

# Extended Correlation Coefficient
ecc = extended_corr_coef(y_true, y_pred, weights)
print(f"ECC: {ecc:.4f}")

# Multivariate Pearson Correlation (Trace Ratio)
mpc1 = mpc_trace_ratio(y_true, y_pred, weights)
print(f"MPC1: {mpc1:.4f}")
```

### Robustness Check

Verify if your metric score is stable under potential weight noise (e.g., if weights are subjective).

```python
from weighted_mcc import calculate_multiclass_stability_bound

epsilon = 0.01 # Max potential deviation in weights
bound = calculate_multiclass_stability_bound(y_true, y_pred, weights, epsilon, metric_type='ECC')

print(f"Score could vary by at most ±{bound:.4f} given epsilon={epsilon}")
```

## Hardware Requirements

- **Minimum**: Modern CPU, 8GB RAM (small datasets).
- **Recommended**: Multi-core CPU with AVX2, 16GB+ RAM for large image segmentation tasks.


## References

This project incorporates research from the following paper:

- **Weighted MCC: A Robust Measure of Multiclass Classifier Performance for Observations with Individual Weights**
  Rommel Cortez, Bala Krishnamoorthy
  *arXiv:2512.20811*

