Metadata-Version: 2.4
Name: uncalibrated_linearsvc
Version: 0.0.2
Summary: Uncalibrated Linearsvc
Home-page: https://github.com/maximz/uncalibrated-linearsvc
Author: Maxim Zaslavsky
Author-email: maxim@maximz.com
License: MIT license
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: scikit-learn
Requires-Dist: genetools
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# uncalibrated-linearsvc

[![](https://img.shields.io/pypi/v/uncalibrated_linearsvc.svg)](https://pypi.python.org/pypi/uncalibrated_linearsvc)
[![CI](https://github.com/maximz/uncalibrated-linearsvc/actions/workflows/ci.yaml/badge.svg?branch=master)](https://github.com/maximz/uncalibrated-linearsvc/actions/workflows/ci.yaml)
[![](https://img.shields.io/badge/docs-here-blue.svg)](https://uncalibrated-linearsvc.maximz.com)
[![](https://img.shields.io/github/stars/maximz/uncalibrated-linearsvc?style=social)](https://github.com/maximz/uncalibrated-linearsvc)

`uncalibrated-linearsvc` provides a small scikit-learn-compatible wrapper around
`sklearn.svm.LinearSVC` that adds a `predict_proba()` method.

## What It Is

The package exposes `LinearSVCWithUncalibratedProbabilities`, a subclass of
`LinearSVC`. It keeps the normal `LinearSVC` API and behavior, then implements
`predict_proba()` from the classifier's `decision_function()` output.

This is useful when downstream sklearn tools expect probability-shaped output.
One motivating case in the source is multiclass ROC AUC, where
`roc_auc_score()` requires per-class scores that sum to 1.

## How It Works

`predict_proba(X)` calls `decision_function(X)` and passes the resulting margins
through `genetools.stats.run_sigmoid_if_binary_and_softmax_if_multiclass()`:

- binary classifiers use a sigmoid transform and return shape
  `(n_samples, 2)`;
- multiclass classifiers use a softmax transform and return shape
  `(n_samples, n_classes)`.

Rows are normalized to sum to 1, matching the shape expected from sklearn
classifiers with `predict_proba()`.

## Important Limitation

The returned values are uncalibrated. They are transformed SVM decision margins,
not calibrated probability estimates. Use them when you need normalized
probability-like scores for sklearn APIs, not when you need well-calibrated
confidence values. For calibrated probabilities, use sklearn's calibration
tools instead.

## Installation

```bash
pip install uncalibrated_linearsvc
```

The package requires Python 3.8 or newer and depends on `numpy`,
`scikit-learn`, and `genetools`.

## Usage

```python
from uncalibrated_linearsvc import LinearSVCWithUncalibratedProbabilities

clf = LinearSVCWithUncalibratedProbabilities(
    dual=False,
    multi_class="ovr",
    random_state=0,
)

clf.fit(X_train, y_train)

predictions = clf.predict(X_test)
probability_like_scores = clf.predict_proba(X_test)
```

Because the class subclasses `LinearSVC`, constructor arguments such as
`dual`, `multi_class`, and `random_state` are the same as in scikit-learn.

## Development

```bash
pip install -r requirements_dev.txt
pip install -e .
make test
make lint
```

The test suite covers binary and multiclass `predict_proba()` shape handling and
row normalization.


# Changelog

## 0.0.1

* First release on PyPI.
