Metadata-Version: 2.4
Name: type_preserving_scaler
Version: 0.0.2
Summary: Type Preserving Scaler
Home-page: https://github.com/maximz/type-preserving-scaler
Author: Maxim Zaslavsky
Author-email: maxim@maximz.com
License: MIT license
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: scikit-learn
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Type Preserving Scaler

[![](https://img.shields.io/pypi/v/type_preserving_scaler.svg)](https://pypi.python.org/pypi/type_preserving_scaler)
[![CI](https://github.com/maximz/type-preserving-scaler/actions/workflows/ci.yaml/badge.svg?branch=master)](https://github.com/maximz/type-preserving-scaler/actions/workflows/ci.yaml)
[![](https://img.shields.io/badge/docs-here-blue.svg)](https://type-preserving-scaler.maximz.com)
[![](https://img.shields.io/github/stars/maximz/type-preserving-scaler?style=social)](https://github.com/maximz/type-preserving-scaler)

`type_preserving_scaler` provides a small wrapper around scikit-learn's
`StandardScaler` for projects that want predictable output container types.

## What it is

The package exposes one class:

```python
from type_preserving_scaler import StandardScalerThatPreservesInputType
```

`StandardScalerThatPreservesInputType` behaves like
`sklearn.preprocessing.StandardScaler`, with one added convention:

- fit on a pandas `DataFrame`, and `transform()` / `fit_transform()` return a
  pandas `DataFrame`
- fit on a NumPy array, and `transform()` / `fit_transform()` return a NumPy
  `ndarray`

For DataFrame inputs, the tested behavior preserves the output shape, columns,
and index.

## Why it exists

Plain `StandardScaler` can return NumPy arrays even when the caller is working
with pandas data. That can force downstream code to manually rebuild DataFrames
or carry column/index metadata separately.

This package keeps the common "pandas in, pandas out" workflow while retaining
the familiar `StandardScaler` API.

## How it works

The class subclasses `sklearn.preprocessing.StandardScaler`. During `fit()`, it
delegates to the parent scaler, then calls scikit-learn's `set_output()` API:

- `set_output(transform="pandas")` when `fit()` receives a pandas `DataFrame`
- `set_output(transform="default")` otherwise

The output type is therefore determined by the data passed to `fit()`, not by
each later call to `transform()`.

## Installation

```bash
pip install type_preserving_scaler
```

The package requires Python 3.8 or newer and depends on NumPy, pandas, and
scikit-learn.

## Usage

```python
import pandas as pd
from type_preserving_scaler import StandardScalerThatPreservesInputType

df = pd.DataFrame({"a": [1, 2, 3], "b": [10, 20, 30]})

scaler = StandardScalerThatPreservesInputType()
scaled = scaler.fit_transform(df)

assert isinstance(scaled, pd.DataFrame)
assert scaled.columns.equals(df.columns)
assert scaled.index.equals(df.index)
```

NumPy inputs keep NumPy outputs:

```python
import numpy as np
from type_preserving_scaler import StandardScalerThatPreservesInputType

array = np.array([[1.0, 2.0], [3.0, 4.0]])
scaled = StandardScalerThatPreservesInputType().fit_transform(array)

assert isinstance(scaled, np.ndarray)
```

## Development

Install the development dependencies and the package in editable mode:

```bash
pip install -r requirements_dev.txt
pip install -e .
```

Common local commands:

```bash
make test
make lint
make docs
```

## Limitations

- This package only wraps `StandardScaler`; it is not a general adapter for all
  scikit-learn transformers.
- The output type is chosen when `fit()` runs.
- The installed scikit-learn version must provide the `set_output()` API used by
  the implementation.
- The package preserves the output container type. It does not change
  `StandardScaler`'s scaling behavior.

## License

MIT.


# Changelog

## 0.0.1

* First release on PyPI.
