
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/evaluation/plot_metrics.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_evaluation_plot_metrics.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_evaluation_plot_metrics.py:


=======================================
Metrics specific to imbalanced learning
=======================================

Specific metrics have been developed to evaluate classifier which
has been trained using imbalanced data. :mod:`imblearn` provides mainly
two additional metrics which are not implemented in :mod:`sklearn`: (i)
geometric mean and (ii) index balanced accuracy.

.. GENERATED FROM PYTHON SOURCE LINES 11-15

.. code-block:: Python


    # Authors: Guillaume Lemaitre <g.lemaitre58@gmail.com>
    # License: MIT








.. GENERATED FROM PYTHON SOURCE LINES 16-20

.. code-block:: Python

    print(__doc__)

    RANDOM_STATE = 42








.. GENERATED FROM PYTHON SOURCE LINES 21-22

First, we will generate some imbalanced dataset.

.. GENERATED FROM PYTHON SOURCE LINES 24-39

.. code-block:: Python

    from sklearn.datasets import make_classification

    X, y = make_classification(
        n_classes=3,
        class_sep=2,
        weights=[0.1, 0.9],
        n_informative=10,
        n_redundant=1,
        flip_y=0,
        n_features=20,
        n_clusters_per_class=4,
        n_samples=5000,
        random_state=RANDOM_STATE,
    )








.. GENERATED FROM PYTHON SOURCE LINES 40-41

We will split the data into a training and testing set.

.. GENERATED FROM PYTHON SOURCE LINES 43-49

.. code-block:: Python

    from sklearn.model_selection import train_test_split

    X_train, X_test, y_train, y_test = train_test_split(
        X, y, stratify=y, random_state=RANDOM_STATE
    )








.. GENERATED FROM PYTHON SOURCE LINES 50-53

We will create a pipeline made of a :class:`~imblearn.over_sampling.SMOTE`
over-sampler followed by a :class:`~sklearn.linear_model.LogisticRegression`
classifier.

.. GENERATED FROM PYTHON SOURCE LINES 53-59

.. code-block:: Python


    from sklearn.linear_model import LogisticRegression
    from sklearn.preprocessing import StandardScaler

    from imblearn.over_sampling import SMOTE








.. GENERATED FROM PYTHON SOURCE LINES 60-68

.. code-block:: Python

    from imblearn.pipeline import make_pipeline

    model = make_pipeline(
        StandardScaler(),
        SMOTE(random_state=RANDOM_STATE),
        LogisticRegression(max_iter=10_000, random_state=RANDOM_STATE),
    )








.. GENERATED FROM PYTHON SOURCE LINES 69-73

Now, we will train the model on the training set and get the prediction
associated with the testing set. Be aware that the resampling will happen
only when calling `fit`: the number of samples in `y_pred` is the same than
in `y_test`.

.. GENERATED FROM PYTHON SOURCE LINES 75-78

.. code-block:: Python

    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)








.. GENERATED FROM PYTHON SOURCE LINES 79-82

The geometric mean corresponds to the square root of the product of the
sensitivity and specificity. Combining the two metrics should account for
the balancing of the dataset.

.. GENERATED FROM PYTHON SOURCE LINES 84-88

.. code-block:: Python

    from imblearn.metrics import geometric_mean_score

    print(f"The geometric mean is {geometric_mean_score(y_test, y_pred):.3f}")





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    The geometric mean is 0.940




.. GENERATED FROM PYTHON SOURCE LINES 89-91

The index balanced accuracy can transform any metric to be used in
imbalanced learning problems.

.. GENERATED FROM PYTHON SOURCE LINES 93-103

.. code-block:: Python

    from imblearn.metrics import make_index_balanced_accuracy

    alpha = 0.1
    geo_mean = make_index_balanced_accuracy(alpha=alpha, squared=True)(geometric_mean_score)

    print(
        f"The IBA using alpha={alpha} and the geometric mean: "
        f"{geo_mean(y_test, y_pred):.3f}"
    )





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    The IBA using alpha=0.1 and the geometric mean: 0.884




.. GENERATED FROM PYTHON SOURCE LINES 104-111

.. code-block:: Python

    alpha = 0.5
    geo_mean = make_index_balanced_accuracy(alpha=alpha, squared=True)(geometric_mean_score)

    print(
        f"The IBA using alpha={alpha} and the geometric mean: "
        f"{geo_mean(y_test, y_pred):.3f}"
    )




.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    The IBA using alpha=0.5 and the geometric mean: 0.884





.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 26.833 seconds)

**Estimated memory usage:**  300 MB


.. _sphx_glr_download_auto_examples_evaluation_plot_metrics.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_metrics.ipynb <plot_metrics.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_metrics.py <plot_metrics.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_metrics.zip <plot_metrics.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
