
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/data_quality/plot_extreme_outlier.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_data_quality_plot_extreme_outlier.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_data_quality_plot_extreme_outlier.py:


========================
Extreme Outliers Removal
========================

Example of point outlier removal with polynomial regression and Studentized residuals. We generate a toy data set with
an underlying polynomial signal that has Gaussian noise and large point outliers added to it.

In the figure below, it can be seen that the point outliers are filtered out from the raw data. This data can then be
subsequently processed with a smoother to refine the underlying signal if desired.

.. GENERATED FROM PYTHON SOURCE LINES 13-49



.. image-sg:: /auto_examples/data_quality/images/sphx_glr_plot_extreme_outlier_001.png
   :alt: plot extreme outlier
   :srcset: /auto_examples/data_quality/images/sphx_glr_plot_extreme_outlier_001.png
   :class: sphx-glr-single-img





.. code-block:: Python


    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd

    from indsl.data_quality import extreme


    rng = np.random.default_rng(12345)
    plt.rcParams.update({"font.size": 18})


    # Create Toy clean dataset
    nx = 1000
    index = pd.date_range(start="1970", periods=nx, freq="1min")
    x = np.linspace(0, 10, nx)
    signal = 2 * x**2 - 10 * x + 2
    noise = np.random.normal(loc=100, size=nx, scale=2)
    y = noise + signal

    # Add anomalies
    anom_num = rng.integers(low=0, high=200, size=20)
    anom_ids = rng.integers(low=0, high=nx, size=20)
    y[anom_ids] = anom_num
    is_anom = [item in anom_ids for item in range(nx)]
    raw_data = pd.Series(y, index=index)


    # Find anomalies and plot results
    res = extreme(raw_data)

    plt.figure(1, figsize=[15, 5])
    raw_data.plot()
    res.plot()

    _ = plt.legend(["Raw Data", "Filtered with Anomaly Detector"])


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 6.984 seconds)


.. _sphx_glr_download_auto_examples_data_quality_plot_extreme_outlier.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_extreme_outlier.ipynb <plot_extreme_outlier.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_extreme_outlier.py <plot_extreme_outlier.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_extreme_outlier.zip <plot_extreme_outlier.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
