
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/statistics/plot_remove_outliers.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_statistics_plot_remove_outliers.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_statistics_plot_remove_outliers.py:


===================================================
Outlier detection with DBSCAN and spline regression
===================================================

Example of outlier detection from time series data using DBSCAN and spline regression.
We use data from a compressor suction pressure sensor. The data is in barg units and resampled to 1 minute granularity.
The figure shows the data without outliers considering a time window of 40min.

.. GENERATED FROM PYTHON SOURCE LINES 11-44



.. image-sg:: /auto_examples/statistics/images/sphx_glr_plot_remove_outliers_001.png
   :alt: Remove outliers based on dbscan and csaps regression
   :srcset: /auto_examples/statistics/images/sphx_glr_plot_remove_outliers_001.png
   :class: sphx-glr-single-img





.. code-block:: Python


    import os

    import matplotlib.pyplot as plt
    import pandas as pd

    from indsl.statistics import remove_outliers


    # TODO: USe a better data set to show how the outlier removal. Suggestion, use a synthetic data set.


    base_path = "" if __name__ == "__main__" else os.path.dirname(__file__)
    data = pd.read_csv(os.path.join(base_path, "../../datasets/data/suct_pressure_barg.csv"), index_col=0)
    data = data.squeeze()
    data.index = pd.to_datetime(data.index)

    plt.figure(1, figsize=[9, 7])
    plt.plot(data, ".", markersize=2, color="red", label="RAW")

    # Remove the outliers with a time window of 40min and plot the results
    plt.plot(
        remove_outliers(data, time_window=pd.Timedelta("40min")),
        ".",
        markersize=2,
        color="forestgreen",
        label="Data without outliers \nwin=40min",
    )

    plt.ylabel("Pressure (barg)")
    plt.title("Remove outliers based on dbscan and csaps regression")
    _ = plt.legend(loc=1)
    plt.show()


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 1.215 seconds)


.. _sphx_glr_download_auto_examples_statistics_plot_remove_outliers.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_remove_outliers.ipynb <plot_remove_outliers.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_remove_outliers.py <plot_remove_outliers.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_remove_outliers.zip <plot_remove_outliers.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
