

.. _sphx_glr_auto_examples_data_quality:

Data quality
____________

Examples on how to explore the data quality of time series.



.. raw:: html

    <div class="sphx-glr-thumbnails">

.. thumbnail-parent-div-open

.. raw:: html

    <div class="sphx-glr-thumbcontainer" tooltip="Example of point outlier removal with polynomial regression and Studentized residuals. We generate a toy data set with an underlying polynomial signal that has Gaussian noise and large point outliers added to it.">

.. only:: html

  .. image:: /auto_examples/data_quality/images/thumb/sphx_glr_plot_extreme_outlier_thumb.png
    :alt:

  :ref:`sphx_glr_auto_examples_data_quality_plot_extreme_outlier.py`

.. raw:: html

      <div class="sphx-glr-thumbnail-title">Extreme Outliers Removal</div>
    </div>


.. raw:: html

    <div class="sphx-glr-thumbcontainer" tooltip="Example of algorithm that indicates decreasing values in time series data. This algorithm is applied on Running Hours time series. It is a specific type of time series that is counting the number of running hours in a pump. Given that we expect the number of running hours to either stay the same (if the pump is not running) or increase with time (if the pump is running), the decrease in running hours value indicates bad data quality.">

.. only:: html

  .. image:: /auto_examples/data_quality/images/thumb/sphx_glr_plot_value_decrease_check_thumb.png
    :alt:

  :ref:`sphx_glr_auto_examples_data_quality_plot_value_decrease_check.py`

.. raw:: html

      <div class="sphx-glr-thumbnail-title">Checking for decreasing values in a timeseries</div>
    </div>


.. raw:: html

    <div class="sphx-glr-thumbcontainer" tooltip="Example of visualizing rolling standard deviation of time delta of time series data to identify dispersion in the ingestion of data.">

.. only:: html

  .. image:: /auto_examples/data_quality/images/thumb/sphx_glr_plot_rolling_stddev_timedelta_thumb.png
    :alt:

  :ref:`sphx_glr_auto_examples_data_quality_plot_rolling_stddev_timedelta.py`

.. raw:: html

      <div class="sphx-glr-thumbnail-title">Rolling standard deviation of data points time delta</div>
    </div>


.. raw:: html

    <div class="sphx-glr-thumbcontainer" tooltip="It is important to know how complete a time series is. In this example, the function qualifies a time series on the basis of its completeness score as good, medium, or poor. The completeness score measures how complete measured by how much of the data is missing based on its median sampling frequency.">

.. only:: html

  .. image:: /auto_examples/data_quality/images/thumb/sphx_glr_plot_completeness_thumb.png
    :alt:

  :ref:`sphx_glr_auto_examples_data_quality_plot_completeness.py`

.. raw:: html

      <div class="sphx-glr-thumbnail-title">Completeness score of time series</div>
    </div>


.. raw:: html

    <div class="sphx-glr-thumbcontainer" tooltip="Example of visualizing breach of threshold in hour count in a time series representing running hours of a piece of equipment.">

.. only:: html

  .. image:: /auto_examples/data_quality/images/thumb/sphx_glr_plot_datapoint_diff_thumb.png
    :alt:

  :ref:`sphx_glr_auto_examples_data_quality_plot_datapoint_diff.py`

.. raw:: html

      <div class="sphx-glr-thumbnail-title">Threshold breach check for difference between two data points over a period of time</div>
    </div>


.. raw:: html

    <div class="sphx-glr-thumbcontainer" tooltip="Detecting density of data points in a time series is important for finding out if the expected number of data points during a certain time window such as per hour or per day have been received.">

.. only:: html

  .. image:: /auto_examples/data_quality/images/thumb/sphx_glr_plot_low_density_identification_thumb.png
    :alt:

  :ref:`sphx_glr_auto_examples_data_quality_plot_low_density_identification.py`

.. raw:: html

      <div class="sphx-glr-thumbnail-title">Identifying low density periods</div>
    </div>


.. raw:: html

    <div class="sphx-glr-thumbcontainer" tooltip="Identifying gaps in data is critical when working with time series. Data gaps can be for instance, the result of an unreliable or defective sensor, and that part of the data might need to be excluded. The exact definition of what is considered a gap requires domain knowledge and is therefore hard to automate. However, mathematical tools can help us to identify potential gaps that the domain expert can then evaluate.">

.. only:: html

  .. image:: /auto_examples/data_quality/images/thumb/sphx_glr_plot_gaps_identification_thumb.png
    :alt:

  :ref:`sphx_glr_auto_examples_data_quality_plot_gaps_identification.py`

.. raw:: html

      <div class="sphx-glr-thumbnail-title">Identifying gaps in time series</div>
    </div>


.. raw:: html

    <div class="sphx-glr-thumbcontainer" tooltip="Introduction ------------ The out_of_range function uses Savitzky-Golay smoothing (SG) - sg to estimate the main trend of the data:">

.. only:: html

  .. image:: /auto_examples/data_quality/images/thumb/sphx_glr_plot_out_of_range_thumb.png
    :alt:

  :ref:`sphx_glr_auto_examples_data_quality_plot_out_of_range.py`

.. raw:: html

      <div class="sphx-glr-thumbnail-title">Detect out of range outliers in sensor data</div>
    </div>


.. thumbnail-parent-div-close

.. raw:: html

    </div>


.. toctree::
   :hidden:

   /auto_examples/data_quality/plot_extreme_outlier
   /auto_examples/data_quality/plot_value_decrease_check
   /auto_examples/data_quality/plot_rolling_stddev_timedelta
   /auto_examples/data_quality/plot_completeness
   /auto_examples/data_quality/plot_datapoint_diff
   /auto_examples/data_quality/plot_low_density_identification
   /auto_examples/data_quality/plot_gaps_identification
   /auto_examples/data_quality/plot_out_of_range

