pytesmo package

Submodules

pytesmo.df_metrics module

Module contains wrappers for methods in pytesmo.metrics which can be given pandas.DataFrames instead of single numpy.arrays . If the DataFrame has more columns than the function has input parameters the function will be applied pairwise

Created on Aug 14, 2013

@author: Christoph Paulik Christoph.Paulik@geo.tuwien.ac.at

exception pytesmo.df_metrics.DataFrameDimensionError[source]

Bases: exceptions.Exception

pytesmo.df_metrics.RSS(df)[source]

Redidual sum of squares

Returns:

result : namedtuple

with column names of df for which the calculation was done as name of the element separated by ‘_and_’

pytesmo.df_metrics.bias(df)[source]

Bias

Returns:

bias : pandas.Dataframe

of shape (len(df.columns),len(df.columns))

See Also :

——– :

pytesmo.metrics.bias :

pytesmo.df_metrics.kendalltau(df)[source]

Wrapper for scipy.stats.kendalltau

Returns:

result : namedtuple

with column names of df for which the calculation was done as name of the element separated by ‘_and_’

See also

pytesmo.metrics.kendalltau, scipy.stats.kendalltau

pytesmo.df_metrics.mse(df)[source]

Mean square error (MSE) as a decomposition of the RMSD into individual error components

Returns:

result : namedtuple

with column names of df for which the calculation was done as name of the element separated by ‘_and_’

pytesmo.df_metrics.nash_sutcliffe(df)[source]

Nash Sutcliffe model efficiency coefficient

Returns:

result : namedtuple

with column names of df for which the calculation was done as name of the element separated by ‘_and_’

pytesmo.df_metrics.nrmsd(df)[source]

Normalized root-mean-square deviation

Returns:

result : namedtuple

with column names of df for which the calculation was done as name of the element separated by ‘_and_’

pytesmo.df_metrics.pairwise_apply(df, method, comm=False)[source]

Compute given method pairwise for all columns, excluding NA/null values

Parameters:

df : pandas.DataFrame

input data, method will be applied to each column pair

method : function

method to apply to each column pair. has to take 2 input arguments of type numpy.array and return one value or tuple of values

Returns:

results : pandas.DataFrame

pytesmo.df_metrics.pearsonr(df)[source]

Wrapper for scipy.stats.pearsonr

Returns:

result : namedtuple

with column names of df for which the calculation was done as name of the element separated by ‘_and_’

See also

pytesmo.metrics.pearsonr, scipy.stats.pearsonr

pytesmo.df_metrics.rmsd(df)[source]

Root-mean-square deviation

Returns:

result : namedtuple

with column names of df for which the calculation was done as name of the element separated by ‘_and_’

pytesmo.df_metrics.spearmanr(df)[source]

Wrapper for scipy.stats.spearmanr

Returns:

result : namedtuple

with column names of df for which the calculation was done as name of the element separated by ‘_and_’

See also

pytesmo.metrics.spearmenr, scipy.stats.spearmenr

pytesmo.df_metrics.tcol_error(df)[source]

Triple collocation error estimate In this case df has to have exactly 3 columns, since triple wise application of a function is not yet implemented and would probably return a complicated structure

Returns:

result : namedtuple

with column names of df

pytesmo.df_metrics.ubrmsd(df)[source]

Unbiased root-mean-square deviation

Returns:

result : namedtuple

with column names of df for which the calculation was done as name of the element separated by ‘_and_’

pytesmo.metrics module

Created on Apr 17, 2013

@author: Christoph Paulik christoph.paulik@geo.tuwien.ac.at @author: Sebastian Hahn sebastian.hahn@geo.tuwien.ac.at @author: Alexander Gruber alexander.gruber@geo.tuwien.ac.at

pytesmo.metrics.RSS(x, y)[source]

Redidual sum of squares

Parameters:

x : numpy.array

1D numpy array to calculate the metric

y : numpy.array

1D numpy array to calculate the metric

Returns:

Residual sum of squares :

pytesmo.metrics.bias(x, y)[source]

Bias

pytesmo.metrics.kendalltau(x, y)[source]

Wrapper for scipy.stats.kendalltau

Parameters:

x : numpy.array

1D numpy array to calculate the metric

y : numpy.array

1D numpy array to calculate the metric

Returns:

Kendall’s tau : float

The tau statistic

p-value : float

The two-sided p-value for a hypothesis test whose null hypothesis is an absence of association, tau = 0.

See also

scipy.stats.kendalltau

pytesmo.metrics.mse(x, y)[source]

Mean square error (MSE) as a decomposition of the RMSD into individual error components

pytesmo.metrics.nash_sutcliffe(x, y)[source]

Nash Sutcliffe model efficiency coefficient

Parameters:

x : numpy.array

1D numpy array to calculate the metric

y : numpy.array

1D numpy array to calculate the metric

Returns:

Nash Sutcliffe coefficient : float

Nash Sutcliffe model efficiency coefficient

pytesmo.metrics.nrmsd(x, y)[source]

Normalized root-mean-square deviation

pytesmo.metrics.pearsonr(x, y)[source]

Wrapper for scipy.stats.pearsonr

Parameters:

x : numpy.array

1D numpy array to calculate the metric

y : numpy.array

1D numpy array to calculate the metric

Returns:

Pearson’s r : float

Pearson’s correlation coefficent

p-value : float

2 tailed p-value

See also

scipy.stats.pearsonr

pytesmo.metrics.rmsd(x, y)[source]

Root-mean-square deviation

pytesmo.metrics.spearmanr(x, y)[source]

Wrapper for scipy.stats.spearmanr

Parameters:

x : numpy.array

1D numpy array to calculate the metric

y : numpy.array

1D numpy array to calculate the metric

Returns:

rho : float

Spearman correlation coefficient

p-value : float

The two-sided p-value for a hypothesis test whose null hypothesis is that two sets of data are uncorrelated

See also

scipy.stats.spearmenr

pytesmo.metrics.tcol_error(x, y, z)[source]

Triple collocation error estimate

Parameters:

x : numpy.array

1D numpy array to calculate the errors

y : numpy.array

1D numpy array to calculate the errors

z : numpy.array

1D numpy array to calculate the errors

Returns:

triple collocation error for x : float

triple collocation error for y : float

triple collocation error for z : float

pytesmo.metrics.ubrmsd(x, y)[source]

Unbiased root-mean-square deviation

pytesmo.scaling module

Created on Apr 17, 2013

@author: Christoph Paulik christoph.paulik@geo.tuwien.ac.at

pytesmo.scaling.add_scaled(df, method='linreg', label_in=None, label_scale=None)[source]

takes a dataframe and appends a scaled time series to it. If no labels are given the first column will be scaled to the second column of the DataFrame

Parameters:

df : pandas.DataFrame

input dataframe

method : string

scaling method

label_in: string, optional :

the column of the dataframe that should be scaled to that with label_scale default is the first column

label_scale : string, optional

the column of the dataframe the label_in column should be scaled to default is the second column

Returns:

df : pandas.DataFrame

input dataframe with new column labeled label_in+’_scaled_’+method

pytesmo.scaling.cdf_match(in_data, scale_to)[source]
  1. computes discrete cumulative density functions of in_data- and scale_to at their respective bin_edges;
  2. computes continuous CDFs by 6th order polynomial fitting;
  3. CDF of in_data is matched to CDF of scale_to
Parameters:

in_data: numpy.array :

input dataset which will be scaled

scale_to: numpy.array :

in_data will be scaled to this dataset

Returns:

CDF matched values: numpy.array :

dataset in_data with CDF as scale_to

pytesmo.scaling.lin_cdf_match(in_data, scale_to)[source]

computes cumulative density functions of in_data and scale_to at their respective bin-edges by linear interpolation; then matches CDF of in_data to CDF of scale_to

Parameters:

in_data: numpy.array :

input dataset which will be scaled

scale_to: numpy.array :

in_data will be scaled to this dataset

Returns:

CDF matched values: numpy.array :

dataset in_data with CDF as scale_to

pytesmo.scaling.linreg(in_data, scale_to)[source]

scales the input datasets using linear regression

Parameters:

in_data : numpy.array

input dataset which will be scaled

scale_to : numpy.array

in_data will be scaled to this dataset

Returns:

scaled dataset : numpy.array

dataset scaled using linear regression

pytesmo.scaling.mean_std(in_data, scale_to)[source]

scales the input datasets so that they have the same mean and standard deviation afterwards

Parameters:

in_data : numpy.array

input dataset which will be scaled

scale_to : numpy.array

in_data will be scaled to this dataset

Returns:

scaled dataset : numpy.array

dataset in_data with same mean and standard deviation as scale_to

pytesmo.scaling.min_max(in_data, scale_to)[source]

scales the input datasets so that they have the same minimum and maximum afterwards

Parameters:

in_data : numpy.array

input dataset which will be scaled

scale_to : numpy.array

in_data will be scaled to this dataset

Returns:

scaled dataset : numpy.array

dataset in_data with same maximum and minimum as scale_to

pytesmo.scaling.scale(df, method='linreg', reference_index=0)[source]

takes pandas.DataFrame and scales all columns to the column specified by reference_index with the chosen method

Parameters:

df : pandas.DataFrame

containing matched time series that should be scaled

method : string, optional

method definition, has to be a function in globals() that takes 2 numpy.array as input and returns one numpy.array of same length

reference_index : int, optional

default 0, column index of reference dataset in dataframe

Returns:

scaled data : pandas.DataFrame

all time series of the input DataFrame scaled to the one specified by reference_index

pytesmo.temporal_matching module

Created on Apr 12, 2013

Provides a temporal matching function

@author: Sebastian Hahn Sebastian.Hahn@geo.tuwien.ac.at

pytesmo.temporal_matching.df_match(reference, *args, **kwds)[source]

Finds temporal match between the reference pandas.DataFrame (index has to be datetime) and n other pandas.DataFrame (index has to be datetime).

Parameters:

reference : pandas.DataFrame or pandas.TimeSeries

The index of this dataframe will be the reference.

*args : pandas.DataFrame or pandas.TimeSeries

The index of this dataframe(s) will be matched.

window : float

Fraction of days of the maximum pos./neg. distance allowed, i.e. the value of window represents the half-winow size (e.g. window=0.5, will search for matches between -12 and +12 hours) (default: None)

dropna : boolean

Drop rows containing only NaNs (default: False)

dropduplicates : boolean

Drop duplicated temporal matched (default: False)

Returns:

temporal_matched_args : pandas.DataFrame or tuple of pandas.DataFrame

Dataframe with index from matched reference index

pytesmo.temporal_matching.matching(reference, *args, **kwargs)[source]

Finds temporal match between the reference pandas.TimeSeries (index has to be datetime) and n other pandas.TimeSeries (index has to be datetime).

Parameters:

reference : pandas.TimeSeries

The index of this Series will be the reference.

*args : pandas.TimeSeries

The index of these Series(s) will be matched.

window : float

Fraction of days of the maximum pos./neg. distance allowed, i.e. the value of window represents the half-winow size (e.g. window=0.5, will search for matches between -12 and +12 hours) (default: None)

Returns:

temporal_match : pandas.DataFrame

containing the index of the reference Series and a column for each of the other input Series

Module contents

Table Of Contents

Previous topic

Examples

Next topic

pytesmo.grid package

This Page