math_helper

Summary

The helper module is designed to handle the repeated math operations that are not directly related to the mechanistic model calculation. These operations include the following

  • distribution sampling from a distribution (uniform, beta)

  • distribution curve fitting to data with an analytical or a numerical method

  • interpolation function for data tables

  • numerical integration for probability density functions

  • reliability probability calculation

  • statistical calculation to find mean and standard distribution ignoring not-a-number (nan).

  • figure sub-plotting

math_helper.beta_custom(m, s, a, b, n_sample=100000, plot=False)[source]

Draw samples from a general beta distribution.

The general beta distribution is described by mean, standard deviation, lower bound, and upper bound. X ~ General Beta(a, b, loc=c, scale=d) Z ~ Standard Beta(alpha, beta) X = c + d * Z

E(X) = c + d * E(Z)

Var(X) = d^2 * Var(Z)

Parameters:
  • m (float) – Mean of the distribution.

  • s (float) – Standard deviation of the distribution.

  • a (float) – Lower bound (not the shape parameter a/alpha).

  • b (float) – Upper bound (not the shape parameter b/beta).

  • n_sample (int) – Number of samples to generate.

  • plot (bool) – If True, plot a histogram of the generated samples. Default is False.

Returns:

Sample array from the distribution.

Return type:

numpy array

math_helper.dropna(x)[source]

Removes NaN values from the input array.

math_helper.f_solve_poly2(a, b, c)[source]

Find the two roots of the quadratic equation \(ax^2+bx+c=0\)

math_helper.find_mean(val, s, confidence_one_tailed=0.95)[source]

return the mean value of a unknown normal distribution based on the given value at a known one-tailed confidence level(default 95%)

Parameters:
  • val (float) – cut-off value

  • s (standard deviation) –

  • confidence_one_tailed (confidence level) –

Returns:

mean value of the unknown normal distribution

Return type:

float

math_helper.find_similar_group(item_list, similar_group_size=2)[source]

Find the most alike values in a list.

Parameters:
  • item_list (list) – A list to choose from.

  • similar_group_size (int, optional) – Number of alike values. Default is 2.

Returns:

A sublist with alike values.

Return type:

list

math_helper.fit_distribution(s, fit_type='kernel', plot=False, xlabel='', title='', axn=None)[source]

Fit data to a probability distribution function (parametric or numerical) and return a continuous random variable or a random variable represented by Gaussian kernels parametric : normal numerical : Gaussian kernels

Parameters:
  • s (array-like) – Sample data.

  • fit_type (str, optional) – Fit type (‘kernel’ or ‘normal’), by default ‘kernel’.

  • plot (bool, optional) – When True, create a plot with histogram and fitted PDF curve.

  • xlabel (str, optional) – Label for the x-axis of the plot, by default “”.

  • title (str, optional) – Title of the plot, by default “”.

  • axn (Any, optional) – Axes object for the plot, by default None.

Returns:

Continuous random variable (stats.norm) if parametric normal is used, Gaussian kernel random variable (stats.gaussian_kde) if kernel is used.

Return type:

instance of random variable

math_helper.get_mean(x)[source]

Calculate the mean of the input array, ignoring NaN values.

math_helper.get_std(x)[source]

Calculate the standard deviation of the input array, ignoring NaN values.

math_helper.hist_custom(S)[source]

Plot a histogram with N_SAMPLE//100 bins, ignoring NaN values.

math_helper.interp_extrap_f(x, y, x_find, plot=False)[source]

Interpolate or extrapolate value from an array with a fitted 2nd-degree or 3rd-degree polynomial.

Parameters:
  • x (array-like) – Independent variable.

  • y (array-like) – Function values.

  • x_find (int or float or array-like) – Lookup x.

  • plot (bool) – If True, plot curve fit and data points. Default is False.

Returns:

Interpolated or extrapolated value(s). Raises a warning when extrapolation is used.

Return type:

int or float or array-like

math_helper.normal_custom(m, s, n_sample=100000, non_negative=False, plot=False)[source]

Sample from a normal distribution.

Parameters:
  • m (int or float) – Mean of the distribution.

  • s (int or float) – Standard deviation of the distribution.

  • n_sample (int) – Number of samples to generate. Default is a global variable N_SAMPLE.

  • non_negative (bool) – If True, return a truncated distribution with no negative values. Default is False.

  • plot (bool) – If True, plot a histogram of the generated samples. Default is False.

Returns:

Sample array from the normal distribution.

Return type:

numpy array

math_helper.pf_RS(R_info, S, R_distrib_type='normal', plot=False)[source]

pf_RS calculates the probability of failure Pf = P(R-S<0), given the R(resistance) and S(load) with three methods and uses method 3 if it is checked “OK” with the other two

  1. crude monte carlo

  2. numerical integral of g kernel fit

  3. R S integral: \(\int\limits_{-\infty}^{\infty} F_R(x)f_S(x)dx\), reliability index (beta factor) is calculated with simple 1st order g.mean()/g.std()

Parameters:
  • R_info (tuple, numpy array) –

    Distribution of Resistance, e.g., cover thickness, critical chloride content, tensile strength Can be an array or distribution parameters.

    R_distrib_type=’normal’ -> tuple(m, s) for normal (m: mean, s: standard deviation)

    R_distrib_type=’beta’ -> tuple(m, s, a, b) for (General) beta distribution m: mean, s: standard deviation, a, b: lower, upper bound

    R_distrib_type=’array’ -> array: for an undetermined distribution, will be treated numerically (R S integral is not applied)

  • S (numpy array) – Distribution of load, e.g., carbonation depth, chloride content, tensile stress The distribution type is calculated S is usually not determined, can vary a lot in different cases, therefore fitted with kernel.

  • R_distrib_type (str, optional) – ‘normal’, ‘beta’, ‘array’, by default ‘normal’

  • plot (bool, optional) – Plot distribution, by default False

Returns:

(probability of failure, reliability index)

Return type:

tuple

Note

For R as arrays, R S integral is not applied R S integration method: \(P_f = P(R-S<=0)=\int\limits_{-\infty}^{\infty}f_S(y) \int\limits_{-\infty}^{y}f_R(x)dxdy\) The dual numerical integration seems too computationally expensive, so consider fitting R to an analytical distribution in future versions [TODO]

math_helper.plot_RS(model, ax=None, t_offset=0, amplify=1)[source]

plot R S distribution vertically at a time to an axis

Parameters:
  • model.R_distrib (scipy.stats._continuous_distns, normal or beta) – calculated in Pf_RS() through model.postproc()

  • model.S_kde_fit (stats.gaussian_kde) – calculated in Pf_RS() through model.postproc() distribution of load, e.g. carbonation depth, chloride content, tensile stress. The distrubtion type is calculated S is usually not determined, can vary a lot in different cases, therefore fitted with kernel

  • model.S (numpy array) – load, e.g. carbonation depth, chloride content, tensile stress

  • ax (axis) –

  • t_offset (time offset to move the plot along the t-axis. default is zero) –

  • amplify (scale the height of the pdf plot) –

math_helper.sample_integral(Y, x)[source]

Integrate Y over x, where every Y data point is a bunch of distribution samples.

Parameters:
  • Y (numpy array) –

    2D array.

    Column: y data point.

    Row: samples for each y data point.

  • x (numpy array) – 1D array.

Returns:

int_y_x : integral of y over x for all sampled data.

Return type:

numpy array

Examples

[y0_sample1, y0_sample2

y1_sample1, y1_sample2]