module: plots

plots.plot_rank_histogram(df)

Plots a rank histogram colored by observation type.

All histogram bars are initalized to be hidden and can be toggled visible in the plot’s legend

plots.calculate_rank(df)

Calculate the rank of observations within an ensemble.

This function takes a DataFrame containing ensemble predictions and observed values, adds sampling noise to the ensemble predictions, and calculates the rank of the observed value within the perturbed ensemble for each observation. The rank indicates the position of the observed value within the sorted ensemble values, with 1 being the lowest. If the observed value is larger than the largest ensemble member, its rank is set to the ensemble size plus one.

Parameters:
  • df (pd.DataFrame) – A DataFrame with columns for mean, standard deviation, observed values,

  • size (ensemble)

  • observation. (and observation type. The DataFrame should have one row per)

Returns:

A tuple containing the rank array, ensemble size, and a result DataFrame. The result DataFrame contains columns for ‘rank’ and ‘obstype’.

Return type:

tuple

plots.plot_profile(df, levels)

Plots RMSE and Bias profiles for different observation types across specified pressure levels.

This function takes a DataFrame containing observational data and model predictions, categorizes the data into specified pressure levels, and calculates the RMSE and Bias for each level and observation type. It then plots two line charts: one for RMSE and another for Bias, both as functions of pressure level. The pressure levels are plotted on the y-axis in reversed order to represent the vertical profile in the atmosphere correctly.

Parameters:
  • df (pd.DataFrame) – The input DataFrame containing at least the ‘vertical’ column for pressure levels,

  • Bias. (and other columns required by the rmse_bias function for calculating RMSE and)

  • levels (array-like) – The bin edges for categorizing the ‘vertical’ column values into pressure levels.

Returns:

A tuple containing the DataFrame with RMSE and Bias calculations, the RMSE plot figure, and the Bias plot figure. The DataFrame includes a ‘plevels’ column representing the categorized pressure levels and ‘hPa’ column representing the midpoint of each pressure level bin.

Return type:

tuple

Raises:

ValueError – If there are missing values in the ‘vertical’ column of the input DataFrame.

Note

  • The function modifies the input DataFrame by adding ‘plevels’ and ‘hPa’ columns.

  • The ‘hPa’ values are calculated as half the midpoint of each pressure level bin, which may need adjustment based on the specific requirements for pressure level representation.

  • The plots are generated using Plotly Express and are displayed inline. The y-axis of the plots is reversed to align with standard atmospheric pressure level representation.

plots.mean_then_sqrt(x)

Calculates the mean of an array-like object and then takes the square root of the result.

Parameters:

arr (array-like) – An array-like object (such as a list or a pandas Series). The elements should be numeric.

Returns:

The square root of the mean of the input array.

Return type:

float

Raises:

TypeError – If the input is not an array-like object containing numeric values.

plots.rmse_bias_by_obs_type(df, obs_type)

Calculate the RMSE and bias for a given observation type.

Parameters:
  • df (DataFrame) – A pandas DataFrame.

  • obs_type (str) – The observation type for which to calculate the RMSE and bias.

Returns:

A DataFrame containing the RMSE and bias for the given observation type.

Return type:

DataFrame

Raises:

ValueError – If the observation type is not present in the DataFrame.