Module src.PyOghma_ML.Output
Comprehensive Report Generation and Visualization System for PyOghma_ML
This module provides sophisticated functionality for generating detailed reports and visualizations that combine experimental data analysis with machine learning predictions. It creates publication-ready documents with statistical analysis, performance metrics, and scientific visualizations.
Core Functionality: Output Class: Central report generation engine that orchestrates the creation of comprehensive analysis documents combining experimental data, ML predictions, and statistical comparisons.
Report Components: Experimental Analysis: - Characteristic curve plotting (JV, IV, TPV, CE, CELIV) - Parameter extraction and calculation (Jsc, Voc, FF, PCE) - Statistical analysis of device populations - Batch processing for multiple devices - Error analysis and uncertainty quantification
Machine Learning Results:
- Model performance metrics (MAPE, RMSE, R²)
- Prediction vs experimental comparisons
- Confusion matrices for classification tasks
- Feature importance analysis
- Model validation statistics
Visualization Features:
- Scientific plotting with proper scaling and units
- Multi-panel figures with subfigures
- Color-coded performance indicators
- Statistical distribution plots
- Correlation analysis visualizations
Document Generation:
- LaTeX-based professional reports
- Automatic figure and table integration
- Custom styling and branding
- PDF compilation with proper formatting
- CSV data export for further analysis
Advanced Features: Statistical Analysis: - Population statistics for device batches - Error propagation calculations - Performance distribution analysis - Outlier detection and handling - Confidence interval calculations
Performance Metrics:
- Mean Absolute Percentage Error (MAPE)
- Root Mean Square Error (RMSE)
- Coefficient of determination (R²)
- Bias and variance decomposition
- Cross-validation metrics
Comparative Analysis:
- Experimental vs predicted comparisons
- Model performance benchmarking
- Parameter correlation analysis
- Sensitivity analysis visualization
- Error distribution analysis
Integration Capabilities: - Seamless integration with Networks module for ML results - Direct interfacing with Input module for experimental data - Automatic figure generation via Figures module - LaTeX document creation through Latex module - Label management integration for proper units and formatting
Output Formats: - PDF reports with complete analysis - Individual PNG/SVG figures for presentations - CSV data tables for spreadsheet analysis - LaTeX source files for custom modification - HTML reports for web deployment
Customization Options: - Configurable report sections - Custom figure styling and layouts - Adjustable statistical thresholds - Flexible table formatting - Branded document templates
Example Usage: >>> # Generate comprehensive analysis report >>> output = Output(trained_networks, experimental_data) >>> output.build_report() >>> >>> # Focus on specific analysis components >>> output.experimental_results() >>> output.machine_learning_results() >>> output.statistical_analysis()
Quality Assurance: - Automatic data validation and error checking - Consistent units and formatting throughout - Professional scientific presentation standards - Reproducible analysis workflows - Version tracking for report generation
This module is essential for translating raw experimental data and ML predictions into actionable insights and publication-ready scientific documentation.
Classes
class Output (networks: Networks,
inputs: Any,
abs_dir: str | None = None)-
Expand source code
class Output: """ A class for generating reports and visualizations for machine learning results. This class provides methods for creating LaTeX-based reports, generating plots, calculating experimental parameters, and saving results in various formats. """ def __init__(self, networks: Networks, inputs: Any, abs_dir: Optional[str] = None) -> None: """ Initialize an Output instance. Args: networks (Networks): The trained networks to generate reports for. inputs: The experimental inputs used for predictions. abs_dir (str, optional): Directory containing absolute data. """ pd.set_option('styler.format.precision', 3) self.networks = networks self.inputs = inputs self.figures = Figures() if abs_dir != None: self.abs_dir = abs_dir self.number_of_inputed_devices = 0 self.number_of_networks_trained = len(self.networks.networks_configured) self.temp_dir = os.path.join(os.getcwd(), 'temp') if not os.path.isdir(self.temp_dir): os.mkdir(self.temp_dir) def build_report(self) -> None: """ Build a comprehensive report in PDF format. This method generates a LaTeX-based report that includes experimental results, machine learning predictions, and analysis. The report is automatically formatted with headers, footers, and proper styling. The report includes: - Experimental data analysis and visualization - Machine learning model predictions and performance metrics - Comparison between experimental and predicted results - Statistical analysis and error metrics Note: The report is saved to the temporary directory and can be compiled to PDF using LaTeX. Custom styling includes Oghma_ML branding. """ self.pdf = Document(document_class='article', document_properties=['a4paper'], packages=['graphicx', 'geometry', 'booktabs', 'caption', 'subcaption', 'float', 'xcolor','colortbl', 'fancyhdr']) self.pdf.geometry(left='1cm', right='1cm', top='1.5cm', bottom='2cm') self.pdf.write(self.pdf.command('pagestyle', 'fancy')) self.pdf.write(self.pdf.command('lhead', 'Device Report')) self.pdf.write(self.pdf.command('rhead', '\\thepage')) self.pdf.write(self.pdf.command('cfoot','Report Produced by Oghma\_ML, developed by Cai Williams')) self.pdf.begin_document() self.experimental_results() self.machine_learning_results() def experimental_results(self) -> None: """ Add experimental results to the report. This method generates comprehensive experimental data analysis including: - Characteristic curves (JV, IV, TPV, etc.) with proper scaling and labeling - Experimental parameter tables with calculated values - Device performance metrics and statistics - Batch analysis for multiple devices when applicable The visualization automatically adapts to different characterization types: - JV/IV: Current density vs voltage plots - batch_JV: Multiple device overlay plots - TPV: Transient photovoltage analysis - Other types: Appropriate axis labels and scaling Note: Figures are saved to the temporary directory and included in the LaTeX report. Parameter calculations are performed and displayed in tabular format. """ self.pdf.section('Experimental Results') cap = 'Experimental Characteristics' characteristic = Figures() match self.inputs.characterisation_type: case 'JV'|'IV'|'JV_I4'|'SPO JV': characteristic.initialise_figure(figsize=(6, 6)) characteristic.plot(self.inputs.x, self.inputs.y) characteristic.set_x_limits(left=np.min(self.inputs.x), right=np.max(self.inputs.x)) characteristic.set_y_limits(top=np.max(self.inputs.y), bottom=np.min(self.inputs.y)*1.5) characteristic.set_x_label('Voltage (V)') characteristic.set_y_label('Current Density ($Am^{-2}$)') case 'batch_JV': characteristic.initialise_figure(figsize=(6, 6)) for idx in self.inputs.x: characteristic.plot(self.inputs.x, self.inputs.y) characteristic.set_x_limits(left=np.min(self.inputs.x), right=np.max(self.inputs.x)) characteristic.set_y_limits(top=np.max(self.inputs.y), bottom=np.min(self.inputs.y) * 1.5) characteristic.set_x_label('Voltage (V)') characteristic.set_y_label('Current Density ($Am^{-2}$)') case '2d_JV': pass case _: pass characteristic_path = os.path.join(self.temp_dir, 'characteristic.png') characteristic.save_to_disk(characteristic_path, dpi=300) experimental_paramaters = self.calcluate_experimental_paramaters() cap = 'Experimental Input Characteristic' self.pdf.figure(characteristic_path, centering=True, width='0.6\\textwidth', caption=cap) if len(experimental_paramaters) != 0: cap = 'Parameters calculable from input characteristic' self.pdf.table(experimental_paramaters, centering=True, caption=cap) self.pdf.newpage() def calcluate_experimental_paramaters(self) -> pd.DataFrame: """ Calculate experimental parameters from input characteristics. Returns: pandas.DataFrame: A DataFrame containing calculated parameters. """ match self.inputs.characterisation_type: case 'JV' | 'JV_I4'|'SPO JV': V = self.inputs.x J = self.inputs.y Jsc = -J[np.argmin(np.abs(V-0))] Voc = V[np.argmin(np.abs(J-0))] P_max = np.abs(np.min(V*J)) FF = P_max / (Voc * Jsc) PCE = Jsc * Voc * FF / 10 experimental_parameters = {} experimental_parameters['Jsc'] = {} self.set_dictionary(experimental_parameters['Jsc'], Jsc, '$Am^{-2}$') experimental_parameters['Voc'] = {} self.set_dictionary(experimental_parameters['Voc'], Voc, '$V$') experimental_parameters['FF'] = {} self.set_dictionary(experimental_parameters['FF'], FF, '$a.u$') experimental_parameters['Pmax'] = {} self.set_dictionary(experimental_parameters['Pmax'], P_max, '$Wm^{-2}$') experimental_parameters['PCE'] = {} self.set_dictionary(experimental_parameters['PCE'], PCE, '$Percent$') experimental_parameters = pd.DataFrame(data=experimental_parameters).T experimental_parameters = experimental_parameters.reset_index(names='Parameter') return experimental_parameters #experimental_parameters.astype(float).round(3) case 'IV': V = self.inputs.x J = self.inputs.y Jsc = -J[np.argmin(np.abs(V - 0))] Voc = V[np.argmin(np.abs(J - 0))] P_max = np.abs(np.min(V * J)) FF = P_max / (Voc * Jsc) PCE = Jsc * Voc * FF / 10 experimental_parameters = {} experimental_parameters['Jsc'] = {} self.set_dictionary(experimental_parameters['Jsc'], Jsc, '$Am^{-2}$') experimental_parameters['Voc'] = {} self.set_dictionary(experimental_parameters['Voc'], Voc, '$V$') experimental_parameters['FF'] = {} self.set_dictionary(experimental_parameters['FF'], FF, '$a.u$') experimental_parameters['Pmax'] = {} self.set_dictionary(experimental_parameters['Pmax'], P_max, '$Wm^{-2}$') experimental_parameters['PCE'] = {} self.set_dictionary(experimental_parameters['PCE'], PCE, '$Percent$') experimental_parameters = pd.DataFrame(data=experimental_parameters).T experimental_parameters = experimental_parameters.reset_index(names='Parameter') return experimental_parameters case _: return {} def machine_learning_results(self) -> None: """ Add machine learning results to the report. This method includes predictions, confusion matrices, and other relevant metrics based on the network type. """ self.pdf.section('Machine Learning Parameters') match self.networks.__class__.__name__: case 'Point': self.confusion_matrices('Point') self.single_results() case 'Difference': self.confusion_matrices('Difference') self.difference_results() case 'Residual': self.confusion_matrices('Residual') self.difference_results() case 'Ensemble': self.confusion_matrices('Ensemble') self.ensemble_results() case _: raise ValueError('Network Type Not Recognised!') def confusion_matrices(self, network_type: str) -> None: """ Generate confusion matrices for the specified network type. Args: network_type (str): The type of network (e.g., 'Point', 'Difference'). """ A = Networks.initialise(self.networks.networks_dir, network_type=network_type) self.MAPE = A.confusion_matrix(self.abs_dir) def clean_parameter(self, parameter: str) -> str: """ Clean a parameter string by removing special characters. Args: parameter (str): The parameter string to clean. Returns: str: The cleaned parameter string. """ parameter = re.sub(r"[-()\"#/@;:<>{}=~|.?,_]", " ", parameter) return parameter def multipoint_row(self, parameter: str, mean: float, std: float, mape: float) -> None: """ Add a row to the prediction dictionary for multipoint predictions. Args: parameter (str): The parameter name. mean (float): The mean prediction value. std (float): The standard deviation of predictions. mape (float): The Mean Absolute Percentage Error (MAPE). """ parameter = self.clean_parameter(parameter) self.prediction_dictionary[parameter] = {} dictionary = self.prediction_dictionary[parameter] if mean > 1e3 or mean < 1e-3: mean = '{:.2e}'.format(mean) dictionary['Mean'] = mean dictionary['Standard Deviation'] = std dictionary['Units'] = 'NA' dictionary['MAPE (\%)'] = mape return def multipoint_row_single(self, parameter: str, mean: float, mape: Optional[float] = None) -> None: """ Add a row to the prediction dictionary for single-point predictions. Args: parameter (str): The parameter name. mean (float): The mean prediction value. mape (float, optional): The Mean Absolute Percentage Error (MAPE). """ keys = list(self.prediction_dictionary.keys()) # parameter = Label(parameter) # if parameter.english == '': # parameter.english = parameter.token # if parameter.english in keys: # number = len(np.where(parameter.english in keys)) # parameter.english = parameter.english + ' (' + str(number) + ')' parameter = self.clean_parameter(parameter) self.prediction_dictionary[parameter] = {} dictionary = self.prediction_dictionary[parameter] # if mean > 1e3 or mean < 1e-3: # mean = '{:.2e}'.format(mean) dictionary['Mean'] = mean dictionary['Units'] = 'NA' if mape == None: dictionary['MAPE (\%)'] = 'N/A' else: dictionary['MAPE (\%)'] = mape return def point_row(self, parameter: str, point: float) -> None: """ Add a row to the prediction dictionary for a single prediction. Args: parameter (str): The parameter name. point (float): The prediction value. """ self.prediction_dictionary[parameter] = {} dictionary = self.prediction_dictionary[parameter] dictionary['Value'] = point dictionary['Units'] = 'to implement' return def prediction_table(self) -> pd.DataFrame: """ Generate a table of predictions for multipoint methods. Returns: pandas.DataFrame: A DataFrame containing predictions and metrics. """ networks = self.networks.networks_configured inputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['inputs'] for working_network in range(len(networks))] outputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['outputs'] for working_network in range(len(networks))] inputs = np.asarray(inputs) outputs = np.asarray(outputs, dtype=object) #outputs = outputs[outputs != 0] self.prediction_dictionary = {} self.prediction_all_dict = {} rows = np.zeros(len(networks) * len(outputs.ravel()),dtype=object) for idx in range(len(networks)): for jdx in range(len(outputs[idx])): parameter = outputs[idx][0] if len(self.networks.predictions) > 0: predictions_mean = self.networks.mean[idx] predictions_std = self.networks.std[idx] try: predictions_mape = self.MAPE[idx] except: predictions_mape = 0 self.multipoint_row(parameter, predictions_mean, predictions_std, predictions_mape) else: predictions = predictions[0] self.point_row(parameter, predictions) predictions = pd.DataFrame(self.prediction_dictionary).T predictions = predictions.reset_index(names='Parameters') self.prediction_table = predictions #predictions.astype(float).round(3) return predictions def prediction_table_single(self) -> pd.DataFrame: """ Generate a table of predictions for single-point methods. Returns: pandas.DataFrame: A DataFrame containing predictions and metrics. """ networks = self.networks.networks_configured inputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['inputs'] for working_network in range(len(networks))] outputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['outputs'] for working_network in range(len(networks))] inputs = np.asarray(inputs) outputs = np.asarray(outputs, dtype=object) #outputs = outputs[outputs != 0] self.prediction_dictionary = {} self.prediction_all_dict = {} rows = np.zeros(len(networks) * len(outputs.ravel()),dtype=object) for idx in range(len(networks)): for jdx in range(len(outputs[idx])): if len(outputs[idx]) == 1: parameter = outputs[idx][0] parameter = self.clean_parameter(parameter) predictions_mean = self.networks.mean[idx][0] #predictions_mape = self.MAPE[idx][0] self.multipoint_row_single(parameter, predictions_mean) else: parameter = outputs[idx][jdx] predictions_mean = self.networks.mean[idx][jdx] #predictions_mape = self.MAPE[idx][jdx] parameter = self.clean_parameter(parameter) self.multipoint_row_single(parameter, predictions_mean) predictions = pd.DataFrame(self.prediction_dictionary).T predictions = predictions.reset_index(names='Parameters') self.prediction_table = predictions #predictions.astype(float).round(3) return predictions def distributions(self) -> List[str]: """ Retrieve file paths for distribution plots. Returns: list: A list of file paths for distribution plots. """ figdir = os.path.join(os.getcwd(), 'temp') files = os.listdir(figdir) files = [os.path.join(figdir,x) for x in files if 'tempDF' in x] files = [x for x in files if x.endswith('.png')] files = natsorted(files) return files def confusions_values(self) -> List[str]: """ Retrieve file paths for confusion matrix plots. Returns: list: A list of file paths for confusion matrix plots. """ figdir = os.path.join(os.getcwd(), 'temp') files = os.listdir(figdir) files = [os.path.join(figdir,x) for x in files if 'tempCF' in x] files = [x for x in files if x.endswith('.csv')] files = natsorted(files) return files def distributions_values(self) -> List[str]: """ Retrieve file paths for distribution plots. Returns: list: A list of file paths for distribution plots. """ figdir = os.path.join(os.getcwd(), 'temp') print(figdir) files = os.listdir(figdir) print(files) files = [os.path.join(figdir,x) for x in files if 'tempDF' in x] files = [os.path.join(figdir,x) for x in files if '_norm' not in x] print(files) files = [x for x in files if x.endswith('.csv')] files = natsorted(files) return files def distributions_values_nomr(self) -> List[str]: """ Retrieve file paths for distribution plots. Returns: list: A list of file paths for distribution plots. """ figdir = os.path.join(os.getcwd(), 'temp') files = os.listdir(figdir) files = [os.path.join(figdir,x) for x in files if '_norm' in x] files = [x for x in files if x.endswith('.csv')] files = natsorted(files) return files def confustions(self) -> List[str]: """ Retrieve file paths for confusion matrix plots. Returns: list: A list of file paths for confusion matrix plots. """ figdir = os.path.join(os.getcwd(), 'temp') files = os.listdir(figdir) files = [os.path.join(figdir,x) for x in files if 'tempCF' in x] files = [x for x in files if x.endswith('.png')] files = natsorted(files) return files def insert_plots(self, df_files: List[str], cf_files: List[str]) -> None: """ Insert distribution and confusion matrix plots into the report. Args: df_files (list): File paths for distribution plots. cf_files (list): File paths for confusion matrix plots. """ outputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['outputs'] for working_network in range(len(self.networks.networks_configured))] if len(df_files) == len(cf_files): for idx in range(len(df_files)): try: self.pdf.subsection(Label(outputs[idx][0]).english) except: self.pdf.subsection('Error') figs = [df_files[idx], cf_files[idx]] self.pdf.subfigure(*figs, width=0.455) elif len(df_files) == 0: raise ValueError("No distributions were found") elif len(cf_files) == 0: for idx in range(len(df_files)): try: self.pdf.subsection(Label(outputs[idx][0]).english) except: self.pdf.subsection('Error') figs = [df_files[idx]] self.pdf.subfigure(*figs, width=0.455) else: raise ValueError("The length of distributions does not match that of confusion matrices") def insert_plot(self, cf_files: List[str]) -> None: """ Insert confusion matrix plots into the report. Args: cf_files (list): File paths for confusion matrix plots. """ outputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['outputs'] for working_network in range(len(self.networks.networks_configured))] for idx in range(len(cf_files)): try: self.pdf.subsection(Label(outputs[idx][0]).english) except: self.pdf.subsection('Error') figs = [cf_files[idx]] self.pdf.subfigure(*figs, width=0.455) #TODO Ouput: Table + Confusion matrix def single_results(self) -> None: """ Add results for single-point methods to the report. This method includes a table of predictions and confusion matrix plots. """ predictions = self.prediction_table_single() cap = 'Machine learning predictions by the single method' self.pdf.table(predictions, centering=True, caption=cap) cfFiles = self.confustions() self.insert_plot(cfFiles) cfFiles = self.confusions_values() self.cfv = pd.DataFrame(np.hstack([pd.read_csv(f).values for f in cfFiles])) #TODO Output: Table + Distributions + Confusion matrix def difference_results(self) -> None: """ Add results for difference-based methods to the report. This method includes a table of predictions, distribution plots, and confusion matrix plots. """ predictions = self.prediction_table() cap = 'Machine learning predictions by the difference method' self.pdf.table(predictions, centering=True, caption=cap) dfFiles = self.distributions() cfFiles = self.confustions() self.insert_plots(dfFiles, cfFiles) dfvFiles = self.distributions_values() print(dfvFiles) self.dfv = pd.DataFrame(np.hstack([pd.read_csv(f).values for f in dfvFiles])) dfv_norm_files = self.distributions_values_nomr() self.dfv_norm = pd.DataFrame(np.hstack([pd.read_csv(f).values for f in dfv_norm_files])) cfFiles = self.confusions_values() self.cfv = pd.DataFrame(np.hstack([pd.read_csv(f).values for f in cfFiles])) #TODO Output: Table + Confusion matrix def ensemble_results(self) -> None: """ Add results for ensemble-based methods to the report. This method includes a table of predictions and confusion matrix plots. """ predictions = self.prediction_table() cap = 'Machine learning predictions by the ensemble method' self.pdf.table(predictions, centering=True, caption=cap) def set_dictionary(self, dict: Dict[str, Any], value: float, unit: str) -> Dict[str, Any]: """ Set values in a dictionary for a specific parameter. Args: dict (dict): The dictionary to update. value (float): The value to set. unit (str): The unit of the value. Returns: dict: The updated dictionary. """ dict['Value'] = value dict['Unit'] = unit return dict def clean_up(self) -> None: """ Clean up temporary files generated during report creation. This method removes temporary files and renames the final PDF report. """ files = os.listdir(os.getcwd()) files = [f for f in files if 'tempfile' in f] files = [f for f in files if f.endswith('.pdf') == False] files = [os.path.join(os.getcwd(), f) for f in files if f.endswith('.csv') == False] for f in files: os.remove(f) os.rename('tempfile.pdf', str(self.name)+'.pdf') shutil.rmtree(os.path.join(os.getcwd(), 'temp')) def save_report(self, name: str) -> None: """ Save the report as a PDF and CSV file. Args: name (str): The name of the report file (without extension). """ self.name = name self.pdf.end_document() self.pdf.save_tex() self.pdf.compile() self.pdf.compile() try: self.dfv.to_csv(self.name + '_distributions.csv', index=False) self.dfv_norm.to_csv(self.name + '_distributions_norm.csv', index=False) except: print('No distributions to save') try: self.cfv.to_csv(self.name + '_confusions.csv', index=False) except: print('No confusions to save') self.clean_up() self.prediction_table.to_csv(self.name + '.csv') def save_report_csv(self, name: str) -> None: """ Save the predictions as a CSV file. Args: name (str): The name of the CSV file (without extension). """ self.name = name self.prediction_table.to_csv(self.name + '.csv')
A class for generating reports and visualizations for machine learning results.
This class provides methods for creating LaTeX-based reports, generating plots, calculating experimental parameters, and saving results in various formats.
Initialize an Output instance.
Args
networks
:Networks
- The trained networks to generate reports for.
inputs
- The experimental inputs used for predictions.
abs_dir
:str
, optional- Directory containing absolute data.
Methods
def build_report(self) ‑> None
-
Expand source code
def build_report(self) -> None: """ Build a comprehensive report in PDF format. This method generates a LaTeX-based report that includes experimental results, machine learning predictions, and analysis. The report is automatically formatted with headers, footers, and proper styling. The report includes: - Experimental data analysis and visualization - Machine learning model predictions and performance metrics - Comparison between experimental and predicted results - Statistical analysis and error metrics Note: The report is saved to the temporary directory and can be compiled to PDF using LaTeX. Custom styling includes Oghma_ML branding. """ self.pdf = Document(document_class='article', document_properties=['a4paper'], packages=['graphicx', 'geometry', 'booktabs', 'caption', 'subcaption', 'float', 'xcolor','colortbl', 'fancyhdr']) self.pdf.geometry(left='1cm', right='1cm', top='1.5cm', bottom='2cm') self.pdf.write(self.pdf.command('pagestyle', 'fancy')) self.pdf.write(self.pdf.command('lhead', 'Device Report')) self.pdf.write(self.pdf.command('rhead', '\\thepage')) self.pdf.write(self.pdf.command('cfoot','Report Produced by Oghma\_ML, developed by Cai Williams')) self.pdf.begin_document() self.experimental_results() self.machine_learning_results()
Build a comprehensive report in PDF format.
This method generates a LaTeX-based report that includes experimental results, machine learning predictions, and analysis. The report is automatically formatted with headers, footers, and proper styling.
The report includes: - Experimental data analysis and visualization - Machine learning model predictions and performance metrics - Comparison between experimental and predicted results - Statistical analysis and error metrics
Note
The report is saved to the temporary directory and can be compiled to PDF using LaTeX. Custom styling includes Oghma_ML branding.
def calcluate_experimental_paramaters(self) ‑> pandas.core.frame.DataFrame
-
Expand source code
def calcluate_experimental_paramaters(self) -> pd.DataFrame: """ Calculate experimental parameters from input characteristics. Returns: pandas.DataFrame: A DataFrame containing calculated parameters. """ match self.inputs.characterisation_type: case 'JV' | 'JV_I4'|'SPO JV': V = self.inputs.x J = self.inputs.y Jsc = -J[np.argmin(np.abs(V-0))] Voc = V[np.argmin(np.abs(J-0))] P_max = np.abs(np.min(V*J)) FF = P_max / (Voc * Jsc) PCE = Jsc * Voc * FF / 10 experimental_parameters = {} experimental_parameters['Jsc'] = {} self.set_dictionary(experimental_parameters['Jsc'], Jsc, '$Am^{-2}$') experimental_parameters['Voc'] = {} self.set_dictionary(experimental_parameters['Voc'], Voc, '$V$') experimental_parameters['FF'] = {} self.set_dictionary(experimental_parameters['FF'], FF, '$a.u$') experimental_parameters['Pmax'] = {} self.set_dictionary(experimental_parameters['Pmax'], P_max, '$Wm^{-2}$') experimental_parameters['PCE'] = {} self.set_dictionary(experimental_parameters['PCE'], PCE, '$Percent$') experimental_parameters = pd.DataFrame(data=experimental_parameters).T experimental_parameters = experimental_parameters.reset_index(names='Parameter') return experimental_parameters #experimental_parameters.astype(float).round(3) case 'IV': V = self.inputs.x J = self.inputs.y Jsc = -J[np.argmin(np.abs(V - 0))] Voc = V[np.argmin(np.abs(J - 0))] P_max = np.abs(np.min(V * J)) FF = P_max / (Voc * Jsc) PCE = Jsc * Voc * FF / 10 experimental_parameters = {} experimental_parameters['Jsc'] = {} self.set_dictionary(experimental_parameters['Jsc'], Jsc, '$Am^{-2}$') experimental_parameters['Voc'] = {} self.set_dictionary(experimental_parameters['Voc'], Voc, '$V$') experimental_parameters['FF'] = {} self.set_dictionary(experimental_parameters['FF'], FF, '$a.u$') experimental_parameters['Pmax'] = {} self.set_dictionary(experimental_parameters['Pmax'], P_max, '$Wm^{-2}$') experimental_parameters['PCE'] = {} self.set_dictionary(experimental_parameters['PCE'], PCE, '$Percent$') experimental_parameters = pd.DataFrame(data=experimental_parameters).T experimental_parameters = experimental_parameters.reset_index(names='Parameter') return experimental_parameters case _: return {}
Calculate experimental parameters from input characteristics.
Returns
pandas.DataFrame
- A DataFrame containing calculated parameters.
def clean_parameter(self, parameter: str) ‑> str
-
Expand source code
def clean_parameter(self, parameter: str) -> str: """ Clean a parameter string by removing special characters. Args: parameter (str): The parameter string to clean. Returns: str: The cleaned parameter string. """ parameter = re.sub(r"[-()\"#/@;:<>{}=~|.?,_]", " ", parameter) return parameter
Clean a parameter string by removing special characters.
Args
parameter
:str
- The parameter string to clean.
Returns
str
- The cleaned parameter string.
def clean_up(self) ‑> None
-
Expand source code
def clean_up(self) -> None: """ Clean up temporary files generated during report creation. This method removes temporary files and renames the final PDF report. """ files = os.listdir(os.getcwd()) files = [f for f in files if 'tempfile' in f] files = [f for f in files if f.endswith('.pdf') == False] files = [os.path.join(os.getcwd(), f) for f in files if f.endswith('.csv') == False] for f in files: os.remove(f) os.rename('tempfile.pdf', str(self.name)+'.pdf') shutil.rmtree(os.path.join(os.getcwd(), 'temp'))
Clean up temporary files generated during report creation.
This method removes temporary files and renames the final PDF report.
def confusion_matrices(self, network_type: str) ‑> None
-
Expand source code
def confusion_matrices(self, network_type: str) -> None: """ Generate confusion matrices for the specified network type. Args: network_type (str): The type of network (e.g., 'Point', 'Difference'). """ A = Networks.initialise(self.networks.networks_dir, network_type=network_type) self.MAPE = A.confusion_matrix(self.abs_dir)
Generate confusion matrices for the specified network type.
Args
network_type
:str
- The type of network (e.g., 'Point', 'Difference').
def confusions_values(self) ‑> List[str]
-
Expand source code
def confusions_values(self) -> List[str]: """ Retrieve file paths for confusion matrix plots. Returns: list: A list of file paths for confusion matrix plots. """ figdir = os.path.join(os.getcwd(), 'temp') files = os.listdir(figdir) files = [os.path.join(figdir,x) for x in files if 'tempCF' in x] files = [x for x in files if x.endswith('.csv')] files = natsorted(files) return files
Retrieve file paths for confusion matrix plots.
Returns
list
- A list of file paths for confusion matrix plots.
def confustions(self) ‑> List[str]
-
Expand source code
def confustions(self) -> List[str]: """ Retrieve file paths for confusion matrix plots. Returns: list: A list of file paths for confusion matrix plots. """ figdir = os.path.join(os.getcwd(), 'temp') files = os.listdir(figdir) files = [os.path.join(figdir,x) for x in files if 'tempCF' in x] files = [x for x in files if x.endswith('.png')] files = natsorted(files) return files
Retrieve file paths for confusion matrix plots.
Returns
list
- A list of file paths for confusion matrix plots.
def difference_results(self) ‑> None
-
Expand source code
def difference_results(self) -> None: """ Add results for difference-based methods to the report. This method includes a table of predictions, distribution plots, and confusion matrix plots. """ predictions = self.prediction_table() cap = 'Machine learning predictions by the difference method' self.pdf.table(predictions, centering=True, caption=cap) dfFiles = self.distributions() cfFiles = self.confustions() self.insert_plots(dfFiles, cfFiles) dfvFiles = self.distributions_values() print(dfvFiles) self.dfv = pd.DataFrame(np.hstack([pd.read_csv(f).values for f in dfvFiles])) dfv_norm_files = self.distributions_values_nomr() self.dfv_norm = pd.DataFrame(np.hstack([pd.read_csv(f).values for f in dfv_norm_files])) cfFiles = self.confusions_values() self.cfv = pd.DataFrame(np.hstack([pd.read_csv(f).values for f in cfFiles]))
Add results for difference-based methods to the report.
This method includes a table of predictions, distribution plots, and confusion matrix plots.
def distributions(self) ‑> List[str]
-
Expand source code
def distributions(self) -> List[str]: """ Retrieve file paths for distribution plots. Returns: list: A list of file paths for distribution plots. """ figdir = os.path.join(os.getcwd(), 'temp') files = os.listdir(figdir) files = [os.path.join(figdir,x) for x in files if 'tempDF' in x] files = [x for x in files if x.endswith('.png')] files = natsorted(files) return files
Retrieve file paths for distribution plots.
Returns
list
- A list of file paths for distribution plots.
def distributions_values(self) ‑> List[str]
-
Expand source code
def distributions_values(self) -> List[str]: """ Retrieve file paths for distribution plots. Returns: list: A list of file paths for distribution plots. """ figdir = os.path.join(os.getcwd(), 'temp') print(figdir) files = os.listdir(figdir) print(files) files = [os.path.join(figdir,x) for x in files if 'tempDF' in x] files = [os.path.join(figdir,x) for x in files if '_norm' not in x] print(files) files = [x for x in files if x.endswith('.csv')] files = natsorted(files) return files
Retrieve file paths for distribution plots.
Returns
list
- A list of file paths for distribution plots.
def distributions_values_nomr(self) ‑> List[str]
-
Expand source code
def distributions_values_nomr(self) -> List[str]: """ Retrieve file paths for distribution plots. Returns: list: A list of file paths for distribution plots. """ figdir = os.path.join(os.getcwd(), 'temp') files = os.listdir(figdir) files = [os.path.join(figdir,x) for x in files if '_norm' in x] files = [x for x in files if x.endswith('.csv')] files = natsorted(files) return files
Retrieve file paths for distribution plots.
Returns
list
- A list of file paths for distribution plots.
def ensemble_results(self) ‑> None
-
Expand source code
def ensemble_results(self) -> None: """ Add results for ensemble-based methods to the report. This method includes a table of predictions and confusion matrix plots. """ predictions = self.prediction_table() cap = 'Machine learning predictions by the ensemble method' self.pdf.table(predictions, centering=True, caption=cap)
Add results for ensemble-based methods to the report.
This method includes a table of predictions and confusion matrix plots.
def experimental_results(self) ‑> None
-
Expand source code
def experimental_results(self) -> None: """ Add experimental results to the report. This method generates comprehensive experimental data analysis including: - Characteristic curves (JV, IV, TPV, etc.) with proper scaling and labeling - Experimental parameter tables with calculated values - Device performance metrics and statistics - Batch analysis for multiple devices when applicable The visualization automatically adapts to different characterization types: - JV/IV: Current density vs voltage plots - batch_JV: Multiple device overlay plots - TPV: Transient photovoltage analysis - Other types: Appropriate axis labels and scaling Note: Figures are saved to the temporary directory and included in the LaTeX report. Parameter calculations are performed and displayed in tabular format. """ self.pdf.section('Experimental Results') cap = 'Experimental Characteristics' characteristic = Figures() match self.inputs.characterisation_type: case 'JV'|'IV'|'JV_I4'|'SPO JV': characteristic.initialise_figure(figsize=(6, 6)) characteristic.plot(self.inputs.x, self.inputs.y) characteristic.set_x_limits(left=np.min(self.inputs.x), right=np.max(self.inputs.x)) characteristic.set_y_limits(top=np.max(self.inputs.y), bottom=np.min(self.inputs.y)*1.5) characteristic.set_x_label('Voltage (V)') characteristic.set_y_label('Current Density ($Am^{-2}$)') case 'batch_JV': characteristic.initialise_figure(figsize=(6, 6)) for idx in self.inputs.x: characteristic.plot(self.inputs.x, self.inputs.y) characteristic.set_x_limits(left=np.min(self.inputs.x), right=np.max(self.inputs.x)) characteristic.set_y_limits(top=np.max(self.inputs.y), bottom=np.min(self.inputs.y) * 1.5) characteristic.set_x_label('Voltage (V)') characteristic.set_y_label('Current Density ($Am^{-2}$)') case '2d_JV': pass case _: pass characteristic_path = os.path.join(self.temp_dir, 'characteristic.png') characteristic.save_to_disk(characteristic_path, dpi=300) experimental_paramaters = self.calcluate_experimental_paramaters() cap = 'Experimental Input Characteristic' self.pdf.figure(characteristic_path, centering=True, width='0.6\\textwidth', caption=cap) if len(experimental_paramaters) != 0: cap = 'Parameters calculable from input characteristic' self.pdf.table(experimental_paramaters, centering=True, caption=cap) self.pdf.newpage()
Add experimental results to the report.
This method generates comprehensive experimental data analysis including: - Characteristic curves (JV, IV, TPV, etc.) with proper scaling and labeling - Experimental parameter tables with calculated values - Device performance metrics and statistics - Batch analysis for multiple devices when applicable
The visualization automatically adapts to different characterization types: - JV/IV: Current density vs voltage plots - batch_JV: Multiple device overlay plots
- TPV: Transient photovoltage analysis - Other types: Appropriate axis labels and scalingNote
Figures are saved to the temporary directory and included in the LaTeX report. Parameter calculations are performed and displayed in tabular format.
def insert_plot(self, cf_files: List[str]) ‑> None
-
Expand source code
def insert_plot(self, cf_files: List[str]) -> None: """ Insert confusion matrix plots into the report. Args: cf_files (list): File paths for confusion matrix plots. """ outputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['outputs'] for working_network in range(len(self.networks.networks_configured))] for idx in range(len(cf_files)): try: self.pdf.subsection(Label(outputs[idx][0]).english) except: self.pdf.subsection('Error') figs = [cf_files[idx]] self.pdf.subfigure(*figs, width=0.455)
Insert confusion matrix plots into the report.
Args
cf_files
:list
- File paths for confusion matrix plots.
def insert_plots(self, df_files: List[str], cf_files: List[str]) ‑> None
-
Expand source code
def insert_plots(self, df_files: List[str], cf_files: List[str]) -> None: """ Insert distribution and confusion matrix plots into the report. Args: df_files (list): File paths for distribution plots. cf_files (list): File paths for confusion matrix plots. """ outputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['outputs'] for working_network in range(len(self.networks.networks_configured))] if len(df_files) == len(cf_files): for idx in range(len(df_files)): try: self.pdf.subsection(Label(outputs[idx][0]).english) except: self.pdf.subsection('Error') figs = [df_files[idx], cf_files[idx]] self.pdf.subfigure(*figs, width=0.455) elif len(df_files) == 0: raise ValueError("No distributions were found") elif len(cf_files) == 0: for idx in range(len(df_files)): try: self.pdf.subsection(Label(outputs[idx][0]).english) except: self.pdf.subsection('Error') figs = [df_files[idx]] self.pdf.subfigure(*figs, width=0.455) else: raise ValueError("The length of distributions does not match that of confusion matrices")
Insert distribution and confusion matrix plots into the report.
Args
df_files
:list
- File paths for distribution plots.
cf_files
:list
- File paths for confusion matrix plots.
def machine_learning_results(self) ‑> None
-
Expand source code
def machine_learning_results(self) -> None: """ Add machine learning results to the report. This method includes predictions, confusion matrices, and other relevant metrics based on the network type. """ self.pdf.section('Machine Learning Parameters') match self.networks.__class__.__name__: case 'Point': self.confusion_matrices('Point') self.single_results() case 'Difference': self.confusion_matrices('Difference') self.difference_results() case 'Residual': self.confusion_matrices('Residual') self.difference_results() case 'Ensemble': self.confusion_matrices('Ensemble') self.ensemble_results() case _: raise ValueError('Network Type Not Recognised!')
Add machine learning results to the report.
This method includes predictions, confusion matrices, and other relevant metrics based on the network type.
def multipoint_row(self, parameter: str, mean: float, std: float, mape: float) ‑> None
-
Expand source code
def multipoint_row(self, parameter: str, mean: float, std: float, mape: float) -> None: """ Add a row to the prediction dictionary for multipoint predictions. Args: parameter (str): The parameter name. mean (float): The mean prediction value. std (float): The standard deviation of predictions. mape (float): The Mean Absolute Percentage Error (MAPE). """ parameter = self.clean_parameter(parameter) self.prediction_dictionary[parameter] = {} dictionary = self.prediction_dictionary[parameter] if mean > 1e3 or mean < 1e-3: mean = '{:.2e}'.format(mean) dictionary['Mean'] = mean dictionary['Standard Deviation'] = std dictionary['Units'] = 'NA' dictionary['MAPE (\%)'] = mape return
Add a row to the prediction dictionary for multipoint predictions.
Args
parameter
:str
- The parameter name.
mean
:float
- The mean prediction value.
std
:float
- The standard deviation of predictions.
mape
:float
- The Mean Absolute Percentage Error (MAPE).
def multipoint_row_single(self, parameter: str, mean: float, mape: float | None = None) ‑> None
-
Expand source code
def multipoint_row_single(self, parameter: str, mean: float, mape: Optional[float] = None) -> None: """ Add a row to the prediction dictionary for single-point predictions. Args: parameter (str): The parameter name. mean (float): The mean prediction value. mape (float, optional): The Mean Absolute Percentage Error (MAPE). """ keys = list(self.prediction_dictionary.keys()) # parameter = Label(parameter) # if parameter.english == '': # parameter.english = parameter.token # if parameter.english in keys: # number = len(np.where(parameter.english in keys)) # parameter.english = parameter.english + ' (' + str(number) + ')' parameter = self.clean_parameter(parameter) self.prediction_dictionary[parameter] = {} dictionary = self.prediction_dictionary[parameter] # if mean > 1e3 or mean < 1e-3: # mean = '{:.2e}'.format(mean) dictionary['Mean'] = mean dictionary['Units'] = 'NA' if mape == None: dictionary['MAPE (\%)'] = 'N/A' else: dictionary['MAPE (\%)'] = mape return
Add a row to the prediction dictionary for single-point predictions.
Args
parameter
:str
- The parameter name.
mean
:float
- The mean prediction value.
mape
:float
, optional- The Mean Absolute Percentage Error (MAPE).
def point_row(self, parameter: str, point: float) ‑> None
-
Expand source code
def point_row(self, parameter: str, point: float) -> None: """ Add a row to the prediction dictionary for a single prediction. Args: parameter (str): The parameter name. point (float): The prediction value. """ self.prediction_dictionary[parameter] = {} dictionary = self.prediction_dictionary[parameter] dictionary['Value'] = point dictionary['Units'] = 'to implement' return
Add a row to the prediction dictionary for a single prediction.
Args
parameter
:str
- The parameter name.
point
:float
- The prediction value.
def prediction_table(self) ‑> pandas.core.frame.DataFrame
-
Expand source code
def prediction_table(self) -> pd.DataFrame: """ Generate a table of predictions for multipoint methods. Returns: pandas.DataFrame: A DataFrame containing predictions and metrics. """ networks = self.networks.networks_configured inputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['inputs'] for working_network in range(len(networks))] outputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['outputs'] for working_network in range(len(networks))] inputs = np.asarray(inputs) outputs = np.asarray(outputs, dtype=object) #outputs = outputs[outputs != 0] self.prediction_dictionary = {} self.prediction_all_dict = {} rows = np.zeros(len(networks) * len(outputs.ravel()),dtype=object) for idx in range(len(networks)): for jdx in range(len(outputs[idx])): parameter = outputs[idx][0] if len(self.networks.predictions) > 0: predictions_mean = self.networks.mean[idx] predictions_std = self.networks.std[idx] try: predictions_mape = self.MAPE[idx] except: predictions_mape = 0 self.multipoint_row(parameter, predictions_mean, predictions_std, predictions_mape) else: predictions = predictions[0] self.point_row(parameter, predictions) predictions = pd.DataFrame(self.prediction_dictionary).T predictions = predictions.reset_index(names='Parameters') self.prediction_table = predictions #predictions.astype(float).round(3) return predictions
Generate a table of predictions for multipoint methods.
Returns
pandas.DataFrame
- A DataFrame containing predictions and metrics.
def prediction_table_single(self) ‑> pandas.core.frame.DataFrame
-
Expand source code
def prediction_table_single(self) -> pd.DataFrame: """ Generate a table of predictions for single-point methods. Returns: pandas.DataFrame: A DataFrame containing predictions and metrics. """ networks = self.networks.networks_configured inputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['inputs'] for working_network in range(len(networks))] outputs = [self.networks.oghma_network_config['sims'][self.networks.networks_configured[working_network]]['outputs'] for working_network in range(len(networks))] inputs = np.asarray(inputs) outputs = np.asarray(outputs, dtype=object) #outputs = outputs[outputs != 0] self.prediction_dictionary = {} self.prediction_all_dict = {} rows = np.zeros(len(networks) * len(outputs.ravel()),dtype=object) for idx in range(len(networks)): for jdx in range(len(outputs[idx])): if len(outputs[idx]) == 1: parameter = outputs[idx][0] parameter = self.clean_parameter(parameter) predictions_mean = self.networks.mean[idx][0] #predictions_mape = self.MAPE[idx][0] self.multipoint_row_single(parameter, predictions_mean) else: parameter = outputs[idx][jdx] predictions_mean = self.networks.mean[idx][jdx] #predictions_mape = self.MAPE[idx][jdx] parameter = self.clean_parameter(parameter) self.multipoint_row_single(parameter, predictions_mean) predictions = pd.DataFrame(self.prediction_dictionary).T predictions = predictions.reset_index(names='Parameters') self.prediction_table = predictions #predictions.astype(float).round(3) return predictions
Generate a table of predictions for single-point methods.
Returns
pandas.DataFrame
- A DataFrame containing predictions and metrics.
def save_report(self, name: str) ‑> None
-
Expand source code
def save_report(self, name: str) -> None: """ Save the report as a PDF and CSV file. Args: name (str): The name of the report file (without extension). """ self.name = name self.pdf.end_document() self.pdf.save_tex() self.pdf.compile() self.pdf.compile() try: self.dfv.to_csv(self.name + '_distributions.csv', index=False) self.dfv_norm.to_csv(self.name + '_distributions_norm.csv', index=False) except: print('No distributions to save') try: self.cfv.to_csv(self.name + '_confusions.csv', index=False) except: print('No confusions to save') self.clean_up() self.prediction_table.to_csv(self.name + '.csv')
Save the report as a PDF and CSV file.
Args
name
:str
- The name of the report file (without extension).
def save_report_csv(self, name: str) ‑> None
-
Expand source code
def save_report_csv(self, name: str) -> None: """ Save the predictions as a CSV file. Args: name (str): The name of the CSV file (without extension). """ self.name = name self.prediction_table.to_csv(self.name + '.csv')
Save the predictions as a CSV file.
Args
name
:str
- The name of the CSV file (without extension).
def set_dictionary(self, dict: Dict[str, Any], value: float, unit: str) ‑> Dict[str, Any]
-
Expand source code
def set_dictionary(self, dict: Dict[str, Any], value: float, unit: str) -> Dict[str, Any]: """ Set values in a dictionary for a specific parameter. Args: dict (dict): The dictionary to update. value (float): The value to set. unit (str): The unit of the value. Returns: dict: The updated dictionary. """ dict['Value'] = value dict['Unit'] = unit return dict
Set values in a dictionary for a specific parameter.
Args
dict
:dict
- The dictionary to update.
value
:float
- The value to set.
unit
:str
- The unit of the value.
Returns
dict
- The updated dictionary.
def single_results(self) ‑> None
-
Expand source code
def single_results(self) -> None: """ Add results for single-point methods to the report. This method includes a table of predictions and confusion matrix plots. """ predictions = self.prediction_table_single() cap = 'Machine learning predictions by the single method' self.pdf.table(predictions, centering=True, caption=cap) cfFiles = self.confustions() self.insert_plot(cfFiles) cfFiles = self.confusions_values() self.cfv = pd.DataFrame(np.hstack([pd.read_csv(f).values for f in cfFiles]))
Add results for single-point methods to the report.
This method includes a table of predictions and confusion matrix plots.