pynir package

Submodules

pynir.Calibration module

Created on Wed Sep 28 11:00:35 2022

@author: chinn

pynir.Calibration.binaryClassificationReport(ytrue, ypred)[source]

Generate a binary classification report.

This function generates a binary classification report for a binary classification problem by computing the confusion matrix and various performance metrics.

Parameters

ytruenumpy.ndarray

The true target variable vector.

yprednumpy.ndarray

The predicted target variable vector.

Returns

reportdict

A dictionary containing various performance metrics.

class pynir.Calibration.lsvc(penalty='l2', loss='squared_hinge', *, dual=True, tol=0.0001, C=1.0, multi_class='ovr', fit_intercept=True, intercept_scaling=1, class_weight=None, verbose=0, random_state=None, max_iter=1000)[source]

Bases: LinearSVC

Linear Support Vector Classification (Linear SVC) model.

This class extends the scikit-learn LinearSVC class to include methods for finding the optimal hyperparameters and computing the confusion matrix.

Methods

get_optParams(X, y, Params=None, nfold=10, n_jobs=None)

Find the optimal hyperparameters for the model using cross-validation.

get_confusion_matrix(X, y)

Compute the confusion matrix for the model.

get_confusion_matrix(X, y)[source]

Compute the confusion matrix for the model.

Parameters

Xnumpy.ndarray

The data matrix.

ynumpy.ndarray

The target variable vector.

Returns

cmnumpy.ndarray

The confusion matrix.

get_optParams(X, y, Params=None, nfold=10, n_jobs=None)[source]

Find the optimal hyperparameters for the model using cross-validation.

Parameters

Xnumpy.ndarray

The data matrix.

ynumpy.ndarray

The target variable vector.

Paramsdict, optional (default = None)

The hyperparameters to search over. If None, a default set of hyperparameters is used.

nfoldint, optional (default = 10)

The number of folds to use in cross-validation.

n_jobsint, optional (default = None)

The number of parallel jobs to run. If None, all CPUs are used.

Returns

best_paramsdict

The optimal hyperparameters for the model.

class pynir.Calibration.multiClass_to_binaryMatrix[source]

Bases: object

Multi-class to binary matrix conversion.

This class is used to convert a multi-class target variable into a binary matrix suitable for training a multi-label classifier.

Methods

fit(x)

Fit the transformer to the data.

transform(x)

Transform the data into a binary matrix.

reTransform(xnew)

Convert the binary matrix back into the original target variable.

fit(x)[source]

Fit the transformer to the data.

Parameters

xnumpy.ndarray

The target variable vector.

Returns

selfobject

Returns self.

reTransform(xnew)[source]

Convert the binary matrix back into the original target variable.

Parameters

xnewnumpy.ndarray

The binary matrix.

Returns

xnumpy.ndarray

The original target variable vector.

transform(x)[source]

Transform the data into a binary matrix.

Parameters

xnumpy.ndarray

The target variable vector.

Returns

Xnewnumpy.ndarray

The binary matrix.

pynir.Calibration.multiClassificationReport(ytrue, ypred)[source]

Generate a classification report for a multi-class classification problem.

This function generates a classification report for a multi-class classification problem by computing binary classification reports for each class.

Parameters

ytruenumpy.ndarray

The true target variable vector.

yprednumpy.ndarray

The predicted target variable vector.

Returns

reportdict

A dictionary containing binary classification reports for each class.

pynir.Calibration.plot_confusion_matrix(cm, target_names, title='Confusion matrix', cmap=None, normalize=True)[source]

given a sklearn confusion matrix (cm), make a nice plot

Arguments

cm: confusion matrix from sklearn.metrics.confusion_matrix

target_names: given classification classes such as [0, 1, 2]

the class names, for example: [‘high’, ‘medium’, ‘low’]

title: the text to display at the top of the matrix

cmap: the gradient of the values displayed from matplotlib.pyplot.cm

see http://matplotlib.org/examples/color/colormaps_reference.html plt.get_cmap(‘jet’) or plt.cm.Blues

normalize: If False, plot the raw numbers

If True, plot the proportions

Usage

plot_confusion_matrix(cm = cm, # confusion matrix created by

# sklearn.metrics.confusion_matrix

normalize = True, # show proportions target_names = y_labels_vals, # list of names of the classes title = best_estimator_name) # title of graph

Citiation

http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html

class pynir.Calibration.pls(n_components=2)[source]

Bases: object

Partial Least Squares (PLS) regression model.

Parameters

n_componentsint, optional (default=2)

The number of PLS components to use.

Attributes

modeldict

The PLS model, containing the following keys: - ‘x_scores’: the X scores - ‘x_loadings’: the X loadings - ‘y_loadings’: the Y loadings - ‘x_weights’: the X weights - ‘B’: the regression coefficients

optLVint

The optimal number of PLS components, determined by cross-validation.

Methods

fit(X, y)

Fit the PLS model to the training data.

predict(Xnew, n_components=None)

Predict the response variable for new data.

crossValidation_predict(nfold=10)

Perform cross-validation and return the predicted response variable.

get_optLV(nfold=10)

Determine the optimal number of PLS components using cross-validation.

transform(Xnew)

Transform new data into the PLS space.

get_vip()

Compute the variable importance in projection (VIP) scores.

plot_prediction(y, yhat, xlabel=”Reference”, ylabel=”Prediction”, title=””, ax=None)

Plot the predicted response variable against the reference variable.

crossValidation_predict(nfold=10)[source]

Perform cross-validation and return the predicted response variable.

Parameters

nfoldint, optional (default=10)

The number of folds to use in cross-validation.

Returns

yhatnumpy.ndarray

The predicted response variable.

fit(X, y)[source]

Fit the PLS model to the training data.

Parameters

Xnumpy.ndarray

The independent variable matrix.

ynumpy.ndarray

The dependent variable vector.

Returns

selfpls

The fitted PLS model.

get_optLV(nfold=10)[source]

Determine the optimal number of PLS components using cross-validation.

Parameters

nfoldint, optional (default=10)

The number of folds to use in cross-validation.

Returns

optLVint

The optimal number of PLS components.

get_vip()[source]

Compute the variable importance in projection (VIP) scores.

Returns

vipScorenumpy.ndarray

The VIP scores.

References

https://www.sciencedirect.com/topics/engineering/variable-importance-in-projection

plot_prediction(y, yhat, xlabel='Reference', ylabel='Prediction', title='', ax=None)[source]

Plot the predicted response variable against the reference variable.

Parameters

ynumpy.ndarray

The reference variable.

yhatnumpy.ndarray

The predicted response variable.

xlabelstr, optional (default=”Reference”)

The label for the x-axis.

ylabelstr, optional (default=”Prediction”)

The label for the y-axis.

titlestr, optional (default=””)

The title for the plot.

axmatplotlib.axes.Axes, optional

The axes on which to plot the figure (default is None).

Returns

axmatplotlib.axes.Axes

The axes object containing the plotted figure.

predict(Xnew, n_components=None)[source]

Predict the response variable for new data.

Parameters

Xnewnumpy.ndarray

The new independent variable matrix.

n_componentsint, optional

The number of PLS components to use (default is None, which uses all components).

Returns

ynew_hatnumpy.ndarray

The predicted response variable.

transform(Xnew)[source]

Transform new data into the PLS space.

Parameters

Xnewnumpy.ndarray

The new independent variable matrix.

Returns

Tnewnumpy.ndarray

The transformed data.

class pynir.Calibration.plsda(n_components=2, scale=True, **kwargs)[source]

Bases: PLSRegression

Partial Least Squares Discriminant Analysis (PLS-DA) model.

This class extends the scikit-learn PLSRegression class to include Linear Discriminant Analysis (LDA) for classification.

Parameters

n_componentsint, optional (default = 2)

Number of components to keep in the model.

scalebool, optional (default = True)

Whether to scale the data before fitting the model.

**kwargsdict, optional

Additional keyword arguments to pass to the PLSRegression constructor.

Attributes

ldaLinearDiscriminantAnalysis

The LDA model used for classification.

Methods

fit(X, y)

Fit the PLS-DA model to the training data.

predict(X)

Predict the class labels for new data.

predict_log_proba(X)

Predict the log probabilities of the class labels for new data.

predict_proba(X)

Predict the probabilities of the class labels for new data.

crossValidation_predict(nfold=10)

Perform cross-validation to predict the class labels for the training data.

get_optLV(nfold=10)

Find the optimal number of components using cross-validation.

get_confusion_matrix(X, y)

Compute the confusion matrix for the model.

get_vip()

Compute the Variable Importance in Projection (VIP) scores for the model.

permutation_test(X, y, n_repeats=100, n_jobs=None)

Perform a permutation test to assess the significance of the model.

crossValidation_predict(nfold=10)[source]

Perform cross-validation to predict the class labels for the training data.

Parameters

nfoldint, optional (default = 10)

The number of folds to use in cross-validation.

Returns

yhatnumpy.ndarray

The predicted class labels for each fold and each number of components.

fit(X, y)[source]

Fit the PLS-DA model to the training data.

Parameters

Xnumpy.ndarray

The training data matrix.

ynumpy.ndarray

The target variable vector.

Returns

selfplsda

The fitted PLS-DA model.

get_confusion_matrix(X, y)[source]

Compute the confusion matrix for the model.

Parameters

Xnumpy.ndarray

The data matrix.

ynumpy.ndarray

The target variable vector.

Returns

cmnumpy.ndarray

The confusion matrix.

get_optLV(nfold=10)[source]

Find the optimal number of components using cross-validation.

Parameters

nfoldint, optional (default = 10)

The number of folds to use in cross-validation.

Returns

optLVint

The optimal number of components.

get_vip()[source]

Compute the Variable Importance in Projection (VIP) scores for the model.

Returns

vipScorenumpy.ndarray

The VIP scores.

permutation_test(X, y, n_repeats=100, n_jobs=None)[source]

Perform a permutation test to assess the significance of the model.

Parameters

Xnumpy.ndarray

The data matrix.

ynumpy.ndarray

The target variable vector.

n_repeatsint, optional (default = 100)

The number of permutations to perform.

n_jobsint, optional (default = None)

The number of parallel jobs to run. If None, all CPUs are used.

Returns

q2numpy.ndarray

The Q2 values for each permutation.

r2numpy.ndarray

The R2 values for each permutation.

permutation_rationumpy.ndarray

The ratio of permuted target variable values to total target variable values for each permutation.

predict(X)[source]

Predict the class labels for new data.

Parameters

Xnumpy.ndarray

The new data matrix.

Returns

y_prednumpy.ndarray

The predicted class labels.

predict_log_proba(X)[source]

Predict the log probabilities of the class labels for new data.

Parameters

Xnumpy.ndarray

The new data matrix.

Returns

log_probanumpy.ndarray

The log probabilities of the class labels.

predict_proba(X)[source]

Predict the probabilities of the class labels for new data.

Parameters

Xnumpy.ndarray

The new data matrix.

Returns

probanumpy.ndarray

The probabilities of the class labels.

pynir.Calibration.regressionReport(ytrue, ypred)[source]

Generate a regression report.

This function generates a regression report for a regression problem by computing the root mean squared error (RMSE) and the R-squared (R2) score.

Parameters

ytruenumpy.ndarray

The true target variable vector.

yprednumpy.ndarray

The predicted target variable vector.

Returns

reportdict

A dictionary containing the RMSE and R2 score.

class pynir.Calibration.rf(n_estimators=100, *, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='sqrt', max_leaf_nodes=None, min_impurity_decrease=0.0, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, class_weight=None, ccp_alpha=0.0, max_samples=None)[source]

Bases: RandomForestClassifier

Random Forest Classification (RF) model.

This class extends the scikit-learn RandomForestClassifier class to include methods for finding the optimal hyperparameters and computing the confusion matrix.

Methods

get_optParams(X, y, Params=None, nfold=10, n_jobs=None)

Find the optimal hyperparameters for the model using cross-validation.

get_confusion_matrix(X, y)

Compute the confusion matrix for the model.

get_confusion_matrix(X, y)[source]

Compute the confusion matrix for the model.

Parameters

Xnumpy.ndarray

The data matrix.

ynumpy.ndarray

The target variable vector.

Returns

cmnumpy.ndarray

The confusion matrix.

get_optParams(X, y, Params=None, nfold=10, n_jobs=None)[source]

Find the optimal hyperparameters for the model using cross-validation.

Parameters

Xnumpy.ndarray

The data matrix.

ynumpy.ndarray

The target variable vector.

Paramsdict, optional (default = None)

The hyperparameters to search over. If None, a default set of hyperparameters is used.

nfoldint, optional (default = 10)

The number of folds to use in cross-validation.

n_jobsint, optional (default = None)

The number of parallel jobs to run. If None, all CPUs are used.

Returns

best_paramsdict

The optimal hyperparameters for the model.

pynir.Calibration.sampleSplit_KS(X, test_size=0.25, metric='euclidean', *args, **kwargs)[source]

Split a dataset into training and testing sets using the KS algorithm.

This function splits a dataset into training and testing sets using the KS algorithm, which selects points that maximize the minimum distance between them and previously selected points.

Parameters

Xnumpy.ndarray

The dataset to split.

test_sizefloat, optional

The proportion of the dataset to include in the test split.

metricstr, optional

The distance metric to use for computing distances between points.

*argstuple

Additional arguments to pass to the distance metric function.

**kwargsdict

Additional keyword arguments to pass to the distance metric function.

Returns

trainIdxnumpy.ndarray

The indices of the training set.

testIdxnumpy.ndarray

The indices of the testing set.

Notes

This implementation is based on the algorithm described in: K. S. Lee, “Automatic thresholding for defect detection,” Pattern Recognition, vol. 21, no. 3, pp. 225-238, 1988.

pynir.Calibration.sampleSplit_random(X, test_size=0.25, random_state=1, shuffle=False)[source]

Randomly split a dataset into training and testing sets.

This function randomly splits a dataset into training and testing sets using the train_test_split function from scikit-learn.

Parameters

Xnumpy.ndarray

The dataset to split.

test_sizefloat, optional

The proportion of the dataset to include in the test split.

random_stateint, optional

The random seed to use for reproducibility.

shufflebool, optional

Whether or not to shuffle the dataset before splitting.

Returns

trainIdxnumpy.ndarray

The indices of the training set.

testIdxnumpy.ndarray

The indices of the testing set.

pynir.Calibration.simpls(X, y, n_components)[source]

Perform SIMPLS (Partial Least Squares) regression.

This function performs SIMPLS regression, which is a variant of PLS regression that uses a sequential algorithm to compute the PLS components.

Parameters

Xnumpy.ndarray

The independent variable matrix.

ynumpy.ndarray

The dependent variable vector.

n_componentsint

The number of PLS components to compute.

Returns

resultsdict

A dictionary containing the PLS components and loadings.

Notes

This implementation is based on the algorithm described in: Wold, S., Ruhe, A., Wold, H., & Dunn III, W. J. (1984). The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM Journal on Scientific and Statistical Computing, 5(3), 735-743.

class pynir.Calibration.svc(*, C=1.0, kernel='rbf', degree=3, gamma='scale', coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape='ovr', break_ties=False, random_state=None)[source]

Bases: SVC

Support Vector Classification (SVC) model.

This class extends the scikit-learn SVC class to include methods for finding the optimal hyperparameters and computing the confusion matrix.

Methods

get_optParams(X, y, Params=None, nfold=10, n_jobs=None)

Find the optimal hyperparameters for the model using cross-validation.

get_confusion_matrix(X, y)

Compute the confusion matrix for the model.

get_confusion_matrix(X, y)[source]

Compute the confusion matrix for the model.

Parameters

Xnumpy.ndarray

The data matrix.

ynumpy.ndarray

The target variable vector.

Returns

cmnumpy.ndarray

The confusion matrix.

get_optParams(X, y, Params=None, nfold=10, n_jobs=None)[source]

Find the optimal hyperparameters for the model using cross-validation.

Parameters

Xnumpy.ndarray

The data matrix.

ynumpy.ndarray

The target variable vector.

Paramsdict, optional (default = None)

The hyperparameters to search over. If None, a default set of hyperparameters is used.

nfoldint, optional (default = 10)

The number of folds to use in cross-validation.

n_jobsint, optional (default = None)

The number of parallel jobs to run. If None, all CPUs are used.

Returns

best_paramsdict

The optimal hyperparameters for the model.

pynir.CalibrationTransfer module

Created on Wed Sep 28 11:00:35 2022

@author: chinn

class pynir.CalibrationTransfer.BS[source]

Bases: object

Implementation of the Osborne and Fearn Back-Shift (BS) method for spectral calibration.

This class implements the BS algorithm for spectral calibration, which is a method for transferring calibration models between instruments with different spectral characteristics.

Notes

Osborne, B. G., & Fearn, T. (1983). Collaborative evaluation of universal calibrations for the measurement of protein and moisture in flour by near infrared reflectance. International Journal of Food Science & Technology, 18(4), 453-460.

fit(y1, y2)[source]

Fit the BS model to the training data.

Parameters

y1numpy.ndarray

The predictions of standard spectra from the master instrument.

y2numpy.ndarray

The predictions of standard spectra from the slave instrument.

Returns

selfBS

The fitted BS model.

transform(y2)[source]

Apply the BS model to new prediction of spectra from slave instrument.

Parameters

y2numpy.ndarray

The predictions of spectra measured on the slave instrument.

Returns

y1numpy.ndarray

The predicted reference values with correction

class pynir.CalibrationTransfer.FS_PFCE(thres=0.98, constrType=1)[source]

Bases: object

Full-Supervised Parameter-Free Calibration Enhancement (FS-PFCE) framework for spectral calibration enhancement.

This class implements the FS-PFCE algorithm for spectral calibration enhancement, which is a method for transferring calibration models between instruments with different spectral characteristics.

Parameters

thresfloat, optional

The threshold for tconstraint.

constrTypeint, optional

The type of constraint to use for optimization.

Attributes

b2numpy.ndarray

The coefficients learned from the training data.

Notes

This implementation is based on the algorithm described in: [1] Zhang J., Zhou X, Li B. Y.*, PFCE2: A versatile parameter-free calibration enhancement framework for near-infrared spectroscopy, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2023, 301: 122978. [2] Zhang J., Li B. Y.*, Hu Y., Zhou L. X., Wang G. Z., Guo G., Zhang Q. H., Lei S. C.*, Zhang A. H., A Parameter-Free Framework for Calibration Enhancement of Near-Infrared Spectroscopy Based on Correlation Constraint, Anal. Chim. Acta, 2021, 1142: 169-178.

fit(X1, X2, y, b1)[source]

Fit the FS-PFCE model to the training data.

Parameters

X1numpy.ndarray

The standard spectra of the master instrument.

X2numpy.ndarray

The standard spectra of the slave instrument.

ynumpy.ndarray

The reference values of the slave instrument.

b1numpy.ndarray

The coefficients learned from the master instrument.

Returns

selfFS_PFCE

The fitted FS-PFCE model for slave instrument.

transform(X)[source]

Apply the FS-PFCE model to spectra measured on slave instruments.

Parameters

Xnumpy.ndarray

The spectra to calibrate.

Returns

ynumpy.ndarray

The prediction of spectra from slave instrument with the FS-PFCE enhanced model.

class pynir.CalibrationTransfer.MT_PFCE(thres=0.98, constrType=1)[source]

Bases: object

Multi-Task Parameter-Free Calibration Enhancement (MT-PFCE) framework for spectral calibration enhancement.

This class implements the MT-PFCE algorithm for spectral calibration enhancement, which is a method for transferring calibration models between instruments with different spectral characteristics.

Parameters

thresfloat, optional

The threshold for tconstraint.

constrTypeint, optional

The type of constraint to use for optimization.

Attributes

Bnumpy.ndarray

The coefficients learned from the training data.

ntaskint

The number of tasks.

Notes

This implementation is based on the algorithm described in: [1] Zhang J., Zhou X, Li B. Y.*, PFCE2: A versatile parameter-free calibration enhancement framework for near-infrared spectroscopy, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2023, 301: 122978. [2] Zhang J., Li B. Y.*, Hu Y., Zhou L. X., Wang G. Z., Guo G., Zhang Q. H., Lei S. C.*, Zhang A. H., A Parameter-Free Framework for Calibration Enhancement of Near-Infrared Spectroscopy Based on Correlation Constraint, Anal. Chim. Acta, 2021, 1142: 169-178.

fit(X, y, b1)[source]

Fit the MT-PFCE model to the training data.

Parameters

Xlist of numpy.ndarray

The standard spectra of the master instrument for each task.

ylist of numpy.ndarray

The reference values of the slave instrument for each task.

b1numpy.ndarray

The coefficients learned from the master instrument.

Returns

selfMT_PFCE

The fitted MT-PFCE model for slave instrument.

transform(X, itask)[source]

Apply the MT-PFCE model to spectra measured on slave instruments.

Parameters

Xnumpy.ndarray

The spectra to calibrate.

itaskint

The index of the task to apply the model to.

Returns

ynumpy.ndarray

The prediction of spectra from ith task with the MT-PFCE enhanced model.

class pynir.CalibrationTransfer.NS_PFCE(thres=0.98, constrType=1)[source]

Bases: object

Non-Supervised Parameter-Free Framework for Calibration Enhancement (NS-PFCE) for spectral calibration enhancement.

This class implements the NS-PFCE algorithm for spectral calibration enhancement, which is a method for transferring calibration models between instruments with different spectral characteristics.

Parameters

thresfloat, optional

The threshold for tconstraint.

constrTypeint, optional

The type of constraint to use for optimization.

Attributes

b2numpy.ndarray

The coefficients learned from the training data.

Notes

This implementation is based on the algorithm described in: [1] Zhang J., Zhou X, Li B. Y.*, PFCE2: A versatile parameter-free calibration enhancement framework for near-infrared spectroscopy, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2023, 301: 122978. [2] Zhang J., Li B. Y.*, Hu Y., Zhou L. X., Wang G. Z., Guo G., Zhang Q. H., Lei S. C.*, Zhang A. H., A Parameter-Free Framework for Calibration Enhancement of Near-Infrared Spectroscopy Based on Correlation Constraint, Anal. Chim. Acta, 2021, 1142: 169-178.

fit(X1, X2, b1)[source]

Fit the NS-PFCE model to the training data.

Parameters

X1numpy.ndarray

The standard spectra of the master instrument.

X2numpy.ndarray

The standard spectra of the slave instrument.

b1numpy.ndarray

The coefficients learned from the master instrument.

Returns

selfNS_PFCE

The fitted NS-PFCE model for slave instrument.

transform(X)[source]

Apply the NS-PFCE model to spectra measured on slave instruments.

Parameters

Xnumpy.ndarray

The spectra to calibrate.

Returns

ynumpy.ndarray

The prediction of spectra from slave instrument with the NS-PFCE enhanced model.

class pynir.CalibrationTransfer.PDS(halfWindowSize=7, regType='mlr', **kwargs)[source]

Bases: object

Partial Direct Standardization (PDS) for spectral calibration.

This class implements the PDS algorithm for spectral calibration, which is a method for transferring calibration models between instruments with different spectral characteristics.

Parameters

halfWindowSizeint, optional

The half window size for selecting the spectral bands.

regTypestr, optional

The regression type to use for modeling the calibration transfer function.

**kwargsdict, optional

Additional keyword arguments to pass to the regression model.

Attributes

Modelslist

A list of regression models for each spectral band.

Notes

This implementation is based on the algorithm described in: Li, H., Xu, Q., Liang, Y., & Ying, Y. (2010). Partial direct standardization for calibration transfer between near-infrared spectrometers. Analytica Chimica Acta, 665(1), 77-82.

fit(X1, X2)[source]

Fit the PDS model to the training data.

Parameters

X1numpy.ndarray

The standard spectra of the master instrument.

X2numpy.ndarray

The standard spectra of the slave instrument.

Returns

selfPDS

The fitted PDS model.

get_windowRange(nFeatrues, i)[source]
transform(X)[source]

Apply the PDS model to new data.

Parameters

Xnumpy.ndarray

The spectra to calibrate.

Returns

Xnewnumpy.ndarray

The calibrated spectra.

class pynir.CalibrationTransfer.SST(n_components=2)[source]

Bases: object

Spectral Space Transformation (SST) for spectral calibration.

This class implements the SST algorithm for spectral calibration, which is a method for transferring calibration models between instruments with different spectral characteristics.

Parameters

n_componentsint, optional

The number of components to use for the truncated SVD.

Attributes

Fnumpy.ndarray

The transformation matrix learned from the training data.

Notes

This implementation is based on the algorithm described in: Du W, Chen Z P, Zhong L J, et al. Maintaining the predictive abilities of multivariate calibration models by spectral space transformation[J]. Analytica Chimica Acta, 2011, 690(1): 64-70.

fit(X1, X2)[source]

Parameters

X12D array

Standard spectra of the first instrument.

X22D array

Standard spectra of the second instrument.

transform(X)[source]
class pynir.CalibrationTransfer.SS_PFCE(thres=0.98, constrType=1)[source]

Bases: object

Semi-Supervised Parameter-Free Calibration Enhancement (SS-PFCE) framework for spectral calibration enhancement.

This class implements the SS-PFCE algorithm for spectral calibration enhancement, which is a method for transferring calibration models between instruments with different spectral characteristics.

Parameters

thresfloat, optional

The threshold for tconstraint.

constrTypeint, optional

The type of constraint to use for optimization.

Attributes

b2numpy.ndarray

The coefficients learned from the training data.

Notes

This implementation is based on the algorithm described in: [1] Zhang J., Zhou X, Li B. Y.*, PFCE2: A versatile parameter-free calibration enhancement framework for near-infrared spectroscopy, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2023, 301: 122978. [2] Zhang J., Li B. Y.*, Hu Y., Zhou L. X., Wang G. Z., Guo G., Zhang Q. H., Lei S. C.*, Zhang A. H., A Parameter-Free Framework for Calibration Enhancement of Near-Infrared Spectroscopy Based on Correlation Constraint, Anal. Chim. Acta, 2021, 1142: 169-178.

fit(X2, y, b1)[source]

Fit the SS-PFCE model to the training data.

Parameters

X2numpy.ndarray

The standard spectra of the slave instrument.

ynumpy.ndarray

The reference values of the slave instrument.

b1numpy.ndarray

The coefficients learned from the master instrument.

Returns

selfSS_PFCE

The fitted SS-PFCE model for slave instrument.

transform(X)[source]

Apply the SS-PFCE model to spectra measured on slave instruments.

Parameters

Xnumpy.ndarray

The spectra to calibrate.

Returns

ynumpy.ndarray

The prediction of spectra from slave instrument with the SS-PFCE enhanced model.

pynir.CalibrationTransfer.cost_FS_PFCE(b1, b2, X1, X2, y)[source]
pynir.CalibrationTransfer.cost_MT_PFCE(B, X, y)[source]
pynir.CalibrationTransfer.cost_NS_PFCE(b1, b2, X1, X2)[source]
pynir.CalibrationTransfer.cost_SS_PFCE(b2, X2, y)[source]
pynir.CalibrationTransfer.pfce_constr1(B, thres)[source]
pynir.CalibrationTransfer.pfce_constr2(B, thres)[source]
pynir.CalibrationTransfer.pfce_constr3(B, thres)[source]

pynir.FeatureSelection module

pynir.OutlierDection module

pynir.Preprocessing module

pynir.utils module

Created on Wed Sep 28 11:02:36 2022

@author: Jin Zhang (zhangjin@mail.nankai.edu.cn)

pynir.utils.simulateNIR(nSample=100, n_components=3, refType=1, noise=0.0, seeds=1)[source]

Simulate NIR spectra.

Parameters

nSampleint, optional

The number of samples to simulate. The default is 100.

n_componentsint, optional

The number of components for spectral simulation. The default is 3.

refTypeint, optional

The type of reference value to generate: - None for no reference value output - 1 for continuous values as reference value output - 2 or larger integer for binary or class output. The default is 1.

noisefloat, optional

The amount of noise to add to the simulated spectra. The default is 0.0.

seedsint, optional

The random seed for generating spectra and reference values. The default is 1.

Returns

Xnumpy.ndarray

The simulated NIR spectra matrix.

ynumpy.ndarray

The concentration or class of all samples.

wvnumpy.ndarray

The wavelengths of the spectra.

pynir.utils.simulateNIR_calibrationTransfer(nSample=100, n_components=3, shifts=0.01, refType=1, noise=0.0, seeds=1)[source]

Simulate NIR spectra for calibration transfer.

Parameters

nSampleint, optional

The number of samples to simulate. The default is 100.

n_componentsint, optional

The number of components for spectral simulation. The default is 3.

shifts: float, optional

The shift level of base peaks for simulating secondary NIR spectra data.

refTypeint, optional

The type of reference value to generate: - None for no reference value output - 1 for continuous values as reference value output - 2 or larger integer for binary or class output. The default is 1.

noisefloat, optional

The amount of noise to add to the simulated spectra. The default is 0.0.

seedsint, optional

The random seed for generating spectra and reference values. The default is 1.

Returns

X1numpy.ndarray

The simulated NIR spectra matrix for the first set of spectra.

X2numpy.ndarray

The simulated NIR spectra matrix for the second set of spectra.

ynumpy.ndarray

The concentration or class of all samples.

wvnumpy.ndarray

The wavelengths of the spectra.

Module contents