pydiodon.pca

pydiodon.pca(A, pretreatment='standard', k=- 1, meth='svd')[source]

Principal Component Analysis

Parameters
Aa numpy array

the array to be analyzed

kinteger

number of axes to be computed

methstring

method for numerical calculation (see notes)

pretreatmentstring

which pretreatment to apply ;

accepted values are: standard, bicentering, col_centering, row_centering, scaling

see notes for details

Notes

The method runs as follows:

  • first it implements the required pretreatments

  • second: it runs the function pca_core on the transformed matrix

  • third: it returns the eigenvalues, the principal axis, the principal components and the correlation matrix if required

methods for PCA: the argument meth specifies which method is selected for the core of MDS. Default value is svd. Let A be the the matrix to analyse.

  • if meth=svd, the a SVD of A is run

  • if meth=grp, the SVD is run with Gaussian Random Projection

  • if meth=evd, the eigenvalues and eigenvectors of A are computed

pretreatments: here are the accepted pretreatments:

  • standard: the matrix is centered and scaled columnwise

  • bicentering: Matrix is centered rwowise and columnwise; it is a useful alternative to CoA known as “double averaging”

Examples

This is an example of a standard PCA of a random matrix, with \(m\) rows and \(n\) columns, with elements as realisation of a uniform law between 0 and 1.

First build the random matrix

>>> import pydiodon as dio
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> m = 200 ; n = 50
>>> A = np.random.random((m,n))

Second, run the PCA

>>> L, V, Y = dio.pca(A)

Third, plots some results (eigenvectors and point cloud)

>>> plt.plot(L) ; plt.show()
>>> plt.scatter(Y[:,0],Y[:,1]) ; plt.show()

The above program runs centered scaled PCA, with here default option pretreatment="standard". For PCA without centering nor scaling, the command is

>>> L, V, Y, C = dio.pca(A, standard=False)

For PCA with column centering but without scaling, the command is

>>> L, V, Y, C = dio.pca(A, standard=False, col_centering=True) 

(in such a case, the argument standard must be set to False. If not, the array will be scaled as well). Scaling without centering is quite unusual.

For bicentering, the command is

firefox
>>> L, V, Y, C = dio.pca(A, bicenter=True)      

These are the most usuful options for pretreatment.

Prescribed rank is simply called by (with standard pretreatment)

>>> rank = 10
>>> L, V, Y = dio.pca(A, k=rank)

for having the 10 first components and axis only.

revised 21.03.03 - 21.04.20