Correspondence Analysis with Diodon - a tutorial

This is an elementary tutorial for Correspondecne Analysis with Diodon.

Authors

Olivier Coulaud
Alain Franc
Jean-Marc Frigerio
Rémy Peressoni
Florent Pruvost

Contact and maintainer

Alain Franc, alain.franc@inrae.fr

Version

started: October, 29th, 2022
version: 22.10.29

Overview

The tutorial programm for running CoA is very short, and given here. It will be explained step by step along this notebok.

# importing library
import pydiodon as dio 
# loading dataset
infile  = "../data4tests/CoA_LMF82.txt"
A, rownames, colnames = dio.load_dataset("example_coa")
# running CoA
L, Y_r, Y_c = dio.coa(A)

It is followed by a few functions for plotting the results

# plotting the results
dio.plot_coa(Y_r, Y_c, rownames=rownames, colnames=colnames)

Importing python version of diodon

In [1]:
import pydiodon as dio
loading pydiodon - version 22.10.29

Loading dataset

Use terefor the function dio.load_dataset().

The name for the example for CoA is "example_coa"

In [2]:
A, rownames, colnames = dio.load_dataset("example_coa")

Analyzing dataset

In [3]:
L, Y_r, Y_c = dio.coa(A, k=3)
Y_r =
[[ 0.13968116 -0.17571191  0.2411423 ]
 [ 0.03584189  0.384053    0.34891427]
 [-0.20892723 -0.18443583  0.05542058]
 [-0.32498476 -0.0213525  -0.09968779]
 [ 0.10172804 -0.05901549 -0.07291212]
 [ 0.06525407 -0.12729234 -0.01156088]
 [ 0.23654214  0.00725767 -0.00846451]
 [ 0.03834502  0.20089456  0.2400401 ]
 [-0.00843841  0.4740163  -0.1293684 ]
 [-0.20652232  0.11679963  0.12245704]]

Y_c =
[[-0.32908994 -0.17447121  0.08634824]
 [ 0.0776667  -0.13837874 -0.0592031 ]
 [-0.40695846  0.13197546 -0.26248142]
 [ 0.0634421   0.14297687  0.03874988]
 [-0.11249182  0.00842469  0.11603206]
 [ 0.30175635 -0.1391285  -0.08589898]
 [ 0.21303375 -0.16318396 -0.01532873]
 [ 0.08340193 -0.11258196  0.03133089]]
In [4]:
dio.plot_coa(Y_r, Y_c, rownames=rownames, colnames=colnames)
plotting axis 1 and 2 ...

Eigenvalues

Eigenvalues are given in $L$, and are

In [5]:
print(L)
[0.04425764 0.02018083 0.00880238]

References

For CoA:

  • L. Lebart, A. Morineau, and N. Tabard. Techniques de la description statistique. Bordas - Dunod, 1977.
  • L. Lebart, A. Morineau, and J.-P. Fénelon. Traitement des données statistiques. Dunod, Paris, 1982.
  • L. Lebart, A. Morineau, and M. Piron. Statistique exploratoire multidimensionnelle. Dunod, Paris, 2000
  • M. Greenacre. Theory and Applications of Correspondence Analysis. Academic Press, 1984.
  • O. Nenadić and M. Greenacre. Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca Package. Journal of Statistical Software, 20(3):1–12, 2007.

For dataset:

Table 8, p. 306, in L. Lebart, A. Morineau, and J.-P. Fénelon (1982).

For softwares:

  • the python library is called pydiodon and can be downloaded from XXXX
  • the Cpp library is called Cppdiodon and can be downloaded from XXX
  • the "binding" of Cppdiodon with pydiodon is called Cppydiodon and can be downloaded from XXX

Further documentation:

The methods used in those libraries and their pseudocodes are explained in

A. Franc. Linear Dimensionality Reduction. Inria-Inrae Research report N° 9488; 2022 ; arXiv:2209.13597 https://doi.org/10.48550/arXiv.2209.13597

In [ ]: