Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory41.3 B

Variable types

NUM5

Reproduction

Analysis started2020-03-12 06:05:00.839849
Analysis finished2020-03-12 06:05:06.158345
Versionpandas-profiling v2.5.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Variables

a
Real number (ℝ≥0)

UNIQUE
Distinct count100
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5217951022108731
Minimum0.014636384701227967
Maximum0.9982151573421486
Zeros0
Zeros (%)0.0%
Memory size928.0 B

Quantile statistics

Minimum0.0146363847
5-th percentile0.07577299662
Q10.3273466411
median0.5130468168
Q30.7195448121
95-th percentile0.9625653707
Maximum0.9982151573
Range0.9835787726
Interquartile range (IQR)0.392198171

Descriptive statistics

Standard deviation0.268149702
Coefficient of variation (CV)0.5138984649
Kurtosis-0.9704911598
Mean0.5217951022
Median Absolute Deviation (MAD)0.2229684747
Skewness-0.06856949839
Sum52.17951022
Variance0.07190426268
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.01463638 0.99821516], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.1651171697 1 1.0%
 
0.1480252679 1 1.0%
 
0.5940523526 1 1.0%
 
0.5595833432 1 1.0%
 
0.6808084366 1 1.0%
 
0.5472695527 1 1.0%
 
0.8730897958 1 1.0%
 
0.1140745295 1 1.0%
 
0.9624279049 1 1.0%
 
0.4948157306 1 1.0%
 
Other values (90) 90 90.0%
 
ValueCountFrequency (%) 
0.0146363847 1 1.0%
 
0.04021255319 1 1.0%
 
0.05132556595 1 1.0%
 
0.05783541219 1 1.0%
 
0.06253453379 1 1.0%
 
ValueCountFrequency (%) 
0.9982151573 1 1.0%
 
0.9831905776 1 1.0%
 
0.9780570719 1 1.0%
 
0.9687640232 1 1.0%
 
0.965177221 1 1.0%
 

b
Real number (ℝ≥0)

UNIQUE
Distinct count100
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5365352442023252
Minimum0.01416011603044065
Maximum0.9659268412154978
Zeros0
Zeros (%)0.0%
Memory size928.0 B

Quantile statistics

Minimum0.01416011603
5-th percentile0.05136493031
Q10.3077798149
median0.5405829856
Q30.7880516894
95-th percentile0.9482418331
Maximum0.9659268412
Range0.9517667252
Interquartile range (IQR)0.4802718744

Descriptive statistics

Standard deviation0.2838002408
Coefficient of variation (CV)0.5289498573
Kurtosis-1.100279928
Mean0.5365352442
Median Absolute Deviation (MAD)0.241870727
Skewness-0.06310923135
Sum53.65352442
Variance0.0805425767
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.01416012 0.90568316 0.96592684], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.9595152223 1 1.0%
 
0.2857310937 1 1.0%
 
0.9154759747 1 1.0%
 
0.01416011603 1 1.0%
 
0.7396548084 1 1.0%
 
0.4224950534 1 1.0%
 
0.8341035428 1 1.0%
 
0.6108854553 1 1.0%
 
0.4995651186 1 1.0%
 
0.243781658 1 1.0%
 
Other values (90) 90 90.0%
 
ValueCountFrequency (%) 
0.01416011603 1 1.0%
 
0.01863789784 1 1.0%
 
0.01872780253 1 1.0%
 
0.02721137234 1 1.0%
 
0.03444677325 1 1.0%
 
ValueCountFrequency (%) 
0.9659268412 1 1.0%
 
0.9621690043 1 1.0%
 
0.9595152223 1 1.0%
 
0.955758438 1 1.0%
 
0.9525463492 1 1.0%
 

c
Real number (ℝ≥0)

UNIQUE
Distinct count100
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.47793724653487835
Minimum0.012540225290272433
Maximum0.9918251007879255
Zeros0
Zeros (%)0.0%
Memory size928.0 B

Quantile statistics

Minimum0.01254022529
5-th percentile0.03652304945
Q10.2310268368
median0.4725896425
Q30.7273476852
95-th percentile0.9170778933
Maximum0.9918251008
Range0.9792848755
Interquartile range (IQR)0.4963208484

Descriptive statistics

Standard deviation0.2865663632
Coefficient of variation (CV)0.5995899362
Kurtosis-1.168478967
Mean0.4779372465
Median Absolute Deviation (MAD)0.2464973509
Skewness0.08584619993
Sum47.79372465
Variance0.0821202805
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.01254023 0.9918251 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.3277253874 1 1.0%
 
0.7233180123 1 1.0%
 
0.2280619618 1 1.0%
 
0.2014470165 1 1.0%
 
0.09211328288 1 1.0%
 
0.606460476 1 1.0%
 
0.6184725859 1 1.0%
 
0.05900069075 1 1.0%
 
0.5687168232 1 1.0%
 
0.6700752913 1 1.0%
 
Other values (90) 90 90.0%
 
ValueCountFrequency (%) 
0.01254022529 1 1.0%
 
0.02244242453 1 1.0%
 
0.02421501622 1 1.0%
 
0.02525429794 1 1.0%
 
0.03624962806 1 1.0%
 
ValueCountFrequency (%) 
0.9918251008 1 1.0%
 
0.9836072115 1 1.0%
 
0.9829415195 1 1.0%
 
0.9508000727 1 1.0%
 
0.9295051978 1 1.0%
 

d
Real number (ℝ≥0)

UNIQUE
Distinct count100
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4735091751153356
Minimum0.01005692358975685
Maximum0.9703035898477324
Zeros0
Zeros (%)0.0%
Memory size928.0 B

Quantile statistics

Minimum0.01005692359
5-th percentile0.08863658133
Q10.2284389804
median0.477830546
Q30.6501625767
95-th percentile0.8837619216
Maximum0.9703035898
Range0.9602466663
Interquartile range (IQR)0.4217235963

Descriptive statistics

Standard deviation0.2460331541
Coefficient of variation (CV)0.5195953258
Kurtosis-0.8098688778
Mean0.4735091751
Median Absolute Deviation (MAD)0.1988813546
Skewness-0.0005005975408
Sum47.35091751
Variance0.06053231292
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.01005692 0.97030359], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.4776485534 1 1.0%
 
0.5331229471 1 1.0%
 
0.4096778008 1 1.0%
 
0.7295297924 1 1.0%
 
0.2763906535 1 1.0%
 
0.3314203812 1 1.0%
 
0.9703035898 1 1.0%
 
0.5873387461 1 1.0%
 
0.5759141427 1 1.0%
 
0.4875775633 1 1.0%
 
Other values (90) 90 90.0%
 
ValueCountFrequency (%) 
0.01005692359 1 1.0%
 
0.01116948476 1 1.0%
 
0.06907844988 1 1.0%
 
0.07544364778 1 1.0%
 
0.08839065659 1 1.0%
 
ValueCountFrequency (%) 
0.9703035898 1 1.0%
 
0.9624569582 1 1.0%
 
0.9497318499 1 1.0%
 
0.9222317551 1 1.0%
 
0.900087811 1 1.0%
 

e
Real number (ℝ≥0)

UNIQUE
Distinct count100
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5288583631413339
Minimum0.018348094296209205
Maximum0.9747194521275142
Zeros0
Zeros (%)0.0%
Memory size928.0 B

Quantile statistics

Minimum0.0183480943
5-th percentile0.03994839296
Q10.2905421171
median0.5350562462
Q30.7825261465
95-th percentile0.9666249084
Maximum0.9747194521
Range0.9563713578
Interquartile range (IQR)0.4919840294

Descriptive statistics

Standard deviation0.2954455543
Coefficient of variation (CV)0.5586477871
Kurtosis-1.194133776
Mean0.5288583631
Median Absolute Deviation (MAD)0.2546875694
Skewness-0.1145953326
Sum52.88583631
Variance0.08728807553
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.01834809 0.96769561 0.97471945], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.4311678549 1 1.0%
 
0.4231599667 1 1.0%
 
0.07479239725 1 1.0%
 
0.0183480943 1 1.0%
 
0.8207956025 1 1.0%
 
0.7404067816 1 1.0%
 
0.8911488286 1 1.0%
 
0.8369862828 1 1.0%
 
0.3987414924 1 1.0%
 
0.173272151 1 1.0%
 
Other values (90) 90 90.0%
 
ValueCountFrequency (%) 
0.0183480943 1 1.0%
 
0.02818261854 1 1.0%
 
0.03006211426 1 1.0%
 
0.03368911438 1 1.0%
 
0.03599039722 1 1.0%
 
ValueCountFrequency (%) 
0.9747194521 1 1.0%
 
0.9726062249 1 1.0%
 
0.97230883 1 1.0%
 
0.9694933881 1 1.0%
 
0.9688852726 1 1.0%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

abcde
00.5513620.3898970.1337360.0754440.225405
10.6972290.6639280.0447790.1309700.628501
20.3530780.2009530.5687170.2215700.230711
30.4933040.4733210.7877210.7615710.447126
40.4169970.2505600.8265920.3612980.074792
50.6691160.2437820.5860260.4652620.398741
60.3668360.9557580.4078570.1649650.772414
70.8238170.7628970.3752490.4611340.820796
80.7960290.8367070.0590010.8829030.324417
90.5081740.3817370.1626960.6044560.868987

Last rows

abcde
900.6379220.5902360.8988780.2191860.235753
910.2046780.9480150.0365370.7540060.752359
920.1622620.2871050.2014470.4622560.463122
930.4235460.9453190.6387750.7498920.076021
940.8206330.4995650.6700750.7757600.133355
950.1140750.5808270.1884340.3364270.256029
960.4490470.4219290.9918250.7202480.789838
970.9982150.2481630.6306920.7319060.173272
980.4643120.0781010.4152810.6171870.541404
990.8202120.2968630.2793860.4144340.018348