JACCARD SIMILARITY
Jaccard Similarity is defined as the size of the intersection divided by the size of the union of the sets. J(A,B) = |A ∩ B| / |A ∪ B|.
False
Column Name | Jaccard Similarity | Result |
---|---|---|
fund_symbol | 1 | PASSED |
price_date | 0.980198 | FAILED |
nav_per_share | 0 | FAILED |
positive_change | 1 | PASSED |
The Jaccard similarity between the expected and tested dataframes is not 1 for all columns. This means that the expected and tested dataframes have different values for the same column(s).
COMPARISON FOR STRING COLUMNS
The string comparisons are done using the Levenshtein distance. The Levenshtein distance is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.
True
Column Name | Total Levenshtein Distance | Result |
---|---|---|
fund_symbol | 0 | PASSED |
The Levenshtein distance between the expected and tested dataframes is 0 for all columns.This means that the expected and tested dataframes have the same values for the same column(s).
COMPARISON FOR NUMERIC COLUMNS
The numeric comparisons are done using the Euclidean distance. The Euclidean distance is a measure of the straight line distance between two points in a space. Given two points P and Q with coordinates (p1, p2, ..., pn) and (q1, q2, ..., qn) respectively, the Euclidean distance d between P and Q is: d(P, Q) = sqrt((q1 - p1)² + (q2 - p2)² + ... + (qn - pn)²) Columns are scaled to have a unit norm (magnitude or length of 1) for calculation of the Euclidean Distance (Normalized).
False
Column Name | Euclidean Distance | Result |
---|---|---|
nav_per_share | 10 | FAILED |
The Euclidean distance between the expected and tested dataframes is not 0 for all columns. This means that the expected and tested dataframes have different numeric values in the same column(s).
COMPARISON FOR BOOLEAN COLUMNS
The boolean comparison is performed by performing an XNOR operation between the expected and tested dataframes. If the XNOR operation returns True, it means that the expected and tested dataframes have the same boolean values in the same column(s).
False
Column Name | Total XNOR | Result |
---|---|---|
positive_change | 1 | FAILED |
The XNOR operation between the expected and tested dataframes is not True for all columns. This means that the expected and tested dataframes have different boolean values for the same column(s).