{{ text_output }}
Analyses Included in Comparison
{{ result.name }}
{{ result.themes|length }} themes
-
{% for t in result.themes %}
- {{ t.name }} {% endfor %}
Theme Network (UMAP Projection)
2-D UMAP projection of theme embeddings from each analysis, shown in different colours. Each point represents a theme; proximity reflects semantic similarity in the original embedding space.
Pairwise Comparisons
Select a pair to view detailed comparison metrics.
{{ comp.a.name }} vs {{ comp.b.name }}
Angular Similarity
Angular distance uses the angle between embedding vectors (arccos of cosine), normalised to [0,1]. Unlike cosine, it satisfies the triangle inequality, making it a proper metric for averaging and comparison.
Summary Statistics
Thematic analysis doesn't have ground truth, so traditional precision/recall don't apply. Instead, we measure coverage (did themes find matches?) and fidelity (how close are the best matches?). Based on {{ comp.stats.similarity_metric }} similarity.
Proportion of themes with at least one match above threshold ({{ comparison.config.threshold }})
- Hit Rate A: {{ "%.1f"|format(comp.stats.hit_rate_a * 100) }}%
- Hit Rate B: {{ "%.1f"|format(comp.stats.hit_rate_b * 100) }}%
- Pair Match Rate: {{ "%.3f"|format(comp.stats.jaccard) }} (pairs above threshold / total pairs)
High hit rates indicate both analyses found similar conceptual territory. Pair match rate shows the density of above-threshold pairs across all possible theme combinations.
How close are the best matches? (Mean of each theme's best match similarity)
- A→B: {{ "%.3f"|format(comp.stats.mean_max_sim_a_to_b) }}
- B→A: {{ "%.3f"|format(comp.stats.mean_max_sim_b_to_a) }}
- Fidelity: {{ "%.3f"|format(comp.stats.fidelity) }}
Fidelity is the harmonic mean of directional scores. Higher = tighter semantic alignment.
{{ comp.stats.similarity_matrix }}
Best Matches (1:1)
The Hungarian algorithm finds the optimal one-to-one pairing that maximizes total similarity. Each theme maps to at most one theme in the other set -- no reuse allowed.
What this enables: Hungarian matching removes ambiguity by assigning each theme to at most one partner. Coverage metrics show what proportion of each set found a good match.
Limitation: This penalises legitimate theme refinement (splitting one theme into two is treated as unmatched). Use OT if you want to allow many-to-many alignment.
{{ "%.3f"|format(comp.stats.hungarian.soft_metrics.soft_precision) }}
Average similarity of optimal pairs
Interpretation: "How good are the best one-to-one correspondences?" Higher = tighter semantic alignment between the two theme sets.
{% if comp.stats.hungarian.distribution.n_pairs > 0 %}Distribution of {{ comp.stats.hungarian.distribution.n_pairs }} optimal pairs:
- Median: {{ "%.3f"|format(comp.stats.hungarian.distribution.median) }} (Q1: {{ "%.3f"|format(comp.stats.hungarian.distribution.q1) }}, Q3: {{ "%.3f"|format(comp.stats.hungarian.distribution.q3) }})
- Range: {{ "%.3f"|format(comp.stats.hungarian.distribution.min) }} -- {{ "%.3f"|format(comp.stats.hungarian.distribution.max) }}
Based on {{ comp.stats.hungarian.distribution.n_pairs }} matched pairs above threshold ({{ comparison.config.threshold }})
{% set cov_a = comp.stats.hungarian.thresholded_metrics.recall %} {% set cov_b = comp.stats.hungarian.thresholded_metrics.precision %} {% set mean_cov = (cov_a + cov_b) / 2 %}{{ "%.0f"|format(cov_a * 100) }}%
Coverage A
(A themes matched)
{{ "%.0f"|format(cov_b * 100) }}%
Coverage B
(B themes matched)
{{ "%.3f"|format(comp.stats.hungarian.thresholded_metrics.true_jaccard) }}
Jaccard Index
(set overlap)
Jaccard Index = matched / (|A| + |B| - matched). Measures overlap between theme sets after 1:1 assignment. Higher = more themes found good partners.
Hungarian algorithm finds the optimal one-to-one assignment.
| Theme in {{ comp.a.name }} | Theme in {{ comp.b.name }} | Angular Similarity |
|---|---|---|
|
{{ theme_a.theme_name }} {{ theme_a.embedded_string }} |
{{ theme_b.theme_name }} {{ theme_b.embedded_string }} |
{{ "%.3f"|format(similarity) }} |
No optimal pairs found.
{% endif %}Unbalanced Optimal Transport (Many-to-Many Alignment)
Unbalanced Optimal Transport allows themes to remain unmatched, representing genuinely novel or missing concepts. Unlike balanced OT (which forces all mass to transport), unbalanced OT permits themes to be left out when no good match exists. The reg_m (K) parameter controls the penalty for leaving mass unmatched.
⚠ Curve may not have plateaued -- elbow estimates may be less reliable{% endif %}
These plots show how shared mass and alignment change as K varies. Baseline curves show the paraphrase ceiling (green, best case) and word-salad floor (red, random baseline) -- both also vary with K because the OT mass penalty affects all comparisons equally.
How much thematic content is matched. Higher = more themes matched. Elbow markers: ◆ chord, ▲ dim. returns.
Quality of matches (1 - cost). Higher = better semantic similarity between matched themes.
Average targets per theme. 1.0 = perfect 1:1 matching. Higher = more many-to-many relationships.
| K | Shared Mass | Alignment | Mass % ceiling | Align % ceiling | Splits/Joins |
|---|---|---|---|---|---|
| {{ "%.2f"|format(k_val) }}{% if k_val == comp.stats.default_k %} ■{% endif %}{% if k_val == comp.stats.chord_k %} ◆{% endif %}{% if k_val == comp.stats.diminishing_k %} ▲{% endif %} | {{ "%.1f"|format(ot_k.ot.shared_mass * 100) }}% | {% if ot_k.ot.alignment_observed is defined %}{{ "%.2f"|format(ot_k.ot.alignment_observed) }}{% else %}{{ "%.2f"|format(1 - ot_k.ot.avg_cost) }}{% endif %} | {% if ot_k.ot.shared_mass_pct_of_ceiling is defined %}{{ "%.0f"|format(ot_k.ot.shared_mass_pct_of_ceiling * 100) }}%{% elif ot_k.ot.shared_mass_relative is defined %}{{ "%.0f"|format(ot_k.ot.shared_mass_relative * 100) }}%{% else %}-{% endif %} | {% if ot_k.ot.alignment_pct_of_ceiling is defined %}{{ "%.0f"|format(ot_k.ot.alignment_pct_of_ceiling * 100) }}%{% elif ot_k.ot.avg_cost_relative is defined %}{{ "%.0f"|format(ot_k.ot.avg_cost_relative * 100) }}%{% else %}-{% endif %} | {{ "%.1f"|format((ot_k.split_join_stats.splits_from_a.mean + ot_k.split_join_stats.joins_to_b.mean) / 2) }} |
■ = Default K, ◆ = Chord elbow, ▲ = Dim. returns. % of ceiling = how close to paraphrase baseline (100% = as good as identical meaning). Higher K forces more transport; lower K allows more unmatched themes.
LLM-generated paraphrases of each theme establish a realistic upper bound for alignment. Paraphrases capture the same meaning in different words -- this represents the best achievable similarity between semantically equivalent analyses.
Baseline Statistics
- Mean self-similarity: {{ "%.3f"|format(comp.stats.paraphrase_baseline.paraphrase_similarity_mean) }}
- Std dev: {{ "%.3f"|format(comp.stats.paraphrase_baseline.paraphrase_similarity_std) }}
- Model: {{ comp.stats.paraphrase_baseline.metadata.model }}
- Paraphrases per theme: {{ comp.stats.paraphrase_baseline.metadata.n_paraphrases }}
Interpretation
The paraphrase ceiling represents the best realistic case -- comparing themes to their own LLM paraphrases (same meaning, different words). If observed alignment reaches this level, the analyses are semantically equivalent.
{{ comp.a.name }} -- Sample Themes with Paraphrases
{{ comp.b.name }} -- Sample Themes with Paraphrases
Word salad is generated by randomly shuffling words from themes, destroying semantic meaning. This represents what you'd expect from random text with similar vocabulary -- a floor below which alignment cannot meaningfully fall.
Generation Method
- Samples generated: {{ comp.stats.word_salad_samples|length }}
- Themes per sample: {{ comp.stats.word_salad_samples[0]|length }}
- Method: Words randomly shuffled while preserving theme length
Interpretation
If observed alignment is close to the word-salad floor, the themes may not share meaningful semantic content. The further above this baseline, the more genuine the semantic similarity.
All {{ comp.stats.word_salad_samples|length }} Word Salad Samples
Each sample contains {{ comp.stats.word_salad_samples[0]|length }} scrambled "themes" (matching B's theme count). Words from B's themes are randomly shuffled while preserving length.
{{ "%.0f"|format(ot_k.ot.shared_mass_pct_of_ceiling * 100) }}%
of paraphrase ceiling
best-case: identical meaning, different words
- Observed: {{ "%.1f"|format(ot_k.ot.shared_mass * 100) }}%
- Paraphrase ceiling: {{ "%.1f"|format(ot_k.ot.paraphrase_upper_bound * 100) }}%
- Word-salad floor: {{ "%.1f"|format(ot_k.ot.null_shared_mass_mean * 100) }}%
- vs word-salad: {{ "%.0f"|format(ot_k.ot.shared_mass_improvement_vs_null * 100) }}% of possible improvement
- Effect size: {{ "%.1f"|format(ot_k.ot.shared_mass_effect) }} MADs
{{ "%.1f"|format(ot_k.ot.shared_mass * 100) }}%
shared mass
- Word-salad floor: {{ "%.1f"|format(ot_k.ot.null_shared_mass_mean * 100) }}%
- vs word-salad: +{{ "%.1f"|format(ot_k.ot.shared_mass_excess * 100) }}pp
- Effect size: {{ "%.1f"|format(ot_k.ot.shared_mass_effect) }} MADs
Paraphrase baseline not available -- showing raw metrics.
{% else %}{{ "%.1f"|format(ot_k.ot.shared_mass * 100) }}%
Shared Mass
{{ "%.0f"|format(ot_k.ot.alignment_pct_of_ceiling * 100) }}%
of paraphrase ceiling
quality of theme-to-theme matches
- Observed: {{ "%.2f"|format(ot_k.ot.alignment_observed) }}
- Paraphrase ceiling: {{ "%.2f"|format(ot_k.ot.alignment_paraphrase_ceiling) }}
- Word-salad floor: {{ "%.2f"|format(ot_k.ot.alignment_null_floor) }}
- vs word-salad: {{ "%.0f"|format(ot_k.ot.alignment_improvement_vs_null * 100) }}% of possible improvement
{{ "%.1f"|format(fallback_alignment * 100) }}%
alignment (1 - cost)
- Word-salad floor: {{ "%.1f"|format(fallback_floor * 100) }}%
- vs word-salad: +{{ "%.1f"|format((fallback_alignment - fallback_floor) * 100) }}pp better
Paraphrase baseline not available -- showing raw alignment vs word-salad floor.
{% else %}{{ "%.3f"|format(ot_k.ot.avg_cost) }}
Average Cost
Semantic alignment measures the quality of theme-to-theme matches (computed as 1 - transport cost). Higher alignment = better semantic similarity between matched themes.
% of paraphrase ceiling (headline) = observed / ceiling. Shows what fraction of the best-case alignment was achieved. 100% would mean matches as semantically close as paraphrases.
% of possible improvement (vs word-salad) = (observed - floor) / (ceiling - floor). Shows progress from random baseline toward the ceiling.
Paraphrase ceiling = alignment when comparing themes to their own paraphrases. Word-salad floor = alignment when comparing to randomly shuffled words.
{% else %}Semantic alignment measures how well matched themes relate to each other. Higher = better semantic similarity.
{% endif %}Themes in A flowing to multiple themes in B
- Mean: {{ "%.2f"|format(ot_k.split_join_stats.splits_from_a.mean) }}
- Median: {{ "%.1f"|format(ot_k.split_join_stats.splits_from_a.median) }}
- Mode: {{ ot_k.split_join_stats.splits_from_a.mode }}
- Max: {{ ot_k.split_join_stats.splits_from_a.max }}
- Themes with >1 target: {{ ot_k.split_join_stats.splits_from_a.n_multiple }}/{{ ot_k.split_join_stats.splits_from_a.total }} ({{ "%.0f"|format(ot_k.split_join_stats.splits_from_a.pct_multiple * 100) }}%)
Distribution (# targets → # themes)
Themes in B receiving from multiple themes in A
- Mean: {{ "%.2f"|format(ot_k.split_join_stats.joins_to_b.mean) }}
- Median: {{ "%.1f"|format(ot_k.split_join_stats.joins_to_b.median) }}
- Mode: {{ ot_k.split_join_stats.joins_to_b.mode }}
- Max: {{ ot_k.split_join_stats.joins_to_b.max }}
- Themes with >1 source: {{ ot_k.split_join_stats.joins_to_b.n_multiple }}/{{ ot_k.split_join_stats.joins_to_b.total }} ({{ "%.0f"|format(ot_k.split_join_stats.joins_to_b.pct_multiple * 100) }}%)
Distribution (# sources → # themes)
How we calibrate alignment: To interpret the observed alignment, we compare against two reference points. The floor is a null baseline -- random "word-salad" sentences constructed from words in the themes, representing what we'd see by chance. The ceiling is a best-case baseline -- LLM-generated paraphrases that retain the original meaning but use different wording, representing the maximum similarity we'd expect between genuinely equivalent themes.
Shared Mass
{{ "%.0f"|format(ot_k.ot.shared_mass_pct_of_ceiling * 100) }}% of paraphrase ceiling
The observed shared mass ({{ "%.1f"|format(ot_k.ot.shared_mass * 100) }}%) is {{ "%.0f"|format(ot_k.ot.shared_mass_pct_of_ceiling * 100) }}% of the paraphrase ceiling ({{ "%.1f"|format(ot_k.ot.paraphrase_upper_bound * 100) }}%).
vs word-salad ({{ "%.1f"|format(ot_k.ot.null_shared_mass_mean * 100) }}%): {{ "%.0f"|format(ot_k.ot.shared_mass_improvement_vs_null * 100) }}% of possible improvement over word-salad.
Alignment Quality
{% set alignment_obs = ot_k.ot.alignment_observed if ot_k.ot.alignment_observed is defined else (1 - ot_k.ot.avg_cost) %} {% set alignment_ceiling = ot_k.ot.alignment_paraphrase_ceiling if ot_k.ot.alignment_paraphrase_ceiling is defined else none %} {% set alignment_floor = ot_k.ot.alignment_null_floor if ot_k.ot.alignment_null_floor is defined else (1 - ot_k.ot.null_avg_cost_mean) %} {% if ot_k.ot.alignment_pct_of_ceiling is defined %}{{ "%.0f"|format(ot_k.ot.alignment_pct_of_ceiling * 100) }}% of paraphrase ceiling
The observed alignment ({{ "%.1f"|format(alignment_obs * 100) }}%) is {{ "%.0f"|format(ot_k.ot.alignment_pct_of_ceiling * 100) }}% of the paraphrase ceiling ({{ "%.1f"|format(alignment_ceiling * 100) }}%).
{% if ot_k.ot.alignment_improvement_vs_null is defined %}vs word-salad ({{ "%.1f"|format(alignment_floor * 100) }}%): {{ "%.0f"|format(ot_k.ot.alignment_improvement_vs_null * 100) }}% of possible improvement over word-salad.
{% endif %} {% else %}{{ "%.1f"|format(alignment_obs * 100) }}% observed
Alignment data not available for this K value.
{% endif %}Raw Values (K={{ "%.2f"|format(k_val) }})
| Baseline | Shared Mass | Alignment |
|---|---|---|
| Paraphrase ceiling (best realistic) | {{ "%.1f"|format(ot_k.ot.paraphrase_upper_bound * 100) }}% | {% if ot_k.ot.alignment_paraphrase_ceiling is defined %}{{ "%.2f"|format(ot_k.ot.alignment_paraphrase_ceiling) }}{% else %}-{% endif %} |
| Observed (A ↔ B) | {{ "%.1f"|format(ot_k.ot.shared_mass * 100) }}% | {% if ot_k.ot.alignment_observed is defined %}{{ "%.2f"|format(ot_k.ot.alignment_observed) }}{% else %}{{ "%.2f"|format(1 - ot_k.ot.avg_cost) }}{% endif %} |
| Word-salad floor (random) | {{ "%.1f"|format(ot_k.ot.null_shared_mass_mean * 100) }}% | {% if ot_k.ot.alignment_null_floor is defined %}{{ "%.2f"|format(ot_k.ot.alignment_null_floor) }}{% else %}{{ "%.2f"|format(1 - ot_k.ot.null_avg_cost_mean) }}{% endif %} |
| % of ceiling (observed/ceiling) | {% if ot_k.ot.shared_mass_pct_of_ceiling is defined %} {{ "%.0f"|format(ot_k.ot.shared_mass_pct_of_ceiling * 100) }}% {% else %}-{% endif %} | {% if ot_k.ot.alignment_pct_of_ceiling is defined %} {{ "%.0f"|format(ot_k.ot.alignment_pct_of_ceiling * 100) }}% {% else %}-{% endif %} |
| % of possible improvement (from word-salad floor) | {% if ot_k.ot.shared_mass_improvement_vs_null is defined %} {{ "%.0f"|format(ot_k.ot.shared_mass_improvement_vs_null * 100) }}% {% else %}-{% endif %} | {% if ot_k.ot.alignment_improvement_vs_null is defined %} {{ "%.0f"|format(ot_k.ot.alignment_improvement_vs_null * 100) }}% {% else %}-{% endif %} |
Shared Mass How much thematic content could be matched between the two analyses. Higher = more overlap.
Alignment Quality of theme-to-theme matches (1 - transport cost). Higher = better semantic similarity between matched themes.
Paraphrase ceiling: The best realistic case -- comparing themes to their own LLM paraphrases (same meaning, different words).
Word-salad floor: Random baseline -- comparing to shuffled words with no semantic meaning.
{% if comp.stats.word_salad_samples %}Each sample contains {{ comp.stats.word_salad_samples[0]|length }} scrambled "themes" (matching B's theme count). Words from B's themes are randomly shuffled while preserving length.
{% for sample_idx, sample in enumerate(comp.stats.word_salad_samples) %}Paraphrase Upper Bound
LLM-generated paraphrases of each theme establish a realistic upper bound for alignment. Paraphrases capture the same meaning in different words -- this represents the best achievable similarity between semantically equivalent analyses.
- Mean self-similarity: {{ "%.3f"|format(comp.stats.paraphrase_baseline.paraphrase_similarity_mean) }} (paraphrase ceiling)
- Std dev: {{ "%.3f"|format(comp.stats.paraphrase_baseline.paraphrase_similarity_std) }}
- Model: {{ comp.stats.paraphrase_baseline.metadata.model }}
- Paraphrases per theme: {{ comp.stats.paraphrase_baseline.metadata.n_paraphrases }}
Sample themes with their LLM-generated paraphrases. Self-similarity shown for each theme (similarity between original and paraphrases).
{{ comp.a.name }}:
{% for sample in comp.stats.paraphrase_baseline.samples_a[:3] %}{{ comp.b.name }}:
{% for sample in comp.stats.paraphrase_baseline.samples_b[:3] %}Effect Sizes
How far is the observed value from the baselines? Measured in MADs (median absolute deviations).
| Metric | MADs above floor | MADs below ceiling |
|---|---|---|
| Shared Mass | +{{ "%.1f"|format(ot_k.ot.shared_mass_effect) }} | {% if ot_k.ot.paraphrase_upper_bound is defined and comp.stats.paraphrase_baseline and comp.stats.paraphrase_baseline.paraphrase_similarity_std > 0 %} -{{ "%.1f"|format((ot_k.ot.paraphrase_upper_bound - ot_k.ot.shared_mass) / comp.stats.paraphrase_baseline.paraphrase_similarity_std) }} {% else %}-{% endif %} |
| Alignment | {% if ot_k.ot.avg_cost_effect is defined %}+{{ "%.1f"|format(ot_k.ot.avg_cost_effect) }}{% else %}-{% endif %} | {% if ot_k.ot.paraphrase_cost_lower_bound is defined and comp.stats.paraphrase_baseline and comp.stats.paraphrase_baseline.paraphrase_similarity_std > 0 %} {% set observed_alignment = 1 - ot_k.ot.avg_cost %} {% set ceiling_alignment = 1 - ot_k.ot.paraphrase_cost_lower_bound %} -{{ "%.1f"|format((ceiling_alignment - observed_alignment) / comp.stats.paraphrase_baseline.paraphrase_similarity_std) }} {% else %}-{% endif %} |
MADs above floor: How many MADs above word-salad baseline (higher = more distinct from random).
MADs below ceiling: How many MADs below paraphrase ceiling (lower = closer to ideal).
Note: MAD (median absolute deviation) is a robust measure of spread, less sensitive to outliers than standard deviation. Do not compare effect sizes across analyses with different embedding lengths.
Embedding Metadata
- Mean embedding length A: {{ "%.1f"|format(comp.stats.mean_embedding_words_a) }} words
- Mean embedding length B: {{ "%.1f"|format(comp.stats.mean_embedding_words_b) }} words
Transport Flow (Sankey)
Width of links shows amount of mass transported between themes. Colour indicates alignment quality (green = high similarity, red = low similarity). Hover over links for details.
▶ About the colour scaleColour scale calibration. Link colours represent the cosine similarity between connected themes, mapped to a green-amber-red gradient. To ensure comparability across K values, all plots use a shared colour scale derived from the default K={{ "%.2f"|format(comp.stats.default_k) }} transport plan.
The scale endpoints are set to the minimum and maximum similarity values observed among links in the default K solution (similarity range: {{ "%.2f"|format(comp.stats.color_sim_min) }}--{{ "%.2f"|format(comp.stats.color_sim_max) }}). Green indicates the highest-similarity matches; red indicates the lowest-similarity matches within this analysis.
This calibration means that as K increases and additional lower-quality matches are transported, these appear as progressively redder links. At low K values, only the best matches (greenest links) are transported; higher K values force the algorithm to include weaker alignments. The consistent scale across K values allows direct visual comparison of match quality.
Note: Because the scale is normalised to each comparison's observed range, colours are not directly comparable across different pairwise comparisons. Within a single comparison, however, the colour scale provides an intuitive representation of relative alignment quality across the full range of K values examined.
Transport Plan Heatmap
Each cell shows percentage of transported mass flowing from A to B theme. Values sum to 100%.
For each theme, how much of its mass was transported? Low coverage = theme is conceptually distinct from the other set.
| Theme | Coverage |
|---|---|
| {{ theme.theme_name }} | {{ "%.2f"|format(ot_k.ot.coverage_a[i]) }} |
| Theme | Coverage |
|---|---|
| {{ theme.theme_name }} | {{ "%.2f"|format(ot_k.ot.coverage_b[i]) }} |
Best Matches (many:many)
Shows best match for each theme, allowing multiple themes to match the same target. OT columns show mass flow from optimal transport (default K={{ "%.2f"|format(comp.stats.default_k) }}).
For each theme in {{ comp.a.name }}, the most similar theme in {{ comp.b.name }}
| Theme in {{ comp.a.name }} | Best Match in {{ comp.b.name }} | Sim | % Mass Transferred | Coverage |
|---|---|---|---|---|
|
{{ theme_a.theme_name }} {{ theme_a.embedded_string }} |
{{ theme_b.theme_name }} {{ theme_b.embedded_string }} |
{{ "%.2f"|format(match.similarity) }} | {{ "%.0f"|format(match.mass_pct) }}% | {{ "%.1f"|format(match.mass_total * 100) }}% |
For each theme in {{ comp.b.name }}, the most similar theme in {{ comp.a.name }}
| Theme in {{ comp.b.name }} | Best Match in {{ comp.a.name }} | Sim | % Mass Transferred | Coverage |
|---|---|---|---|---|
|
{{ theme_b.theme_name }} {{ theme_b.embedded_string }} |
{{ theme_a.theme_name }} {{ theme_a.embedded_string }} |
{{ "%.2f"|format(match.similarity) }} | {{ "%.0f"|format(match.mass_pct) }}% | {{ "%.1f"|format(match.mass_total * 100) }}% |
Additional Distance Metrics
Alternative distance functions for specialised analyses.
Shepard Similarity (k={{ comp.stats.shepard_k_value }})
Exponential decay on angular distance. Cognitively realistic similarity function.
Within-set baseline: Mean = {{ "%.3f"|format(comp.stats.within_set_stats.mean) }}, SD = {{ "%.3f"|format(comp.stats.within_set_stats.std) }}
Percentile-Normalized
Cross-set similarity relative to within-set distribution. 0.80 = more similar than 80% of within-set pairs.
Z-Score Normalized
Standard deviations above/below typical within-set similarity. Useful for identifying outliers.
Comparison Configuration
{{ comparison.config | tojson(indent=2) }}
Additional Data
Download raw data files for further analysis.