This visualization shows the similarity of {{context_type}}s in the {{#corpus_link}}{{corpus_name}}{{/corpus_link}} {{^corpus_link}}{{corpus_name}}{{/corpus_link}}.
This visualization shows the similarity of {{context_type}}s in the {{#corpus_link}}{{corpus_name}}{{/corpus_link}} {{^corpus_link}}{{corpus_name}}{{/corpus_link}} to .
Use the search box below to select an entry or click the button to populate the box with a random {{context_type}}.
The color bands within each article's row show the topic distribution within that article, and the relative sizes of each band indicates the weight of that topic in the article. The total width of each row indicates similarity to the focal topic or document, measured by the quantity sim(doc) = 1 – JSD(doc, ), where JSD is the Jensen-Shannon distance between the word probability distributions of each item. Each topic's label is arbitrarily assigned, while the color corresponds to the topic's oscillation across documents. Both label and color are consistent across articles in the browser.
Display options include topic normalization, alphabetical sort and topic sort. By normalizing topics, the combined width of each bar expands so that topic weights per document can be compared. By clicking a topic, the documents will reorder acoording to that topic's weight and topic bars will reorder according to the topic weights in the highest weighted document. When a topic is selected, clicking "Top Documents for [Topic]" will take you to a new page showing the most similar documents to that topic's word distribution. The original sort order can be restored with the "Reset Topic Sort" button.
This data was generated using the InPhO VSM module's LDA implementation. See Wikipedia: Latent Dirichlet Allocation for more on the LDA topic modeling approach or "Probabilistic Topic Models" (Blei, 2012) for a recent review.
Source code and issue tracking for this page available at GitHub. Contact Jaimie Murdock (jammurdo) with further comments.