Metadata-Version: 2.4
Name: lipe
Version: 0.3.0
Summary: Latent Interview Protocol Engineer (LIPE)
Author: Zachary Lim
License: MIT
Project-URL: Repository, https://github.com/zacthinks/lipe
Keywords: topic-modeling,qualitative,interviews,nlp
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.23
Requires-Dist: pandas>=1.5
Requires-Dist: scipy>=1.9
Requires-Dist: scikit-learn>=1.2
Requires-Dist: umap-learn>=0.5
Requires-Dist: hdbscan>=0.8
Requires-Dist: sentence-transformers>=2.2
Requires-Dist: plotly>=5.0
Requires-Dist: networkx>=2.8
Requires-Dist: pyvis>=0.3
Requires-Dist: wordcloud>=1.9
Requires-Dist: spacy>=3.7
Requires-Dist: interactive-topic-model>=0.1.0
Dynamic: license-file

﻿# Latent Interview Protocol Engineer (LIPE)

![PyPI](https://img.shields.io/pypi/v/lipe)
![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)
![Python Version](https://img.shields.io/pypi/pyversions/lipe)


**Latent Interview Protocol Engineer (LIPE)** is a Python utility for mapping the latent structure of interview corpora. It builds on the [Interactive Topic Model](https://github.com/zacthinks/InteractiveTopicModel) to recover recurring protocol questions, support qualitative labeling/merging/splitting, and visualize interview flows. It's primary goal is to help researchers effectively navigate and explore their interview corpus.

- Mapping out the underlying structure of a corpus of interviews
	- The underlying assumption is that in interviews on a certain topic, there will be an underlying set of key questions that are explored. These questions often follow a certain logic (e.g., if interviewee says X then ask Y else ask Z). We refer to this underlying set of key questions and how they connect as the *interview protocol*. In more structured interviews, this protocol might be known before hand, but often, interviewers may diverge from it as the interview question molds to the nuances of the phenomena. Other times, there may not even be an explicit protocol to start with. The goal of LIPE is to recover this interview protocol as it is latently expressed across a corpus of interviews.
   - Protocol questions are distinct from incidental questions, such as specific follow up questions or clarification questions, in that protocol questions characterize the entire corpus of interviews. As such, we can expect protocol questions to occur in a significant portion of the interviews, even if they are worded slightly differently. 
   - Whether a question is protocol or incidental also depends on the researcher's conception of what questions are important. This is why it is important for researchers to be able to have full control over what is determined to be a protocol question.
   - LIPE provides an interface (building on the [Interactive Topic Model (ITM)](https://github.com/zacthinks/InteractiveTopicModel)) for researchers to qualitatively examine, label, merge, and split questions. Because LIPE is a subclass of ITM, you can refer to the ITM examples and documentation for more information.
- Navigating mapped out interview corpora
   - Once a corpus of interviews has been mapped out, researchers can use it to navigate the otherwise unwieldy corpus.
   - LIPE allows researchers to extract responses to particular protocol questions across the corpus. These responses include responses to any follow up questions.
   - LIPE also includes basic exploratory tools for the answers to particular questions including visualizations and topic models of the answers.
- LIPE takes as an input a structured dataframe where each row is a line from an interview. Each line needs to have an interview ID, line ID, and speaker ID in addition to the text for that line.

## Quick Start

### Installation

```bash
pip install lipe
```

### Basic Usage

```python
import pandas as pd
from LIPE import LIPE

lines = pd.read_csv("toy_interviews.csv")

l = LIPE(lines)

l.get_question_info()
```

## Features

- Recover protocol questions from interviewer lines
- Manual labeling, merging, splitting, and archiving of questions
- Extract answers by question (lines, merged, or sentence-level)
- Transition graph construction and visualization
- Answer-level analytics (TF-IDF, word clouds, LDA, embedding plots)

## Examples

See [`example.ipynb`](examples/example.ipynb) for a full workflow using a toy dataset.


## Contributing

Contributions are welcome! Please open an issue or pull request.

## Support & Contact

For questions, issues, or feature requests, please open an issue on GitHub or contact the maintainer at [zacthinks@outlook.com].

## License

MIT License
