Metadata-Version: 2.4
Name: lipe
Version: 0.2.0
Summary: Latent Interview Protocol Engineer (LIPE)
Author: Zachary Lim
License: MIT
Project-URL: Repository, https://github.com/zacthinks/lipe
Keywords: topic-modeling,qualitative,interviews,nlp
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.23
Requires-Dist: pandas>=1.5
Requires-Dist: scipy>=1.9
Requires-Dist: scikit-learn>=1.2
Requires-Dist: umap-learn>=0.5
Requires-Dist: hdbscan>=0.8
Requires-Dist: sentence-transformers>=2.2
Requires-Dist: plotly>=5.0
Requires-Dist: networkx>=2.8
Requires-Dist: pyvis>=0.3
Requires-Dist: wordcloud>=1.9
Requires-Dist: spacy>=3.7
Requires-Dist: interactive-topic-model>=0.1.0
Dynamic: license-file

﻿# Latent Interview Protocol Engineer (LIPE)

![PyPI](https://img.shields.io/pypi/v/lipe)
![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)
![Python Version](https://img.shields.io/pypi/pyversions/lipe)


**Latent Interview Protocol Engineer (LIPE)** is a Python utility for mapping the latent structure of interview corpora. It builds on the [Interactive Topic Model](https://github.com/zacthinks/InteractiveTopicModel) to recover recurring protocol questions, support qualitative labeling/merging/splitting, and visualize interview flows. It's primary goal is to help researchers effectively navigate and explore their interview corpus.

- Mapping out the underlying structure of a corpus of interviews
	- The underlying assumption is that in interviews on a certain topic, there will be an underlying set of key questions that are explored. These questions often follow a certain logic (e.g., if interviewer says X then ask Y else ask Z). We refer to this underlying set of key questions and how they connect as the *interview protocol*. In more structured interviews, this protocol might be known before hand, but often, interviewers may diverge from it as the interview question molds to the nuances of the phenomena. Other times, there may not even be an explicit protocol to start with. The goal of LIPE is to recover this interview protocol as it is latently expressed across a corpus of interviews.
   - Protocol questions are distinct from incidental questions, such as specific follow up questions or clarification questions, in that protocol questions characterize the entire corpus of interviews. As such, we can expect protocol questions to occur in a significant portion of the interviews, even if they are worded slightly differently. 
   - Whether a question is protocol or incidental also depends on the researcher's conception of what questions are important. This is why it is important for researchers to be able to have full control over what is determined to be a protocol question.
   - LIPE provides an interface (building on the [Interactive Topic Model](https://github.com/zacthinks/InteractiveTopicModel)) for researchers to qualitatively examine, label, merge, and split questions.
- Navigating mapped out interview corpora
   - Once a corpus of interviews has been mapped out, researchers can use it to navigate the otherwise unwieldy corpus.
   - LIPE allows researchers to extract responses to particular protocol questions across the corpus. These responses include responses to any follow up questions.
   - LIPE also includes basic exploratory tools for the answers to particular questions including visualizations and topic models of the answers.
- LIPE takes as an input a structured dataframe where each row is a line from an interview. Each line needs to have an interview ID, line ID, and speaker ID in addition to the text for that line.



## Requirements

LIPE requires Python 3.8+ and the following packages:

```
numpy, pandas, scipy, scikit-learn, umap-learn, hdbscan, sentence-transformers, plotly, networkx, pyvis, wordcloud, spacy, nbformat, interactive-topic-model
```

These will be installed automatically with pip, but see [requirements.txt](requirements.txt) for details.

## Quick Start

### Installation

```bash
pip install lipe
```

### Basic Usage

```python
import pandas as pd
from LIPE import LIPE

lines = pd.read_csv("toy_interviews.csv")

l = LIPE(lines)

l.get_question_info()
```


## API Reference

See the [example notebook](examples/example.ipynb) for a full workflow. For detailed API documentation, see the docstrings in [core.py](core.py) or use Python's `help()` function after importing LIPE.

## Features

- Recover protocol questions from interviewer lines
- Manual labeling, merging, splitting, and archiving of questions
- Extract answers by question (lines, merged, or sentence-level)
- Transition graph construction and visualization
- Answer-level analytics (TF-IDF, word clouds, LDA, embedding plots)

## Examples

See [`example.ipynb`](examples/example.ipynb) for a full workflow using a toy dataset.


## Contributing

Contributions are welcome! Please open an issue or pull request. See [CONTRIBUTING.md](CONTRIBUTING.md) if available, or contact the maintainer for guidance.

## Support & Contact

For questions, issues, or feature requests, please open an issue on GitHub or contact the maintainer at [your-email@example.com].

## License

MIT License
