Metadata-Version: 2.4
Name: design-research-analysis
Version: 0.1.1
Summary: Utilities for design research analysis workflows
Author: The Design Research Collective
Maintainer-email: "Christopher C. McComb" <ccm@cmu.edu>
License-Expression: MIT
Project-URL: Homepage, https://cmudrc.github.io/design-research-analysis/
Project-URL: Documentation, https://cmudrc.github.io/design-research-analysis/
Project-URL: Repository, https://github.com/cmudrc/design-research-analysis
Project-URL: Issues, https://github.com/cmudrc/design-research-analysis/issues
Keywords: analysis,design-research,hmm,markov,sequence
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: matplotlib<4,>=3.8
Requires-Dist: numpy<3,>=1.26
Provides-Extra: dev
Requires-Dist: build<2,>=1.2; extra == "dev"
Requires-Dist: mypy<2,>=1.10; extra == "dev"
Requires-Dist: pre-commit<5,>=3.7; extra == "dev"
Requires-Dist: pydata-sphinx-theme<1,>=0.16; extra == "dev"
Requires-Dist: pytest<9,>=8.2; extra == "dev"
Requires-Dist: pytest-cov<8,>=7.0; extra == "dev"
Requires-Dist: ruff<1,>=0.6.0; extra == "dev"
Requires-Dist: sphinx<9,>=7.4; extra == "dev"
Requires-Dist: sphinx-copybutton<1,>=0.5; extra == "dev"
Requires-Dist: sphinx-rtd-theme<4,>=2.0; extra == "dev"
Requires-Dist: twine<7,>=5.1; extra == "dev"
Provides-Extra: table
Provides-Extra: data
Requires-Dist: pandas<3,>=2.2; extra == "data"
Provides-Extra: seq
Requires-Dist: hmmlearn<1,>=0.3.2; extra == "seq"
Requires-Dist: networkx<4,>=3.3; extra == "seq"
Requires-Dist: scipy<2,>=1.11; extra == "seq"
Provides-Extra: embeddings
Requires-Dist: sentence-transformers<4,>=3.0; extra == "embeddings"
Provides-Extra: lang
Requires-Dist: scikit-learn<2,>=1.5; extra == "lang"
Provides-Extra: maps
Requires-Dist: pacmap<1,>=0.8; extra == "maps"
Requires-Dist: scikit-learn<2,>=1.5; extra == "maps"
Requires-Dist: trimap<2,>=1.1; extra == "maps"
Requires-Dist: umap-learn<1,>=0.5.7; extra == "maps"
Provides-Extra: dimred
Requires-Dist: pacmap<1,>=0.8; extra == "dimred"
Requires-Dist: scikit-learn<2,>=1.5; extra == "dimred"
Requires-Dist: trimap<2,>=1.1; extra == "dimred"
Requires-Dist: umap-learn<1,>=0.5.7; extra == "dimred"
Provides-Extra: stats
Requires-Dist: scipy<2,>=1.11; extra == "stats"
Requires-Dist: statsmodels<1,>=0.14; extra == "stats"
Requires-Dist: pandas<3,>=2.2; extra == "stats"
Provides-Extra: all
Requires-Dist: hmmlearn<1,>=0.3.2; extra == "all"
Requires-Dist: networkx<4,>=3.3; extra == "all"
Requires-Dist: scipy<2,>=1.11; extra == "all"
Requires-Dist: sentence-transformers<4,>=3.0; extra == "all"
Requires-Dist: pacmap<1,>=0.8; extra == "all"
Requires-Dist: scikit-learn<2,>=1.5; extra == "all"
Requires-Dist: trimap<2,>=1.1; extra == "all"
Requires-Dist: umap-learn<1,>=0.5.7; extra == "all"
Requires-Dist: statsmodels<1,>=0.14; extra == "all"
Requires-Dist: pandas<3,>=2.2; extra == "all"
Dynamic: license-file

# design-research-analysis
[![CI](https://github.com/cmudrc/design-research-analysis/actions/workflows/ci.yml/badge.svg)](https://github.com/cmudrc/design-research-analysis/actions/workflows/ci.yml)
[![Coverage](https://raw.githubusercontent.com/cmudrc/design-research-analysis/main/.github/badges/coverage.svg)](https://github.com/cmudrc/design-research-analysis/actions/workflows/ci.yml)
[![Examples Passing](https://raw.githubusercontent.com/cmudrc/design-research-analysis/main/.github/badges/examples-passing.svg)](https://github.com/cmudrc/design-research-analysis/actions/workflows/examples.yml)
[![Public API In Examples](https://raw.githubusercontent.com/cmudrc/design-research-analysis/main/.github/badges/examples-api-coverage.svg)](https://github.com/cmudrc/design-research-analysis/actions/workflows/examples.yml)
[![Docs](https://github.com/cmudrc/design-research-analysis/actions/workflows/docs-pages.yml/badge.svg)](https://github.com/cmudrc/design-research-analysis/actions/workflows/docs-pages.yml)

<!-- release-callout:start -->
> [!IMPORTANT]
> Current monthly release: [Mellon Metrics - May 2026](https://github.com/cmudrc/design-research-analysis/milestone/2)  
> Due: May 1, 2026  
> Tracks: April 2026 work
<!-- release-callout:end -->

`design-research-analysis` is the unified-table analysis layer in the cmudrc design research ecosystem.

It provides typed, reusable workflows for sequence, language, embedding-map, and statistical analysis over recurring event logs.

## Overview

This package centers on reproducible analysis workflows with a small top-level API:

- Unified-table coercion, validation, and mapper-based derived columns
- Dataset profiling, schema checks, and codebook generation
- Sequence modeling (Markov chains, discrete HMM, Gaussian HMM)
- Language analysis (semantic convergence trajectories, topic modeling, sentiment scoring)
- Embedding maps (PCA, t-SNE, UMAP, PaCMAP, TriMap) with clustering, comparison, and trajectory-plotting helpers
- Statistical wrappers (group comparisons, OLS regression, mixed-effects models, nonparametrics, and power)
- Runtime provenance capture for reproducibility manifests
- A thin CLI for deterministic pipeline runs

## Quickstart

Requires Python 3.12+.
Maintainer workflows target Python `3.12` (`.python-version`).

Install from PyPI:

```bash
python -m pip install --upgrade pip
pip install design-research-analysis
```

Common install profiles:

```bash
pip install "design-research-analysis[seq]"
pip install "design-research-analysis[lang,embeddings]"
pip install "design-research-analysis[maps]"
pip install "design-research-analysis[stats,data]"
pip install "design-research-analysis[all]"
```

For contributor workflows:

```bash
python -m venv .venv
source .venv/bin/activate
make dev
make test
```

Run a compact end-to-end example:

```bash
PYTHONPATH=src python examples/basic_usage.py
```

For dependency profiles and release-check guidance, see [Dependencies and Extras](https://cmudrc.github.io/design-research-analysis/dependencies_and_extras.html).

## CLI

The package installs a `design-research-analysis` CLI:

```bash
design-research-analysis validate-table --input data/events.csv --summary-json artifacts/validate.json
design-research-analysis run-sequence --input data/events.csv --summary-json artifacts/sequence.json --mode markov
design-research-analysis run-language --input data/events.csv --summary-json artifacts/language.json --trajectory-csv artifacts/language_trajectory.csv
design-research-analysis run-embedding-maps --input data/events.csv --summary-json artifacts/embedding_maps.json --map-csv artifacts/embedding_maps.csv
design-research-analysis run-stats --input data/events.csv --summary-json artifacts/stats.json --mode regression --x-columns x1,x2 --y-column y
```

The Python API can start from files too at the main ingestion points, for example
`coerce_unified_table("data/events.csv")` and `profile_dataframe("data/events.csv")`.

## Examples

Start with [examples/README.md](https://github.com/cmudrc/design-research-analysis/blob/main/examples/README.md) for runnable scripts across all analysis families.

## Docs

See the [published documentation](https://cmudrc.github.io/design-research-analysis/) for quickstart, workflow guidance, schema details, CLI reference, and API docs.

Build docs locally with:

```bash
make docs
```

## Public API

The supported public surface is whatever is exported from `design_research_analysis.__all__`.

Top-level exports include:

- Package metadata: `__version__`
- Table contracts: `UnifiedTableConfig`, `UnifiedTableValidationReport`, `coerce_unified_table`, `derive_columns`, `validate_unified_table`
- Sequence: `fit_markov_chain_from_table`, `fit_discrete_hmm_from_table`, `fit_text_gaussian_hmm_from_table`, `decode_hmm`, plotting helpers, and result types
- Language: `compute_language_convergence`, `compute_semantic_distance_trajectory`, `fit_topic_model`, `score_sentiment`
- Embedding maps: `embed_records`, `build_embedding_map`, `cluster_embedding_map`, `compare_embedding_maps`, `plot_embedding_map`, `plot_embedding_map_grid`
- Statistics: `compare_groups`, `fit_regression`, `fit_mixed_effects`, `permutation_test`, `bootstrap_ci`, power helpers
- Dataset + runtime: `profile_dataframe`, `validate_dataframe`, `generate_codebook`, `capture_run_context`, `attach_provenance`, `write_run_manifest`

## Contributing

Contribution workflow and validation gates are documented in [CONTRIBUTING.md](https://github.com/cmudrc/design-research-analysis/blob/main/CONTRIBUTING.md).
