Metadata-Version: 2.4
Name: mdsa-tools
Version: 1.2.2
Summary: The Weir Labs H-bond Systems Analyses modules!
Author-email: Luis Perez <lperez@wesleyan.edu>
License-Expression: MIT
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: matplotlib
Requires-Dist: seaborn
Requires-Dist: scikit-learn
Requires-Dist: mdtraj<=1.10.3
Requires-Dist: umap-learn
Requires-Dist: python-circos
Requires-Dist: requests
Requires-Dist: MDAnalysisData
Requires-Dist: setuptools>=68
Provides-Extra: dev
Requires-Dist: ruff; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=8; extra == "docs"
Requires-Dist: furo; extra == "docs"
Requires-Dist: myst-parser; extra == "docs"
Requires-Dist: sphinx-copybutton; extra == "docs"
Provides-Extra: test
Requires-Dist: pytest; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Dynamic: license-file

# __mdsa-tools__ [![Docs](https://img.shields.io/github/actions/workflow/status/zeper-eng/mdsa-tools/docs.yml?branch=main&label=Docs%20Build&logo=github&logoColor=1E3A8A&labelColor=555555&color=f06292&style=flat)](https://mdsa-tools.readthedocs.io/en/latest/)[![CI](https://github.com/zeper-eng/mdsa-tools/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/zeper-eng/mdsa-tools/actions/workflows/ci.yml)


Tools for systems-level analysis of Molecular Dynamics (MD) simulations

## Pipeline overview

![Pipeline](https://raw.githubusercontent.com/zeper-eng/mdsa-tools/main/resources/pipeline_11_2_2025.png)

We start from an MD trajectory and generate per-frame interaction networks. Then we vectorize our adjacency matrices by representing them as edge vectors (vectors consisting of just the edgeweights for every edge connecting pairs of unique nodes); stacking these per-frame vectors yields a feature matrix suitable for clustering (e.g., k-means) and dimensionality reduction (PCA/UMAP). Results can be visualized with graphs, scatter plots, MDcircos plots (Chord Diagrams), or replicate maps of frame-level measurements of interest. 

## Install

```bash
pip install mdsa-tools
```

## Systems Problem Area:

![System panel](https://raw.githubusercontent.com/zeper-eng/mdsa-tools/main/resources/Fig_1.png)

In the Weir Group at Wesleyan University, we perform molecular dynamics (MD) simulations of a ribosomal subsystem to study tuning of protein translation by the CAR interaction surface — a ribosomal interface identified by the lab that interacts with the +1 codon (poised to enter the ribosome A site). Our "computational genetics" research focuses on modifying adjacent codon identities at the A-site and the +1 positions to model how changes at these sites influence the behavior of the CAR surface and correlate with translation rate variations.

## Development Note:

Moving forward the most recent work will be through github releases and PyPI pushes will be the most recent confirmed covered working release. 

## Quickstart example (see [docs](https://mdsa-tools.readthedocs.io/en/latest/examples.html) for more examples):

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](
https://colab.research.google.com/github/zeper-eng/mdsa-tools/blob/main/notebooks/Quick_Start.ipynb)


```python
from mdsa_tools.Data_gen_hbond import TrajectoryProcessor as tp
import numpy as np
import os

###
### Datagen
###

# load in and test trajectory
system_one_topology = '../PDBs/5JUP_N2_CGU_nowat.prmtop'
system_one_trajectory = '../PDBs/CCU_CGU_10frames.mdcrd'

system_two_topology = '../PDBs/5JUP_N2_GCU_nowat.prmtop'
system_two_trajectory = '../PDBs/CCU_GCU_10frames.mdcrd'

test_trajectory_one = tp(trajectory_path=system_one_trajectory, topology_path=system_one_topology)
test_trajectory_two = tp(trajectory_path=system_two_trajectory, topology_path=system_two_topology)

# now that it's loaded, make objects
test_system_one_ = test_trajectory_one.create_system_representations()
test_system_two_ = test_trajectory_two.create_system_representations()

np.save('test_system_one', test_system_one_)
np.save('test_system_two', test_system_two_)

###
### Analysis
###

from mdsa_tools.Analysis import systems_analysis

all_systems = [test_system_one_, test_system_two_]
Systems_Analyzer = systems_analysis(all_systems)

# transform adjacency matrices, perform clustering and dimensional reduction
Systems_Analyzer.replicates_to_featurematrix()
optimal_k_silhouette_labels, optimal_k_elbow_labels, centers_silhouette, centers_elbow = Systems_Analyzer.perform_kmeans(outfile_path='./test_', max_clusters=5)
print('clustering successfully completed')
X_pca, weights, explained_variance_ratio_ = Systems_Analyzer.reduce_systems_representations(method='PCA')  # you could do method='PCA'/'UMAP' here
print('reduction successful')

###
### Visualization
###

import matplotlib.cm as cm
from mdsa_tools.Viz import visualize_reduction

# visualize embedding space with original clusters
visualize_reduction(X_pca, color_mappings=optimal_k_silhouette_labels, savepath='./PCA_', cmap=cm.plasma_r)

# map transitions between various cluster assignments
from mdsa_tools.Viz import replicatemap_from_labels

fake_labels = np.arange(0, 18, 1)
replicatemap_from_labels(cmap=cm.plasma_r, frame_list=[9] * 2, labels=fake_labels, savepath='./Repmap_')  # 9 frames each
```
