Metadata-Version: 2.4
Name: pepkit
Version: 0.0.5
Summary: A modular Python toolkit for peptide modelling, rescoring, and structural bioinformatics
Project-URL: Homepage, https://github.com/Vivi-tran/PepKit
Project-URL: Source, https://github.com/Vivi-tran/PepKit
Project-URL: Issues, https://github.com/Vivi-tran/PepKit/issues
Project-URL: Documentation, https://vivi-tran.github.io/PepKit/
Author-email: Ngoc-Vi Nguyen Tran <vi.tran@sund.ku.dk>
License: MIT
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.11
Requires-Dist: dockq>=2.1.3
Requires-Dist: joblib>=1.4.2
Requires-Dist: pandas>=2.2.0
Requires-Dist: rcsb-api>=1.5.0
Requires-Dist: seaborn>=0.13.2
Provides-Extra: all
Requires-Dist: mdanalysis>=2.9.0; extra == 'all'
Requires-Dist: openbabel-wheel>=3.1.1.21; extra == 'all'
Requires-Dist: peptides>=0.3.4; extra == 'all'
Requires-Dist: prolif>=2.0.3; extra == 'all'
Requires-Dist: rdkit>=2025.3.2; extra == 'all'
Requires-Dist: scikit-learn>=1.4.0; extra == 'all'
Requires-Dist: umap-learn>=0.1.1; extra == 'all'
Provides-Extra: docs
Requires-Dist: myst-parser>=2.0; extra == 'docs'
Requires-Dist: sphinx-copybutton; extra == 'docs'
Requires-Dist: sphinx-design>=0.5; extra == 'docs'
Requires-Dist: sphinx-rtd-theme; extra == 'docs'
Requires-Dist: sphinx>=6.0; extra == 'docs'
Requires-Dist: sphinxcontrib-bibtex; extra == 'docs'
Provides-Extra: modelling
Requires-Dist: mdanalysis>=2.9.0; extra == 'modelling'
Requires-Dist: openbabel-wheel>=3.1.1.21; extra == 'modelling'
Requires-Dist: peptides>=0.3.4; extra == 'modelling'
Requires-Dist: prolif>=2.0.3; extra == 'modelling'
Requires-Dist: rdkit>=2025.3.2; extra == 'modelling'
Description-Content-Type: text/markdown

# 🧬 PepKit

**A Python Toolkit for Peptide Modeling, Analysis, and Benchmarking**

![PepKit Logo](https://raw.githubusercontent.com/Vivi-tran/PepKit/main/data/Figure/pepkit.png)

PepKit is a modular, peptide-centric Python toolkit designed to support **end-to-end computational peptide workflows**, from sequence processing and descriptor calculation to structural modeling, confidence assessment, and docking-oriented analysis.

The package is built with **reproducibility, scalability, and interoperability** in mind, making it suitable for:

- Peptide–protein interaction studies  
- Machine-learning–ready dataset construction  
- Structural bioinformatics and docking benchmarks  
- Large-scale peptide screening and analysis pipelines  

---

## ✨ Key Features

### 1️⃣ Sequence I/O & Standardization
- Convert between **FASTA** and **SMILES** formats (`fasta_to_smiles`, `smiles_to_fasta`).
- Validate and standardize peptide sequences (canonical residues, charge models).
- Batch processing for lists and pandas DataFrames.

### 2️⃣ Physicochemical Descriptors & Clustering
- Compute peptide-level descriptors (molecular weight, charge, hydrophobicity, pI).
- Generate descriptor tables for ML pipelines.
- Cluster peptide libraries based on sequence or chemical similarity.

### 3️⃣ Structural Modeling & Confidence Metrics
- Post-process **AlphaFold / AlphaFold-Multimer** outputs.
- Compute confidence metrics:
  - pLDDT (global, peptide, interface)
  - PAE (interface-aware)
  - pTM / ipTM / composite pTM
  - pDockQ, pDockQ2, MPDockQ
- Optional **DockQ** evaluation against experimental structures.

### 4️⃣ Peptide–Protein Dataset Construction
- Automated querying of **RCSB PDB** for peptide–protein complexes.
- Heuristic peptide-chain detection and interface extraction.
- CSV + FASTA export for benchmarking and ML.

### 5️⃣ Docking & Benchmark Pipelines
- Prepare inputs for docking and scoring workflows.
- Integrate predicted and experimental metrics in unified tables.
- Designed to interoperate with Rosetta, AlphaFold, and downstream scoring tools.

---

## 📦 Installation

### Install from PyPI
```bash
pip install pepkit
```

### Development installation
```bash
git clone https://github.com/Vivi-tran/PepKit.git
cd PepKit
pip install -e .
```

## 🚀 Quickstart

This section shows how to get started with **PepKit** in just a few lines of code, covering the most common peptide-centric tasks.

---

### 🔹 Sequence Conversion

Convert between peptide **FASTA** and **SMILES** representations:

```python
from pepkit.conversion import fasta_to_smiles, smiles_to_fasta


# FASTA → SMILES
seq = "ACDEFGHIK"
smiles = fasta_to_smiles(seq)
print("SMILES:", smiles)

# SMILES → FASTA
back_seq = smiles_to_fasta(smiles)
print("FASTA:", back_seq)
```


## 📚 Documentation

Full documentation is available at:

👉 **https://pepkit.readthedocs.io/en/latest/**


## License

This project is licensed under MIT License - see the [License](LICENSE) file for details.