Metadata-Version: 2.4
Name: peptidegym
Version: 0.1.0
Summary: Gymnasium-compatible RL environments for therapeutic peptide design
Project-URL: Repository, https://github.com/HassDhia/peptidegym
Project-URL: Documentation, https://github.com/HassDhia/peptidegym#readme
Author-email: Hass Dhia <partners@smarttechinvest.com>
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.10
Requires-Dist: gymnasium>=0.29
Requires-Dist: numpy>=1.24
Provides-Extra: all
Requires-Dist: matplotlib>=3.7; extra == 'all'
Requires-Dist: mypy; extra == 'all'
Requires-Dist: pytest-cov; extra == 'all'
Requires-Dist: pytest>=7.0; extra == 'all'
Requires-Dist: ruff; extra == 'all'
Requires-Dist: stable-baselines3>=2.0; extra == 'all'
Requires-Dist: torch>=2.0; extra == 'all'
Provides-Extra: dev
Requires-Dist: mypy; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff; extra == 'dev'
Provides-Extra: train
Requires-Dist: matplotlib>=3.7; extra == 'train'
Requires-Dist: stable-baselines3>=2.0; extra == 'train'
Requires-Dist: torch>=2.0; extra == 'train'
Description-Content-Type: text/markdown

# peptidegym

**Gymnasium-Compatible RL Environments for Therapeutic Peptide Design**

![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
![Tests](https://img.shields.io/badge/tests-125%20passing-brightgreen.svg)
[![PyPI version](https://img.shields.io/pypi/v/peptidegym.svg)](https://pypi.org/project/peptidegym/)

---

PeptideGym provides the first Gymnasium-compatible reinforcement learning environments for therapeutic peptide design. It models peptide construction as a sequential decision process — an RL agent builds peptide sequences residue-by-residue, receiving rewards from pluggable biophysical property predictors. PeptideGym enables researchers to benchmark any Gymnasium-compatible RL algorithm (PPO, DQN, SAC via Stable Baselines3, CleanRL, or RLlib) on peptide design without writing custom training loops.

Three environment families cover distinct therapeutic peptide classes:
- **Antimicrobial peptides (AMPs)** — cationic, amphipathic sequences that disrupt microbial membranes
- **Cyclic peptides** — macrocyclic binders with enhanced stability and oral bioavailability
- **Vaccine epitopes** — short peptides optimized for MHC-I binding and T-cell recognition

## Installation

```bash
pip install peptidegym              # Core (numpy, gymnasium)
pip install peptidegym[train]       # + SB3, PyTorch for RL training
pip install peptidegym[all]         # Everything
```

Development install:

```bash
git clone https://github.com/HassDhia/peptidegym.git
cd peptidegym
pip install -e ".[all]"
```

## Quick Start

```python
import gymnasium as gym
import peptidegym

env = gym.make("PeptideGym/AMP-v0")
obs, info = env.reset(seed=42)
for _ in range(100):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        print(f"Designed peptide: {info['sequence']} (reward: {reward:.3f})")
        obs, info = env.reset()
env.close()
```

### Train a PPO Agent

```python
from stable_baselines3 import PPO
import gymnasium as gym
import peptidegym

env = gym.make("PeptideGym/AMP-v0")
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=90_000)

# Evaluate
obs, _ = env.reset()
done = False
while not done:
    action, _ = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated
print(f"Designed AMP: {info['sequence']}, Activity: {info.get('activity_score', 'N/A')}")
```

## Environments

| Environment | Task | Action Space | Observation | Difficulty Tiers |
|---|---|---|---|---|
| `PeptideGym/AMP-v0` | Design antimicrobial peptide | `Discrete(21)` — 20 AAs + STOP | Sequence + biophysical properties | Easy, Medium, Hard |
| `PeptideGym/CyclicPeptide-v0` | Design cyclic peptide binder | `Discrete(24)` — 20 AAs + 3 cyclization + linear stop | Sequence + properties + cyclization validity | Easy, Medium, Hard |
| `PeptideGym/Epitope-v0` | Optimize vaccine epitope | `Discrete(21)` — 20 AAs + STOP | Sequence + HLA encoding + binding estimate | Easy, Medium, Hard |

Each environment is available in three difficulty tiers (e.g., `PeptideGym/AMP-Easy-v0`, `PeptideGym/AMP-Hard-v0`) for a total of 9 benchmark configurations.

## Architecture

```
┌─────────────────────────────────────────────────────┐
│                   RL Agent (PPO)                     │
│              via Stable Baselines3                   │
└────────────────────┬────────────────────────────────┘
                     │ action (amino acid or special)
                     ▼
┌─────────────────────────────────────────────────────┐
│              PeptideGym Environment                  │
│  ┌───────────┐  ┌──────────────┐  ┌───────────┐    │
│  │  AMP-v0   │  │CyclicPep-v0  │  │ Epitope-v0│    │
│  └─────┬─────┘  └──────┬───────┘  └─────┬─────┘    │
│        └───────────────┬┘───────────────┘            │
│                        ▼                             │
│            Pluggable RewardBackend                   │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────┐ │
│  │  Heuristic   │  │   AMPlify    │  │ NetMHCpan │ │
│  │  (default)   │  │  (optional)  │  │ (optional)│ │
│  └──────────────┘  └──────────────┘  └───────────┘ │
└─────────────────────────────────────────────────────┘
```

All environments share the Gymnasium API (`reset()`, `step()`, `observation_space`, `action_space`). Default heuristic reward backends require no external dependencies. Optional backends (AMPlify, NetMHCpan, MHCflurry) can be swapped in for research-grade reward signals.

## Paper

The accompanying paper is available at:
- [PDF (GitHub)](https://github.com/HassDhia/peptidegym/blob/main/paper/peptidegym.pdf)

## Citation

If you use peptidegym in your research, please cite:

```bibtex
@software{dhia2026peptidegym,
  author = {Dhia, Hass},
  title = {PeptideGym: Gymnasium-Compatible Reinforcement Learning Environments for Therapeutic Peptide Design},
  year = {2026},
  publisher = {Smart Technology Investments Research Institute},
  url = {https://github.com/HassDhia/peptidegym}
}
```

## License

MIT License. See [LICENSE](LICENSE) for details.

## Contact

Hass Dhia -- Smart Technology Investments Research Institute
- Email: partners@smarttechinvest.com
- Web: [smarttechinvest.com/research](https://smarttechinvest.com/research)
