Metadata-Version: 2.4
Name: path-finder-retrosynthesis
Version: 1.0.3
Summary: Retrosynthesis route finder — AiZynthFinder + Rxn-INSIGHT + Chemistry by Design
Author: Yara Chahda, Corentin Portmann, Ines Ouchen
License: MIT License
        
        Copyright (c) 2026 by Yara Chahda <yara.chahda@epfl.ch>
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
Project-URL: Homepage, https://github.com/YaraChahda/path_finder
Project-URL: Repository, https://github.com/YaraChahda/path_finder
Project-URL: Issues, https://github.com/YaraChahda/path_finder/issues
Keywords: chemistry,retrosynthesis,aizynthfinder,streamlit,cheminformatics,rxn-insight
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Chemistry
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: streamlit>=1.35.0
Requires-Dist: numpy>=1.24
Requires-Dist: matplotlib>=3.7
Requires-Dist: pandas>=2.0
Requires-Dist: Pillow>=10.0
Requires-Dist: requests>=2.31
Requires-Dist: aizynthfinder>=4.2.0
Requires-Dist: click>=8.1
Provides-Extra: predicted
Requires-Dist: rxn-insight>=1.0.0; extra == "predicted"
Dynamic: license-file

# Path Finder

<p align="center">
  <img src="src/path_finder/assets/banner.png" width="400"/>
</p>

**Retrosynthesis route finder** — AiZynthFinder · Rxn-INSIGHT · Chemistry by Design

*Yara Chahda · Corentin Postmann · Inès Ouchen — EPFL 2026*

---

## User installation

### 1. Install RDKit

RDKit cannot be installed via pip — conda is required for this one step.

```bash
conda install -c conda-forge rdkit
```

### 2. Install Path Finder

```bash
pip install path-finder-retrosynthesis
```

### 3. Run the setup wizard

```bash
path-finder-setup
```

This automatically:
- copies the bundled datasets into `data/`
- downloads the AiZynthFinder model files (~500 MB) via the official AiZynthFinder downloader
- generates `data/config.yml` with the correct paths

> If the automatic download fails, download the model files manually from
> [https://github.com/MolecularAI/aizynthfinder/releases](https://github.com/MolecularAI/aizynthfinder/releases)
> and place them in `data/aizynthfinder/`.

### 4. Download the Rxn-INSIGHT USPTO database

Download `uspto_rxn_insight.gzip` from:
[The rxn-INSIGHT article](https://zenodo.org/records/10171745)

Place it in `data/uspto_rxn_insight.gzip`.

> This file enables reaction condition prediction for novel routes (predicted routes section).
> Without it, only dataset and validated routes are shown.

### 5. Launch

```bash
path-finder
```

Open [http://localhost:8501](http://localhost:8501) in your browser.

---

## Summary

```
conda install -c conda-forge rdkit
pip install path-finder-retrosynthesis
path-finder-setup
# → place uspto_rxn_insight.gzip in data/
path-finder
```

---

## What the app does

Path Finder finds and ranks retrosynthesis routes for a target molecule using three sources:

| Section | Source | Conditions | Yield in scoring |
|---------|--------|------------|-----------------|
| 📚 Dataset | Curated Chemistry by Design routes | Real | Yes |
| ✅ Validated | AiZynthFinder + generic reactions (USPTO) | Real | Yes |
| 🤖 Predicted | AiZynthFinder + Rxn-INSIGHT | Predicted | No |

Routes are scored using a weighted 1/i² scheme across three user-chosen criteria:
steps, yield, atom economy, E-factor, or safety.

---

## Data files

| File | Bundled | Description |
|------|---------|-------------|
| `reaction_dataset.json` | ✅ | Curated synthesis routes |
| `toxicity_dataset.json` | ✅ | Safety scores for reagents and solvents |
| `generic_reactions.json` | ✅ | 10 000 USPTO reactions for step validation |
| `data/aizynthfinder/` | ❌ | AiZynthFinder model files — downloaded by wizard |
| `data/config.yml` | ❌ | Generated by wizard — do not commit |
| `data/uspto_rxn_insight.gzip` | ❌ | Rxn-INSIGHT USPTO database — download manually |

---

## Troubleshooting

| Problem | Solution |
|---------|----------|
| `config.yml not found` | Run `path-finder-setup` |
| AiZynthFinder crash | Check that all paths in `data/config.yml` are absolute |
| No routes found | Try Galanthamine (`OC1C=C[C@@]23c4cc(OC)ccc4CN(C)C[C@@H]2[C@@H]1O3`) |
| Predicted routes disabled | Add `data/uspto_rxn_insight.gzip` (see step 4 above) |
| Slow search (~2 min) | Normal — AiZynthFinder MCTS is computationally intensive |

---

## Developer setup

```bash
git clone https://github.com/YaraChahda/path_finder.git
cd path_finder
conda install -c conda-forge rdkit
pip install -e .
path-finder-setup
path-finder
```

### Project structure

```
path_finder/
├── src/path_finder/
│   ├── app_path_finder.py      # Streamlit front-end
│   ├── route_engine.py         # Scoring, AiZynthFinder, Rxn-INSIGHT
│   ├── molecule_rendering.py   # RDKit Cairo rendering
│   ├── localization.py         # EN/FR UI strings
│   ├── report_builder.py       # PDF generation
│   ├── cli.py                  # path-finder and path-finder-setup commands
│   ├── assets/banner.png
│   └── data/                   # bundled datasets + config template
├── data/                       # working data directory (not committed)
│   ├── config.yml              # generated by path-finder-setup
│   ├── aizynthfinder/          # model files downloaded by path-finder-setup
│   └── uspto_rxn_insight.gzip  # download manually
├── tests/
├── pyproject.toml
└── README.md
```

### Publishing a new version

```bash
sed -i '' 's/version = "X.Y.Z"/version = "X.Y.Z+1"/' pyproject.toml
git add pyproject.toml
git commit -m "release: vX.Y.Z+1"
git tag vX.Y.Z+1
git push origin main --tags
# GitHub Actions publishes to PyPI automatically
```

### Running tests

```bash
pytest tests/
```

---

## Citation

- AiZynthFinder: Genheden et al., *J. Cheminf.* 2020 — [doi:10.1186/s13321-020-00472-1](https://doi.org/10.1186/s13321-020-00472-1)
- Rxn-INSIGHT: Thakkar et al., *J. Cheminf.* 2023 — [doi:10.1186/s13321-023-00744-4](https://doi.org/10.1186/s13321-023-00744-4)
- Open Reaction Database: Kearnes et al., *JACS* 2021 — [doi:10.1021/jacs.1c09820](https://doi.org/10.1021/jacs.1c09820)
