Metadata-Version: 2.4
Name: como-ocsr
Version: 1.2.0
Summary: COMO: Closed-loop Optical Molecule recOgnition with Minimum Risk Training
Author: Zhuoqi Lyu
License: MIT
Project-URL: Homepage, https://huggingface.co/Keylab/COMO
Project-URL: Repository, https://github.com/netknowledge/COMO
Project-URL: Bug Tracker, https://github.com/netknowledge/COMO/issues
Keywords: cheminformatics,optical-chemical-structure-recognition,ocsr,molecule-recognition,deep-learning,transformer,rdkit
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0
Requires-Dist: torchvision>=0.15
Requires-Dist: rdkit
Requires-Dist: SmilesPE>=0.0.3
Requires-Dist: albumentations>=1.3
Requires-Dist: opencv-python-headless>=4.5
Requires-Dist: Pillow>=9.0
Requires-Dist: numpy>=1.21
Requires-Dist: pandas>=1.5
Requires-Dist: tqdm>=4.60
Dynamic: license-file

# COMO: Optical Chemical Structure Recognition

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue)](https://www.python.org/downloads/)
[![PyPI](https://img.shields.io/pypi/v/como-ocsr)](https://pypi.org/project/como-ocsr/)

**COMO** (Closed-loop Optical Molecule recOgnition) converts images of chemical structure diagrams into machine-readable SMILES strings, atom-level coordinates, and bond matrices.

Compared to image-to-text OCSR models (e.g., MolScribe, SwinOCSR, Image2Mol), COMO uniquely predicts explicit molecular graphs — atoms with 2D coordinates and bonds — then reconstructs SMILES using cheminformatics post-processing for provably valid, chemically accurate structures.

## 🚀 Quick Start

```python
import como

# 1. Load model
model = como.load_model("COMO_joint.pth", device="cuda")

# 2. Predict a single molecule
smiles = como.predict(model, "molecule.png")         # → "CC(=O)O"
result = como.predict(model, "molecule.png", smiles_mode=None)
# result contains: tokens, atom symbols, coordinates, bond matrix, etc.

# 3. Batch prediction
smiles_list = como.predict_batch(model, ["mol1.png", "mol2.png"])

# 4. Benchmark evaluation
metrics = como.evaluate(model, "benchmark/USPTO/", "benchmark/USPTO.csv")
print(metrics["exact_match_acc"], metrics["avg_tanimoto"])
```

## 📦 Installation

```bash
pip install como-ocsr
```

**Requirements:** Python 3.10+, PyTorch ≥ 2.0, RDKit.

## 🧠 Model Checkpoints

| Checkpoint | Description |
|---|---|
| `COMO_joint.pth` | Full model — MLE + MRT joint training (recommended) |
| `COMO_stage1_synthetic.pth` | Stage 1 only — MLE on synthetic data |

Download from [Hugging Face](https://huggingface.co/Keylab/COMO).

## 📖 API Reference

### `como.load_model(checkpoint_path, device="cuda", pretrained=True, **kwargs)`

Load a COMO model from a checkpoint.

- **checkpoint_path** (`str`): Path to `.pth` checkpoint file.
- **device** (`str`): `"cuda"` or `"cpu"`.
- **pretrained** (`bool`): Use ImageNet-pretrained backbone weights (default: `True`).
- Returns: `ComoModel` in evaluation mode.

### `como.predict(model, image, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)`

Predict SMILES for a single image.

- **image**: File path (`str`), NumPy array (H×W×3 or H×W), PIL `Image`, or preprocessed `torch.Tensor`.
- **beam_size** (`int`): 1 = greedy, 3 = beam search.
- **smiles_mode** (`str`):
  - `"postprocess"` — cheminformatics-based SMILES reconstruction (recommended, best accuracy)
  - `"graph"` — graph-traversal SMILES
  - `"decoder"` — raw decoder output
  - `None` — returns full result dict (tokens, atoms, bonds, coordinates)
- Returns: SMILES string (`str`) or full result dict.

### `como.predict_batch(model, images, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)`

Predict SMILES for multiple images (single GPU).

- **images**: List of file paths, NumPy arrays, PIL Images, or Tensors.
- Returns: List of SMILES strings or result dicts.

### `como.evaluate(model, benchmark_dir, csv_path, *, beam_size=1, postproc_workers=32, tautomer_standardize=True, gpus="0")`

Evaluate on a benchmark dataset.

- **benchmark_dir**: Directory of `.png` images.
- **csv_path**: CSV with columns `image_id` and `SMILES`.
- **gpus**: Comma-separated GPU IDs (e.g. `"0,1,2,3"`), or `None` for all GPUs.
- Returns: Dict with `exact_match_acc`, `avg_tanimoto`, `tautomer_match_acc`, etc.

### `como.evaluate_benchmarks(model, benchmarks, *, ...)`

Evaluate on multiple benchmarks at once.

- **benchmarks**: List of `{"name": ..., "benchmark_dir": ..., "csv_path": ...}` dicts.
- Returns: `dict[name] → metrics_dict`.

## 🧪 Supported Input Formats

- PNG / JPEG / TIFF images
- Hand-drawn or computer-generated chemical structure diagrams
- Arbitrary aspect ratios and sizes (auto-resized internally)

## 📄 License

- **Code**: MIT License (see [LICENSE](LICENSE))
- **Model Weights**: CC BY-NC 4.0

## 📚 Citation

If you use COMO in your research, please cite:

```bibtex
@article{lyu2025como,
  title={Closed-loop Optical Molecule recOgnition with Minimum Risk Training},
  author={Lyu, Zhuoqi and others},
  journal={arXiv},
  year={2025}
}
```
