Metadata-Version: 2.4
Name: fpocket-rs
Version: 0.1.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Chemistry
Requires-Dist: pytest>=7 ; extra == 'dev'
Provides-Extra: dev
License-File: LICENSE
Summary: Rust reimplementation of the fpocket protein binding-pocket detection algorithm — pip-installable, no qhull/conda required.
Keywords: fpocket,binding-site,pocket-detection,druggability,alpha-sphere,voronoi,structural-biology,rust
Author-email: Zehua Zeng <starlitnightly@163.com>
License: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/omicverse/rust-fpocket
Project-URL: Issues, https://github.com/omicverse/rust-fpocket/issues
Project-URL: Repository, https://github.com/omicverse/rust-fpocket
Project-URL: Upstream C, https://github.com/Discngine/fpocket

# rust-fpocket

A **Rust reimplementation of [fpocket](https://github.com/Discngine/fpocket)** —
the protein binding-pocket detection algorithm — packaged as a pip-installable
Python wheel.

fpocket is a widely used C program (Le Guilloux, Schmidtke & Tuffery,
*BMC Bioinformatics* 2009) for finding druggable pockets in protein structures.
It has **no PyPI wheel**: it must be compiled from C against `qhull`, or
installed via conda/bioconda. `fpocket-rs` ports the algorithm to Rust and ships
as a self-contained wheel with **no system dependencies** — no qhull, no C
compiler, no conda.

```python
import fpocket_rs

result = fpocket_rs.detect_pockets("1hvr.pdb")
print(result["n_pockets"])               # number of pockets found
top = result["pockets"][0]               # ranked by score
print(top["drug_score"], top["volume"])  # druggability, volume
```

## Installation

```bash
pip install fpocket-rs
```

To build from source you need a Rust toolchain and [maturin](https://www.maturin.rs/):

```bash
pip install maturin
maturin build --release
pip install target/wheels/fpocket_rs-*.whl
```

## The algorithm

fpocket detects pockets geometrically, in five stages — each is a module of the
Rust crate:

1. **Voronoi tessellation** (`voronoi.rs`) — a 3D Delaunay triangulation of the
   protein's heavy atoms; each Delaunay tetrahedron yields one Voronoi vertex
   (the circumcentre), a candidate **alpha sphere**.
2. **Alpha-sphere filtering** (`alpha_spheres.rs`) — keep spheres whose radius
   is in `[min_radius, max_radius]`; classify each apolar/polar by the
   electronegativity of its four contacting atoms.
3. **Clustering** (`cluster.rs`) — single-linkage clustering of the alpha
   spheres into pockets at `clust_max_dist`.
4. **Descriptors** (`descriptors.rs`) — per-pocket volume, alpha-sphere
   statistics, hydrophobicity/polarity/charge scores, SASA, etc.
5. **Druggability scoring** (`druggability.rs`) — a logistic-regression model
   over the normalized descriptors gives a druggability score in `[0, 1]`.

The Voronoi tessellation is a dependency-free Bowyer-Watson implementation. It
was verified to produce a Voronoi-vertex set **identical to qhull's `qvoronoi`**
— all 10734 vertices matched within 0.05 Å on the 1AZ8 heavy-atom set.

## API

```python
def detect_pockets(
    pdb_path_or_str,        # path to a .pdb file, or PDB-format text
    *,
    min_radius=3.4,         # min alpha-sphere radius  (fpocket -m)
    max_radius=6.2,         # max alpha-sphere radius  (fpocket -M)
    clust_max_dist=2.4,     # clustering cutoff        (fpocket -D)
    min_apol_neigh=3,       # apolar-atom threshold    (fpocket -A)
    min_pock_nb_asph=15,    # min alpha spheres/pocket (fpocket -i)
    min_as_density=0.7,     # min alpha-sphere density
    mc_iter=300,            # Monte-Carlo volume iters (fpocket -v)
) -> dict
```

Returns `{"n_pockets": int, "pockets": [...]}`. Each pocket is a dict with:

| key | meaning |
|-----|---------|
| `pocket_id`, `rank` | 1-based, ranked by score |
| `score` | fpocket pocket score |
| `drug_score` | druggability score, `[0, 1]` |
| `volume` | Monte-Carlo pocket volume (Å³) |
| `n_alpha_spheres` | number of alpha spheres |
| `mean_alpha_radius`, `mean_alpha_solvent_access` | alpha-sphere geometry |
| `apolar_alpha_sphere_proportion` | fraction of apolar spheres |
| `hydrophobicity_score`, `volume_score` | amino-acid descriptors |
| `polarity_score`, `charge_score`, `prop_polar_atoms` | amino-acid descriptors |
| `mean_local_hydrophobic_density` | apolar packing |
| `alpha_sphere_density`, `as_max_dist` | spread descriptors |
| `total_sasa`, `polar_sasa`, `apolar_sasa`, `flexibility` | surface / b-factor |
| `residues` | list of `(chain_id, residue_id)` tuples lining the pocket |
| `alpha_spheres` | list of `(x, y, z, radius)` tuples |

The default parameters match fpocket's `fparams.h` (`M_MIN_ASHAPE_SIZE_DEFAULT`
3.4, `M_MAX_ASHAPE_SIZE_DEFAULT` 6.2, `M_CLUST_MAX_DIST` 2.4, etc.).

## Parity with the fpocket C binary

Exact bit-for-bit parity through qhull is not attainable — qhull and the
Bowyer-Watson triangulation resolve cospherical degeneracies differently, and
some filtering tests sit on floating-point boundaries. We measured agreement
honestly against the fpocket C binary (built from the
[Discngine/fpocket](https://github.com/Discngine/fpocket) source) on three
real structures:

| structure | fpocket-C pockets | rust-fpocket pockets | alpha-sphere overlap | mean \|drug-score error\| |
|-----------|-------------------|----------------------|----------------------|---------------------------|
| 1AZ8 (thrombin)        |  6 | 10 | **100 %** (188/188) | 0.027 |
| 1HVR (HIV-1 protease)  |  6 |  7 | **100 %** (232/232) | 0.005 |
| 3EML (A2A receptor)    | 31 | 34 | **100 %** (916/916) | 0.008 |

What this means concretely:

- **Every pocket fpocket-C finds is recovered**, with **100 % alpha-sphere
  membership overlap** — each C pocket's spheres are all present in the matching
  Rust pocket.
- **Druggability scores match** to a mean absolute error of 0.005–0.027.
  Example top pockets: 1HVR C 0.903 / Rust 0.924; 3EML C 0.984 / Rust 0.994;
  1AZ8 C 0.087 / Rust 0.083.
- **Hydrophobicity, polarity and charge scores are exact** when the residue set
  matches (they depend only on residue identity).
- `alpha_sphere_density`, `mean_local_hydrophobic_density`, `as_max_dist` and
  `volume` agree closely.

Known differences:

- rust-fpocket reports a **few extra small pockets** (e.g. 10 vs 6 on 1AZ8). The
  extra clusters are real alpha-sphere groups; fpocket's exact set of retained
  alpha spheres differs by a small number due to qhull-vs-Bowyer-Watson
  degeneracy handling and filter-threshold boundary cases.
- Matched pockets carry **1–2 extra alpha spheres** each for the same reason.
- `total_sasa` can differ for large/elongated pockets, since SASA scales with
  the (slightly larger) sphere set; this mildly affects the absolute pocket
  `score` but not pocket identity or ranking of the top pocket.

The parity tests in `tests/test_parity.py` reproduce these numbers; they are
skipped automatically if no `fpocket` C binary is on `PATH` (or pointed at by
the `FPOCKET_BIN` environment variable).

## Development

```bash
# build + install into the current environment
maturin develop --release

# Python tests (functional + parity)
pytest tests/

# Rust unit tests
cargo test --manifest-path rust/Cargo.toml
```

The `fpocket-ref/` directory (git-ignored) holds a clone of the fpocket C source
used purely as a parity reference; clone it with
`git clone https://github.com/Discngine/fpocket fpocket-ref`.

## License

MIT — the same license as upstream fpocket. See [LICENSE](LICENSE).

fpocket is © 2012–2018 Vincent Le Guilloux, Peter Schmidtke and Pierre Tuffery.
This is an independent reimplementation, not affiliated with the fpocket authors.

