Metadata-Version: 2.4
Name: anukriti-pgx-core
Version: 0.2.1
Summary: Deterministic pharmacogenomics infrastructure: CPIC-pinned phenotype engine, gene callers, and recommendation lookup.
Author: Anukriti contributors
License: Apache-2.0
Project-URL: Homepage, https://github.com/AnukritiAi-hq/anukriti-pgx-core
Project-URL: Documentation, https://github.com/AnukritiAi-hq/anukriti-pgx-core#readme
Project-URL: Issues, https://github.com/AnukritiAi-hq/anukriti-pgx-core/issues
Keywords: pharmacogenomics,cpic,pharmvar,bioinformatics,precision-medicine
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Typing :: Typed
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: service
Requires-Dist: fastapi>=0.104; extra == "service"
Requires-Dist: uvicorn[standard]>=0.24; extra == "service"
Requires-Dist: pydantic>=2.5; extra == "service"
Provides-Extra: dev
Requires-Dist: pytest>=7.4; extra == "dev"
Requires-Dist: pytest-cov>=4.1; extra == "dev"
Dynamic: license-file

# anukriti-pgx-core

### Deterministic pharmacogenomics infrastructure

> Authoritative, CPIC-pinned, no LLMs, no randomness. Same inputs → same outputs.

[![License: Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-green)]()
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue)]()
[![Status: Alpha](https://img.shields.io/badge/status-alpha-orange)]()

---

## What this is

The deterministic core extracted from
[`anukriti`](https://github.com/AnukritiAi-hq/anukriti) (PGx product) and
[`anukriti-swarm`](https://github.com/AnukritiAi-hq/anukriti-swarm) (research
platform), now shippable as a standalone library and (optionally) HTTP service.

Three stacked layers:

```
    VCF variants (rsid → ref,alt,gt)
            │
            ▼  Layer 1: calling/       ← VCFCaller.call(gene, variants)
    Star alleles (*1, *17, *2/*17 …)
            │
            ▼  Layer 2: phenotype/     ← PhenotypeEngine.infer(gene, a1, a2)  ← PHASE 1
    Phenotype (PM / IM / NM / RM / UM)
            │
            ▼  Layer 3: recommendations/  ← RecommendationLookup.lookup(...)
    CPIC recommendation
```

Each layer is independently usable. A consumer that already has star alleles
(e.g. a swarm agent) uses Layer 2 directly. A consumer with VCF input
(e.g. a clinical pipeline) uses Layer 1, which calls into Layer 2 internally.

## Why this exists

Before the extraction, the same phenotype logic lived in two places:

- `anukriti-swarm/rules/phenotype_rules.py` — star-allele level, 3 genes
- `anukriti/src/*_caller.py` (16 files) — VCF level, 16 genes, inconsistent
  signatures

That meant two truth sources for the same CPIC tables, and no way for a
third party to consume just the deterministic core without pulling in a
FastAPI product or a multi-agent research framework.

This package **is the deterministic core**. It has:

- Zero runtime dependencies (Phase 1)
- No LLMs, no randomness, no I/O side effects
- CPIC table versions pinned by filename (e.g. `CYP2C19_named_diplotypes_v2022.1.json`)
- An explicit upgrade rhythm: CPIC updates → new file → new test pin → review → bump

## Phase status

| Phase | Scope | Status |
|---|---|---|
| **1** | Layer 2: PhenotypeEngine for CYP2D6 + CYP2C19 | ✅ **this release** |
| 2 | Layer 1: VCFCaller + 16 gene callers (normalized signatures) | planned |
| 3 | Layer 3: RecommendationLookup + optional HTTP service | planned |
| 4 | Swarm expansion to the 13 non-CYP2D6/2C19/HLA-B genes | planned |

## Install

```bash
pip install anukriti-pgx-core==0.2.1

# Or locally while iterating:
pip install -e /path/to/anukriti-pgx-core
```

## Quick use (Layer 2 — what Swarm consumes)

```python
from anukriti_pgx_core import PhenotypeEngine

engine = PhenotypeEngine()

result = engine.infer("CYP2C19", "*1", "*17")
# -> PhenotypeInference(
#        gene="CYP2C19",
#        diplotype="*1/*17",
#        activity_score=2.5,
#        phenotype="Rapid Metabolizer",           # per CPIC 2022 Table 2
#        confidence=1.0,
#        rule_version="cpic_activity_score_v2",
#        source="CPIC named-diplotype table (CYP2C19)",
#        cpic_table_version="CYP2C19_named_diplotypes_v2022.1",
#        ...
#    )

engine.supported_genes()
# -> ["CYP2C19", "CYP2D6"]

engine.cpic_table_version("CYP2C19")
# -> "CYP2C19_named_diplotypes_v2022.1"
```

## Authoritative sources (pinned)

| Gene | Layer | Table file | Citation |
|---|---|---|---|
| CYP2D6 | activity-score → phenotype | `CYP2D6_activity_v2019.10.json` | CPIC 2019 standardization (Caudle et al., PMID:31647186) |
| CYP2C19 | named diplotype → phenotype | `CYP2C19_named_diplotypes_v2022.1.json` | CPIC 2022 clopidogrel guideline Table 2 (Lee et al., PMID:35034351; NCBI NBK84114) |
| CYP2C19 | allele function (activity score) | `CYP2C19_activity_v2022.1.json` | CPIC 2022 allele functionality table |

Updating a pinned table = adding a new file + bumping the pin + adding/updating
the regression case. Silent in-place edits are not allowed.

## Tests

```bash
cd anukriti-pgx-core
pip install -e ".[dev]"
pytest

# Or standalone:
python -m tests.test_pinned_star_alleles
```

## Integration with existing Anukriti projects

### From `anukriti-swarm`

Swarm's `rules/phenotype_rules.py` is a thin re-export shim that delegates
here. All existing swarm code (`agents/pharmacogene/base.py`,
`core/verification/safety.py`, `core/runtime/runtime.py`) continues to work
unchanged. Migration to direct `anukriti_pgx_core` imports is opt-in per
call site.

### From `anukriti` (FastAPI product)

Phase 2 will migrate the 16 gene callers in `anukriti/src/` into
`anukriti_pgx_core.calling`. Until then, `anukriti` continues to use its
existing caller modules unchanged.

## License

Apache 2.0. See [LICENSE](LICENSE).
