Metadata-Version: 2.4
Name: coreason_prism
Version: 0.4.0
Summary: coreason-prism
License: # The Prosperity Public License 3.0.0
         
         Contributor: CoReason, Inc.
         
         Source Code: https://github.com/CoReason-AI/coreason_prism
         
         ## Purpose
         
         This license allows you to use and share this software for noncommercial purposes for free and to try this software for commercial purposes for thirty days.
         
         ## Agreement
         
         In order to receive this license, you have to agree to its rules.  Those rules are both obligations under that agreement and conditions to your license.  Don't do anything with this software that triggers a rule you can't or won't follow.
         
         ## Notices
         
         Make sure everyone who gets a copy of any part of this software from you, with or without changes, also gets the text of this license and the contributor and source code lines above.
         
         ## Commercial Trial
         
         Limit your use of this software for commercial purposes to a thirty-day trial period.  If you use this software for work, your company gets one trial period for all personnel, not one trial per person.
         
         ## Contributions Back
         
         Developing feedback, changes, or additions that you contribute back to the contributor on the terms of a standardized public software license such as [the Blue Oak Model License 1.0.0](https://blueoakcouncil.org/license/1.0.0), [the Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0.html), [the MIT license](https://spdx.org/licenses/MIT.html), or [the two-clause BSD license](https://spdx.org/licenses/BSD-2-Clause.html) doesn't count as use for a commercial purpose.
         
         ## Personal Uses
         
         Personal use for research, experiment, and testing for the benefit of public knowledge, personal study, private entertainment, hobby projects, amateur pursuits, or religious observance, without any anticipated commercial application, doesn't count as use for a commercial purpose.
         
         ## Noncommercial Organizations
         
         Use by any charitable organization, educational institution, public research organization, public safety or health organization, environmental protection organization, or government institution doesn't count as use for a commercial purpose regardless of the source of funding or obligations resulting from the funding.
         
         ## Defense
         
         Don't make any legal claim against anyone accusing this software, with or without changes, alone or with other technology, of infringing any patent.
         
         ## Copyright
         
         The contributor licenses you to do everything with this software that would otherwise infringe their copyright in it.
         
         ## Patent
         
         The contributor licenses you to do everything with this software that would otherwise infringe any patents they can license or become able to license.
         
         ## Reliability
         
         The contributor can't revoke this license.
         
         ## Excuse
         
         You're excused for unknowingly breaking [Notices](#notices) if you take all practical steps to comply within thirty days of learning you broke the rule.
         
         ## No Liability
         
         ***As far as the law allows, this software comes as is, without any warranty or condition, and the contributor won't be liable to anyone for any damages related to this software or this license, under any kind of legal claim.***
License-File: LICENSE
License-File: NOTICE
Author: Gowtham A Rao
Author-email: gowtham.rao@coreason.ai
Requires-Python: >=3.11
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Requires-Dist: aiofiles (>=23.2.1,<24.0.0)
Requires-Dist: anyio (>=4.4.0,<5.0.0)
Requires-Dist: coreason-identity (>=0.1.0,<0.2.0)
Requires-Dist: datamol (>=0.12.5,<0.13.0)
Requires-Dist: fastapi (>=0.128.0,<0.129.0)
Requires-Dist: httpx (>=0.28.1,<0.29.0)
Requires-Dist: loguru (>=0.7.2,<0.8.0)
Requires-Dist: open-clip-torch (>=3.2.0,<4.0.0)
Requires-Dist: pandas (>=2.3.3,<3.0.0)
Requires-Dist: pillow (>=12.1.0,<13.0.0)
Requires-Dist: pydantic (>=2.12.5,<3.0.0)
Requires-Dist: rdkit (==2024.9.4)
Requires-Dist: selfies (>=2.2.0,<3.0.0)
Requires-Dist: torch (>=2.9.1,<3.0.0)
Requires-Dist: transformers (>=4.57.3,<5.0.0)
Requires-Dist: uvicorn (>=0.40.0,<0.41.0)
Project-URL: Documentation, https://github.com/CoReason-AI/coreason_prism
Project-URL: Homepage, https://github.com/CoReason-AI/coreason_prism
Project-URL: Repository, https://github.com/CoReason-AI/coreason_prism
Description-Content-Type: text/markdown

# coreason-prism

**The Scientific Eye / Multi-Modal Encoder**

[![Organization](https://img.shields.io/badge/org-CoReason--AI-blue)](https://github.com/CoReason-AI)
[![License](https://img.shields.io/badge/license-Prosperity%203.0-blue)](https://prosperitylicense.com/versions/3.0.0)
[![CI](https://github.com/CoReason-AI/coreason_prism/actions/workflows/ci.yml/badge.svg)](https://github.com/CoReason-AI/coreason_prism/actions)
[![Code Quality](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)
[![Docs](https://img.shields.io/badge/docs-product%20requirements-green)](docs/product_requirements.md)

**coreason-prism** is the specialized processing engine for scientific data types (Chemistry and Vision) within the CoReason AI ecosystem. It acts as the "Scientific Eye" and "Multi-Modal Encoder", transforming fragile string representations and static images into robust, mathematical graphs and vectors.

**Core Philosophy:**
> "A Molecule is a Graph, not a String. A Chart is Data, not an Image."

---

## Features

Derived from the [Product Requirements Document](docs/product_requirements.md):

*   **Cheminformatic Grounding (The Chemist):**
    *   Treats molecules as mathematical graphs, not strings.
    *   Normalizes and sanitizes chemical structures (SMILES/InChI) using `datamol`.
    *   Transmutes SMILES to **SELFIES** for 100% valid generative output.
    *   Computes fingerprints (Morgan/ECFP) for structural similarity search.
    *   Calculates key properties: Molecular Weight, LogP, TPSA, Lipinski Violations.

*   **Visual De-Plotting (The Analyst):**
    *   Extracts raw data from scientific figures (e.g., Kaplan-Meier curves) using **DePlot**.
    *   Digitizes charts into linear tables/DataFrames.
    *   Enables meta-analysis of data locked in PDF images.

*   **Bio-Image Segmentation (The Biologist):**
    *   Segments and classifies medical images (e.g., Histology) using **MedSAM**.
    *   Detects ROIs and computes metrics like cell counts and tumor area.

*   **Multi-Modal Embedding (The Embedder):**
    *   Generates joint embeddings for text, molecules, and images using **BioCLIP**.
    *   Enables multi-modal retrieval (e.g., searching for histology slides via text description).

---

## Installation

```bash
pip install coreason-prism
```

Or install from source:

```bash
git clone https://github.com/CoReason-AI/coreason_prism.git
cd coreason_prism
pip install .
```

## Usage

Here is a concise snippet showing how to initialize and use the library:

```python
from pathlib import Path
from coreason_prism.interface import Prism, PrismMode

# Initialize Prism (The Facade)
prism = Prism(light_mode=False)  # Set light_mode=True to skip heavy models (DePlot/BioCLIP)

# 1. Process a Molecule (SMILES -> Graph/SELFIES + Properties)
molecule_result = prism.process_molecule("CC(=O)Oc1ccccc1C(=O)O")  # Aspirin
if molecule_result.status == "VALID":
    print(f"Canonical SMILES: {molecule_result.canonical_smiles}")
    print(f"SELFIES: {molecule_result.selfies_string}")
    print(f"LogP: {molecule_result.logp}")
    print(f"Fingerprint (first 10 bits): {molecule_result.fingerprint_vector[:10]}")

# 2. Process a Chart Image (Extract Data)
# Ensure you have an image file at the specified path
chart_path = Path("tests/data/kaplan_meier.png")
if chart_path.exists():
    chart_result = prism.process_image(
        image_path=chart_path,
        source_document_id="doc_123",
        mode=PrismMode.CHART
    )
    print(f"Figure Type: {chart_result.figure_type}")
    print(f"Extracted Data: {chart_result.data_series}")
    if chart_result.metadata:
        print(f"Median Survival: {chart_result.metadata.get('median_survival')}")

# 3. Process a Bio-Image (Segmentation)
bio_path = Path("tests/data/histology_slide.jpg")
if bio_path.exists():
    bio_result = prism.process_image(
        image_path=bio_path,
        source_document_id="doc_456",
        mode=PrismMode.BIO
    )
    print(f"Cell Count: {bio_result.metadata.get('cell_count')}")
```

