Metadata-Version: 2.4
Name: phytodata-mcp
Version: 0.1.0
Summary: MCP server providing LLMs with conversational access to plant genomics databases (PhytoMine, KEGG, Ensembl Plants, UniProt)
Keywords: bioinformatics,ensembl,kegg,mcp,phytozome,plant-biology
Requires-Python: >=3.10
Requires-Dist: mcp>=1.0.0
Requires-Dist: python-dotenv
Requires-Dist: requests
Provides-Extra: dev
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Description-Content-Type: text/markdown

# PhytoData-MCP

<div align="center">
  <img src="logo.jpg" alt="PhytoData-MCP Logo" width="200" />
</div>

> Standalone MCP Server for Plant Genomics

PhytoData-MCP solves the problem of siloed plant genomics databases by exposing the four most important plant genomics databases as Model Context Protocol (MCP) tools. This allows LLMs (like Claude) to autonomously chain queries together and build a complete picture of a single gene's function, pathway membership, cross-species conservation, and protein behavior.

![PyPI version](https://img.shields.io/pypi/v/phytodata-mcp)
![Python](https://img.shields.io/pypi/pyversions/phytodata-mcp)
![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)

## Features

- **PhytoMine (Phytozome):** Query GO terms, Pfam domains, and KO IDs for given genes.
- **KEGG:** Look up pathways from KO IDs or EC numbers.
- **Ensembl Plants:** Find orthologous genes across multiple crop species.
- **UniProt (Upcoming):** Subcellular localization and protein domains.

## Installation

You can install `phytodata-mcp` directly from python:

```bash
pip install phytodata-mcp
```

### Development Installation

To install for development, with testing dependencies:

```bash
git clone https://github.com/zaeyasa/phytodata-mcp.git
cd phytodata-mcp
pip install -e .[dev]
```

## Claude Desktop Configuration

Add the server to your `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "phytodata": {
      "command": "phytodata-mcp"
    }
  }
}
```

## Usage Example

Once connected, you can ask Claude questions like:

> *"Fetch the gene info for AT3G24650 in Arabidopsis, identify its KEGG pathway, and find its orthologs in wheat and tomato."*

The LLM will automatically chain:
1. `phytomine_gene_info` -> Retrieves KO ID
2. `kegg_pathway_lookup` -> Uses KO ID to find pathways
3. `ensembl_plants_orthologs` -> Matches gene across target crops

## Tools Reference

- **`phytomine_gene_info`**: Requires `gene_id` and `organism`.
- **`kegg_pathway_lookup`**: Requires `ko_id` or `ec_number`.
- **`ensembl_plants_orthologs`**: Requires `gene_id` and `target_species` (can use `"all_crops"`).

## License

This project is licensed under the MIT License.
