Metadata-Version: 2.4
Name: geo-mcp
Version: 0.1.2
Summary: A Model Context Protocol (MCP) server for accessing GEO (Gene Expression Omnibus) data through NCBI E-Utils API
Project-URL: Homepage, https://github.com/MCPmed/geomcp
Project-URL: Repository, https://github.com/MCPmed/geomcp
Project-URL: Documentation, https://github.com/MCPmed/geomcp#readme
Project-URL: Bug Tracker, https://github.com/MCPmed/geomcp/issues
Author-email: MCPmed Contributors <matthias.flotho@ccb.uni-saarland.de>
License: BSD-3-Clause
License-File: LICENSE
Keywords: bioinformatics,e-utils,gene-expression,geo,mcp,ncbi
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: aiofiles>=23.0.0
Requires-Dist: aiohttp>=3.8.0
Requires-Dist: fastapi>=0.100.0
Requires-Dist: mcp[cli]>=1.9.4
Requires-Dist: requests>=2.31.0
Requires-Dist: uvicorn>=0.20.0
Provides-Extra: dev
Requires-Dist: black>=23.0.0; extra == 'dev'
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Provides-Extra: test
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'test'
Requires-Dist: pytest>=7.0.0; extra == 'test'
Description-Content-Type: text/markdown

# Gene Expression Omnibus (GEO) MCP

<div align="center">
  <img src="https://www.ncbi.nlm.nih.gov/geo/img/geo_main.gif" alt="GEO Logo" width="200"/>
  <br/>
  <em>Gene Expression Omnibus (GEO) - A public functional genomics data repository</em>
</div>

[![PyPI version](https://badge.fury.io/py/geo-mcp.svg)](https://pypi.org/project/geo-mcp/)
[![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-BSD--3--Clause-green.svg)](LICENSE)
[![MCP](https://img.shields.io/badge/MCP-Protocol-blue.svg)](https://modelcontextprotocol.io/)
[![GEO](https://img.shields.io/badge/GEO-NCBI-orange.svg)](https://www.ncbi.nlm.nih.gov/geo/)

> **Status:** Beta. The MCP stdio and HTTP servers are both functional and in
> day-to-day use. The public tool surface is stable for the 0.1.x line; internal
> APIs and config defaults may still evolve. Please open an issue if you hit a
> regression.

A Model Context Protocol (MCP) server for accessing [GEO (Gene Expression Omnibus)](https://www.ncbi.nlm.nih.gov/geo/) data through NCBI E-Utils API.
The tool will enable you to search for GEO datasets, series, samples, platforms, and profiles for your LLM.
Tested with Claude Desktop, chatGPT has no out of the box support for this tool yet.
Claude will automatically use the tools if it fits the context.

## Quick Install (pip)

install from pip
```bash
pip install geo-mcp
```
install from source
```bash
git clone https://github.com/MCPmed/GEOmcp
cd GEOmcp
pip install -e .
```

## Configuration

Run init to create a config file
```bash
geo-mcp --init
```

By default this writes to `$XDG_CONFIG_HOME/geomcp/config.json`
(typically `~/.config/geomcp/config.json`). If you already have a legacy
`~/.geo-mcp/config.json` from an older install, that location is still
honored and `--init` will update it in place so nothing gets orphaned.

The file contains:
```json
{
    "base_url": "https://eutils.ncbi.nlm.nih.gov/entrez/eutils",
    "email": "your_email@example.com",
    "api_key": "YOUR_API_KEY",
    "download_dir": "~/.local/share/geomcp/downloads"
}
```

- `email` is required by NCBI.
- `api_key` is optional but recommended for higher rate limits
  ([get one here](https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/)).

You can also skip the config file entirely and pass values via env vars
(`GEOMCP_EMAIL`, `GEOMCP_API_KEY`, `GEOMCP_BASE_URL`, `GEOMCP_DOWNLOAD_DIR`)
or CLI flags (`--email`, `--api-key`, `--download-dir`, ...). Precedence,
highest wins: CLI flags → env vars → `config.json` → built-in defaults.

## Running the Server

- **MCP stdio mode:**
  ```bash
  geo-mcp
  ```
- **HTTP mode:**
  ```bash
  geo-mcp --http --port 8001
  ```

## Claude Desktop Integration

### Common Issue: `spawn geo-mcp ENOENT`

This error means Claude Desktop cannot find the `geo-mcp` command. This is usually a PATH issue.

### Solution
1. **Find the full path to the executable:**
   ```bash
   which geo-mcp
   ```
   Example output: `/Users/youruser/miniforge3/bin/geo-mcp`

2. **Update your Claude config:**
   Instead of just `"geo-bio-mcp"`, use the full path:
   ```json
   {
     "mcpServers": {
       "geo-mcp": {
         "command": "/Users/youruser/miniforge3/bin/geo-mcp",
         "env": {
           "CONFIG_PATH": "/Users/youruser/.geo-mcp/config.json"
         }
       }
     }
   }
   ```

3. **(Optional) Use a Conda Environment:**
   - Activate your conda env and run `which geo-mcp` to get the correct path.
   - Use that path in your Claude config as above.

4. **Restart Claude Desktop** after updating the config.

---

## Troubleshooting
- If you see `command not found: geo-mcp`, make sure you installed into the
  Python environment whose `bin` directory is on your `PATH`, or invoke the
  server with its full path.
- If the config file is missing, `geo-mcp --init` will create one. You can
  also run without a config by setting `GEOMCP_EMAIL` in the environment.

---

## HTTP API

With the server running via `geo-mcp --http` (default port `8001`), the
following endpoints are exposed:

- `GET /` — server status
- `GET /health` — health check
- `GET /tools` — list all registered MCP tools
- `POST /tools/call` — execute a tool with arguments
- `GET /events` — server-sent events stream of tool calls
- `GET /docs` — interactive Swagger UI

### Examples

```bash
# List available tools
curl http://localhost:8001/tools

# Search GEO Series
curl -X POST http://localhost:8001/tools/call \
  -H "Content-Type: application/json" \
  -d '{
    "name": "search_geo_series",
    "arguments": {"term": "breast cancer", "retmax": 5}
  }'
```

## Available Tools

This MCP server provides access to all major GEO databases through the following tools:

### Search Tools

- **`search_geo`** - Universal GEO search across GSE/GSM/GPL/GDS records, with
  results bucketed by accession type. Pass `record_types` to filter.
  - *Example searches*: "breast cancer", "GSE12345", "RNA-seq time course"

- **`search_geo_profiles`** - Search gene expression profiles across different biological contexts
  - *Example searches*: "cancer", "breast cancer", "p53", "apoptosis"
  
- **`search_geo_datasets`** - Search curated gene expression datasets
  - *Example searches*: "diabetes", "Alzheimer's disease", "drug response", "tissue specific"
  
- **`search_geo_series`** - Search original submitter-supplied gene expression series
  - *Example searches*: "GSE12345", "microarray", "RNA-seq", "time course"
  
- **`search_geo_samples`** - Search individual gene expression samples
  - *Example searches*: "GSM123456", "human", "mouse", "tumor", "normal"
  
- **`search_geo_platforms`** - Search microarray platform definitions
  - *Example searches*: "Affymetrix", "Illumina", "Agilent", "GPL96"

### Search Parameters

Each tool accepts:
- **`term`** (required): Search term or query string
- **`retmax`** (optional, default: 20): Maximum number of results to return

### Example usage
For now just chat with claude desktop and it will use the tools automatically if it fits the context.

**User:** find a small brain dataset  

**Assistant:** I'll help you find a small brain dataset from the GEO database. Let me search for brain-related datasets with a maximum of one result.  

**Assistant (system):** *search_geo_datasets*  

**Assistant:** I found a brain-related dataset for you:

- **Dataset:** GSE290848  
- **Title:** Maternal Stress Triggers Early-Life Eczema via Fetal Mast Cell Reprogramming [10x Genomics Multiome ATAC + Gene Expression]  
- **Species:** Mouse (*Mus musculus*)  
- **Samples:** 8  
- **Data type:** ATAC-seq & gene expression (10x Genomics Multiome)  
- **Publication date:** 21 Jun 2025  

This small dataset explores how prenatal stress affects neuro-immune circuits and mast-cell development.  

Would you like me to download this dataset or search for others?

---

**User:** download GSE290848  

**Assistant:** I'll download the GSE290848 dataset for you.  

**Assistant (system):** *download_geo_data*  

**Assistant:** Download complete.  

**Downloaded files**

- `GSE290848_family.soft.gz` — series, platform & sample info  
- `GSE290848_metadata.xml` — detailed study metadata  

**Location:** `~/.local/share/geomcp/downloads/gse/GSE290848`  
**Total size:** 0.01 MB  

The dataset is ready for analysis. Let me know if you’d like to explore it further.
