Metadata-Version: 2.4
Name: imas-mcp
Version: 2.1.0
Summary: An IMAS Data Dictionary MCP server
Project-URL: Homepage, https://github.com/iterorganization/imas-mcp
Project-URL: Repository, https://github.com/iterorganization/imas-mcp
Project-URL: Documentation, https://github.com/iterorganization/imas-mcp#readme
Project-URL: Bug Tracker, https://github.com/iterorganization/imas-mcp/issues
Author-email: Simon McIntosh <simon.mcintosh@iter.org>
License: CC BY-ND 4.0
Keywords: Data Dictionary,Fusion,IMAS,ITER,MCP,Model Context Protocol,Plasma Physics
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: Other/Proprietary License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database :: Database Engines/Servers
Classifier: Topic :: Scientific/Engineering :: Physics
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: <3.13,>=3.12
Requires-Dist: cachetools<6.0.0,>=5.3.0
Requires-Dist: click<9.0.0,>=8.0.0
Requires-Dist: fastmcp<3.0.0,>=2.10.6
Requires-Dist: huggingface-hub[hf-xet]>=0.33.4
Requires-Dist: nest-asyncio<2.0.0,>=1.5.0
Requires-Dist: networkx<4.0,>=3.0
Requires-Dist: numpy>=2.3.1
Requires-Dist: pint<0.25.0,>=0.24.4
Requires-Dist: pydantic<3.0.0,>=2.11.4
Requires-Dist: pyyaml<7.0,>=6.0
Requires-Dist: sentence-transformers>=5.0.0
Provides-Extra: bench
Requires-Dist: asv[virtualenv]<1.0.0,>=0.6.0; extra == 'bench'
Provides-Extra: http
Requires-Dist: fastapi<1.0.0,>=0.100.0; extra == 'http'
Requires-Dist: starlette<1.0.0,>=0.37.0; extra == 'http'
Requires-Dist: uvicorn<1.0.0,>=0.30.0; extra == 'http'
Provides-Extra: test
Requires-Dist: coverage>=7.0.0; extra == 'test'
Requires-Dist: pytest-asyncio<1.0.0,>=0.21.0; extra == 'test'
Requires-Dist: pytest-benchmark<5.0.0,>=4.0.0; extra == 'test'
Requires-Dist: pytest-cov<7.0.0,>=6.1.1; extra == 'test'
Requires-Dist: pytest-timeout<3.0.0,>=2.1.0; extra == 'test'
Requires-Dist: pytest-xdist<4.0.0,>=3.0.0; extra == 'test'
Requires-Dist: pytest<9.0.0,>=8.3.5; extra == 'test'
Description-Content-Type: text/markdown

# IMAS MCP Server

[![pre-commit][pre-commit-badge]][pre-commit-link]
[![Ruff][ruff-badge]][ruff-link]
[![Python versions][python-badge]][python-link]
[![CI/CD status][build-deploy-badge]][build-deploy-link]
[![Coverage status][codecov-badge]][codecov-link]
[![Documentation][docs-badge]][docs-link]
[![ASV][asv-badge]][asv-link]

A Model Context Protocol (MCP) server providing AI assistants with access to IMAS (Integrated Modelling & Analysis Suite) data structures through natural language search and optimized path indexing.

## Quick Start - Connect to Hosted Server

The easiest way to get started is connecting to our hosted IMAS MCP server. No installation required!

### VS Code Setup

#### Option 1: Interactive Setup (Recommended)

1. Open VS Code and press `Ctrl+Shift+P` (or `Cmd+Shift+P` on Mac)
2. Type "MCP: Add Server" and select it
3. Choose "HTTP Server"
4. Enter server name: `imas`
5. Enter server URL: `https://imas-dd.iter.org/mcp/`

#### Option 2: Manual Configuration

Choose one of these file locations:

- **Workspace Settings (Recommended)**: `.vscode/mcp.json` in your workspace (`Ctrl+Shift+P` → "Preferences: Open Workspace Settings (JSON)")
- **User Settings**: VS Code `settings.json` (`Ctrl+Shift+P` → "Preferences: Open User Settings (JSON)")

Then add this configuration:

```json
{
  "servers": {
    "imas": {
      "type": "http",
      "url": "https://imas-dd.iter.org/mcp/"
    }
  }
}
```

_Note: For user settings.json, wrap the above in `"mcp": { ... }`_

### Claude Desktop Setup

Add to your Claude Desktop config file:

**Windows:** `%APPDATA%\Claude\claude_desktop_config.json`  
**macOS:** `~/Library/Application Support/Claude/claude_desktop_config.json`  
**Linux:** `~/.config/claude/claude_desktop_config.json`

```json
{
  "mcpServers": {
    "imas-mcp-hosted": {
      "command": "npx",
      "args": ["mcp-remote", "https://imas-dd.iter.org/mcp/"]
    }
  }
}
```

## Quick Start - Local Docker Server

If you have Docker available, you can run a local IMAS MCP server:

### Start the Docker Server

```bash
# Pull and run the server
docker run -d \
  --name imas-mcp \
  -p 8000:8000 \
  ghcr.io/iterorganization/imas-mcp:latest-streamable-http

# Verify it's running
docker ps --filter name=imas-mcp --format "table {{.Names}}\t{{.Status}}"
```

### Configure Your Client

**VS Code** - Add to `.vscode/mcp.json`:

```json
{
  "servers": {
    "imas-mcp-docker": {
      "type": "http",
      "url": "http://localhost:8000/mcp/"
    }
  }
}
```

**Claude Desktop** - Add to your config file:

```json
{
  "mcpServers": {
    "imas-mcp-docker": {
      "command": "npx",
      "args": ["mcp-remote", "http://localhost:8000/mcp/"]
    }
  }
}
```

## Quick Start - Local UV Installation

If you have [uv](https://docs.astral.sh/uv/) installed, you can run the server directly:

### Install and Configure

```bash
# Install imas-mcp with uv
uv tool install imas-mcp

# Or add to a project
uv add imas-mcp
```

### UV Client Configuration

**VS Code** - Add to `.vscode/mcp.json`:

```json
{
  "servers": {
    "imas-mcp-uv": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "python", "-m", "imas_mcp.server"]
    }
  }
}
```

**Claude Desktop** - Add to your config file:

```json
{
  "mcpServers": {
    "imas-mcp-uv": {
      "command": "uv",
      "args": ["run", "python", "-m", "imas_mcp.server"]
    }
  }
}
```

## Example IMAS Queries

Once you have the IMAS MCP server configured, you can interact with it using natural language queries. Use the `@imas` prefix to direct queries to the IMAS server:

### Basic Search Examples

```text
@imas Find data paths related to plasma temperature
@imas Search for electron density measurements
@imas What data is available for magnetic field analysis?
@imas Show me core plasma profiles
```

### Physics Concept Exploration

```text
@imas Explain what equilibrium reconstruction means in plasma physics
@imas What is the relationship between pressure and magnetic fields?
@imas How do transport coefficients relate to plasma confinement?
@imas Describe the physics behind current drive mechanisms
```

### Data Structure Analysis

```text
@imas Analyze the structure of the core_profiles IDS
@imas What are the relationships between equilibrium and core_profiles?
@imas Show me identifier schemas for transport data
@imas Export bulk data for equilibrium, core_profiles, and transport IDS
```

### Advanced Queries

```text
@imas Find all paths containing temperature measurements across different IDS
@imas What physics domains are covered in the IMAS data dictionary?
@imas Show me measurement dependencies for fusion power calculations
@imas Explore cross-domain relationships between heating and confinement
```

### Workflow and Integration

```text
@imas How do I access electron temperature profiles from IMAS data?
@imas What's the recommended workflow for equilibrium analysis?
@imas Show me the branching logic for diagnostic identifier schemas
@imas Export physics domain data for comprehensive transport analysis
```

The IMAS MCP server provides 8 specialized tools for different types of queries:

- **Search**: Natural language and structured search across IMAS data paths
- **Explain**: Physics concepts with IMAS context and domain expertise
- **Overview**: General information about IMAS structure and available data
- **Analyze**: Detailed structural analysis of specific IDS
- **Explore**: Relationship discovery between data paths and physics domains
- **Identifiers**: Exploration of enumerated options and branching logic
- **Bulk Export**: Comprehensive export of multiple IDS with relationships
- **Domain Export**: Physics domain-specific data with measurement dependencies

## Development

For local development and customization:

### Setup

```bash
# Clone repository
git clone https://github.com/iterorganization/imas-mcp.git
cd imas-mcp

# Install development dependencies (search index build takes ~8 minutes first time)
uv sync --all-extras
```

### Build Dependencies

This project requires additional dependencies during the build process that are not part of the runtime dependencies. These include:

- **`imas-data-dictionary`** - Required for generating the search index during build
- **`rich`** - Used for enhanced console output during build processes

**For developers:** These build dependencies are included in the `dev` dependency group and can be installed with:

```bash
uv sync --group dev
```

**Location in configuration:**

- **Build-time dependencies**: Listed in `[build-system.requires]` in `pyproject.toml`
- **Development access**: Also available in `[dependency-groups.dev]` for local development

**Note:** Regular users installing the package don't need these dependencies - they're only required when building from source or working with the data dictionary directly.

### Development Commands

```bash
# Run tests
uv run pytest

# Run linting and formatting
uv run ruff check .
uv run ruff format .

# Build schema data structures from IMAS data dictionary
uv run build-schemas

# Build document store and semantic search embeddings
uv run build-embeddings

# Run the server locally
uv run python -m imas_mcp --transport streamable-http --port 8000

# Run with stdio transport for MCP development
uv run python -m imas_mcp --transport stdio --auto-build
```

### Build Scripts

The project includes two separate build scripts for creating the required data structures:

**`build-schemas`** - Creates schema data structures from IMAS XML data dictionary:

- Transforms XML data into optimized JSON format
- Creates catalog and relationship files
- Use `--ids-filter "core_profiles equilibrium"` to build specific IDS
- Use `--force` to rebuild even if files exist

**`build-embeddings`** - Creates document store and semantic search embeddings:

- Builds in-memory document store from JSON data
- Generates sentence transformer embeddings for semantic search
- Caches embeddings for fast loading
- Use `--model-name "all-mpnet-base-v2"` for different models
- Use `--force` to rebuild embeddings cache
- Use `--no-normalize` to disable embedding normalization
- Use `--half-precision` to reduce memory usage
- Use `--similarity-threshold 0.1` to set similarity score thresholds

**Note:** The build hook creates JSON data. Build embeddings separately using `build-embeddings` for better control and performance.

### Local Development MCP Configuration

**VS Code** - The repository includes a `.vscode/mcp.json` file with pre-configured development server options. Use the `imas-local-stdio` configuration for local development.

**Claude Desktop** - Add to your config file:

```json
{
  "mcpServers": {
    "imas-local-dev": {
      "command": "uv",
      "args": ["run", "python", "-m", "imas_mcp", "--auto-build"],
      "cwd": "/path/to/imas-mcp"
    }
  }
}
```

## How It Works

1. **Installation**: During package installation, the index builds automatically when the module first imports
2. **Build Process**: The system parses the IMAS data dictionary and creates comprehensive JSON files with structured data
3. **Embedding Generation**: Creates semantic embeddings using sentence transformers for advanced search capabilities
4. **Serialization**: The system stores indexes in organized subdirectories:
   - **JSON data**: `imas_mcp/resources/schemas/` (LLM-optimized structured data)
   - **Embeddings cache**: Pre-computed sentence transformer embeddings for semantic search
5. **Import**: When importing the module, the pre-built index and embeddings load in ~1 second

## Optional Dependencies and Runtime Requirements

The IMAS MCP server uses a composable pattern that allows it to work with or without the `imas-data-dictionary` package at runtime:

### Package Installation Options

- **Runtime only**: `uv add imas-mcp` - Uses pre-built indexes, stdio transport only
- **With HTTP support**: `uv add imas-mcp[http]` - Adds support for sse/streamable-http transports
- **With build support**: `uv add imas-mcp[build]` - Includes `imas-data-dictionary` for index building
- **Full installation**: `uv add imas-mcp[all]` - Includes all optional dependencies

### Data Dictionary Access

The system uses multiple fallback strategies to access IMAS Data Dictionary version and metadata:

1. **Environment Variable**: `IMAS_DD_VERSION` (highest priority)
2. **Metadata File**: JSON metadata stored alongside indexes
3. **Index Name Parsing**: Extracts version from index filename
4. **IMAS Package**: Direct access to `imas-data-dictionary` (if available)

This design ensures the server can:

- **Build indexes** when the IMAS package is available
- **Run with pre-built indexes** without requiring the IMAS package
- **Access version/metadata** through multiple reliable fallback mechanisms

### Index Building vs Runtime

- **Index Building**: Requires `imas-data-dictionary` package to parse XML and create indexes
- **Runtime Search**: Only requires pre-built indexes and metadata, no IMAS package dependency
- **Version Access**: Uses composable accessor pattern with multiple fallback strategies

## Implementation Details

### Search Implementation

The search system is the core component that provides fast, flexible search capabilities over the IMAS Data Dictionary. It combines efficient indexing with IMAS-specific data processing and semantic search to enable different search modes:

#### Search Methods

1. **Semantic Search** (`SearchMode.SEMANTIC`):

   - AI-powered semantic understanding using sentence transformers
   - Natural language queries with physics context awareness
   - Finds conceptually related terms even without exact keyword matches
   - Best for exploratory research and concept discovery

2. **Lexical Search** (`SearchMode.LEXICAL`):

   - Fast text-based search with exact keyword matching
   - Boolean operators (`AND`, `OR`, `NOT`)
   - Wildcards (`*` and `?` patterns)
   - Field-specific searches (e.g., `documentation:plasma ids:core_profiles`)
   - Fastest performance for known terminology

3. **Hybrid Search** (`SearchMode.HYBRID`):

   - Combines semantic and lexical approaches
   - Provides both exact matches and conceptual relevance
   - Balanced performance and comprehensiveness

4. **Auto Search** (`SearchMode.AUTO`):
   - Intelligent search mode selection based on query characteristics
   - Automatically chooses optimal search strategy
   - Adaptive performance optimization

#### Key Capabilities

- **Search Mode Selection**: Choose between semantic, lexical, hybrid, or auto modes
- **Performance Caching**: TTL-based caching system with hit rate monitoring
- **Semantic Embeddings**: Pre-computed sentence transformer embeddings for fast semantic search
- **Physics Context**: Domain-aware search with IMAS-specific terminology
- **Advanced Query Parsing**: Supports complex search expressions and field filtering
- **Relevance Ranking**: Results sorted by match quality and physics relevance

## Future Work

### MCP Resources Implementation (Phase 2 - Planned)

We plan to implement MCP resources to provide efficient access to pre-computed IMAS data:

#### Planned Resource Features

- **Static JSON IDS Data**: Pre-computed IDS catalog and structure data served as MCP resources
- **Physics Measurement Data**: Domain-specific measurement data and relationships
- **Usage Examples**: Code examples and workflow patterns for common analysis tasks
- **Documentation Resources**: Interactive documentation and API references

#### Resource Types

- `ids://catalog` - Complete IDS catalog with metadata
- `ids://structure/{ids_name}` - Detailed structure for specific IDS
- `ids://physics-domains` - Physics domain mappings and relationships
- `examples://search-patterns` - Common search patterns and workflows

### MCP Prompts Implementation (Phase 3 - Planned)

Specialized prompts for physics analysis and workflow automation:

#### Planned Prompt Categories

- **Physics Analysis Prompts**: Specialized prompts for plasma physics analysis tasks
- **Code Generation Prompts**: Generate Python analysis code for IMAS data
- **Workflow Automation Prompts**: Automate complex multi-step analysis workflows
- **Data Validation Prompts**: Create validation approaches for IMAS measurements

#### Prompt Templates

- `physics-explain` - Generate comprehensive physics explanations
- `measurement-workflow` - Create measurement analysis workflows
- `cross-ids-analysis` - Analyze relationships between multiple IDS
- `imas-python-code` - Generate Python code for data analysis

### Performance Optimization (Phase 4 - In Progress)

Continued optimization of search and tool performance:

#### Current Optimizations (Implemented)

- ✅ **Search Mode Selection**: Multiple search modes (semantic, lexical, hybrid, auto)
- ✅ **Search Caching**: TTL-based caching with hit rate monitoring for search operations
- ✅ **Semantic Embeddings**: Pre-computed sentence transformer embeddings
- ✅ **ASV Benchmarking**: Automated performance monitoring and regression detection

#### Planned Optimizations

- **Advanced Caching Strategy**: Intelligent cache management for all MCP operations (beyond search)
- **Performance Monitoring**: Enhanced metrics tracking and analysis across all tools
- **Multi-Format Export**: Optimized export formats (raw, structured, enhanced)
- **Selective AI Enhancement**: Conditional AI enhancement based on request context

### Testing and Quality Assurance (Phase 5 - Planned)

Comprehensive testing strategy for all MCP components:

#### Test Implementation Goals

- **MCP Tool Testing**: Complete test coverage using FastMCP 2 testing framework
- **Resource Testing**: Validation of all MCP resources and data integrity
- **Prompt Testing**: Automated testing of prompt templates and responses
- **Performance Testing**: Benchmarking and regression detection for all tools

## Docker Usage

The server is available as a pre-built Docker container with the index already built:

```bash
# Pull and run the latest container
docker run -d -p 8000:8000 ghcr.io/iterorganization/imas-mcp:latest

# Or use Docker Compose
docker-compose up -d
```

See [DOCKER.md](DOCKER.md) for detailed container usage, deployment options, and troubleshooting.

[python-badge]: https://img.shields.io/badge/python-3.12-blue
[python-link]: https://www.python.org/downloads/
[ruff-badge]: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json
[ruff-link]: https://docs.astral.sh/ruff/
[pre-commit-badge]: https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white
[pre-commit-link]: https://github.com/pre-commit/pre-commit
[build-deploy-badge]: https://img.shields.io/github/actions/workflow/status/simon-mcintosh/imas-mcp/test.yml?branch=main&color=brightgreen&label=CI%2FCD
[build-deploy-link]: https://github.com/iterorganization/imas-mcp/actions/workflows/test.yml
[codecov-badge]: https://codecov.io/gh/simon-mcintosh/imas-mcp/graph/badge.svg
[codecov-link]: https://codecov.io/gh/simon-mcintosh/imas-mcp
[docs-badge]: https://img.shields.io/badge/docs-online-brightgreen
[docs-link]: https://simon-mcintosh.github.io/imas-mcp/
[asv-badge]: https://img.shields.io/badge/ASV-Benchmarks-blue?style=flat&logo=speedtest&logoColor=white
[asv-link]: https://simon-mcintosh.github.io/imas-mcp/benchmarks/
