Metadata-Version: 2.4
Name: ollama-almasrv-mcp
Version: 1.0.0
Summary: MCP server for accessing Ollama models on a remote GPU server via Model Context Protocol
Project-URL: Homepage, https://github.com/VanKasins/ollama-almasrv-mcp
Project-URL: Issues, https://github.com/VanKasins/ollama-almasrv-mcp/issues
Author-email: Noesis AI <kasimeris.evagelos@gmail.com>
License: MIT
License-File: LICENSE
Keywords: claude-code,embeddings,llm,mcp,model-context-protocol,ollama
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: httpx>=0.24.0
Requires-Dist: mcp>=1.0.0
Description-Content-Type: text/markdown

# ollama-almasrv-mcp

MCP (Model Context Protocol) server for accessing Ollama models on a remote GPU server.

Designed for [ALMASRV](https://github.com/noesis-ai) (NVIDIA RTX PRO 6000 96GB VRAM) but works with any Ollama instance behind a compatible HTTP gateway.

## Features

- **6 MCP tools**: chat, think, embed, similarity, models, health
- **10 models**: 4 local (llama3.3:70b, qwen3:32b, llama3.2:3b, mxbai-embed-large) + 6 cloud
- **1024-dim embeddings** compatible with SQL Server 2025 `VECTOR(1024)`
- **Thinking models** with reasoning traces (qwen3:32b, kimi-k2-thinking, glm-5, kimi-k2.5, minimax-m2.5)
- **Configurable endpoints** via environment variables

## Quick Install

```bash
pip install ollama-almasrv-mcp
```

## Setup with Claude Code

```bash
# Add to Claude Code (user-level, available everywhere)
claude mcp add --scope user ollama-almasrv -- ollama-almasrv-mcp

# Or with custom server URLs
claude mcp add --scope user \
  -e ALMASRV_GATEWAY_URL=http://your-server:8030 \
  -e ALMASRV_EMBED_URL=http://your-server:8031 \
  ollama-almasrv -- ollama-almasrv-mcp
```

## Available Tools

| Tool | Description |
|------|-------------|
| `ollama_chat` | Chat with any Ollama model (default: qwen3:32b) |
| `ollama_think` | Chat with thinking models that return reasoning traces |
| `ollama_embed` | Generate 1024-dim embedding vectors |
| `ollama_similarity` | Calculate cosine similarity between two texts |
| `ollama_models` | List all available models |
| `ollama_health` | Check gateway and embedding service health |

## Configuration

| Environment Variable | Default | Description |
|---------------------|---------|-------------|
| `ALMASRV_GATEWAY_URL` | `http://192.168.50.78:8030` | Ollama gateway (chat/think/models) |
| `ALMASRV_EMBED_URL` | `http://192.168.50.78:8031` | Embedding service (embed/similarity) |

## Requirements

- Python >= 3.10
- A running [Ollama](https://ollama.ai) instance with a compatible HTTP gateway
- Gateway endpoints: `/chat`, `/models`, `/health`
- Embedding endpoints: `/embeddings`, `/similarity`, `/health`

## License

MIT
