Metadata-Version: 2.4
Name: mcp-automl
Version: 0.1.4
Summary: MCP server for end-to-end machine learning
Author-email: ke <idea7766@gmail.com>
License-File: LICENSE
Requires-Python: <3.12,>=3.10
Requires-Dist: duckdb>=1.4.3
Requires-Dist: joblib<1.4
Requires-Dist: mcp>=1.21.2
Requires-Dist: pandas<2.2.0
Requires-Dist: pycaret>=3.0.0
Requires-Dist: scikit-learn<1.4
Requires-Dist: tabulate>=0.9.0
Description-Content-Type: text/markdown

# MCP AutoML

MCP AutoML is a server that enables AI Agents to perform end-to-end machine learning workflows including data inspection, processing, model training. With MCP AutoML, AI Agents can perform more than a typical autoML framework. AI Agents can identify the target, setting baseline, and creating features by themselves.

MCP AutoML seperates tools and workflows, allowing you to create your own workflow.

## Features

- **Data Inspection**: Analyze datasets with comprehensive statistics, data types, and previews
- **SQL-based Data Processing**: Transform and engineer features using DuckDB SQL queries
- **AutoML Training**: Train classification and regression models with automatic model comparison using PyCaret
- **Prediction**: Make predictions using trained models
- **Multi-format Support**: Works with CSV, Parquet, and JSON files

## Usage

### Configure MCP Server

Add to your MCP client configuration (e.g., Claude Desktop, Gemini CLI, Cursor, Antigravity):

```json
{
  "mcpServers": {
    "mcp-automl": {
      "command": "uvx",
      "args": ["--python", "3.11", "mcp-automl"]
    }
  }
}
```

**Or using Docker:**

```json
{
  "mcpServers": {
    "mcp-automl": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-v", "${PWD}:/workspace", "-v", "${HOME}/.mcp-automl:/root/.mcp-automl", "idea7766/mcp-automl:latest"]
    }
  }
}
```


### Available Tools

| Tool | Description |
|------|-------------|
| `inspect_data` | Get comprehensive statistics and preview of a dataset |
| `query_data` | Execute DuckDB SQL queries on data files |
| `process_data` | Transform data using SQL and save to a new file |
| `train_classifier` | Train a classification model with AutoML |
| `train_regressor` | Train a regression model with AutoML |
| `predict` | Make predictions using a trained model |

## Agent Skill

MCP AutoML includes an **data science workflow skill** that guides AI agents through best practices for machine learning projects. This skill teaches agents to:

- Identify targets and establish baselines
- Perform exploratory data analysis
- Engineer domain-specific features
- Train and evaluate models systematically

### Installing the Skill

**For Gemini CLI:**

```bash
gemini skills install https://github.com/idea7766/mcp-automl --path skill/data-science-workflow
```

**For Claude Code:**

```bash
# Clone the repo and copy the skill
git clone https://github.com/idea7766/mcp-automl.git
cp -r mcp-automl/skill/data-science-workflow ~/.claude/skills/
```

The skill file is located at `skill/data-science-workflow/SKILL.md`.

## Configuration

Models and experiments are saved to `~/.mcp-automl/experiments/` by default.
## Troubleshooting

### macOS: LightGBM OpenMP Error

If you encounter an error like `Library not loaded: @rpath/libomp.dylib`, you need to install OpenMP:

```bash
brew install libomp
```

This is a system-level dependency required by LightGBM on macOS. Linux and Windows users typically don't need this step.

## Dependencies

- [PyCaret](https://pycaret.org/) - AutoML library
- [DuckDB](https://duckdb.org/) - Fast SQL analytics
- [MCP](https://github.com/modelcontextprotocol/python-sdk) - Model Context Protocol SDK
