Metadata-Version: 2.4
Name: key-drivers-mcp
Version: 0.1.1
Summary: MCP server for key driver and feature importance analysis based on rule mining
Project-URL: Homepage, https://github.com/petrmasa/key_drivers_mcp
Project-URL: Issues, https://github.com/petrmasa/key_drivers_mcp/issues
Author-email: Petr Masa <code@cleverminer.org>
License: AGPL-3.0-only
License-File: LICENSE
Keywords: araxai,cleverminer,driver-analysis,feature-importance,key-drivers,mcp,xai
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Requires-Dist: araxai>=0.3.0
Requires-Dist: mcp[cli]>=1.0.0
Requires-Dist: numpy>=1.26.0
Requires-Dist: pandas>=2.0.0
Description-Content-Type: text/markdown

# key-drivers-mcp

<!-- mcp-name: io.github.petrmasa/key-drivers-mcp -->

MCP server for key driver and feature importance analysis. Load any CSV dataset and ask what drives an outcome — survival, credit default, diagnosis, income — and get a ranked breakdown with sub-driver analysis showing not just which factors matter, but how they combine to amplify or completely reverse each other.

Powered by [araxai](https://github.com/petrmasa/araxai) (CleverMiner association rule analysis).

## Configuration

Add this to your MCP client config (e.g. Claude Code `.mcp.json`). No installation needed — `uvx` fetches and runs the package automatically.

```json
{
  "mcpServers": {
    "key-drivers": {
      "type": "stdio",
      "command": "uvx",
      "args": ["key-drivers-mcp"]
    }
  }
}
```

> **No `uv`?** Install it with `pip install uv`, or use `pipx install key-drivers-mcp` and set `"command": "key-drivers-mcp"` instead.

## Tools

| Tool | Purpose |
|------|---------|
| `load_dataset` | Load a CSV file into session memory |
| `list_datasets` | List all loaded datasets |
| `find_drivers` | Find key drivers of a target outcome |
| `explain_segment` | Driver analysis conditioned on a segment variable (CLARA) |

## Examples

### Titanic — what drove survival?

> *"What are the key drivers to survive in titanic.csv?"*

**Baseline survival rate: 38.4%**

| Driver | Survival rate | vs Baseline |
|--------|--------------|-------------|
| Sex: female | 74.2% | 1.9× higher |
| Sex: male | 18.9% | 2.0× lower |
| Low fare ≤ £10.50 | 20.9% | 1.8× lower |
| Deck D | 75.8% | 2.0× higher |

Sub-drivers are returned automatically. Within **male passengers**, 1st class men recovered to 36.9% — nearly double the male average. Within **low-fare passengers**, women still survived at 60.8% while men reached only 10.7%.

---

### Titanic — drill-down from global to a specific segment

> *"And within women in 3rd class, what helped survival?"*

The global result shows **sex** as the top driver (women 74.2%, men 18.9%). Sub-drivers within women immediately reveal that **3rd class women dropped to 50%** — a coin flip, far below the female average. That triggers a follow-up with `filters={"sex": "female", "pclass": "3"}`:

**144 women in 3rd class — segment baseline: 50%**

| Driver | Survival rate | vs Segment baseline |
|--------|--------------|---------------------|
| Embarked at Queenstown | 72.7% | 1.5× higher |
| Fare £6.75–£7.77 | 72.4% | 1.4× higher |
| Embarked at Southampton | 37.5% | 1.3× lower |

Queenstown passengers (mostly Irish emigrants boarding late in small groups) survived at nearly twice the rate of Southampton passengers — a pattern completely invisible in the global analysis. Each drill-down level answers a narrower question using the previous result as the starting point.

---

### German Credit — how factors combine and reverse each other

> *"What are the key drivers for good credit?"*

**Baseline: 70% good credit rating**

An overdrawn checking account drops approval to 50.7% — but the sub-driver analysis shows the outcome depends sharply on what else is true:

| Profile | Good credit rate | vs Baseline |
|---------|-----------------|-------------|
| Overdrawn checking account | 50.7% | 1.4× lower |
| Overdrawn + loan duration > 24 months | 34.4% | **2.0× lower** |
| Overdrawn + critical credit history | 73.1% | back to baseline |
| Long loan duration > 30 months | 52.0% | 1.3× lower |
| Long loan + no property | 38.9% | 1.8× lower |
| Long loan + no checking account | 79.3% | 1.1× higher |

The same risk factor (overdrawn account) leads to very different outcomes depending on credit history. Borrowers with no checking account are actually safer on long loans — likely self-employed or asset-wealthy.

---

### Diabetes — combinations push risk above 80%

> *"What are the key drivers for testing positive for diabetes?"*

**Baseline: 34.9% positive**

| Driver | Probability | vs Baseline |
|--------|-------------|-------------|
| Glucose > 147 mg/dL | 74.3% | 2.1× higher |
| Glucose > 147 + age 27–33 | 88.5% | 2.5× higher |
| Glucose > 147 + BMI 33.7–37.8 | 84.2% | 2.4× higher |
| Glucose > 147 + many pregnancies (>7) | 85.7% | 2.5× higher |
| Glucose ≤ 109 mg/dL | 14.0% | 2.5× lower |
| Age ≤ 23 | 13.3% | 2.6× lower |

High glucose is already a strong signal (74%), but combining it with age 27–33, elevated BMI, or high pregnancy count pushes risk above 84%. The tool surfaces these compound profiles in a single call.

---

### Income — education can completely override marital status

> *"What drives income above $50K for women specifically?"*

Using `filters={"sex": "Female"}` — women's baseline: 10.9% (vs 23.9% overall):

| Profile | >50K rate | vs Women's baseline |
|---------|-----------|---------------------|
| Doctorate | 56.6% | 5.2× higher |
| Prof-school | 47.7% | 4.4× higher |
| Doctorate + married | 88.0% | **8.1× higher** |
| Prof-school + married | 84.2% | 7.7× higher |
| Prof-school + never-married | 35.7% | 3.3× higher |
| Own-child relationship | 1.2% | 9.0× lower |

Never-married women with a doctorate still reach 35.7% — three times the women's baseline — showing that education fully overrides the marital status penalty. The same inversion appears in the overall dataset: never-married alone → 4.5%, but never-married + Doctorate → 44.3%, almost twice the global baseline.

---

## How it works

araxai uses association rule analysis (CleverMiner) to find statistically significant rules that explain why a target class occurs. Each driver rule reports:

- **probability** — how often the target class occurs in that segment
- **vs_global_baseline** — lift relative to the whole dataset
- **vs_parent_segment** — lift relative to the parent rule (for sub-drivers)
- **strength** — `+`/`-` signs indicating rule reliability

Numeric columns are automatically binned into quantiles. The server enriches every top-level driver with a sub-analysis, so compound profiles like "overdrawn + long loan" or "high glucose + age 27–33" are returned in a single call.

## Requirements

- Python 3.11+
- `araxai >= 0.3.0`
- `mcp[cli] >= 1.0.0`
