Metadata-Version: 2.4
Name: rostaing-mcp
Version: 0.1.1
Summary: A universal Model Content Protocol (MCP) for Pandas and Polars DataFrames, designed to be used as a tool by Large Language Models (LLMs).
Author-email: Davila Rostaing <rostaingdavila@gmail.com>
Project-URL: Homepage, https://github.com/Rostaing/rostaing-mcp
Project-URL: Bug Tracker, https://github.com/Rostaing/rostaing-mcp/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=1.0
Requires-Dist: polars>=0.15
Requires-Dist: plotly>=5.0.0
Requires-Dist: kaleido>=0.2.1
Requires-Dist: pingouin>=0.5.0
Requires-Dist: scipy>=1.10.0
Requires-Dist: lifelines>=0.27.0
Requires-Dist: statsmodels>=0.14.0
Dynamic: license-file

# Rostaing's Model Content Protocol (rostaing-mcp)

<p align="center">
  <a href="https://pypi.org/project/rostaing-mcp/"><img src="https://img.shields.io/pypi/v/rostaing-mcp?color=blue&label=PyPI%20version" alt="PyPI version"></a>
  <a href="https://pypi.org/project/rostaing-mcp/"><img src="https://img.shields.io/pypi/pyversions/rostaing-mcp.svg" alt="Python versions"></a>
  <a href="https://github.com/Rostaing/rostaing-mcp/blob/main/LICENSE"><img src="https://img.shields.io/pypi/l/rostaing-mcp.svg" alt="License"></a>
  <a href="https://pepy.tech/project/rostaing-mcp"><img src="https://static.pepy.tech/badge/rostaing-mcp" alt="Downloads"></a>
</p>

A universal Model Content Protocol (MCP) for **Pandas** and **Polars** DataFrames, designed to be used as a powerful tool by Large Language Models (LLMs) like GPT-4, Claude, and Mistral.

This package allows LLMs to interact securely with data, performing advanced tasks ranging from simple filtering to **statistical hypothesis testing** and **data visualization** (returning viewable images).

## Key Features

*   **Library Agnostic:** Seamlessly handles `pandas` and `polars` DataFrames.
*   **Visual Intelligence:** Generates Plotly charts returning **Base64 PNG images**, allowing LLMs to "see" and generate charts.
*   **Statistical Suite:** Built-in support for T-tests, ANOVA, Chi-Squared, Normality tests (Shapiro), and Survival Analysis (Logrank).
*   **Smart NLU (Fuzzy Matching):** Automatically corrects typos in column names (e.g., understands that "salary" refers to "Salary_USD").
*   **Stateful Analysis:** Filters and modifications persist within the agent's session context.

## Installation

```bash
pip install rostaing-mcp
```
*Note: This will also install necessary dependencies like `plotly`, `kaleido` (for image generation), `pingouin`, and `scipy`.*

## Quick Start

```python
import pandas as pd
from rostaing_mcp import DataFrameAgent, DataFrameToolHandler

# 1. Create a sample DataFrame
data = {
    'employee_name': ['Rostaing', 'Lucrèce', 'Isnard', 'Charline', 'Dacier', 'Nora'],
    'salary': [100000, 70000, 85000, 60000, 95000, 62000],
    'experience_level': ['Expert', 'Senior', 'Manager', 'Junior', 'Principal', 'Mid-Level'],
    'department': ['AI', 'Sales', 'AI', 'Sales', 'AI', 'Sales']
}
df = pd.DataFrame(data)

# 2. Initialize the core agent (Works with Polars too!)
data_agent = DataFrameAgent(df, source_description="Employee salary data")

# 3. Wrap it in the tool handler
df_tool = DataFrameToolHandler(data_agent)

# --- EXAMPLES OF DIRECT USAGE ---

# A. Inspect Data
print(df_tool.get_schema())

# B. Statistical Test (e.g., T-test between groups)
# Note: Handles fuzzy matching if you type 'Department' instead of 'department'
print(df_tool.perform_t_test(a='salary', group='department')) 

# C. Visualization (Returns Base64 Image string)
# The LLM can call this to generate a chart
image_data = df_tool.plot_bar_chart(x='employee_name', y='salary', color='department')
print("Chart generated successfully (Base64 data ready).")
```

## Integration with LLM Agents (e.g., Upsonic, LangChain)

To let an LLM use all available tools, you can pass the methods dynamically or wrap them in a proxy class.

```python
from upsonic import Agent, Task

# Get the list of all callable tools for the LLM
tools_list = df_tool.get_all_tools() 

task = Task(
    description="Analyze the salary distribution and plot a bar chart by department.",
    tools=tools_list
)

agent = Agent(model="openai/gpt-4o", name="Data Analyst")
result = agent.do(task)
print(result)
```

## Available Tools

### 📊 Visualization
*   `plot_histogram`, `plot_bar_chart`, `plot_line_chart`
*   `plot_scatter_plot`, `plot_box_plot`, `plot_violin_plot`
*   `plot_heatmap`, `plot_pie_chart`, `plot_3d_scatter`, and more.

### 🧮 Statistics
*   `get_summary_statistics`, `get_correlation_matrix`
*   `perform_normality_test` (Shapiro-Wilk)
*   `perform_t_test` (Student's t-test)
*   `perform_anova` (One-way ANOVA)
*   `perform_chi2_test` (Independence)
*   `perform_logrank_test` (Survival analysis)

### 🛠 Manipulation
*   `filter_rows` (Complex conditions supported)
*   `sort_values`
*   `select_columns`

## Useful Links
- [Author's LinkedIn](https://www.linkedin.com/in/davila-rostaing/)
- [Author's YouTube Channel](https://www.youtube.com/@RostaingAI?sub_confirmation=1)
- [GitHub Repository](https://github.com/Rostaing/rostaing-mcp)
- [PyPI Project Page](https://pypi.org/project/rostaing-mcp/)
