Metadata-Version: 2.4
Name: lakexpress-mcp
Version: 0.1.6
Summary: A Model Context Protocol (MCP) server for LakeXpress, enabling database to Parquet export with sync management and data lake publishing.
Project-URL: Homepage, https://aetperf.github.io/LakeXpress-Documentation/
Project-URL: Repository, https://github.com/arpe-io/lakexpress-mcp
Project-URL: Issues, https://github.com/arpe-io/lakexpress-mcp/issues
Author: Arpe.io
License-Expression: MIT
License-File: LICENSE
Keywords: data-lake,database,etl,lakexpress,mcp,model-context-protocol,parquet
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.10
Requires-Dist: mcp>=1.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-dotenv>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
Requires-Dist: pytest-mock>=3.11.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Description-Content-Type: text/markdown

# LakeXpress MCP Server

<!-- mcp-name: io.github.arpe-io/lakexpress-mcp -->

[![PyPI](https://img.shields.io/pypi/v/lakexpress-mcp)](https://pypi.org/project/lakexpress-mcp/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![MCP Registry](https://img.shields.io/badge/MCP-Registry-blue)](https://registry.modelcontextprotocol.io/?q=arpe-io)

A [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) server for [LakeXpress](https://aetperf.github.io/LakeXpress-Documentation/) — a database to Parquet export tool with sync management and data lake publishing.

## Features

- **14 subcommands** supported: logdb management, config management, sync execution, status, and cleanup
- **5 source databases**: SQL Server, PostgreSQL, Oracle, MySQL, MariaDB
- **6 log databases**: SQL Server, PostgreSQL, MySQL, MariaDB, SQLite, DuckDB
- **6 storage backends**: Local, S3, S3-compatible, GCS, Azure ADLS Gen2, OneLake
- **7 publish targets**: Snowflake, Databricks, Fabric, BigQuery, MotherDuck, Glue, DuckLake
- Command preview before execution with safety confirmation
- Auth file validation
- Workflow suggestions based on use case

## Installation

```bash
pip install -e ".[dev]"
```

## Claude Code Configuration

Add to your Claude Code MCP settings:

```json
{
  "mcpServers": {
    "lakexpress": {
      "command": "python",
      "args": ["-m", "src.server"],
      "cwd": "/path/to/lakexpress-mcp",
      "env": {
        "LAKEXPRESS_PATH": "/path/to/LakeXpress",
        "LAKEXPRESS_TIMEOUT": "3600",
        "LAKEXPRESS_LOG_DIR": "./logs",
        "FASTBCP_DIR_PATH": "/path/to/FastBCP/"
      }
    }
  }
}
```

Or using the installed entry point:

```json
{
  "mcpServers": {
    "lakexpress": {
      "command": "lakexpress-mcp",
      "env": {
        "LAKEXPRESS_PATH": "/path/to/LakeXpress",
        "FASTBCP_DIR_PATH": "/path/to/FastBCP/"
      }
    }
  }
}
```

## Tools

### `preview_command`
Build and preview any LakeXpress CLI command without executing it. Supports all 14 subcommands with full parameter validation.

### `execute_command`
Execute a previously previewed command. Requires `confirmation: true` as a safety mechanism.

### `validate_auth_file`
Validate that an authentication file exists, is valid JSON, and optionally check for specific `auth_id` entries.

### `list_capabilities`
List all supported source databases, log databases, storage backends, publishing targets, compression types, and available commands.

### `suggest_workflow`
Given a use case (source DB type, storage destination, optional publish target), suggest the full sequence of LakeXpress commands with example parameters.

### `get_version`
Report the detected LakeXpress binary version and capabilities.

## Workflow Example

```
# 1. Initialize the log database (first-time setup)
LakeXpress logdb init -a auth.json --log_db_auth_id export_db

# 2. Create a sync configuration
LakeXpress config create -a auth.json --log_db_auth_id export_db \
  --source_db_auth_id prod_db --source_schema_name sales \
  --output_dir ./exports --compression_type Zstd

# 3. Execute the sync
LakeXpress sync --sync_id <sync_id>

# 4. Check status
LakeXpress status -a auth.json --log_db_auth_id export_db --sync_id <sync_id>
```

## Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `LAKEXPRESS_PATH` | `./LakeXpress` | Path to the LakeXpress binary |
| `LAKEXPRESS_TIMEOUT` | `3600` | Command execution timeout in seconds |
| `LAKEXPRESS_LOG_DIR` | `./logs` | Directory for execution logs |
| `FASTBCP_DIR_PATH` | _(empty)_ | Path to FastBCP binary directory (auto-fills `fastbcp_dir_path` parameter) |
| `LOG_LEVEL` | `INFO` | Logging level (DEBUG, INFO, WARNING, ERROR) |

## Development

```bash
# Install dev dependencies
pip install -e ".[dev]"

# Run tests
python -m pytest tests/ -v

# Run with coverage
python -m pytest tests/ -v --cov=src --cov-report=term-missing
```

## License

MIT
