Metadata-Version: 2.4
Name: spark-advisor-cli
Version: 0.1.5
Summary: AI-powered Apache Spark job analyzer and configuration advisor
Project-URL: Homepage, https://github.com/pstysz/spark-advisor
Project-URL: Repository, https://github.com/pstysz/spark-advisor
Project-URL: Issues, https://github.com/pstysz/spark-advisor/issues
Project-URL: Documentation, https://github.com/pstysz/spark-advisor/blob/main/docs/architecture.md
Author: Pawel Stysz
License-Expression: Apache-2.0
Keywords: ai,apache-spark,claude,mcp,optimization,performance,spark
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: System :: Monitoring
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: anthropic>=0.52
Requires-Dist: httpx>=0.28
Requires-Dist: orjson>=3.10
Requires-Dist: rich>=14
Requires-Dist: spark-advisor-analyzer
Requires-Dist: spark-advisor-hs-connector
Requires-Dist: spark-advisor-models
Requires-Dist: spark-advisor-rules
Requires-Dist: typer>=0.15
Description-Content-Type: text/markdown

# spark-advisor

AI-powered Apache Spark job analyzer and configuration advisor.

**Stop guessing Spark configs. Let data and AI tell you what's wrong.**

## Install

```bash
pip install spark-advisor-cli
```

## Quick Start

```bash
# Analyze from event log file (rules-only, free)
spark-advisor analyze /path/to/event-log.json.gz --no-ai

# Analyze with AI recommendations
export ANTHROPIC_API_KEY=sk-ant-...
spark-advisor analyze /path/to/event-log.json.gz

# Analyze from History Server
spark-advisor analyze app-20250101120000-0001 -hs http://yarn:18080

# Agent mode (multi-turn AI analysis)
spark-advisor analyze /path/to/event-log.json.gz --agent

# Scan recent jobs
spark-advisor scan -hs http://yarn:18080 --limit 20
```

## What it detects

11 deterministic rules: data skew, disk spill, GC pressure, shuffle partitions, executor idle, task failures, small files, broadcast join threshold, serializer choice, dynamic allocation, memory overhead.

## Links

- [Full documentation and architecture](https://github.com/pstysz/spark-advisor)
- [MCP Server setup (Claude Desktop / Cursor)](https://github.com/pstysz/spark-advisor/blob/main/docs/mcp-setup.md)
- [Contributing](https://github.com/pstysz/spark-advisor/blob/main/CONTRIBUTING.md)

## License

Apache 2.0
