Metadata-Version: 2.4
Name: spark-advisor-cli
Version: 0.1.21
Summary: AI-powered Apache Spark job analyzer and configuration advisor
Project-URL: Homepage, https://github.com/pstysz/spark-advisor
Project-URL: Repository, https://github.com/pstysz/spark-advisor
Project-URL: Issues, https://github.com/pstysz/spark-advisor/issues
Project-URL: Documentation, https://github.com/pstysz/spark-advisor/blob/main/docs/architecture.md
Author: Pawel Stysz
License-Expression: Apache-2.0
Keywords: ai,apache-spark,claude,mcp,optimization,performance,spark
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: System :: Monitoring
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: rich>=14
Requires-Dist: spark-advisor-analyzer==0.1.21
Requires-Dist: spark-advisor-hs-connector==0.1.21
Requires-Dist: spark-advisor-models==0.1.21
Requires-Dist: spark-advisor-parser==0.1.21
Requires-Dist: spark-advisor-rules==0.1.21
Requires-Dist: typer>=0.15
Description-Content-Type: text/markdown

[↩ spark-advisor](../../README.md)

# spark-advisor-cli

AI-powered Apache Spark job analyzer and configuration advisor. Standalone CLI tool.

**Stop guessing Spark configs. Let data and AI tell you what's wrong.**

## Install

```bash
pip install spark-advisor-cli
```

## Quick Start

```bash
# Analyze from event log file (rules-only, free)
spark-advisor analyze /path/to/event-log.json.gz --no-ai

# Analyze with AI recommendations
export SA_ANALYZER_AI__API_KEY=sk-or-...
spark-advisor analyze /path/to/event-log.json.gz

# Analyze from History Server
spark-advisor analyze app-20250101120000-0001 -hs http://yarn:18080

# Agent mode (multi-turn AI analysis)
spark-advisor analyze /path/to/event-log.json.gz --agent
```

## Commands

### `analyze` — analyze a Spark job

```bash
# From event log file
spark-advisor analyze /path/to/event-log.json.gz

# From History Server
spark-advisor analyze app-20250101120000-0001 -hs http://yarn:18080

# With AI analysis (default if SA_ANALYZER_AI__API_KEY is set)
spark-advisor analyze /path/to/event-log.json.gz

# Without AI (rules only)
spark-advisor analyze /path/to/event-log.json.gz --no-ai

# Agent mode (multi-turn AI with tool use)
spark-advisor analyze /path/to/event-log.json.gz --agent

# Verbose mode (per-stage breakdown)
spark-advisor analyze /path/to/event-log.json.gz --verbose

# JSON output
spark-advisor analyze /path/to/event-log.json.gz --format json

# Save suggested config to file
spark-advisor analyze /path/to/event-log.json.gz -o spark-defaults.conf

# Use specific LLM model
spark-advisor analyze /path/to/event-log.json.gz --model qwen/qwen3-coder:free
```

| Flag               | Short | Default             | Description                                   |
|--------------------|-------|---------------------|-----------------------------------------------|
| `source`           |       | required            | App ID (with `-hs`) or path to event log file |
| `--history-server` | `-hs` | `None`              | Spark History Server URL                      |
| `--no-ai`          |       | `False`             | Disable AI analysis (rules only)              |
| `--agent`          |       | `False`             | Use agent mode (multi-turn AI with tool use)  |
| `--model`          | `-m`  | `qwen/qwen3-coder:free` | LLM model for AI analysis                  |
| `--output`         | `-o`  | `None`              | Write suggested config to file                |
| `--format`         | `-f`  | `text`              | Output format: `text` or `json`               |
| `--verbose`        | `-v`  | `False`             | Show per-stage breakdown                      |

### `scan` — list recent jobs from History Server

```bash
spark-advisor scan -hs http://yarn:18080 --limit 20
```

### `version`

```bash
spark-advisor version
# spark-advisor v0.1.10
```

## What it detects

11 deterministic rules: data skew, disk spill, GC pressure, shuffle partitions, executor idle, task failures, small files, broadcast join threshold, serializer choice, dynamic allocation, memory overhead.

All thresholds are configurable via `Thresholds` model.

## See also

- [Full documentation and architecture](../../README.md)
- [MCP Server setup (Claude Desktop / Cursor)](../../docs/mcp-setup.md)
- [Rules engine](../spark-advisor-rules/README.md)
- [Analyzer](../spark-advisor-analyzer/README.md)
- [Contributing](../../CONTRIBUTING.md)

## License

Apache 2.0
