Metadata-Version: 2.4
Name: spark-logs
Version: 0.0.2
Summary: Download and optimize spark clusters automatically
Project-URL: Homepage, https://github.com/probably-nothing-labs/spark-logs
Project-URL: Bug Tracker, https://github.com/probably-nothing-labs/spark-logs/issues
Author-email: Matt Green <matt@denormalized.io>
License: MIT
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.12
Requires-Dist: google-cloud-dataproc>=5.18.1
Requires-Dist: google-cloud-resource-manager>=1.14.2
Requires-Dist: google-cloud-storage>=3.1.0
Requires-Dist: google-cloud>=0.34.0
Requires-Dist: inquirer>=3.1.3
Description-Content-Type: text/markdown

# Spark Logs

A tool for downloading and managing logs from Dataproc Spark clusters.

## Installation

```bash
pip install .
```

## Usage

The tool is fully interactive by default and will prompt for any missing information:

```bash
# Fully interactive - will prompt for project (with selection list), cluster, and application
spark-logs

# Partially interactive - only prompts for cluster and application
spark-logs --project my-gcp-project

# Partially interactive - only prompts for application
spark-logs --project my-gcp-project --cluster my-cluster

# No prompts - directly downloads logs when all parameters are specified
spark-logs --project my-gcp-project --cluster my-cluster --app-id application_1234567890_0001
```

You can also use the tool for listing resources:

```bash
# List all available Google Cloud projects
spark-logs --list-projects

# List all clusters in a project
spark-logs --project my-gcp-project --list-clusters

# List all Spark applications for a specific cluster
spark-logs --project my-gcp-project --cluster my-cluster --list-apps
```

## Optional Arguments

- `--region`: Specify the GCP region (default: us-central1)
- `--output-dir`: Directory to save logs (default: ./logs)
- `--service-account-json`: Path to service account JSON key file

## Authentication

The tool uses Google Cloud authentication. If no service account is specified, it will use the default authentication methods (gcloud auth, environment variables, etc.).

To use a service account:

```bash
spark-logs --service-account-json path/to/service-account.json
```