Metadata-Version: 2.4
Name: databricks-job-runner
Version: 0.6
Summary: Reusable CLI for uploading, submitting, validating, fetching logs, and cleaning Databricks job runs
Author: Ryan Knight
Author-email: Ryan Knight <ryan.knight@neo4j.com>
License-Expression: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Typing :: Typed
Requires-Dist: databricks-sdk>=0.40
Requires-Dist: pydantic>=2
Requires-Python: >=3.12
Project-URL: Repository, https://github.com/neo4j-partners/databricks-job-runner
Description-Content-Type: text/markdown

# databricks-job-runner

[![PyPI version](https://badge.fury.io/py/databricks-job-runner.svg)](https://badge.fury.io/py/databricks-job-runner)

Reusable CLI for uploading, submitting, and cleaning Databricks job runs.

`databricks-job-runner` wraps the
[Databricks Python SDK](https://docs.databricks.com/dev-tools/sdk-python.html)
into a small library that each project configures with a `Runner` instance.
One `Runner` gives you nine CLI subcommands (`upload`, `download`, `submit`,
`validate`, `logs`, `clean`, `catalog`, `schema`, `volume`) without writing any
Databricks API code in your project.

## Install

```bash
uv add databricks-job-runner
```

Or with pip:

```bash
pip install databricks-job-runner
```

## Quick start

`databricks-job-runner` is a library, not a standalone CLI. There is no
`__main__` in this repo. Each project wires one `Runner`.

Create a `cli/` package with two files:

**`cli/__init__.py`**

```python
from databricks_job_runner import Runner

runner = Runner(
    run_name_prefix="my_project",
    wheel_package="my_package",  # optional
)
```

**`cli/__main__.py`**

```python
from cli import runner
runner.main()
```

Add a `.env` to your project root with at least:

```
DATABRICKS_PROFILE=my-profile
DATABRICKS_CLUSTER_ID=0123-456789-abcdef
DATABRICKS_WORKSPACE_DIR=/Users/you@example.com/my_project
```

Then run the core lifecycle from your project root:

```bash
uv run python -m cli upload --all          # upload agent_modules/
uv run python -m cli submit test_hello.py  # submit a job and wait
uv run python -m cli logs                  # stdout/stderr from the last run
uv run python -m cli clean --yes           # tear down
```

```
.env + cli/  ->  upload  ->  submit  ->  (Databricks run)  ->  logs  ->  clean
                   |            |                               |
              workspace/     one-shot                        tail 5MB
              agent_modules  SubmitRun                        stdout/err
```

## Documentation

| Page | What it covers |
|------|----------------|
| [Getting started](docs/getting-started.md) | Install, project-layout contract, first job end to end, architecture overview. |
| [Configuration](docs/configuration.md) | Every `.env` key, precedence, compute modes, parameter injection, `inject_params`. |
| [Workflows](docs/workflows.md) | Common workflows with diagrams: classic vs serverless, wheels, data, Unity Catalog. |
| [Command reference](docs/commands.md) | Every subcommand, flag, and positional argument. |
| [Bootstrap-from-Volume](docs/bootstrap.md) | Run-startup wheel install, `BootstrapConfig`, per-run isolation. |
| [Preflight hooks](docs/preflight.md) | Fail-fast compute checks before submit/validate, cluster-library helpers. |
| [API reference](docs/api.md) | `Runner`, `RunnerConfig`, `Compute`, `inject_params`, `RunnerError`. |
| [Examples and smoke tests](docs/examples.md) | The two runnable example projects and the serverless test matrix. |
| [Releasing](docs/releasing.md) | PyPI tag-based release flow. |

## Requirements

- Python 3.12+
- Databricks authentication: a
  [Databricks CLI profile](https://docs.databricks.com/dev-tools/cli/index.html),
  env vars (`DATABRICKS_HOST` / `DATABRICKS_TOKEN`), or any other
  [unified-auth](https://docs.databricks.com/dev-tools/auth/) method
- Either a Databricks all-purpose cluster (auto-started if terminated) or
  serverless compute enabled for the workspace
- [uv](https://docs.astral.sh/uv/) (for wheel building only)
