Metadata-Version: 2.4
Name: metaflow-kubeflow
Version: 0.0.4
Summary: Kubeflow Pipeline extension for Metaflow
Author: Outerbounds
Author-email: help@outerbounds.co
License: Apache Software License
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: kfp>=2.14.6
Requires-Dist: kfp-kubernetes>=2.14.6
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Kubeflow Pipelines extension for Metaflow

Compile and run Metaflow flows on Kubeflow Pipelines (**argo workflows** backend).

## Basic Usage

- Have access to a Kubeflow Pipelines instance with the API server URL.
- Use the CLI commands to compile your flow into a Kubeflow Pipeline and deploy it.

## Youtube Screencast

[![metaflow kubeflow demo](https://img.youtube.com/vi/ALg0A9SzRG8/0.jpg)](https://www.youtube.com/watch?v=ALg0A9SzRG8)

## Compiling and Deploying a Pipeline

```py
python my_flow.py kubeflow-pipelines --url https://my-kubeflow-instance.com create
```

This command will:

- Compile your Metaflow flow into a Kubeflow Pipeline YAML specification
- Upload it to your Kubeflow Pipelines instance
- Create a new version of the pipeline

### Accessing Kubeflow Pipelines for Deployment

Metaflow needs to be able to connect to Kubeflow Pipelines for deployment. If you have connectivity already set up, you don't need to do anything.

If you can't connect to the service directly, you can set up a port forward to the service:
```bash
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8081:80
```

After this, you can specify the service URL as `http://localhost:8081` in one of these ways:

- On the CLI for `kubeflow-pipelines` with the `--url` option
- In the Metaflow config, specify `"METAFLOW_KUBEFLOW_PIPELINES_URL": "http://localhost:8081"`
- Set an environment variable, `METAFLOW_KUBEFLOW_PIPELINES_URL=http://localhost:8081`

## Available Commands

### 1. **create** - Compile and/or Deploy Pipeline

Compile a new version of your flow to Kubeflow Pipelines:

**Recurring Runs**: If your flow is decorated with `@schedule`, this command will automatically create or update the corresponding Recurring Run in Kubeflow Pipelines.

```py
python my_flow.py kubeflow-pipelines \
    --url https://my-kubeflow-instance.com \
    create \
    --version-name v1.0.0 \
    --experiment "My Production Experiment" \
    --alpha 0.5
```

Options:
- `--experiment`: The experiment name to create the recurring run under (if @schedule is present). Defaults to "Default".
- `--version-name`: Allows one to deploy a custom version name. Else, a new version with UTC timestamp is created.
- `--only-yaml`: Print the YAML specification to stdout and exit without uploading to Kubeflow Pipelines.
- Flow Parameters: Any flow parameters (e.g., `--alpha`) passed here will be baked into the recurring run configuration (if @schedule is present), overriding the defaults defined in your code.

Use `--help` for all available options including `tags`, `namespace`, `max-workers`, and production token management.

### 2. **trigger** - Execute Pipeline

Trigger an execution of your deployed pipeline:

```py
python my_flow.py kubeflow-pipelines \
    --url https://my-kubeflow-instance.com \
    trigger \
    --experiment my-experiment \
    --alpha 0.1 \
    --max-epochs 100
```

Flow parameters can be passed as command-line arguments. Use `--help` for all available options.

By default, the latest version of the deployed pipeline is used for the trigger. Else, one can also pass in a custom version using `--version-name`.

### 3. **status** - Check Execution Status

Fetch the status of a running or completed pipeline execution:

```py
python my_flow.py kubeflow-pipelines \
    --url https://my-kubeflow-instance.com \
    status \
    --kfp-run-id abc-123-def-456
```

Use `--help` for all available options.

### 4. **terminate** - Terminate Execution

Terminate a running pipeline execution:

```py
python my_flow.py kubeflow-pipelines \
    --url https://my-kubeflow-instance.com \
    terminate \
    --kfp-run-id abc-123-def-456
```

Use `--help` for all available options.

### 5. **delete** - Delete a Deployed Pipeline

Delete the flow definition and all its associated versions from Kubeflow Pipelines.

This command also searches for and deletes any associated Recurring Runs (Schedules) to ensure no orphaned schedules continue trying to trigger deleted pipelines.

In essence, this undeploys the pipeline but preserves execution history (runs) and artifacts.

```py
python my_flow.py kubeflow-pipelines \
    --url https://my-kubeflow-instance.com \
    delete
```

Use `--help` for all available options.

### Fin.
