Metadata-Version: 2.4
Name: metaflow-logs
Version: 0.1.0
Summary: Ships Metaflow task logs to a centralized backend via OpenTelemetry
License: Apache-2.0
License-File: LICENSE
Requires-Python: >=3.9
Requires-Dist: metaflow>=2.9
Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.24
Requires-Dist: opentelemetry-sdk>=1.24
Description-Content-Type: text/markdown

# metaflow-logs

[![CI](https://img.shields.io/github/actions/workflow/status/npow/metaflow-logs/ci.yml?branch=main&style=flat-square&label=CI)](https://github.com/npow/metaflow-logs/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/metaflow-logs?style=flat-square)](https://pypi.org/project/metaflow-logs/)
[![Python](https://img.shields.io/pypi/pyversions/metaflow-logs?style=flat-square)](https://pypi.org/project/metaflow-logs/)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue?style=flat-square)](LICENSE)

A Metaflow extension that ships task stdout and stderr to any [OpenTelemetry](https://opentelemetry.io/)-compatible backend (Grafana Loki, Datadog, New Relic, Elasticsearch, Honeycomb, …). Each log line is tagged with full task context — flow name, run ID, step name, task ID, attempt — making logs filterable and correlatable from any OTel-aware tool.

## Quick start

```bash
pip install metaflow-logs
```

Add the decorator to any step:

```python
from metaflow import FlowSpec, step
from metaflow_extensions.logs.plugins.decorator import LogsDecorator

class MyFlow(FlowSpec):
    @LogsDecorator()
    @step
    def train(self):
        print("epoch 1 complete")
        self.next(self.end)

    @step
    def end(self):
        pass

if __name__ == "__main__":
    MyFlow()
```

Point it at your OTel collector:

```bash
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
python flow.py run
```

Each log line arrives in your backend tagged with:

| Attribute | Example |
|-----------|---------|
| `metaflow.flow` | `MyFlow` |
| `metaflow.run_id` | `123` |
| `metaflow.step` | `train` |
| `metaflow.task_id` | `456` |
| `metaflow.attempt` | `0` |
| `metaflow.log_source` | `task` or `runtime` |

## Configuration

All configuration is via standard OTel environment variables:

| Variable | Default | Purpose |
|----------|---------|---------|
| `OTEL_EXPORTER_OTLP_ENDPOINT` | `http://localhost:4318` | OTel collector endpoint |
| `OTEL_EXPORTER_OTLP_HEADERS` | — | Auth headers (e.g. `Authorization=Bearer ...`) |
| `OTEL_SERVICE_NAME` | `metaflow` | Service name in your backend |

Falls back to a console exporter if `opentelemetry-exporter-otlp-proto-http` is not installed.

## How it works

Reads Metaflow's structured MFLOG files (`MFLOG_STDOUT`, `MFLOG_STDERR`) at step completion and emits each line as an OTel `LogRecord`. Runtime logs (Metaflow internals) are emitted at `DEBUG` severity; task logs (your code's stdout/stderr) at `INFO`.

## Development

```bash
git clone https://github.com/npow/metaflow-logs
cd metaflow-logs
pip install -e .
pytest tests/ -v
```
