Metadata-Version: 2.4
Name: metrana
Version: 0.3.1
Summary: Inephany client library to use Metrana.
Author-email: Inephany <info@inephany.com>
License: Apache 2.0
Keywords: metrana,mlops,rlops,ml,metrics
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy<3.0.0,>=1.24.0
Requires-Dist: loguru<0.8.0,>=0.7.0
Requires-Dist: requests<3.0.0,>=2.28.0
Requires-Dist: pydantic<3.0.0,>=2.5.0
Requires-Dist: urllib3<3.0.0,>=2.0.0
Requires-Dist: PyYAML<7.0.0,>=6.0.0
Requires-Dist: aiohttp<4.0.0,>=3.0.0
Requires-Dist: metrana-protobuf<1.0.0,>=0.0.3
Provides-Extra: rendering
Requires-Dist: av<16.0.0,>=12.0.0; extra == "rendering"
Provides-Extra: dev
Requires-Dist: pytest<10.0.0,>=7.0.0; extra == "dev"
Requires-Dist: pytest-mock<4.0.0,>=3.10.0; extra == "dev"
Requires-Dist: bump-my-version==1.3.0; extra == "dev"
Requires-Dist: black==26.3.1; extra == "dev"
Requires-Dist: isort==8.0.1; extra == "dev"
Requires-Dist: flake8==7.3.0; extra == "dev"
Requires-Dist: pre-commit==4.6.0; extra == "dev"
Requires-Dist: mypy==1.20.2; extra == "dev"
Requires-Dist: types-PyYAML>=6.0.12; extra == "dev"
Requires-Dist: types-redis>=4.5.0; extra == "dev"
Requires-Dist: types-requests>=2.28.0; extra == "dev"
Requires-Dist: types-cachetools>=6.1.0; extra == "dev"
Requires-Dist: typeguard==4.5.1; extra == "dev"
Requires-Dist: pytest-asyncio<2.0.0,>=0.23.0; extra == "dev"
Dynamic: license-file

# Metrana Client Library

Metrana is a metrics tracking client for ML/RL training runs. It provides a simple API to log metrics from training loops to the Metrana ingestion service, with asynchronous batching, configurable backpressure handling, and automatic retry on failure.

## Installation

```bash
pip install metrana
```

The `metrana-protobuf` dependency is pulled in automatically.

To use `metrana.log_rendering()` for logging RL environment video, install the optional `rendering` extra (pulls in PyAV, used for client-side H.264 encoding):

```bash
pip install 'metrana[rendering]'
```

## Quick Start

```python
import metrana

metrana.init(
    api_key="your-api-key",
    workspace_name="my-workspace",
    project_name="my-project",
    run_name="run-001",
)

for step in range(1000):
    loss, accuracy = train_step()
    metrana.log("loss", loss)
    metrana.log("accuracy", accuracy)

metrana.close()
```

The API key can also be provided via the `METRANA_API_KEY` environment variable, in which case `api_key` can be omitted from `init()`.

## API Reference

### `metrana.init()`

Initialises the logger. Must be called once before `log()` or `close()`.

```python
metrana.init(
    api_key: str,
    workspace_name: str,
    project_name: str,
    run_name: str,
    experiment_name: str | None = None,

    # Behavioural strategies (can also be set via environment variables)
    resume_strategy: str | None = None,       # "Never" | "Allow"
    backpressure_strategy: str | None = None, # "DropNew" | "Block" | "Raise"
    error_strategy: str | None = None,        # "Silent" | "Warn" | "RaiseOnLog" | "RaiseOnClose"
    close_strategy: str | None = None,        # "Immediate" | "CompletePending" | "CompleteAll"
    log_level: str | None = None,             # "Trace" | "Debug" | "Info" | "Success" | "Warn" | "Error" | "Critical" | "Off"

    # Aggregation rules - NOTE: this is disabled at this time.
    aggregation_rules: list[AggregationRule] | None = None,

    # Run config — logged as queryable run attributes
    config: dict | None = None,

    # Advanced
    num_dispatch_workers: int = 4,
    ingestion_url: str | None = None,         # Overrides the default API endpoint

    # Rendering (see Environment Renderings below)
    rendering_output_dir: str | Path | None = None, # Defaults to ~/.metrana/renderings
    rendering_fps: int = 30,
    rendering_max_concurrent_encoders: int = 1,
    rendering_queue_max_size: int | None = None,    # None / 0 = unbounded
)
```

### `metrana.log()`

Logs a single metric value (or a dict of values). Thread-safe and non-blocking by default.

```python
# Single metric
metrana.log("loss", 0.5)

# Multiple metrics at once
metrana.log({"loss": 0.5, "accuracy": 0.9})
```

Full signature:

```python
metrana.log(
    metric_name: str | dict[str, float | int],
    value: float | int | None = None,     # Omit when metric_name is a dict
    scale: str | None = None,             # See Metric Scales below; defaults to "ML_STEP"
    step: int | None = None,              # Auto-increments per series — do not provide
    labels: dict[str, str] | None = None, # See Labels below
    evaluation: bool = False,             # Injects "evaluation": "true" label when True
    timestamp: int | None = None,         # Unix nanoseconds; defaults to now
)
```

**`step`** auto-increments per `(metric_name, scale, labels)` series. Do not provide it — manual step values are only appropriate in for unordered series. Incoming change: allow steps to be provided as
long as they are monotonically increasing.

**`scale`** defaults to `ML_STEP`. For RL training, use the specialised helper methods to get RL environment/episode level series logged in the most efficient and scalable form.

### `metrana.close()`

Shuts down the logger. Behaviour depends on the configured `close_strategy`.

```python
metrana.close()
```

---

### RL Helpers

The following functions are convenience wrappers around `log()` that fix the scale and ensure the backend treats them appropriately.

### `metrana.log_rl_step()`

Logs a per-gradient-update metric on the `ML_STEP` scale.

```python
metrana.log_rl_step(
    metric_name: str,
    value: float | int,
    evaluation: bool = False,             # Injects "evaluation": "true" label when True
    step: int | None = None,              # Auto-increments per series — do not provide
    labels: dict[str, str] | None = None,
    timestamp: int | None = None,
)
```

### `metrana.log_rl_episode()`

Logs a per-episode metric on the `EPISODE` scale. Automatically attaches `rl_step` and `environment_id` as labels so episode data can be correlated with training progress and individual environments.

```python
metrana.log_rl_episode(
    metric_name: str,
    value: float | int,
    rl_step: int,                         # Current RL training step — required
    evaluation: bool = False,             # Injects "evaluation": "true" label when True
    episode: int | None = None,           # Auto-increments per series — do not provide
    env_id: str | None = None,            # Environment identifier
    labels: dict[str, str] | None = None,
    timestamp: int | None = None,
)
```

**`episode`** is used as the step index for this series. It auto-increments — do not provide it unless restoring from a checkpoint.

### `metrana.log_rl_environment_step()`

Logs a per-environment-interaction metric on the `ENVIRONMENT_STEP` scale. Automatically attaches `episode`, `rl_step`, and `environment_id` as labels.

```python
metrana.log_rl_environment_step(
    metric_name: str,
    value: float | int,
    rl_step: int,                         # Current RL training step — required
    evaluation: bool = False,             # Injects "evaluation": "true" label when True
    env_step: int | None = None,          # Auto-increments per series — do not provide
    episode: int | None = None,           # Episode index label
    env_id: str | None = None,            # Environment identifier
    labels: dict[str, str] | None = None,
    timestamp: int | None = None,
)
```

**`env_step`** is used as the step index for this series. It auto-increments — do not provide it unless restoring from a checkpoint.

---

### Environment Renderings

`metrana.log_rendering()` accepts a single rendered frame from an RL environment and asynchronously encodes it to a per-episode H.264 `.mp4` file on the local filesystem. Frames sharing the same `(env_id, episode)` are appended to the same file; when either changes, the in-flight encoder for that `env_id` is closed and a new one is opened.

Requires the `rendering` extra (`pip install 'metrana[rendering]'`); calling `log_rendering()` without it raises `ImportError` with a message pointing at the extra. Encoding runs on a dedicated background thread and never blocks the calling thread (subject to the configured `backpressure_strategy` when the rendering queue is full).

```python
metrana.log_rendering(
    frame: np.ndarray,
    rl_step: int,
    episode: int,
    env_id: str | None = None,
)
```

`frame` must be a `uint8` numpy array of shape `(H, W, 3)` for RGB or `(H, W)` / `(H, W, 1)` for grayscale. `H` and `W` must both be even (libx264 yuv420p constraint). Frame size and colour mode are locked at the first frame of an episode and must remain consistent for the rest of that episode.

```python
import numpy as np
import metrana

metrana.init(
    workspace_name="my-workspace",
    project_name="my-project",
    run_name="run-001",
    rendering_fps=30,
)

for episode in range(num_episodes):
    obs = env.reset()
    done = False
    while not done:
        frame = env.render()                 # uint8 (H, W, 3)
        metrana.log_rendering(
            frame=frame,
            rl_step=current_rl_step,
            episode=episode,
            env_id="env_0",
        )
        obs, _, done, _ = env.step(action(obs))

metrana.close()
```

Output files land at `<rendering_output_dir>/<run_name>/<env_id>_<episode>.mp4` (default base: `~/.metrana/renderings`). Episodes that produced no frames are deleted at close.

**Concurrent encoders.** `rendering_max_concurrent_encoders` caps the number of open encoders at any one time (default `1`). Frames for additional `env_id`s beyond the cap are dropped (or blocked / raised) per `backpressure_strategy`. Only raise this if you know what you're doing — encoding is CPU-bound and PyAV/libx264 already use internal threads for the work.

**FPS.** `rendering_fps` is locked for the run. It cannot be changed mid-run.

---

## Labels

Labels are key-value pairs that identify a series. Two calls with different label sets create two independent series. This is intentional for splitting data by environment, agent, or other dimension — but means that labels whose values change on every call will create a new series each time, which is almost never what you want.

Use labels to split data along dimensions you want to filter or aggregate over (e.g. `environment_id`). For indexing within a series, rely on the auto-incrementing step.

## Metric Scales

Scales define the x-axis semantics of a series. The specialised RL helpers fix the scale automatically; only use `scale` on `log()` directly when the helpers do not apply.

| Scale | Use when |
|---|---|
| `ML_STEP` | One entry per gradient update / training step (default) |
| `EPISODE` | One entry per RL episode |
| `ENVIRONMENT_STEP` | One entry per RL environment interaction |

The scale can be passed as a string or via `metrana.StandardMetricScale`:

```python
from metrana import StandardMetricScale
metrana.log("reward", reward, scale=StandardMetricScale.EPISODE)
```

## Aggregation Rules

Aggregation rules tell the ingestion worker how to derive new series from existing ones. They are declared once at run creation and applied automatically as data arrives.

NOTE: Aggregation rules are currently disabled on the backend.

```python
from metrana import AggregationRule, AggregationFn

metrana.init(
    ...,
    aggregation_rules=[
        # Mean and max reward collapsed across environments.
        # aggregate_over_labels=["environment_id"] strips environment_id from
        # the output, merging all per-environment series into one.
        AggregationRule(
            source_scale="EPISODE",
            output_scale="EPISODE",
            fns=[AggregationFn.AGGREGATION_FN_MEAN, AggregationFn.AGGREGATION_FN_MAX],
            aggregate_over_labels=["environment_id"],
            output_name_suffix="/across_envs",
        ),
        # Min and sum of a specific metric per episode
        AggregationRule(
            metric_name="reward",
            source_scale="EPISODE",
            output_scale="EPISODE",
            fns=[AggregationFn.AGGREGATION_FN_MIN, AggregationFn.AGGREGATION_FN_SUM],
            output_metric_name="reward/final",
        ),
    ],
)
```

### Rule fields

| Field | Type | Description |
|---|---|---|
| `metric_name` | `str \| None` | Metric to apply the rule to. If absent, applies to every metric matching `source_scale` and `aggregate_over_labels` |
| `source_scale` | `str` | Scale of the source series (e.g. `"EPISODE"`, `"ENVIRONMENT_STEP"`) |
| `output_scale` | `str` | Scale of the derived output series |
| `fns` | `list[AggregationFn]` | Aggregation functions to apply. Each function produces a separate output series. At least one required. |
| `aggregate_over_labels` | `list[str]` | Labels to aggregate over and strip from the output. Series that share the same values for all *other* labels are merged together, and these labels disappear from the result. Empty list merges all matching series unconditionally. |
| `output_metric_name` | `str \| None` | Output series name. Only valid when `metric_name` is set; defaults to `metric_name` |
| `output_name_suffix` | `str \| None` | Suffix appended to each source metric name when `metric_name` is absent. Ignored when both `metric_name` and `output_metric_name` are set |

### Aggregation functions

| Value | Description |
|---|---|
| `AggregationFn.AGGREGATION_FN_MEAN` | Mean of values in the group |
| `AggregationFn.AGGREGATION_FN_MAX` | Maximum value in the group |
| `AggregationFn.AGGREGATION_FN_SUM` | Sum of values in the group |
| `AggregationFn.AGGREGATION_FN_MIN` | Minimum value in the group |
| `AggregationFn.AGGREGATION_FN_STD_DEV` | Standard deviation of values in the group |
| `AggregationFn.AGGREGATION_FN_COUNT` | Count of values in the group |

## Strategies

### Backpressure strategy

Controls what happens when the internal event queue is full.

| Value | Behaviour |
|---|---|
| `DropNew` | Silently discard the incoming event (default) |
| `Block` | Block the calling thread until space is available |
| `Raise` | Raise `MetranaEventQueueFullError` |

### Error strategy

Controls how API errors are surfaced to the caller.

| Value | Behaviour |
|---|---|
| `Silent` | Ignore errors |
| `Warn` | Log a warning and continue (default) |
| `RaiseOnLog` | Raise on the next `log()` call if errors have occurred |
| `RaiseOnClose` | Raise on `close()` if errors have occurred |

### Resume strategy

Controls what happens when a run with the same name already exists.

| Value | Behaviour |
|---|---|
| `Allow` | Create a new run or resume an existing one (default) |
| `Never` | Always create a new run; raise if it already exists |

### Close strategy

Controls how pending events are handled on shutdown.

| Value | Behaviour |
|---|---|
| `Immediate` | Shut down immediately, discarding pending events |
| `CompletePending` | Complete API requests already in flight, but discard events still queued (default) |
| `CompleteAll` | Wait for all queued events including those not yet dispatched |

## Environment Variables

All strategies and several other settings can be configured without code changes:

| Variable | Default | Accepted values |
|---|---|---|
| `METRANA_API_KEY` | — | Your API key |
| `METRANA_BACKPRESSURE_STRATEGY` | `DropNew` | `DropNew`, `Block`, `Raise` |
| `METRANA_ERROR_MODES` | `Warn` | `Silent`, `Warn`, `RaiseOnLog`, `RaiseOnClose` |
| `METRANA_RESUME_STRATEGY` | `Allow` | `Allow`, `Never` |
| `METRANA_CLOSE_STRATEGY` | `CompletePending` | `Immediate`, `CompletePending`, `CompleteAll` |
| `METRANA_LOG_LEVEL` | `Success` | `Trace`, `Debug`, `Info`, `Success`, `Warn`, `Error`, `Critical`, `Off` |
| `METRANA_EVENT_QUEUE_MAX_SIZE` | unbounded | Integer (`0` = unbounded) |
| `METRANA_DISPATCH_QUEUE_MAX_SIZE` | unbounded | Integer (`0` = unbounded) |
| `METRANA_ERROR_QUEUE_MAX_SIZE` | unbounded | Integer (`0` = unbounded) |
