Metadata-Version: 2.4
Name: ear-job-visualization
Version: 6.0.0
Summary: High level support for read and visualize job information given by the EAR Library.
Author-email: Oriol Vidal <oriol.vidal@eas4dc.com>, Jalal Lakhlili <jalal.lakhlili@eas4dc.com>
License-Expression: EPL-2.0
Keywords: data,visualization,hpc,analysis,ear,paraver
Classifier: Programming Language :: Python :: 3
Requires-Python: >3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas
Requires-Dist: importlib_resources
Requires-Dist: rich
Requires-Dist: ear_analytics_core<7.0.0,>=6.0.0
Dynamic: license-file

# ear-job-visualizer

**ear-job-visualizer** is a CLI tool written in Python that visualizes runtime data collected by the [EAR](https://gitlab.bsc.es/ear_team/ear/-/wikis/home) software. It reads application and loop signatures produced by the EAR and renders them as timeline graphs, making it easy to inspect per-node performance metrics across the duration of a job.

The tool can retrieve data in two ways:
- **From the EAR Database**, by internally calling the [`eacct`](https://gitlab.bsc.es/ear_team/ear/-/wikis/EAR-commands#ear-job-accounting-eacct) command when only a job ID is provided.
- **From signature files**, by reading application and loop signature CSV files directly. These files can be generated by the [EAR CSV report plug-in](https://gitlab.bsc.es/ear_team/ear/-/wikis/Report#csv) at job runtime, or exported from the database using `eacct`.

This tool is compatible with **EAR v6**. The major version of the tool must match the major version of EAR you are collecting data from. See the [compatibility section](#ear-version-compatibility) if you are working with data from an older EAR version.

The tool currently supports two output formats:
1. **Static images** — heatmap-based timeline graphs showing job runtime metrics per node.
2. **Paraver traces** — job data converted to [Paraver Trace Format](https://tools.bsc.es/doc/1370.pdf), for use with Paraver and other BSC Tools.

You can find more information about [running jobs with EAR](https://gitlab.bsc.es/ear_team/ear/-/wikis/User%20guide#running-jobs-with-ear) and about [Paraver](https://tools.bsc.es/paraver) in their respective documentation.

## Features

- Generate static images showing runtime metrics of your job monitored by EARL.
- Generate Paraver traces to visualize runtime metrics within Paraver or any other tool from the BSC Tools team.
- Normalize CSV files from older EAR versions to the current format using `ear-normalize-csv`.

## Requirements

Python package dependencies:

- pandas
- importlib\_resources
- rich
- [ear\_analytics\_core](https://github.com/eas4dc/ear_analytics_core)

These are installed automatically when using pip. The `eacct` command is only needed if you intend to let the tool query the EAR Database directly — it is not required when providing signature files manually. See [Providing signature files](#providing-signature-files) for details.

## Installation

### Recommended: install from PyPI

The simplest way to install the tool is directly from [PyPI](https://pypi.org/project/ear-job-visualization/):

```bash
pip install ear-job-visualization
```

We recommend installing inside a virtual environment:

```bash
python -m venv my_env && source my_env/bin/activate
pip install ear-job-visualization
```

### Install from source

You need the **build** and **setuptools** packages to build and install from source. Clone the repository (or download from the [latest release](https://github.com/eas4dc/ear-job-visualization/releases/latest)), then run:

```bash
python -m venv my_env && source my_env/bin/activate
pip install -U pip && pip install build setuptools wheel
python -m build
pip install .
```

> Tool developers may want to use `pip install -e .` instead of `pip install .` to install in editable mode, avoiding reinstalls on every change.

#### Making the tool available to other users

On shared HPC systems, you can install the tool to a shared location and expose it via a module file. One approach:

1. Export [`PYTHONUSERBASE`](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONUSERBASE) to set a shared installation prefix.
2. Prepend `<prefix>/lib/python<version>/site-packages` to `PYTHONPATH`.
3. Prepend `<prefix>/bin` to `PATH`.

The script `create_module.py` included in this repository generates an Lmod modulefile automatically. Pass `--prefix` with the installation prefix (e.g., your virtual environment root); the version is read from the installed package metadata by default:

```bash
python create_module.py --prefix /path/to/venv
```

The script writes the modulefile to `ear-job-visualizer/<version>.lua` by default. You can override the output path with `--output`, and the Python version with `--python-version` if needed. The generated file looks like:

```lua
-- -*- lua -*-
-- Lmod modulefile for ear-job-visualizer <version>

whatis("Name:        ear-job-visualizer")
whatis("Version:     <version>")
whatis("Description: Visualisation tool for performance metrics collected by EAR.")

local prefix      = "/path/to/venv"
local python_ver  = "<python-version>"

prepend_path("PATH",       pathJoin(prefix, "bin"))
prepend_path("PYTHONPATH", pathJoin(prefix, "lib", "python" .. python_ver, "site-packages"))
```

Place the generated file in a directory on your module path and load it with `module load ear-job-visualizer`.

## EAR version compatibility

**ear-job-visualizer v6** requires data from **EAR v6**. The major version of the tool must always match the major version of EAR you collected data from.

### Using v5 CSV files

EAR v6 renamed two columns in the application-level CSV:

| EAR v5 column | EAR v6 column |
|---|---|
| `START_TIME` | `JOB_START_TIME` |
| `END_TIME` | `JOB_END_TIME` |

To use v5 CSV files without modifying them, export the default configuration and update the `app_info` section to use the old column names:

```bash
ear-job-visualizer --print-config > my_config.json
```

Edit the `app_info` section in `my_config.json`:

```json
"app_info": {
    "job_id": "JOBID",
    "step_id": "STEPID",
    "app_id": "APPID",
    "start_time": "START_TIME",
    "end_time": "END_TIME",
    "node_name": "NODENAME"
}
```

Then pass it with `-c my_config.json` when invoking the tool.

### Using EAR v4.3 CSV files

EAR v4.3 CSVs differ substantially from the current format: they use different application identifiers, are missing the `APPID` column, and have different column names for several metrics. Use the `ear-normalize-csv` command to convert them before visualizing. See the [dedicated section](#ear-normalize-csv) below.

## Usage

Running `ear-job-visualizer` without arguments shows:

```
usage: ear-job-visualizer [-h] [--version] [-c CONFIG_FILE]
                          (--format {runtime,ear2prv} | --print-config | --avail-metrics)
                          [--loops-file LOOPS_FILE] [--apps-file APPS_FILE]
                          [-j JOB_ID] [-s STEP_ID] [-o OUTPUT] [-k] [-t TITLE]
                          [-r] [-m metric [metric ...]]
ear-job-visualizer: error: one of the arguments --format --print-config --avail-metrics is required
```

One of `--format`, `--print-config`, or `--avail-metrics` is always required. The most commonly used is `--format`, but reading this document in order is recommended for new users.

### `--print-config`

Prints the active [configuration](#configuration) to stdout. Redirect it to a file to use as a starting point for a custom configuration:

```bash
ear-job-visualizer --print-config > my_config.json
```

### `--avail-metrics`

Lists all metric names recognized by the tool. These are read from the configuration file:

```bash
ear-job-visualizer --avail-metrics
```

To check metrics in a custom configuration:

```bash
ear-job-visualizer --avail-metrics -c my_config.json
```

### `--format`

Requests a plotting or conversion operation. The two available formats are [`runtime`](#runtime-format) and [`ear2prv`](#ear2prv-format).

**`--job-id`** is required with both formats to identify which job to process. **`--step-id`** is also required for `runtime`, and optional for `ear2prv` — since `ear2prv` can include multiple steps and applications (e.g., a full workflow) in a single trace.

#### Querying the EAR Database directly

When no signature files are provided, the tool calls `eacct` internally to retrieve the job signatures from the EAR Database. Temporary CSV files are created during the process and removed at the end; use `--keep-csv` to retain them.

> Make sure `eacct` is on your PATH and the `EAR_ETC` environment variable is set correctly. Loading the `ear` module typically handles this. If you encounter issues, contact your system administrator to verify EAR Database access.

```bash
ear-job-visualizer --format runtime --job-id <job-id> --step-id <step-id> -m gflops dc_power
```

#### Providing signature files

If you already have signature files — or are working on a system without EAR Database access — you can pass them directly with `--loops-file` and `--apps-file`. **Both options must be provided together.**

Signature files can be obtained in two ways:
- Exported from the database: `eacct -j <jobid>[.stepid] -r -c <loops_file>` and `eacct -j <jobid>[.stepid] -l -c <apps_file>`.
- Generated automatically at runtime by the EARL [CSV report plug-in](https://gitlab.bsc.es/ear_team/ear/-/wikis/Report#csv), by setting `--ear-user-db=<prefix>` when submitting your job.

When using the CSV report plug-in, one file per compute node is generated (e.g., `<prefix>_<nodename>_loops.csv`). For multi-node jobs, you can collect all files into directories and pass those directories to the tool:

```bash
mkdir apps_dir && mv *_apps.csv apps_dir
mkdir loops_dir && mv *_loops.csv loops_dir

ear-job-visualizer --format <format> --job-id <job-id> \
    --loops-file loops_dir --apps-file apps_dir <format-specific-options>
```

### *runtime* format

Generates a heatmap-based timeline figure for each metric listed in the `--metrics` argument. Each figure shows one row per node, all sharing the same time axis, which makes it straightforward to compare behaviour across the cluster.

> Requires both `--job-id` and `--step-id`, as it only supports a single job step at a time.

```bash
ear-job-visualizer --format runtime --job-id <job-id> --step-id <step-id> -m io_mbs gflops perc_mpi
```

> Use [`--avail-metrics`](#--avail-metrics) to list all supported metric names.

The above command generates the following figures:

![An example of OpenRadioss GFLOPS across the execution time.]()

![An example of OpenRadioss I/O rate across the execution time.]()

![An example of OpenRadioss %MPI rate across the execution time.]()

#### GPU metrics

When GPU metrics are requested, the graph shows per-GPU data. GPUs with a constant zero value throughout the execution are automatically filtered out.

```bash
ear-job-visualizer --format runtime --job-id 69478 --step-id 0 \
    --loops-file /examples/runtime_format/69478_loops.csv \
    --apps-file /examples/runtime_format/69478_apps.csv \
    -m gpu_util gpu_power -o 69478.0.png
```

The above command generates the following figures:

![An example of GPU utilization of a single node application using just one GPU device.](/examples/runtime_format/runtime_gpu_util-69478.0.png)

![An example of GPU power consumption of a single node application using just one GPU device.](/examples/runtime_format/runtime_gpu_power-69478.0.png)

Starting from EAR 6.0, signature files can include extended GPU profiling metrics from the NVIDIA® Data Center GPU Manager ([DCGM](https://docs.nvidia.com/datacenter/dcgm/latest/user-guide/feature-overview.html#profiling-metrics)) and the NVIDIA Management Library ([NVML](https://developer.nvidia.com/management-library-nvml)) GPM interface, including SM activity, tensor/FP64/FP32/FP16 engine activity, memory bandwidth utilization, NVLink and PCIe bandwidth, and more.

> **Note:** DCGM metrics are **not collected by default**. They must be explicitly enabled via EAR environment variables before submitting your job. See the [Extended GPU metrics](https://github.com/eas4dc/EAR/wiki/EAR-environment-variables#extended-gpu-metrics) section of the EAR documentation for details.

These metrics are available in the default tool configuration once collected. Run `--avail-metrics` to see the full list.

#### Colormap range

By default the colormap is scaled to the data range found in the input, across all nodes and GPUs. Pass `--manual-range` to use the fixed ranges defined for each metric in the [configuration](#configuration) file instead.

### *ear2prv* format

Converts EARL signature data to Paraver Trace Format. All metrics present in the input are included in the trace. `--step-id` is optional, so a single trace can cover multiple steps or an entire workflow.

The mapping to [Paraver Trace Format](https://tools.bsc.es/doc/1370.pdf) is:
- Node-level data → Paraver **task** level (Thread 1).
- GPU data → Paraver **thread** level.

Two example [Paraver configuration files](examples) are provided to get started with the output.

### `ear-normalize-csv`

This command converts EAR v4.3 CSV files to the format expected by the current version of the tool. EAR v4.3 CSVs differ substantially from the current format — they use different application identifiers, are missing the `APPID` column, and use different names for several metric columns. The normalizer handles all of these differences automatically.

```bash
ear-normalize-csv --apps-file <apps.csv> --loops-file <loops.csv> [--output-dir <dir>]
```

| Option | Description |
|---|---|
| `-a / --apps-file` | Path to the EAR v4.3 apps CSV file (required) |
| `-l / --loops-file` | Path to the EAR v4.3 loops CSV file (required) |
| `-o / --output-dir` | Directory for the output files (default: current directory) |

The command writes two normalized files — `normalized_loops_<original_name>` and `normalized_apps_<original_name>` — and prints a ready-to-use `ear-job-visualizer` invocation.

**Example:**

```bash
ear-normalize-csv -a job_apps.csv -l job_loops.csv -o normalized/

# Then use the printed command, e.g.:
ear-job-visualizer --format runtime --job-id <job-id> --step-id <step-id> \
    --apps-file normalized/normalized_apps_job_apps.csv \
    --loops-file normalized/normalized_loops_job_loops.csv \
    -m gflops dc_power
```

## Configuration

The tool ships with a default `config.json`. Export it, modify it, and pass it with `-c / --config-file`:

```bash
ear-job-visualizer --print-config > my_config.json
ear-job-visualizer --format runtime ... -c my_config.json
```

The configuration file has four top-level sections:

### `runtime`

Controls static image generation. Contains four subsections:

**`metrics`** — node-level metrics. Each entry key is the short name used with `-m`:

```json
"gflops": {
    "column_name": "^GFLOPS",
    "range": [3, 60],
    "step": 1,
    "display_name": "CPU GFLOPS"
}
```

- `column_name`: exact column name or regular expression matching column(s) in the CSV.
- `range`: `[min, max]` used when `--manual-range` is active.
- `step`: colormap tick spacing for `--manual-range`.
- `display_name` *(optional)*: label shown in the figure; defaults to the key name.

**`gpu_metrics`** — GPU-level metrics, same structure as `metrics`. The `column_name` field uses a regex with a GPU index capture group (e.g., `GPU(\\d)_POWER_W`). Includes standard NVML metrics as well as the extended DCGM/NVML GPM metrics collected by EAR 6.0.

**`socket_metrics`** — socket-level metrics (currently: CPU socket temperature). Same structure; `column_name` uses a socket index capture group.

**`app_info`** — maps logical field names to the actual CSV column names. Edit this section to handle CSV files from different EAR versions (see [EAR version compatibility](#ear-version-compatibility)):

```json
"app_info": {
    "job_id": "JOBID",
    "step_id": "STEPID",
    "app_id": "APPID",
    "start_time": "JOB_START_TIME",
    "end_time": "JOB_END_TIME",
    "node_name": "NODENAME"
}
```

### `ear2prv`

Defines which CSV columns are exported to the Paraver trace and their data types. The `job` subsection covers application-level columns; `loop` covers loop-level columns. Column names support regular expressions.

### `events` and `phases`

Define human-readable labels for EARL internal state values (policy states, optimization accuracy, application phases) that appear in the Paraver trace.

### Adding metrics not in the default configuration

Not all metrics collected by EAR 6.0 are included in the default `config.json`. To add one, export the default config, add an entry under `runtime.metrics` (or `runtime.gpu_metrics` for GPU columns), and pass the file with `-c`. The `column_name` value must match the exact column name or a regex pattern as it appears in the CSV.

For example, to add CPU instructions:

```json
"instructions": {
    "column_name": "INSTRUCTIONS",
    "range": [0, 1e12],
    "step": 1e11,
    "display_name": "Instructions"
}
```

## Contact

For any question or suggestion, contact support@ear.energy or open an issue in this repository.
