{{{icon_motherboard}}}
Server

Key hardware specifications of the server executing this step.

OS family
{{{server_info.os}}}
Allocation {{{icon_info}}}
Indicates whether this server appears to be dedicated to this Metaflow step or shared with other tasks. This assessment is based on comparing server-level resource usage with task-specific resource usage, analyzing patterns in CPU, memory, GPU, and VRAM utilization against both percentage thresholds and absolute values.
{{{server_info.allocation}}}
vCPUs
{{{server_info.vcpus}}}
Memory
{{{server_info.memory_mb_pretty}}} MiB
Storage
{{{server_info.disk_space_total_gb}}} GiB
{{#server_info.gpu_names}}
GPUs
{{{server_info.gpu_count}}} {{server_info.gpu_name}}
VRAM
{{{server_info.gpu_memory_mb_pretty}}} MiB
{{/server_info.gpu_names}}

{{{icon_clouds}}}
Cloud

Network discovery indicates the following cloud environment was utilized for this step.

Cloud Provider
{{{cloud_info.vendor}}}
Region/Datacenter
{{{cloud_info.region}}}
Instance Type
{{{cloud_info.instance_type_html}}}
{{#cloud_info.compute_costs}}
Compute Costs {{{icon_info}}}
The cost of running this cloud server for the duration of this step, not including storage, network traffic, IPV4 prices, startup time or any discounts.
${{{cloud_info.compute_costs}}}
{{/cloud_info.compute_costs}}

{{{icon_calculator}}}
Usage Statistics

Current and historical (including up-to the last five successful runs) averages, peaks and other summaries on resource usage.

CPU
{{{stats.cpu_usage.mean}}} avg | {{{historical_stats.avg_cpu_mean}}} hist avg | {{{stats.cpu_usage.max}}} peak
Memory
{{{stats.memory_usage.mean_pretty}}} MiB avg | {{{stats.memory_usage.max_pretty}}} MiB peak | {{{historical_stats.max_memory_max_pretty}}} MiB hist peak
Duration
{{{stats.duration}}} sec
{{#server_info.gpu_names}}
GPU
{{{stats.gpu_usage.mean}}} avg | {{{historical_stats.avg_gpu_mean}}} hist avg | {{{stats.gpu_usage.max}}} peak
VRAM
{{{stats.gpu_vram.mean_pretty}}} MiB avg | {{{stats.gpu_vram.max_pretty}}} MiB peak | {{{historical_stats.max_vram_max_pretty}}} MiB hist peak
{{/server_info.gpu_names}}
Disk Space
{{{stats.disk_usage.max_pretty}}} GiB peak
Traffic
{{stats.traffic.inbound_pretty}} GB in | {{stats.traffic.outbound_pretty}} GB out

{{{icon_robot}}}
Recommendations

Based on recent average CPU usage, historical peak memory and GPU utilization.

Recommended Resources for Next Run {{{icon_info}}}
The Metaflow @resources decorator is limited to specifying the number of vCPUs, memory, and number of GPUs, so e.g. no way to specify the minimum amount of VRAM.
{{{recommended_resources}}}
Automated Tuning of Resources
Cheapest Cloud Server to Run This Step {{{icon_info}}}
Evaluated 2000+ servers options accross AWS, GCP, Azure, Hetzner and UpCloud by filtering for the required number of vCPUs, memory, GPUs and min VRAM, then ordered descending by ondemand price, and selected the first one. The price per execution is based on the current best ondemand price of the server and the current duration of the step, and does not include any storage, network traffic, IPV4 prices, the startup time or any discounts. If interested in more advanced recommendations, please get in touch!
{{#cost_savings}}
Potential Cost Savings {{{icon_info}}}
This calculation assumes the current cloud server is dedicated to running this step and that the recommended cloud server would provide comparable performance. Savings are based on the best available on-demand pricing in supported regions and don't account for any existing discounts you may have.
{{{cost_savings.percent}}}% | ${{{cost_savings.amount}}}/execution
{{/cost_savings}}

{{{icon_cpu}}}
CPU Usage

CPU usage for both the server and specific tasks is calculated by summing user+nice and system CPU times (in clock ticks), normalized by dividing by the total elapsed time and ticks per second. Task CPU usage encompasses all child processes.

{{{icon_memory}}}
Memory Usage

On Linux, the used server memory is calculated by total - free - buffers - cached, while it depends on psutil for other systems. Task memory usage is measured by summing PSS (on Linux), USS (on MacOS and Windows), or RSS rollups of all subprocesses.

{{{icon_disk}}}
Disk I/O Usage

Task-specific disk usage tracking is unreliable; therefore, it is recommended to monitor disk usage at the server level, encompassing all mounted disks.

{{{icon_disk}}}
Disk Space Usage

Server-level disk space usage on all mounted disks.

{{{icon_router}}}
Network Usage

Network usage is monitored solely at the server level across all interfaces.

{{#server_info.gpu_names}}

{{{icon_gpu}}}
GPU Usage

nvidia-smi reported ratios standardized between 0 and GPU count, proxying how many GPUs have been 100% utilized. Note that task-specific GPU usage is not as reliable as server-level GPU usage and limited up to 4 GPUs.

{{{icon_gpu}}}
GPUs in Use

nvidia-smi reported number of GPUs with a utilization greater than 0. Note that task-specific GPU usage is not as reliable as server-level GPU usage and limited up to 4 GPUs.

{{{icon_gpu}}}
VRAM Usage

nvidia-smi reported, summed up VRAM usage for all GPUs. Note that task-specific GPU usage is not as reliable as server-level GPU usage and limited up to 4 GPUs.

{{/server_info.gpu_names}}