ReportsπŸ”—

Rich Table ReportπŸ”—

Rich Tables are a popular way to display benchmark results in a clear and concise manner. SimpleBench leverages the rich library to generate these tables, providing visually appealing and easy-to-read reports.

Note

Rich Table reports are just one of several reporting options available in SimpleBench. You can also generate CSV reports, graph reports, and JSON reports, each providing different perspectives on your benchmark results.

Refer to the Command-Line Options section for more details on how to generate and customize these reports.

To generate a Rich Table report, you can use an option like –rich-table.ops when running your benchmarks. For example:

Running a benchmark with a Rich Table reportπŸ”—
  python my_benchmark_script.py --rich-table.ops --progress

This command executes the benchmarks in my_benchmark_script.py and generates a Rich Table in the terminal displaying the operations-per-second results. A basic output will look something like this:

Sample Rich Table Output (operations per second)πŸ”—
                                                                   addition_benchmark
                                                                operations per second

                                           A simple addition benchmark of Python's built-in sum function.
 ┏━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━┓
 ┃        ┃            ┃        ┃ Elapsed ┃             ┃               ┃            ┃            ┃            ┃             ┃                ┃        ┃
 ┃   N    ┃ Iterations ┃ Rounds ┃ Seconds ┃ mean kOps/s ┃ median kOps/s ┃ min kOps/s ┃ max kOps/s ┃ 5th kOps/s ┃ 95th kOps/s ┃ std dev kOps/s ┃  rsd%  ┃
 ┑━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━┩
 β”‚      1 β”‚    44872   β”‚      1 β”‚  0.32   β”‚    143.00   β”‚     144.00    β”‚      1.07  β”‚    153.00  β”‚    140.00  β”‚    150.00   β”‚        9.28    β”‚  6.51% β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Note

To avoid β€œfalse precision”, statistical results are shown to three significant digits. Due to the inherent variability of performance measurement, any further digits are typically meaningless statistical noise.

This is not an issue with SimpleBench itself, but rather a fundamental aspect of benchmarking and performance measurement in the real world.

Note

Interpreting Outliers in Benchmark Results

In the sample output above, you may notice that the min kOps/s value is an extreme outlier, far from the mean and median. This is a realistic reflection of real-world benchmarking. System events like garbage collection, process scheduling, or I/O interrupts can cause individual iterations to be significantly slower than the typical case.

This is precisely why SimpleBench provides a full suite of statistics. Instead of relying solely on the mean, you should also consider:

  • The median, which is resistant to outliers and often gives a better sense of β€œtypical” performance.

  • The 5th and 95th percentiles, which show the range of performance for the vast majority of iterations, excluding the most extreme outliers.

  • The standard deviation (std dev) and RSD%, which quantify the level of inconsistency in the results. A high value indicates significant variability.

By providing these metrics, SimpleBench allows you to get a complete and honest picture of your code’s performance, including its variability.

Report Variations and DestinationsπŸ”—

The example above shows an operations-per-second report printed to the console. SimpleBench provides several variations:

  • –rich-table: Generates tables for all result types (ops, timing, and memory).

  • –rich-table.ops: Generates tables only for operations-per-second results.

  • –rich-table.timing: Generates tables only for timing results.

  • –rich-table.memory: Generates tables only for memory usage results.

By default, reports are displayed in the console. You can send a report to other destinations, such as the filesystem, by appending the destination name. For example, to save the report to a file instead of printing it to the terminal:

Saving a Rich Table report to the filesystemπŸ”—
  python my_benchmark_script.py --rich-table.ops filesystem

Advanced FeaturesπŸ”—

Beyond the basic fields shown above, the reports also support advanced features such as:

Parameterized Benchmarks

Including esults for benchmarks that take parameters, allowing for analysis of performance across different input sizes or configurations.

Custom Complexity Weightings:

Including Big-O complexity weight/size annotations to help analyze how performance scales with input size.

These features make these reports a powerful tool for understanding the performance characteristics of your code in a clear and structured manner.

Parameterized BenchmarksπŸ”—

When benchmarks are parameterized, SimpleBench generates additional columns in the report for each parameter value requested for generation.

This allows you to easily compare performance across different configurations. For example, if you have a benchmark that takes an input size parameter, the report can include how performance varies with different input sizes.

See the defining_benchmarks section for more details on defining and using parameterized benchmarks.

Custom Complexity WeightingsπŸ”—

Related to parameterized benchmarks, SimpleBench allows you to specify custom complexity weightings (number/size weighting)for your benchmarks.

These weightings are included in the report as the N column value, helping you analyze how performance scales with input size and parameterization.

For example, you might specify that a benchmark set covers input sizes 1, 20, 100, 1000, which will be indicated in the N column of the report with a row for each size.

When defining a parameterized benchmark, you can provide complexity weightings that reflect the expected performance characteristics of the code being benchmarked and are matched with the parameters being used. This helps in understanding how the performance of the benchmarked code changes as the input size or other parameters vary.

These advanced features make these reports a powerful tool for analyzing the performance of parameterized benchmarks and understanding the scalability of your code.

Field DefinitionsπŸ”—

The descriptions of the fields included in each report type is described below.

Common Report FieldsπŸ”—

The following report fields are present in all of the report types below.

N

A complexity weighting used to indicate the input size for a benchmark.

A Big-O (O(n), etc) complexity weighting. This is used to indicate the β€˜size’ of the input to a parameterized benchmark. It defaults to 1 unless overridden by the benchmark. The N value is used to help compare performance across different input sizes (if applicable) and to analyze how the function scales with different input sizes.

Iterations

The number of statistical samples taken for the benchmark.

The total number of iterations executed during the benchmark. An iteration is an execution of the benchmarked function once for statistical reporting purposes. It may be composed of multiple actual rounds to improve accuracy and precision, but is reported as a single count for the purposes of the table.

Rounds

The number of times the benchmarked function is executed within a single iteration.

The number of rounds executed during an iteration. A round is a single execution of the benchmarked function. Multiple rounds are often executed within an iteration to gather more accurate timing and performance data. They are executed in rapid succession, and their results are aggregated to produce the final metrics for an iteration.

Elapsed Seconds

The total CPU time spent executing the benchmarked code.

The total measured elapsed time in seconds for all iterations of the benchmark. This metric provides an overview of how long the benchmark took to complete. This does not include any setup or teardown time, focusing solely on the execution time of the benchmarked code. By default, this measures CPU time, not wall-clock time, to provide a more accurate representation of the code’s performance. It can be overridden to measure wall-clock time instead if so desired.

Operations Per SecondπŸ”—

The operations per second report provides a detailed overview of the performance of the benchmarked code in terms of how many operations it can perform per second. This is a common metric used to evaluate the efficiency of code, especially in performance-critical applications.

Output numbers are scaled to appropriate units (Ops/s, kOps/s, MOps/s, etc) for easier readability.

The fields always in an operations per second report are:

mean Ops/s

The average number of operations per second.

The arithmetic mean average number of of operations per second (Ops/s) performed during the benchmark. This metric is calculated by dividing the total number of operations executed by the total elapsed time, then scaling it an appropriate factor (for example, kOps/s) for easier readability. It provides a quick overview of the benchmark’s performance.

median Ops/s

The 50th percentile (middle value) of operations per second.

The median (50th percentile) number of operations per second (Ops/s) performed during the benchmark. This metric represents the middle value of the Ops/s measurements collected during the benchmark, providing a robust measure of central tendency that is less affected by outliers compared to the mean.

min Ops/s

The lowest (worst) performance recorded across all iterations.

The minimum number of operations per second (Ops/s) recorded during the benchmark. This metric indicates the lowest performance observed during the benchmark runs, which can be useful for identifying potential bottlenecks or performance issues.

max Ops/s

The highest (best) performance recorded across all iterations.

The maximum number of operations per second (Ops/s) recorded during the benchmark. This metric indicates the highest performance observed during the benchmark runs, showcasing the best-case scenario for the benchmarked code.

5th Ops/s

The 5th percentile of operations per second.

The 5th percentile number of operations per second (Ops/s) recorded during the benchmark. This metric indicates that 5% of the Ops/s measurements were below this value, providing insight into the lower end of the typical performance distribution.

95th Ops/s

The 95th percentile of operations per second.

The 95th percentile number of operations per second (Ops/s) recorded during the benchmark. This metric indicates that 95% of the Ops/s measurements were below this value, providing insight into the upper end of the typical performance distribution.

std dev kOps/s

A measure of the variation or inconsistency in performance.

The standard deviation of the operations per second (Ops/s) measurements collected during the benchmark. This metric quantifies the amount of variation or dispersion in the Ops/s values, providing insight into the consistency of the benchmark’s performance. A lower standard deviation indicates more consistent performance, while a higher standard deviation suggests greater variability in the results.

rsd%

A normalized measure of performance inconsistency, expressed as a percentage.

The relative standard deviation (RSD) expressed as a percentage. This metric is calculated by dividing the standard deviation by the mean and multiplying by 100. It provides a normalized measure of variability, allowing for easier comparison of consistency across different benchmarks or parameter configurations. A lower RSD% indicates more consistent performance relative to the mean, while a higher RSD% suggests greater variability in the results.

TimingπŸ”—

A timing report focuses on the time taken to execute the benchmarked code, rather than the number of operations per second. It provides insights into the average time per operation and other timing-related statistics.

Output numbers are scaled to appropriate units (seconds, milliseconds, microseconds, etc) for easier readability.

The fields in this report include:

mean s/op

The average time in seconds per operation.

The arithmetic mean average time in seconds per operation (s/op). This metric is calculated by dividing the total elapsed time by the total number of operations. It provides a direct measure of how long a single operation takes on average.

median s/op

The 50th percentile (middle value) of seconds per operation.

The median (50th percentile) time in seconds per operation. This metric represents the middle value of the timing measurements, providing a robust measure of central tendency that is less affected by unusually fast or slow iterations (outliers).

min s/op

The lowest (fastest) time per operation recorded across all iterations.

The minimum time in seconds per operation recorded during the benchmark. This metric indicates the best-case performance observed, showcasing the fastest execution time for a single operation.

max s/op

The highest (slowest) time per operation recorded across all iterations.

The maximum time in seconds per operation recorded during the benchmark. This metric indicates the worst-case performance observed, which can be useful for identifying potential bottlenecks or performance stalls.

5th s/op

The 5th percentile of seconds per operation. 5% of iterations were faster than this.

The 5th percentile time in seconds per operation. This metric indicates that 5% of the timing measurements were faster than this value, providing insight into the best-case end of the performance distribution.

95th s/op

The 95th percentile of seconds per operation. 95% of iterations were faster than this.

The 95th percentile time in seconds per operation. This metric indicates that 95% of the timing measurements were faster than this value, providing insight into the typical worst-case performance, excluding extreme outliers.

std dev s/op

A measure of the variation or inconsistency in the time per operation.

The standard deviation of the seconds per operation (s/op) measurements. This metric quantifies the amount of variation in the timing values. A lower standard deviation indicates more consistent, predictable execution times.

rsd%

A normalized measure of timing inconsistency, expressed as a percentage.

The relative standard deviation (RSD) expressed as a percentage. This metric is calculated by dividing the standard deviation by the mean time. It provides a normalized measure of variability, allowing for easier comparison of timing consistency across different benchmarks.

Memory UsageπŸ”—

A memory usage Rich Table report provides information about the memory consumption of the benchmarked code. It includes statistics on average and peak memory usage during the benchmark runs. Output numbers are scaled to appropriate units (bytes, kB, MB, etc) for easier readability.

For a memory usage Rich Table report, two tables are generated: one for average memory usage and another for peak memory usage. The key fields in these tables include:

mean bytes

The average memory allocated per operation, in bytes.

The arithmetic mean average memory allocated per operation. This metric provides a general overview of the benchmark’s memory footprint under typical execution.

median bytes

The 50th percentile (middle value) of memory allocated per operation.

The median (50th percentile) of memory allocated per operation. This provides a robust measure of the typical memory usage that is less affected by iterations with unusually high or low memory consumption.

min bytes

The minimum memory allocated per operation across all iterations.

The minimum memory allocated per operation recorded during the benchmark. This metric indicates the lowest memory footprint observed, representing the best-case scenario for memory efficiency.

max bytes

The maximum memory allocated per operation across all iterations.

The maximum memory allocated per operation recorded during the benchmark. This metric indicates the highest memory footprint observed, which is crucial for understanding peak memory demand and potential memory-related issues.

5th bytes

The 5th percentile of memory allocated per operation.

The 5th percentile of memory allocated per operation. This metric indicates that 5% of the iterations used less memory than this value, providing insight into the lower end of the memory usage distribution.

95th bytes

The 95th percentile of memory allocated per operation.

The 95th percentile of memory allocated per operation. This metric indicates that 95% of the iterations used less memory than this value, which is useful for understanding the typical upper bound of memory usage, excluding extreme outliers.

std dev bytes

A measure of the variation in memory allocation per operation.

The standard deviation of the memory allocation measurements. This metric quantifies the amount of variation in memory usage across iterations. A lower value indicates more consistent and predictable memory behavior.

rsd%

A normalized measure of memory usage inconsistency, expressed as a percentage.

The relative standard deviation (RSD) expressed as a percentage. This metric is calculated by dividing the standard deviation by the mean memory usage. It provides a normalized measure of variability, allowing for easier comparison of memory consistency across different benchmarks.

CSV ReportπŸ”—

CSV (Comma-Separated Values) reports are designed for machine readability and are ideal for importing benchmark results into spreadsheet applications like Microsoft Excel or Google Sheets, or for analysis with data processing tools like Pandas.

To generate a CSV report, you can use an option like –csv.ops when running your benchmarks. By default, this will save a CSV file to the output directory (which defaults to .benchmarks).

Here is an example of how to run a benchmark script and generate a CSV report:

Running a benchmark with a CSV reportπŸ”—
  python my_benchmark_script.py --csv.ops

This command executes the benchmarks in my_benchmark_script.py and saves a CSV file containing the operations-per-second results. A basic CSV output file will look something like this:

Sample CSV OutputπŸ”—
  # title: addition_benchmark
  # description: A simple addition benchmark of Python's built-in sum function.
  # unit: Ops/s
  N,Iterations,Rounds,Elapsed Seconds,mean (Ops/s),median (Ops/s),min (Ops/s),max (Ops/s),5th (Ops/s),95th (Ops/s),std dev (Ops/s),rsd (%)
  1,42761,1,0.29235192800000004,148000.0,150000.0,962.0,154000.0,144000.0,151000.0,8220.0,5.55

Which corresponds to the following table:

N

Iterations

Rounds

Elapsed Seconds

mean (Ops/s)

median (Ops/s)

min (Ops/s)

max (Ops/s)

5th (Ops/s)

95th (Ops/s)

std dev (Ops/s)

rsd (%)

1

42761

1

0.29235192800000004

148000.0

150000.0

962.0

154000.0

144000.0

151000.0

8220.0

5.55

Note

Interpreting Outliers in Benchmark Results

In the sample output, you may notice that the min (Ops/s) value is an extreme outlier. This is a realistic reflection of real-world benchmarking, where system events like garbage collection can cause individual iterations to be significantly slower. This is why SimpleBench provides a full suite of statistics like the median (which is resistant to outliers) and RSD% (which quantifies inconsistency) to help you get a complete and honest picture of your code’s performance.

Note

To avoid β€œfalse precision”, statistical results are output with three significant digits. Due to the inherent variability of performance measurement, any further digits are typically meaningless statistical noise.

This does not apply to the raw timing measurements (e.g., Elapsed Seconds), which are reported in full precision for accuracy.

This is not an issue with SimpleBench itself, but rather a fundamental aspect of benchmarking and performance measurement in the real world.

Report Variations and DestinationsπŸ”—

SimpleBench provides several variations of the CSV report:

  • --csv: Generates a CSV file with all result types (ops, timing, and memory).

  • --csv.ops: Generates a CSV file only for operations-per-second results.

  • --csv.timing: Generates a CSV file only for timing results.

  • --csv.memory: Generates a CSV file only for memory usage results.

By default, CSV reports are saved to the filesystem. You can send a report to other destinations, such as the console, by appending the destination name. For example, to print the CSV content directly to the terminal:

Printing a CSV report to the consoleπŸ”—
  python my_benchmark_script.py --csv.ops console

The generated files are named based on the benchmarked function name and report type. To prevent collisions between identical benchmark names, a numeric prefix is added to ensure uniqueness.

Examples

Output file names for different CSV report typesπŸ”—
  001_addition_benchmark-memory_usage.csv
  001_addition_benchmark-peak_memory_usage.csv
  001_addition_benchmark-operations_per_second.csv
  001_addition_benchmark-timing.csv

Advanced FeaturesπŸ”—

Parameterized BenchmarksπŸ”—

When running parameterized benchmarks, the CSV report includes additional columns for each parameter variation. This makes it easy to sort, filter, and analyze how different input parameters affect performance.

For more information on creating parameterized benchmarks, see the ../advanced_usage documentation.

Custom Complexity WeightingsπŸ”—

The N column in the CSV report represents the complexity weighting of the benchmark. This is particularly useful for analyzing the scalability of your code with different input sizes.

For more details on how to use this feature, see the ../advanced_usage documentation.

CSV Field DefinitionsπŸ”—

Each report conains a set of comments and fields that provide detailed information about the benchmark results.

Metadata Comment LinesπŸ”—

At the top of each report, there are comment lines that provide metadata about the report generation. These comments are prefixed with a # character and include important information for understanding the context of the report.

Most modern CSV parsers will ignore comment lines, but they can be useful for humans reading the report or for tools that process the report files.

They will usually be read and included by spreadsheet applications to provide context about the data contained in the report.

# title:

The title of the report. This is the name of the benchmarked function.

# description:

A brief description of the benchmarked function, if provided. This is generated from the docstring of the benchmark function.

# unit:

The unit of measurement for the report (e.g., Ops/s, seconds, bytes).

Example of Metadata Comment Lines

Example Metadata CommentsπŸ”—
  # title: addition_benchmark
  # description: A simple addition benchmark of Python's built-in sum function.
  # unit: Ops/s

First Line Of DataπŸ”—

The first line of the report data contains the column headers that label each field in the report. These headers provide context for the numerical data that follows.

Example of First Line of Data

Example First Line of Report DataπŸ”—
  N,Iterations,Rounds,Elapsed Seconds,mean (Ops/s),median (Ops/s),min (Ops/s),max (Ops/s),5th (Ops/s),95th (Ops/s),std dev (Ops/s),rsd (%)

Common Report FieldsπŸ”—

The following report fields are present in all of the report types below.

N

A complexity weighting used to indicate the input size for a benchmark.

A Big-O (O(n), etc) complexity weighting. This is used to indicate the β€˜size’ of the input to a parameterized benchmark. It defaults to 1 unless overridden by the benchmark. The N value is used to help compare performance across different input sizes (if applicable) and to analyze how the function scales with different input sizes.

Iterations

The number of statistical samples taken for the benchmark.

The total number of iterations executed during the benchmark. An iteration is an execution of the benchmarked function once for statistical reporting purposes. It may be composed of multiple actual rounds to improve accuracy and precision, but is reported as a single count for the purposes of the table.

Rounds

The number of times the benchmarked function is executed within a single iteration.

The number of rounds executed during an iteration. A round is a single execution of the benchmarked function. Multiple rounds are often executed within an iteration to gather more accurate timing and performance data. They are executed in rapid succession, and their results are aggregated to produce the final metrics for an iteration.

Elapsed Seconds

The total CPU time spent executing the benchmarked code.

The total measured elapsed time in seconds for all iterations of the benchmark. This metric provides an overview of how long the benchmark took to complete. This does not include any setup or teardown time, focusing solely on the execution time of the benchmarked code. By default, this measures CPU time, not wall-clock time, to provide a more accurate representation of the code’s performance. It can be overridden to measure wall-clock time instead if so desired.

Operations Per SecondπŸ”—

The operations per second report provides a detailed overview of the performance of the benchmarked code in terms of how many operations it can perform per second. This is a common metric used to evaluate the efficiency of code, especially in performance-critical applications.

Output numbers are scaled to appropriate units (Ops/s, kOps/s, MOps/s, etc) for easier readability.

The fields always in an operations per second report are:

mean (Ops/s)

The average number of operations per second.

The arithmetic mean average number of of operations per second (Ops/s) performed during the benchmark. This metric is calculated by dividing the total number of operations executed by the total elapsed time, then scaling it an appropriate factor (for example, kOps/s) for easier readability. It provides a quick overview of the benchmark’s performance.

median (Ops/s)

The 50th percentile (middle value) of operations per second.

The median (50th percentile) number of operations per second (Ops/s) performed during the benchmark. This metric represents the middle value of the Ops/s measurements collected during the benchmark, providing a robust measure of central tendency that is less affected by outliers compared to the mean.

min (Ops/s)

The lowest (worst) performance recorded across all iterations.

The minimum number of operations per second (Ops/s) recorded during the benchmark. This metric indicates the lowest performance observed during the benchmark runs, which can be useful for identifying potential bottlenecks or performance issues.

max (Ops/s)

The highest (best) performance recorded across all iterations.

The maximum number of operations per second (Ops/s) recorded during the benchmark. This metric indicates the highest performance observed during the benchmark runs, showcasing the best-case scenario for the benchmarked code.

5th (Ops/s)

The 5th percentile of operations per second.

The 5th percentile number of operations per second (Ops/s) recorded during the benchmark. This metric indicates that 5% of the Ops/s measurements were below this value, providing insight into the lower end of the typical performance distribution.

95th (Ops/s)

The 95th percentile of operations per second.

The 95th percentile number of operations per second (Ops/s) recorded during the benchmark. This metric indicates that 95% of the Ops/s measurements were below this value, providing insight into the upper end of the typical performance distribution.

std dev (Ops/s)

A measure of the variation or inconsistency in performance.

The standard deviation of the operations per second (Ops/s) measurements collected during the benchmark. This metric quantifies the amount of variation or dispersion in the Ops/s values, providing insight into the consistency of the benchmark’s performance. A lower standard deviation indicates more consistent performance, while a higher standard deviation suggests greater variability in the results.

rsd (%)

A normalized measure of performance inconsistency, expressed as a percentage.

The relative standard deviation (RSD) expressed as a percentage. This metric is calculated by dividing the standard deviation by the mean and multiplying by 100. It provides a normalized measure of variability, allowing for easier comparison of consistency across different benchmarks or parameter configurations. A lower RSD% indicates more consistent performance relative to the mean, while a higher RSD% suggests greater variability in the results.

TimingπŸ”—

A timing report focuses on the time taken to execute the benchmarked code, rather than the number of operations per second. It provides insights into the average time per operation and other timing-related statistics.

Output numbers are scaled to appropriate units (seconds, milliseconds, microseconds, etc) for easier readability.

The fields in this report include:

mean (s)

The average time in seconds per operation.

The arithmetic mean average time in seconds per operation (s/op). This metric is calculated by dividing the total elapsed time by the total number of operations. It provides a direct measure of how long a single operation takes on average.

median (s)

The 50th percentile (middle value) of seconds per operation.

The median (50th percentile) time in seconds per operation. This metric represents the middle value of the timing measurements, providing a robust measure of central tendency that is less affected by unusually fast or slow iterations (outliers).

min (s)

The lowest (fastest) time per operation recorded across all iterations.

The minimum time in seconds per operation recorded during the benchmark. This metric indicates the best-case performance observed, showcasing the fastest execution time for a single operation.

max (s)

The highest (slowest) time per operation recorded across all iterations.

The maximum time in seconds per operation recorded during the benchmark. This metric indicates the worst-case performance observed, which can be useful for identifying potential bottlenecks or performance stalls.

5th (s)

The 5th percentile of seconds per operation. 5% of iterations were faster than this.

The 5th percentile time in seconds per operation. This metric indicates that 5% of the timing measurements were faster than this value, providing insight into the best-case end of the performance distribution.

95th (s)

The 95th percentile of seconds per operation. 95% of iterations were faster than this.

The 95th percentile time in seconds per operation. This metric indicates that 95% of the timing measurements were faster than this value, providing insight into the typical worst-case performance, excluding extreme outliers.

std dev (s)

A measure of the variation or inconsistency in the time per operation.

The standard deviation of the seconds per operation (s) measurements. This metric quantifies the amount of variation in the timing values. A lower standard deviation indicates more consistent, predictable execution times.

rsd (%)

A normalized measure of timing inconsistency, expressed as a percentage.

The relative standard deviation (RSD) expressed as a percentage. This metric is calculated by dividing the standard deviation by the mean time. It provides a normalized measure of variability, allowing for easier comparison of timing consistency across different benchmarks.

Memory UsageπŸ”—

A memory usage report provides information about the memory consumption of the benchmarked code. It includes statistics on average and peak memory usage during the benchmark runs. Output numbers are scaled to appropriate units (bytes, kB, MB, etc) for easier readability.

For a memory usage report, two tables are generated: one for average memory usage and another for peak memory usage with each table having its name ending in either -memory_usage.csv or -peak_memory_usage.csv respectively.

The fields in these tables are the same, with the distinction being whether they refer to average or peak memory usage.

The fields always in a memory usage report are:

mean (bytes)

The average memory allocated per operation, in bytes.

The arithmetic mean average memory allocated per operation. This metric provides a general overview of the benchmark’s memory footprint under typical execution.

median (bytes)

The 50th percentile (middle value) of memory allocated per operation.

The median (50th percentile) of memory allocated per operation. This provides a robust measure of the typical memory usage that is less affected by iterations with unusually high or low memory consumption.

min (bytes)

The minimum memory allocated per operation across all iterations.

The minimum memory allocated per operation recorded during the benchmark. This metric indicates the lowest memory footprint observed, representing the best-case scenario for memory efficiency.

max (bytes)

The maximum memory allocated per operation across all iterations.

The maximum memory allocated per operation recorded during the benchmark. This metric indicates the highest memory footprint observed, which is crucial for understanding peak memory demand and potential memory-related issues.

5th (bytes)

The 5th percentile of memory allocated per operation.

The 5th percentile of memory allocated per operation. This metric indicates that 5% of the iterations used less memory than this value, providing insight into the lower end of the memory usage distribution.

95th (bytes)

The 95th percentile of memory allocated per operation.

The 95th percentile of memory allocated per operation. This metric indicates that 95% of the iterations used less memory than this value, which is useful for understanding the typical upper bound of memory usage, excluding extreme outliers.

std dev (bytes)

A measure of the variation in memory allocation per operation.

The standard deviation of the memory allocation measurements. This metric quantifies the amount of variation in memory usage across iterations. A lower value indicates more consistent and predictable memory behavior.

rsd (%)

A normalized measure of memory usage inconsistency, expressed as a percentage.

The relative standard deviation (RSD) expressed as a percentage. This metric is calculated by dividing the standard deviation by the mean memory usage. It provides a normalized measure of variability, allowing for easier comparison of memory consistency across different benchmarks.