Examples on POWER

In the definitions/power/examples directory of the Microprobe distribution (if you installed the microprobe_target_power package), you will find different examples showing the usage of Microprobe for the power architecture. Although we have split the examples by architecture, the concepts we introduce in these examples are common in all the architectures.

We recommend users to go through the code of these examples to understand specific details on how to use the framework.

Contents:

isa_power_v206_info.py

The first example we show is isa_power_v206_info.py. This example shows how to search for architecture definitions (e.g. the ISA properties), how to import the definitions and then how to dump the definition. If you execute the following command:

> ./isa_power_v206_info.py

will generate the following output, which shows all the details of the POWER v2.06 architecture (first and last 20 lines for brevity):

--------------------------------------------------------------------------------
ISA Name: power_v206
ISA Description: power_v206
--------------------------------------------------------------------------------
Register Types:
     GPR: General Register (bit size: 64)
    VSCR: Vector Status and Control Register (bit size: 32)
     FPR: Floating-Point Register (bit size: 64)
     SPR: Special Purpose Register (64 bits) (bit size: 64)
      VR: Vector Register (bit size: 128)
     MSR: Machine State Register (bit size: 64)
   SPR32: Special Purpose Register (32 bits) (bit size: 32)
     VSR: Vector Scalar Register (bit size: 128)
   FPSCR: Floating-Point Status and Control Register (bit size: 32)
      CR: Condition Register (bit size: 4)
--------------------------------------------------------------------------------
Architected registers:
    AESR : AESR Register (Type: SPR)
    AMOR : AMOR Register (Type: SPR)
     AMR : Authority Mask Register (Type: SPR)
...
	access_storage              :	False	(Boolean indicating if the instruction has storage operands                                                          )
	access_storage_with_update  :	False	(Boolean indicating if the instruction accesses to storage and updates the source register with the generated address)
	algebraic                   :	False	(Boolean indicating if operation uses algebraic rules to keep values                                                 )
	branch                      :	False	(Boolean indicating if the instruction is a branch                                                                   )
	branch_conditional          :	False	(Boolean indicating if the instruction is a branch conditional                                                       )
	branch_relative             :	False	(Boolean indicating if the instruction is a relative branch                                                          )
	category                    :	VSX  	(String indicating if the instruction the instruction category                                                       )
	decimal                     :	False	(Boolean indication if the instruction requires inputs in decimal format                                             )
	disable_asm                 :	False	(Boolean indicating if ASM generation is disabled for the instruction. If so, binary codification is used.           )
	hypervisor                  :	False	(Boolean indicating if the instruction need hypervisor mode                                                          )
	privileged                  :	False	(Boolean indicating if the instruction is privileged                                                                 )
	privileged_optional         :	False	(Boolean indicating the instrucion is priviledged or not depending on the input values                               )
	switching                   :	None 	(Input values required to maximize the computational switching                                                       )
	syscall                     :	False	(Boolean indicating if the instruction is a syscall or return from one                                               )
	trap                        :	False	(Boolean indicating if the instruction is a trap                                                                     )


 Instructions defined: 938 
 Variants defined: 964 
--------------------------------------------------------------------------------

The following code is what has been executed:

 1#!/usr/bin/env python
 2# Copyright 2011-2021 IBM Corporation
 3#
 4# Licensed under the Apache License, Version 2.0 (the "License");
 5# you may not use this file except in compliance with the License.
 6# You may obtain a copy of the License at
 7#
 8# http://www.apache.org/licenses/LICENSE-2.0
 9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16isa_power_v206_info.py
17
18Example module to show how to access to isa definitions.
19"""
20
21# Futures
22from __future__ import absolute_import, print_function
23
24# Built-in modules
25import os
26
27# Own modules
28from microprobe.target.isa import find_isa_definitions, import_isa_definition
29
30__author__ = "Ramon Bertran"
31__copyright__ = "Copyright 2011-2021 IBM Corporation"
32__credits__ = []
33__license__ = "IBM (c) 2011-2021 All rights reserved"
34__version__ = "0.5"
35__maintainer__ = "Ramon Bertran"
36__email__ = "rbertra@us.ibm.com"
37__status__ = "Development"  # "Prototype", "Development", or "Production"
38
39# Constants
40ISANAME = "power_v206"
41
42# Functions
43
44# Classes
45
46# Main
47
48# Search and import definition
49ISADEF = import_isa_definition(
50    os.path.dirname(
51        [isa for isa in find_isa_definitions()
52         if isa.name == ISANAME][0].filename
53        )
54    )
55
56# Print definition
57print((ISADEF.full_report()))
58exit(0)

In this simple code, first the find_isa_definitions, import_isa_definition from the microprobe.target.isa module are imported (line 14). Then, the first one is used to look for definitions of architectures, a list returned and filtered and only the one with name power_v206 is imported using the second method: import_isa_definition (lines 34-37). Finally, the full report of the ISADEF object is printed to standard output in line 40.

In the case, the full report is printed but the user can query any information about the particular ISA that has been imported by using the microprobe.target.isa.ISA API.

power_v206_power7_ppc64_linux_gcc_profile.py

The aim of this example is to show how the code generation works in Microprobe. In particular, this example shows how to generate, for each instruction of the ISA, an endless loop containing such instruction. The size of the loop and the dependency distance between the instructions of the loop can specified as a parameter. Using Microprobe you can generate thousands of microbenchmarks in few minutes. Let’s start with the command line interface. Executing:

> ./power_v206_power7_ppc64_linux_gcc_profile.py --help

will generate the following output:

power_v206_power7_ppc64_linux_gcc_profile.py: INFO: Processing input arguments...
usage: power_v206_power7_ppc64_linux_gcc_profile.py [-h]
                                                    [-P SEARCH_PATH [SEARCH_PATH ...]]
                                                    [-V] [-v] [-d]
                                                    [-i INSTRUCTION_NAME [INSTRUCTION_NAME ...]]
                                                    [--output_prefix PREFIX]
                                                    [-O PATH] [-p NUM_JOBS]
                                                    [-S BENCHMARK_SIZE]
                                                    [-D DEPENDECY_DISTANCE]

ISA power v206 profile example

optional arguments:
  -h, --help            show this help message and exit
  -P SEARCH_PATH [SEARCH_PATH ...], --default_paths SEARCH_PATH [SEARCH_PATH ...]
                        Default search paths for microprobe target definitions
  -V, --version         Show Microprobe version and exit
  -v, --verbosity       Verbosity level (Values: [0,1,2,3,4]). Each time this
                        argument is specified the verbosity level is
                        increased. By default, no logging messages are shown.
                        These are the four levels available:
                        
                          -v (1): critical messages
                          -v -v (2): critical and error messages
                          -v -v -v (3): critical, error and warning messages
                          -v -v -v -v (4): critical, error, warning and info messages
                        
                        Specifying more than four verbosity flags, will
                        default to the maximum of four. If you need extra
                        information, enable the debug mode (--debug or -d
                        flags).
  -d, --debug           Enable debug mode in Microprobe framework. Lots of
                        output messages will be generated
  -i INSTRUCTION_NAME [INSTRUCTION_NAME ...], --instruction INSTRUCTION_NAME [INSTRUCTION_NAME ...]
                        Instruction names to generate. Default: All
                        instructions
  --output_prefix PREFIX
                        Output prefix of the generated files. Default:
                        POWER_V206_PROFILE
  -O PATH, --output_path PATH
                        Output path. Default: current path
  -p NUM_JOBS, --parallel NUM_JOBS
                        Number of parallel jobs. Default: number of CPUs
                        available (80). Valid values: 1, 2, 3, 4, 5, 6, 7, 8,
                        9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                        23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
                        36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
                        49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
                        62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,
                        75, 76, 77, 78, 79, 80
  -S BENCHMARK_SIZE, --size BENCHMARK_SIZE
                        Benchmark size (number of instructions in the endless
                        loop). Default: 64 instructions
  -D DEPENDECY_DISTANCE, --dependency_distance DEPENDECY_DISTANCE
                        Average dependency distance between the instructions.
                        Default: 1000 (no dependencies)

Environment variables:

  MICROPROBETEMPLATES    Default path for microprobe templates
  MICROPROBEDEBUG        If set, enable debug
  MICROPROBEDEBUGPASSES  If set, enable debug during passes
  MICROPROBEASMHEXFMT    Assembly hexadecimal format. Options:
                         'all' -> All immediates in hex format
                         'address' -> Address immediates in hex format (default)
                         'none' -> All immediate in integer format

Lets look at the code to see how this command line tool is implemented. This is the complete code of the script:

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15
 16"""
 17power_v206_power7_ppc64_linux_gcc_profile.py
 18
 19Example module to show how to generate a benchmark for each instruction
 20of the ISA
 21"""
 22
 23# Futures
 24from __future__ import absolute_import
 25
 26# Built-in modules
 27import multiprocessing as mp
 28import os
 29import sys
 30import traceback
 31
 32# Third party modules
 33from six.moves import map, range
 34
 35# Own modules
 36import microprobe.code.ins
 37import microprobe.passes.address
 38import microprobe.passes.branch
 39import microprobe.passes.decimal
 40import microprobe.passes.float
 41import microprobe.passes.ilp
 42import microprobe.passes.initialization
 43import microprobe.passes.instruction
 44import microprobe.passes.memory
 45import microprobe.passes.register
 46import microprobe.passes.structure
 47import microprobe.utils.cmdline
 48from microprobe.exceptions import MicroprobeException
 49from microprobe.target import import_definition
 50from microprobe.utils.cmdline import existing_dir, \
 51    int_type, print_error, print_info, print_warning
 52from microprobe.utils.logger import get_logger
 53
 54__author__ = "Ramon Bertran"
 55__copyright__ = "Copyright 2011-2021 IBM Corporation"
 56__credits__ = []
 57__license__ = "IBM (c) 2011-2021 All rights reserved"
 58__version__ = "0.5"
 59__maintainer__ = "Ramon Bertran"
 60__email__ = "rbertra@us.ibm.com"
 61__status__ = "Development"  # "Prototype", "Development", or "Production"
 62
 63# Constants
 64LOG = get_logger(__name__)  # Get the generic logging interface
 65
 66
 67# Functions
 68def main_setup():
 69    """
 70    Set up the command line interface (CLI) with the arguments required by
 71    this command line tool.
 72    """
 73
 74    args = sys.argv[1:]
 75
 76    # Create the CLI interface object
 77    cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
 78                                           config_options=False,
 79                                           target_options=False,
 80                                           debug_options=False)
 81
 82    # Add the different parameters for this particular tool
 83    cmdline.add_option(
 84        "instruction",
 85        "i",
 86        None,
 87        "Instruction names to generate. Default: All instructions",
 88        required=False,
 89        nargs="+",
 90        metavar="INSTRUCTION_NAME")
 91
 92    cmdline.add_option(
 93        "output_prefix",
 94        None,
 95        "POWER_V206_PROFILE",
 96        "Output prefix of the generated files. Default: POWER_V206_PROFILE",
 97        opt_type=str,
 98        required=False,
 99        metavar="PREFIX")
100
101    cmdline.add_option(
102        "output_path",
103        "O",
104        "./",
105        "Output path. Default: current path",
106        opt_type=existing_dir,
107        metavar="PATH")
108
109    cmdline.add_option(
110        "parallel",
111        "p",
112        mp.cpu_count(),
113        "Number of parallel jobs. Default: number of CPUs available (%s)" %
114        mp.cpu_count(),
115        opt_type=int,
116        choices=list(range(
117            1,
118            mp.cpu_count() +
119            1)),
120        metavar="NUM_JOBS")
121
122    cmdline.add_option(
123        "size",
124        "S",
125        64,
126        "Benchmark size (number of instructions in the endless loop). "
127        "Default: 64 instructions",
128        opt_type=int_type(1, 2**20),
129        metavar="BENCHMARK_SIZE")
130
131    cmdline.add_option(
132        "dependency_distance",
133        "D",
134        1000,
135        "Average dependency distance between the instructions. "
136        "Default: 1000 (no dependencies)",
137        opt_type=int_type(1, 1000),
138        metavar="DEPENDECY_DISTANCE")
139
140    # Start the main
141    print_info("Processing input arguments...")
142    cmdline.main(args, _main)
143
144
145def _main(arguments):
146    """
147    Main program. Called after the arguments from the CLI interface have
148    been processed.
149    """
150
151    print_info("Arguments processed!")
152
153    print_info("Importing target definition "
154               "'power_v206-power7-ppc64_linux_gcc'...")
155    target = import_definition("power_v206-power7-ppc64_linux_gcc")
156
157    # Get the arguments
158    instructions = arguments.get("instruction", None)
159    prefix = arguments["output_prefix"]
160    output_path = arguments["output_path"]
161    parallel_jobs = arguments["parallel"]
162    size = arguments["size"]
163    distance = arguments["dependency_distance"]
164
165    # Process the arguments
166    if instructions is not None:
167
168        # If the user has provided some instructions, make sure they
169        # exists and then we call the generation function
170
171        instructions = _validate_instructions(instructions, target)
172
173        if len(instructions) == 0:
174            print_error("No valid instructions defined.")
175            exit(-1)
176
177        # Set more verbose level
178        # set_log_level(10)
179        #
180        list(map(_generate_benchmark,
181                 [(instruction,
182                   prefix,
183                   output_path,
184                   target, size, distance) for instruction in instructions]))
185
186    else:
187
188        # If the user has not provided any instruction, go for all of them
189        # and then call he generation function
190
191        instructions = _generate_instructions(target, output_path, prefix)
192
193        # Since several benchmark will be generated, reduce verbose level
194        # and call the generation function in parallel
195
196        # set_log_level(30)
197
198        if parallel_jobs > 1:
199            pool = mp.Pool(processes=parallel_jobs)
200            pool.map(_generate_benchmark,
201                     [(instruction,
202                       prefix,
203                       output_path,
204                       target,
205                       size,
206                       distance) for instruction in instructions],
207                     1)
208        else:
209            list(map(_generate_benchmark,
210                     [(instruction,
211                       prefix,
212                       output_path,
213                       target,
214                       size,
215                       distance) for instruction in instructions]))
216
217
218def _validate_instructions(instructions, target):
219    """
220    Validate the provided instruction for a given target
221    """
222
223    nins = []
224    for instruction in instructions:
225
226        if instruction not in list(target.isa.instructions.keys()):
227            print_warning(
228                "'%s' not defined in the ISA. Skipping..." %
229                instruction)
230            continue
231        nins.append(instruction)
232    return nins
233
234
235def _generate_instructions(target, path, prefix):
236    """
237    Generate the list of instruction to be generated for a given target
238    """
239
240    instructions = []
241    for name, instr in target.instructions.items():
242
243        if instr.privileged or instr.hypervisor:
244            # Skip priv/hyper instructions
245            continue
246
247        if instr.branch and not instr.branch_relative:
248            # Skip branch absolute due to relocation problems
249            continue
250
251        if instr.category in ['LMA', 'LMV', 'DS', 'EC']:
252            # Skip some instruction categories
253            continue
254
255        if name in ['LSWI_V0', 'LSWX_V0', 'LMW_V0', 'STSWX_V0',
256                    'LD_V1', 'LWZ_V1', 'STW_V1']:
257            # Some instructions are not completely supported yet
258            # String-related instructions and load multiple
259
260            continue
261
262        # Skip if the files already exists
263
264        fname = "%s/%s_%s.c" % (path, prefix, name)
265        ffname = "%s/%s_%s.c.fail" % (path, prefix, name)
266
267        if os.path.isfile(fname):
268            print_warning("Skip %s. '%s' already generated" % (name, fname))
269            continue
270
271        if os.path.isfile(ffname):
272            print_warning("Skip %s. '%s' already generated (failed)"
273                          % (name, ffname))
274            continue
275
276        instructions.append(name)
277
278    return instructions
279
280
281def _generate_benchmark(args):
282    """
283    Actual benchmark generation policy. This is the function that defines
284    how the microbenchmark are going to be generated
285    """
286
287    instr_name, prefix, output_path, target, size, distance = args
288
289    try:
290
291        # Name of the output file
292        fname = "%s/%s_%s" % (output_path, prefix, instr_name)
293
294        # Name of the fail output file (generated in case of exception)
295        ffname = "%s.c.fail" % (fname)
296
297        print_info("Generating %s ..." % (fname))
298
299        instruction = microprobe.code.ins.Instruction()
300        instruction.set_arch_type(target.instructions[instr_name])
301        sequence = [target.instructions[instr_name]]
302
303        # Get the wrapper object. The wrapper object is in charge of
304        # translating the internal representation of the microbenchmark
305        # to the final output format.
306        #
307        # In this case, we obtain the 'CInfGen' wrapper, which embeds
308        # the generated code within an infinite loop using C plus
309        # in-line assembly statements.
310        cwrapper = microprobe.code.get_wrapper("CInfGen")
311
312        # Create the synthesizer object, which is in charge of driving the
313        # generation of the microbenchmark, given a set of passes
314        # (a.k.a. transformations) to apply to the an empty internal
315        # representation of the microbenchmark
316        synth = microprobe.code.Synthesizer(target, cwrapper(),
317                                            value=0b01010101)
318
319        # Add the transformation passes
320
321        #######################################################################
322        # Pass 1: Init integer registers to a given value                     #
323        #######################################################################
324        synth.add_pass(
325            microprobe.passes.initialization.InitializeRegistersPass(
326                value=_init_value()))
327        floating = False
328        vector = False
329
330        for operand in instruction.operands():
331            if operand.type.immediate:
332                continue
333
334            if operand.type.float:
335                floating = True
336
337            if operand.type.vector:
338                vector = True
339
340        if vector and floating:
341            ###################################################################
342            # Pass 1.A: if instruction uses vector floats, init vector        #
343            #           registers to float values                             #
344            ###################################################################
345            synth.add_pass(
346                microprobe.passes.initialization.InitializeRegistersPass(
347                    v_value=(
348                        1.000000000000001,
349                        64)))
350        elif vector:
351            ###################################################################
352            # Pass 1.B: if instruction uses vector but not floats, init       #
353            #           vector registers to integer value                     #
354            ###################################################################
355            synth.add_pass(
356                microprobe.passes.initialization.InitializeRegistersPass(
357                    v_value=(
358                        _init_value(),
359                        64)))
360        elif floating:
361            ###################################################################
362            # Pass 1.C: if instruction uses floats, init float                #
363            #           registers to float values                             #
364            ###################################################################
365            synth.add_pass(
366                microprobe.passes.initialization.InitializeRegistersPass(
367                    fp_value=1.000000000000001))
368
369        #######################################################################
370        # Pass 2: Add a building block of size 'size'                         #
371        #######################################################################
372        synth.add_pass(
373            microprobe.passes.structure.SimpleBuildingBlockPass(size))
374
375        #######################################################################
376        # Pass 3: Fill the building block with the instruction sequence       #
377        #######################################################################
378        synth.add_pass(
379            microprobe.passes.instruction.SetInstructionTypeBySequencePass(
380                sequence
381            )
382        )
383
384        #######################################################################
385        # Pass 4: Compute addresses of instructions (this pass is needed to   #
386        #         update the internal representation information so that in   #
387        #         case addresses are required, they are up to date).          #
388        #######################################################################
389        synth.add_pass(
390            microprobe.passes.address.UpdateInstructionAddressesPass())
391
392        #######################################################################
393        # Pass 5: Set target of branches to be the next instruction in the    #
394        #         instruction stream                                          #
395        #######################################################################
396        synth.add_pass(microprobe.passes.branch.BranchNextPass())
397
398        #######################################################################
399        # Pass 6: Set memory-related operands to access 16 storage locations  #
400        #         in a round-robin fashion in stride 256 bytes.               #
401        #         The pattern would be: 0, 256, 512, .... 3840, 0, 256, ...   #
402        #######################################################################
403        synth.add_pass(
404            microprobe.passes.memory.SingleMemoryStreamPass(
405                16,
406                256))
407
408        #######################################################################
409        # Pass 7.A: Initialize the storage locations accessed by floating     #
410        #           point instructions to have a valid floating point value   #
411        #######################################################################
412        synth.add_pass(microprobe.passes.float.InitializeMemoryFloatPass(
413            value=1.000000000000001)
414        )
415
416        #######################################################################
417        # Pass 7.B: Initialize the storage locations accessed by decimal      #
418        #           instructions to have a valid decimal value                #
419        #######################################################################
420        synth.add_pass(
421            microprobe.passes.decimal.InitializeMemoryDecimalPass(
422                value=1))
423
424        #######################################################################
425        # Pass 8: Set the remaining instructions operands (if not set)        #
426        #         (Required to set remaining immediate operands)              #
427        #######################################################################
428        synth.add_pass(
429            microprobe.passes.register.DefaultRegisterAllocationPass(
430                dd=distance))
431
432        # Synthesize the microbenchmark.The synthesize applies the set of
433        # transformation passes added before and returns object representing
434        # the microbenchmark
435        bench = synth.synthesize()
436
437        # Save the microbenchmark to the file 'fname'
438        synth.save(fname, bench=bench)
439
440        print_info("%s generated!" % (fname))
441
442        # Remove fail file if exists
443        if os.path.isfile(ffname):
444            os.remove(ffname)
445
446    except MicroprobeException:
447
448        # In case of exception during the generation of the microbenchmark,
449        # print the error, write the fail file and exit
450        print_error(traceback.format_exc())
451        open(ffname, 'a').close()
452        exit(-1)
453
454
455def _init_value():
456    """ Return a init value """
457    return 0b0101010101010101010101010101010101010101010101010101010101010101
458
459
460# Main
461if __name__ == '__main__':
462    # run main if executed from the command line
463    # and the main method exists
464
465    if callable(locals().get('main_setup')):
466        main_setup()
467        exit(0)

The code is self-documented. You can take a look to understand the basic concepts of the code generation in Microprobe. In order to help the readers, let us summarize and elaborate the explanations in the code. The following are the suggested steps required to implement a command line tool to generate microbenchmarks using Microprobe:

  1. Define the command line interface and parameters (main_setup() function in the example). This includes:

    1. Create a command line interface object

    2. Define parameters using the add_option interface

    3. Call the actual main with the arguments

  2. Define the function to process the input parameters (_main() function in the example). This includes:

    1. Import target definition

    2. Get processed arguments

    3. Validate and use the arguments to call the actual microbenchmark generation function

  3. Define the function to generate the microbenchmark (_generate_benchmark function in the example). The main elements are the following:

    1. Get the wrapper object. The wrapper object defines the general characteristics of code being generated (i.e. how the internal representation will be translated to the final file being generated). General characteristics are, for instance, code prologs such as #include <header.h> directives, the main function declaration, epilogs, etc. In this case, the wrapper selected is the CInfGen. This wrapper generates C code with an infinite loop of instructions. This results in the following code:

      #include <stdio.h>
      #include <string.h>
      
      // <declaration of variables>
      
      int main(int argc, char** argv, char** envp) {
      
          // <initialization_code>
      
          while(1) {
      
              // <generated_code>
      
          } // end while
      }
      

      The user can subclass or define their own wrappers to fulfill their needs. See microprobe.code.wrapper.Wrapper for more details.

    2. Instantiate synthesizer. The benchmark synthesizer object is in charge of driving the code generation object by applying the set of transformation passes defined by the user.

    3. Define the transformation passes. The transformation passes will fill the declaration of variables, <initialization_code> and <generated_code> sections of the previous code block. Depending on the order and the type of passes applied, the code generated will be different. The user has plenty of transformation passes to apply. See microprobe.passes and all its submodules for further details. Also, the use can define its own passes by subclassing the class microprobe.passes.Pass.

    4. Finally, once the generation policy is defined, the user only has to synthesize the benchmark and save it to a file.

power_v206_power7_ppc64_linux_gcc_fu_stress.py

The following example shows how to generate microbenchmarks that stress a particular functional unit of the architecture. The code is self explanatory:

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15
 16"""
 17power_v206_power7_ppc64_linux_gcc_fu_stress.py
 18
 19Example module to show how to generate a benchmark stressing a particular
 20functional unit of the microarchitecture at different rate using the
 21average latency of instructions as well as the average dependency distance
 22between the instructions
 23"""
 24
 25# Futures
 26from __future__ import absolute_import
 27
 28# Built-in modules
 29import os
 30import sys
 31import traceback
 32
 33# Own modules
 34import microprobe.code.ins
 35import microprobe.passes.address
 36import microprobe.passes.branch
 37import microprobe.passes.decimal
 38import microprobe.passes.float
 39import microprobe.passes.ilp
 40import microprobe.passes.initialization
 41import microprobe.passes.instruction
 42import microprobe.passes.memory
 43import microprobe.passes.register
 44import microprobe.passes.structure
 45import microprobe.utils.cmdline
 46from microprobe.exceptions import MicroprobeException, \
 47    MicroprobeTargetDefinitionError
 48from microprobe.target import import_definition
 49from microprobe.utils.cmdline import dict_key, existing_dir, \
 50    float_type, int_type, print_error, print_info
 51from microprobe.utils.logger import get_logger
 52
 53__author__ = "Ramon Bertran"
 54__copyright__ = "Copyright 2011-2021 IBM Corporation"
 55__credits__ = []
 56__license__ = "IBM (c) 2011-2021 All rights reserved"
 57__version__ = "0.5"
 58__maintainer__ = "Ramon Bertran"
 59__email__ = "rbertra@us.ibm.com"
 60__status__ = "Development"  # "Prototype", "Development", or "Production"
 61
 62# Constants
 63LOG = get_logger(__name__)  # Get the generic logging interface
 64
 65
 66# Functions
 67def main_setup():
 68    """
 69    Set up the command line interface (CLI) with the arguments required by
 70    this command line tool.
 71    """
 72
 73    args = sys.argv[1:]
 74
 75    # Get the target definition
 76    try:
 77        target = import_definition("power_v206-power7-ppc64_linux_gcc")
 78    except MicroprobeTargetDefinitionError as exc:
 79        print_error("Unable to import target definition")
 80        print_error("Exception message: %s" % str(exc))
 81        exit(-1)
 82
 83    func_units = {}
 84    valid_units = [elem.name for elem in target.elements.values()]
 85
 86    for instr in target.isa.instructions.values():
 87        if instr.execution_units == "None":
 88            LOG.debug("Execution units for: '%s' not defined", instr.name)
 89            continue
 90
 91        for unit in instr.execution_units:
 92            if unit not in valid_units:
 93                continue
 94
 95            if unit not in func_units:
 96                func_units[unit] = [elem for elem in target.elements.values()
 97                                    if elem.name == unit][0]
 98
 99    # Create the CLI interface object
100    cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
101                                           config_options=False,
102                                           target_options=False,
103                                           debug_options=False)
104
105    # Add the different parameters for this particular tool
106    cmdline.add_option(
107        "functional_unit",
108        "f",
109        [func_units['ALU']],
110        "Functional units to stress. Default: ALU",
111        required=False,
112        nargs="+",
113        choices=func_units,
114        opt_type=dict_key(func_units),
115        metavar="FUNCTIONAL_UNIT_NAME")
116
117    cmdline.add_option(
118        "output_prefix",
119        None,
120        "POWER_V206_FU_STRESS",
121        "Output prefix of the generated files. Default: POWER_V206_FU_STRESS",
122        opt_type=str,
123        required=False,
124        metavar="PREFIX")
125
126    cmdline.add_option(
127        "output_path",
128        "O",
129        "./",
130        "Output path. Default: current path",
131        opt_type=existing_dir,
132        metavar="PATH")
133
134    cmdline.add_option(
135        "size",
136        "S",
137        64,
138        "Benchmark size (number of instructions in the endless loop). "
139        "Default: 64 instructions",
140        opt_type=int_type(1, 2**20),
141        metavar="BENCHMARK_SIZE")
142
143    cmdline.add_option(
144        "dependency_distance",
145        "D",
146        1000,
147        "Average dependency distance between the instructions. "
148        "Default: 1000 (no dependencies)",
149        opt_type=int_type(1, 1000),
150        metavar="DEPENDECY_DISTANCE")
151
152    cmdline.add_option(
153        "average_latency",
154        "L",
155        2,
156        "Average latency of the selected instructins. "
157        "Default: 2 cycles",
158        opt_type=float_type(1, 1000),
159        metavar="AVERAGE_LATENCY")
160
161    # Start the main
162    print_info("Processing input arguments...")
163    cmdline.main(args, _main)
164
165
166def _main(arguments):
167    """
168    Main program. Called after the arguments from the CLI interface have
169    been processed.
170    """
171
172    print_info("Arguments processed!")
173
174    print_info("Importing target definition "
175               "'power_v206-power7-ppc64_linux_gcc'...")
176    target = import_definition("power_v206-power7-ppc64_linux_gcc")
177
178    # Get the arguments
179    functional_units = arguments["functional_unit"]
180    prefix = arguments["output_prefix"]
181    output_path = arguments["output_path"]
182    size = arguments["size"]
183    latency = arguments["average_latency"]
184    distance = arguments["dependency_distance"]
185
186    if functional_units is None:
187        functional_units = ["ALL"]
188
189    _generate_benchmark(target,
190                        "%s/%s_" % (output_path, prefix),
191                        (functional_units, size, latency, distance))
192
193
194def _generate_benchmark(target, output_prefix, args):
195    """
196    Actual benchmark generation policy. This is the function that defines
197    how the microbenchmark are going to be generated
198    """
199
200    functional_units, size, latency, distance = args
201
202    try:
203
204        # Name of the output file
205        func_unit_names = [unit.name for unit in functional_units]
206        fname = "%s%s" % (output_prefix, "_".join(func_unit_names))
207        fname = "%s_LAT_%s" % (fname, latency)
208        fname = "%s_DEP_%s" % (fname, distance)
209
210        # Name of the fail output file (generated in case of exception)
211        ffname = "%s.c.fail" % (fname)
212
213        print_info("Generating %s ..." % (fname))
214
215        # Get the wrapper object. The wrapper object is in charge of
216        # translating the internal representation of the microbenchmark
217        # to the final output format.
218        #
219        # In this case, we obtain the 'CInfGen' wrapper, which embeds
220        # the generated code within an infinite loop using C plus
221        # in-line assembly statements.
222        cwrapper = microprobe.code.get_wrapper("CInfGen")
223
224        # Create the synthesizer object, which is in charge of driving the
225        # generation of the microbenchmark, given a set of passes
226        # (a.k.a. transformations) to apply to the an empty internal
227        # representation of the microbenchmark
228        synth = microprobe.code.Synthesizer(target, cwrapper(),
229                                            value=0b01010101)
230
231        # Add the transformation passes
232
233        #######################################################################
234        # Pass 1: Init integer registers to a given value                     #
235        #######################################################################
236        synth.add_pass(
237            microprobe.passes.initialization.InitializeRegistersPass(
238                value=_init_value()))
239
240        #######################################################################
241        # Pass 2: Add a building block of size 'size'                         #
242        #######################################################################
243        synth.add_pass(
244            microprobe.passes.structure.SimpleBuildingBlockPass(size)
245        )
246
247        #######################################################################
248        # Pass 3: Fill the building block with the instruction sequence       #
249        #######################################################################
250        synth.add_pass(
251            microprobe.passes.instruction.SetInstructionTypeByElementPass(
252                target,
253                functional_units,
254                {}))
255
256        #######################################################################
257        # Pass 4: Compute addresses of instructions (this pass is needed to   #
258        #         update the internal representation information so that in   #
259        #         case addresses are required, they are up to date).          #
260        #######################################################################
261        synth.add_pass(
262            microprobe.passes.address.UpdateInstructionAddressesPass()
263        )
264
265        #######################################################################
266        # Pass 5: Set target of branches to be the next instruction in the    #
267        #         instruction stream                                          #
268        #######################################################################
269        synth.add_pass(microprobe.passes.branch.BranchNextPass())
270
271        #######################################################################
272        # Pass 6: Set memory-related operands to access 16 storage locations  #
273        #         in a round-robin fashion in stride 256 bytes.               #
274        #         The pattern would be: 0, 256, 512, .... 3840, 0, 256, ...   #
275        #######################################################################
276        synth.add_pass(
277            microprobe.passes.memory.SingleMemoryStreamPass(
278                16,
279                256))
280
281        #######################################################################
282        # Pass 7.A: Initialize the storage locations accessed by floating     #
283        #           point instructions to have a valid floating point value   #
284        #######################################################################
285        synth.add_pass(microprobe.passes.float.InitializeMemoryFloatPass(
286            value=1.000000000000001)
287        )
288
289        #######################################################################
290        # Pass 7.B: Initialize the storage locations accessed by decimal      #
291        #           instructions to have a valid decimal value                #
292        #######################################################################
293        synth.add_pass(
294            microprobe.passes.decimal.InitializeMemoryDecimalPass(
295                value=1))
296
297        #######################################################################
298        # Pass 8: Set the remaining instructions operands (if not set)        #
299        #         (Required to set remaining immediate operands)              #
300        #######################################################################
301        synth.add_pass(
302            microprobe.passes.register.DefaultRegisterAllocationPass(
303                dd=distance))
304
305        # Synthesize the microbenchmark.The synthesize applies the set of
306        # transformation passes added before and returns object representing
307        # the microbenchmark
308        bench = synth.synthesize()
309
310        # Save the microbenchmark to the file 'fname'
311        synth.save(fname, bench=bench)
312
313        print_info("%s generated!" % (fname))
314
315        # Remove fail file if exists
316        if os.path.isfile(ffname):
317            os.remove(ffname)
318
319    except MicroprobeException:
320
321        # In case of exception during the generation of the microbenchmark,
322        # print the error, write the fail file and exit
323        print_error(traceback.format_exc())
324        open(ffname, 'a').close()
325        exit(-1)
326
327
328def _init_value():
329    """ Return a init value """
330    return 0b0101010101010101010101010101010101010101010101010101010101010101
331
332
333# Main
334if __name__ == '__main__':
335    # run main if executed from the command line
336    # and the main method exists
337
338    if callable(locals().get('main_setup')):
339        main_setup()
340        exit(0)

power_v206_power7_ppc64_linux_gcc_memory.py

The following example shows how to create microbenchmarks with different activity (stress levels) on the different levels of the cache hierarchy. Note that it is not necessary to use the built-in command line interface provided by Microprobe, as the example shows.

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15
 16"""
 17power_v206_power7_ppc64_linux_gcc_memory.py
 18
 19Example python script to show how to generate microbenchmarks with particular
 20levels of activity in the memory hierarchy.
 21"""
 22
 23# Futures
 24from __future__ import absolute_import
 25
 26# Built-in modules
 27import multiprocessing as mp
 28import os
 29import random
 30import sys
 31
 32# Third party modules
 33from six.moves import map
 34
 35# Own modules
 36import microprobe.code
 37import microprobe.passes.address
 38import microprobe.passes.ilp
 39import microprobe.passes.initialization
 40import microprobe.passes.instruction
 41import microprobe.passes.memory
 42import microprobe.passes.register
 43import microprobe.passes.structure
 44from microprobe.exceptions import MicroprobeTargetDefinitionError
 45from microprobe.model.memory import EndlessLoopDataMemoryModel
 46from microprobe.target import import_definition
 47from microprobe.utils.cmdline import print_error, print_info
 48
 49__author__ = "Ramon Bertran"
 50__copyright__ = "Copyright 2011-2021 IBM Corporation"
 51__credits__ = []
 52__license__ = "IBM (c) 2011-2021 All rights reserved"
 53__version__ = "0.5"
 54__maintainer__ = "Ramon Bertran"
 55__email__ = "rbertra@us.ibm.com"
 56__status__ = "Development"  # "Prototype", "Development", or "Production"
 57
 58# Get the target definition
 59try:
 60    TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
 61except MicroprobeTargetDefinitionError as exc:
 62    print_error("Unable to import target definition")
 63    print_error("Exception message: %s" % str(exc))
 64    exit(-1)
 65
 66BASE_ELEMENT = [element for element in TARGET.elements.values()
 67                if element.name == 'L1D'][0]
 68CACHE_HIERARCHY = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
 69    BASE_ELEMENT)
 70
 71# Benchmark size
 72BENCHMARK_SIZE = 8 * 1024
 73
 74# Fill a list of the models to be generated
 75
 76MEMORY_MODELS = []
 77
 78#
 79# Due to performance issues (long exec. time) this
 80# model is disabled
 81#
 82# MEMORY_MODELS.append(
 83#    (
 84#        "ALL", CACHE_HIERARCHY, [
 85#            25, 25, 25, 25]))
 86
 87MEMORY_MODELS.append(
 88    (
 89        "L1", CACHE_HIERARCHY, [
 90            100, 0, 0, 0]))
 91MEMORY_MODELS.append(
 92    (
 93        "L2",
 94        CACHE_HIERARCHY, [
 95            0, 100, 0, 0]))
 96MEMORY_MODELS.append(
 97    (
 98        "L3",
 99        CACHE_HIERARCHY, [
100            0, 0, 100, 0]))
101MEMORY_MODELS.append(
102    (
103        "L1L3",
104        CACHE_HIERARCHY, [
105            50, 0, 50, 0]))
106MEMORY_MODELS.append(
107    (
108        "L1L2",
109        CACHE_HIERARCHY, [
110            50, 50, 0, 0]))
111MEMORY_MODELS.append(
112    (
113        "L2L3",
114        CACHE_HIERARCHY, [
115            0, 50, 50, 0]))
116MEMORY_MODELS.append(
117    (
118        "CACHES",
119        CACHE_HIERARCHY, [
120            33, 33, 34, 0]))
121MEMORY_MODELS.append(
122    (
123        "MEM", CACHE_HIERARCHY, [
124            0, 0, 0, 100]))
125
126
127# Enable parallel generation
128PARALLEL = False
129
130
131def main():
132    """Main function. """
133    # call the generate method for each model in the memory model list
134
135    if PARALLEL:
136        print_info("Start parallel execution...")
137        pool = mp.Pool(processes=mp.cpu_count())
138        pool.map(generate, MEMORY_MODELS, 1)
139    else:
140        print_info("Start sequential execution...")
141        list(map(generate, MEMORY_MODELS))
142
143    exit(0)
144
145
146def generate(model):
147    """Benchmark generation policy function. """
148
149    print_info("Creating memory model '%s' ..." % model[0])
150    model = EndlessLoopDataMemoryModel(*model)
151
152    modelname = model.name
153
154    print_info("Generating Benchmark mem-%s ..." % (modelname))
155
156    # Get the architecture
157    garch = TARGET
158
159    # For all the supported instructions, get the memory operations,
160    sequence = []
161    for instr_name in sorted(garch.instructions.keys()):
162
163        instr = garch.instructions[instr_name]
164
165        if not instr.access_storage:
166            continue
167        if instr.privileged:  # Skip privileged
168            continue
169        if instr.hypervisor:  # Skip hypervisor
170            continue
171        if instr.trap:  # Skip traps
172            continue
173        if "String" in instr.description:  # Skip unsupported string instr.
174            continue
175        if "Multiple" in instr.description:  # Skip unsupported mult. ld/sts
176            continue
177        if instr.category in [
178                'LMA',
179                'LMV',
180                'DS',
181                'EC',
182                'WT']:  # Skip unsupported categories
183            continue
184        if instr.access_storage_with_update:  # Not supported by mem. model
185            continue
186        if "Reserve Indexed" in instr.description:  # Skip (illegal intr.)
187            continue
188        if "Conditional Indexed" in instr.description:  # Skip (illegal intr.)
189            continue
190        if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1']:
191            continue
192
193        sequence.append(instr)
194
195    # Get the loop wrapper. In this case we take the 'CInfPpc', which
196    # generates an infinite loop in C using PowerPC embedded assembly.
197    cwrapper = microprobe.code.get_wrapper("CInfPpc")
198
199    # Define function to return random numbers (used afterwards)
200    def rnd():
201        """Return a random value. """
202        return random.randrange(0, (1 << 64) - 1)
203
204    # Create the benchmark synthesizer
205    synth = microprobe.code.Synthesizer(garch, cwrapper())
206
207    ##########################################################################
208    # Add the passes we want to apply to synthesize benchmarks               #
209    ##########################################################################
210
211    # --> Init registers to random values
212    synth.add_pass(
213        microprobe.passes.initialization.InitializeRegistersPass(
214            value=rnd))
215
216    # --> Add a single basic block of size 'size'
217    if model.name in ['MEM']:
218        synth.add_pass(
219            microprobe.passes.structure.SimpleBuildingBlockPass(
220                BENCHMARK_SIZE *
221                4))
222    else:
223        synth.add_pass(
224            microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
225        )
226
227    # --> Fill the basic block using the sequence of instructions provided
228    synth.add_pass(
229        microprobe.passes.instruction.SetInstructionTypeBySequencePass(
230            sequence
231        )
232    )
233
234    # --> Set the memory operations parameters to fulfill the given model
235    synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(model))
236
237    # --> Set the dependency distance and the default allocation. Sets the
238    # remaining undefined instruction operands (register allocation,...)
239    synth.add_pass(microprobe.passes.register.NoHazardsAllocationPass())
240    synth.add_pass(
241        microprobe.passes.register.DefaultRegisterAllocationPass(
242            dd=0))
243
244    # Generate the benchmark (applies the passes).
245    bench = synth.synthesize()
246
247    print_info("Benchmark mem-%s saving to disk..." % (modelname))
248
249    # Save the benchmark
250    synth.save("%s/mem-%s" % (DIRECTORY, modelname), bench=bench)
251
252    print_info("Benchmark mem-%s generated" % (modelname))
253    return True
254
255
256if __name__ == '__main__':
257    # run main if executed from the command line
258    # and the main method exists
259
260    if len(sys.argv) != 2:
261        print_info("Usage:")
262        print_info("%s output_dir" % (sys.argv[0]))
263        exit(-1)
264
265    DIRECTORY = sys.argv[1]
266
267    if not os.path.isdir(DIRECTORY):
268        print_error("Output directory '%s' does not exists" % (DIRECTORY))
269        exit(-1)
270
271    if callable(locals().get('main')):
272        main()

power_v206_power7_ppc64_linux_gcc_random.py

The following example generates random microbenchmarks:

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15
 16"""
 17power_v206_power7_ppc64_linux_gcc_memory.py
 18
 19Example python script to show how to generate random microbenchmarks.
 20"""
 21
 22# Futures
 23from __future__ import absolute_import
 24
 25# Built-in modules
 26import multiprocessing as mp
 27import os
 28import random
 29import sys
 30
 31# Third party modules
 32from six.moves import map, range
 33
 34# Own modules
 35import microprobe.code
 36import microprobe.passes.address
 37import microprobe.passes.branch
 38import microprobe.passes.ilp
 39import microprobe.passes.initialization
 40import microprobe.passes.instruction
 41import microprobe.passes.memory
 42import microprobe.passes.register
 43import microprobe.passes.structure
 44from microprobe.exceptions import MicroprobeError, \
 45    MicroprobeTargetDefinitionError
 46from microprobe.model.memory import EndlessLoopDataMemoryModel
 47from microprobe.target import import_definition
 48from microprobe.utils.cmdline import print_error, print_info
 49
 50__author__ = "Ramon Bertran"
 51__copyright__ = "Copyright 2011-2021 IBM Corporation"
 52__credits__ = []
 53__license__ = "IBM (c) 2011-2021 All rights reserved"
 54__version__ = "0.5"
 55__maintainer__ = "Ramon Bertran"
 56__email__ = "rbertra@us.ibm.com"
 57__status__ = "Development"  # "Prototype", "Development", or "Production"
 58
 59# Benchmark size
 60BENCHMARK_SIZE = 8 * 1024
 61
 62# Get the target definition
 63try:
 64    TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
 65except MicroprobeTargetDefinitionError as exc:
 66    print_error("Unable to import target definition")
 67    print_error("Exception message: %s" % str(exc))
 68    exit(-1)
 69
 70BASE_ELEMENT = [element for element in TARGET.elements.values()
 71                if element.name == 'L1D'][0]
 72CACHE_HIERARCHY = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
 73    BASE_ELEMENT)
 74
 75PARALLEL = True
 76
 77
 78def main():
 79    """ Main program. """
 80    if PARALLEL:
 81        pool = mp.Pool(processes=mp.cpu_count())
 82        pool.map(generate, list(range(0, 100)), 1)
 83    else:
 84        list(map(generate, list(range(0, 100))))
 85
 86
 87def generate(name):
 88    """ Benchmark generation policy. """
 89
 90    if os.path.isfile("%s/random-%s.c" % (DIRECTORY, name)):
 91        print_info("Skip %d" % name)
 92        return
 93
 94    print_info("Generating %d..." % name)
 95
 96    # Generate a random memory model (used afterwards)
 97    model = []
 98    total = 100
 99    for mcomp in CACHE_HIERARCHY[0:-1]:
100        weight = random.randint(0, total)
101        model.append(weight)
102        print_info("%s: %d%%" % (mcomp, weight))
103        total = total - weight
104
105    # Fix remaining
106    level = random.randint(0, len(CACHE_HIERARCHY[0:-1]) - 1)
107    model[level] += total
108
109    # Last level always zero
110    model.append(0)
111
112    # Sanity check
113    psum = 0
114    for elem in model:
115        psum += elem
116    assert psum == 100
117
118    modelobj = EndlessLoopDataMemoryModel("random-%s", CACHE_HIERARCHY, model)
119
120    # Get the loop wrapper. In this case we take the 'CInfPpc', which
121    # generates an infinite loop in C using PowerPC embedded assembly.
122    cwrapper = microprobe.code.get_wrapper("CInfPpc")
123
124    # Define function to return random numbers (used afterwards)
125    def rnd():
126        """Return a random value. """
127        return random.randrange(0, (1 << 64) - 1)
128
129    # Create the benchmark synthesizer
130    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
131
132    ##########################################################################
133    # Add the passes we want to apply to synthesize benchmarks               #
134    ##########################################################################
135
136    # --> Init registers to random values
137    synth.add_pass(
138        microprobe.passes.initialization.InitializeRegistersPass(
139            value=rnd))
140
141    # --> Add a single basic block of size size
142    synth.add_pass(
143        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
144
145    # --> Fill the basic block with instructions picked randomly from the list
146    #     provided
147
148    instructions = []
149    for instr in TARGET.instructions.values():
150
151        if instr.privileged:  # Skip privileged
152            continue
153        if instr.hypervisor:  # Skip hypervisor
154            continue
155        if instr.trap:  # Skip traps
156            continue
157        if instr.syscall:  # Skip syscall
158            continue
159        if "String" in instr.description:  # Skip unsupported string instr.
160            continue
161        if "Multiple" in instr.description:  # Skip unsupported mult. ld/sts
162            continue
163        if instr.category in [
164                'LMA',
165                'LMV',
166                'DS',
167                'EC',
168                'WT']:  # Skip unsupported categories
169            continue
170        if instr.access_storage_with_update:  # Not supported by mem. model
171            continue
172        if instr.branch and not instr.branch_relative:  # Skip branches
173            continue
174        if "Reserve Indexed" in instr.description:  # Skip (illegal intr.)
175            continue
176        if "Conitional Indexed" in instr.description:  # Skip (illegal intr.)
177            continue
178        if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1', ]:
179            continue
180
181        instructions.append(instr)
182
183    synth.add_pass(
184        microprobe.passes.instruction.SetRandomInstructionTypePass(
185            instructions
186        )
187    )
188
189    # --> Set the memory operations parameters to fulfill the given model
190    synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(modelobj))
191
192    # --> Set target of branches to next instruction (first compute addresses)
193    synth.add_pass(microprobe.passes.address.UpdateInstructionAddressesPass())
194    synth.add_pass(microprobe.passes.branch.BranchNextPass())
195
196    # --> Set the dependency distance and the default allocation. Dependency
197    #     distance is randomly picked
198    synth.add_pass(
199        microprobe.passes.register.DefaultRegisterAllocationPass(
200            dd=random.randint(1, 20)
201        )
202    )
203
204    # Generate the benchmark (applies the passes)
205    # Since it is a randomly generated code, the generation might fail
206    # (e.g. not enough access to fulfill the requested memory model, etc.)
207    # Because of that, we handle the exception accordingly.
208    try:
209        print_info("Synthesizing %d..." % name)
210        bench = synth.synthesize()
211        print_info("Synthesized %d!" % name)
212        # Save the benchmark
213        synth.save("%s/random-%s" % (DIRECTORY, name), bench=bench)
214    except MicroprobeError:
215        print_info("Synthesizing error in '%s'. This is Ok." % name)
216
217    return True
218
219
220if __name__ == '__main__':
221    # run main if executed from the command line
222    # and the main method exists
223
224    if len(sys.argv) != 2:
225        print_info("Usage:")
226        print_info("%s output_dir" % (sys.argv[0]))
227        exit(-1)
228
229    DIRECTORY = sys.argv[1]
230
231    if not os.path.isdir(DIRECTORY):
232        print_error("Output directory '%s' does not exists" % (DIRECTORY))
233        exit(-1)
234
235    if callable(locals().get('main')):
236        main()

power_v206_power7_ppc64_linux_gcc_custom.py

The following example shows different examples on how to customize the generation of microbenchmarks:

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15
 16"""
 17power_v206_power7_ppc64_linux_gcc_custom.py
 18
 19Example python script to show how to generate random microbenchmarks.
 20"""
 21
 22# Futures
 23from __future__ import absolute_import
 24
 25# Built-in modules
 26import os
 27import sys
 28
 29# Own modules
 30import microprobe.code
 31import microprobe.passes.initialization
 32import microprobe.passes.instruction
 33import microprobe.passes.memory
 34import microprobe.passes.register
 35import microprobe.passes.structure
 36from microprobe.exceptions import MicroprobeTargetDefinitionError
 37from microprobe.model.memory import EndlessLoopDataMemoryModel
 38from microprobe.target import import_definition
 39from microprobe.utils.cmdline import print_error, print_info
 40from microprobe.utils.misc import RNDINT
 41
 42__author__ = "Ramon Bertran"
 43__copyright__ = "Copyright 2011-2021 IBM Corporation"
 44__credits__ = []
 45__license__ = "IBM (c) 2011-2021 All rights reserved"
 46__version__ = "0.5"
 47__maintainer__ = "Ramon Bertran"
 48__email__ = "rbertra@us.ibm.com"
 49__status__ = "Development"  # "Prototype", "Development", or "Production"
 50
 51# Benchmark size
 52BENCHMARK_SIZE = 8 * 1024
 53
 54if len(sys.argv) != 2:
 55    print_info("Usage:")
 56    print_info("%s output_dir" % (sys.argv[0]))
 57    exit(-1)
 58
 59DIRECTORY = sys.argv[1]
 60
 61if not os.path.isdir(DIRECTORY):
 62    print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
 63    exit(-1)
 64
 65# Get the target definition
 66try:
 67    TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
 68except MicroprobeTargetDefinitionError as exc:
 69    print_error("Unable to import target definition")
 70    print_error("Exception message: %s" % str(exc))
 71    exit(-1)
 72
 73
 74###############################################################################
 75# Example 1: loop with instructions accessing storage , hitting the first     #
 76#            level of cache and with dependency distance of 3                 #
 77###############################################################################
 78def example_1():
 79    """ Example 1 """
 80    name = "L1-LOADS"
 81
 82    base_element = [element for element in TARGET.elements.values()
 83                    if element.name == 'L1D'][0]
 84    cache_hierarchy = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
 85        base_element)
 86
 87    model = [0] * len(cache_hierarchy)
 88    model[0] = 100
 89
 90    mmodel = EndlessLoopDataMemoryModel("random-%s", cache_hierarchy, model)
 91
 92    profile = {}
 93    for instr_name in sorted(TARGET.instructions.keys()):
 94        instr = TARGET.instructions[instr_name]
 95        if not instr.access_storage:
 96            continue
 97        if instr.privileged:  # Skip privileged
 98            continue
 99        if instr.hypervisor:  # Skip hypervisor
100            continue
101        if "String" in instr.description:  # Skip unsupported string instr.
102            continue
103        if "ultiple" in instr.description:  # Skip unsupported mult. ld/sts
104            continue
105        if instr.category in [
106                'DS',
107                'LMA',
108                'LMV',
109                'EC']:  # Skip unsupported categories
110            continue
111        if instr.access_storage_with_update:  # Not supported
112            continue
113
114        if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1', ]:
115            continue
116
117        if (any([moper.is_load
118                 for moper in instr.memory_operand_descriptors]) and
119            all(
120                [not moper.is_store
121                 for moper in instr.memory_operand_descriptors])):
122            profile[instr] = 1
123
124    cwrapper = microprobe.code.get_wrapper("CInfPpc")
125    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
126
127    synth.add_pass(
128        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
129    synth.add_pass(
130        microprobe.passes.initialization.InitializeRegistersPass(
131            value=RNDINT))
132    synth.add_pass(
133        microprobe.passes.initialization.InitializeRegisterPass(
134            "GPR1", 0, force=True, reserve=True
135        )
136    )
137    synth.add_pass(
138        microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile)
139    )
140    synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(mmodel))
141    synth.add_pass(
142        microprobe.passes.register.DefaultRegisterAllocationPass(
143            dd=3))
144
145    print_info("Generating %s..." % name)
146    bench = synth.synthesize()
147    print_info("%s Generated!" % name)
148    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
149
150
151###############################################################################
152# Example 2: loop with instructions using the MUL unit and with dependency    #
153#            distance of 4                                                    #
154###############################################################################
155def example_2():
156    """ Example 2 """
157    name = "FXU-MUL"
158
159    cwrapper = microprobe.code.get_wrapper("CInfPpc")
160    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
161
162    synth.add_pass(
163        microprobe.passes.initialization.InitializeRegistersPass(
164            value=RNDINT))
165    synth.add_pass(
166        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
167    synth.add_pass(
168        microprobe.passes.instruction.SetInstructionTypeByElementPass(
169            TARGET,
170            [TARGET.elements['MUL_FXU0_Core0_SCM_Processor']],
171            {}))
172    synth.add_pass(
173        microprobe.passes.register.DefaultRegisterAllocationPass(
174            dd=4))
175
176    print_info("Generating %s..." % name)
177    bench = synth.synthesize()
178    print_info("%s Generated!" % name)
179    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
180
181
182###############################################################################
183# Example 3: loop with instructions using the ALU unit and with dependency    #
184#            distance of 1                                                    #
185###############################################################################
186def example_3():
187    """ Example 3 """
188    name = "FXU-ALU"
189
190    cwrapper = microprobe.code.get_wrapper("CInfPpc")
191    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
192
193    synth.add_pass(
194        microprobe.passes.initialization.InitializeRegistersPass(
195            value=RNDINT))
196    synth.add_pass(
197        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
198    synth.add_pass(
199        microprobe.passes.instruction.SetInstructionTypeByElementPass(
200            TARGET,
201            [TARGET.elements['ALU_FXU0_Core0_SCM_Processor']],
202            {}))
203    synth.add_pass(
204        microprobe.passes.register.DefaultRegisterAllocationPass(
205            dd=1))
206
207    print_info("Generating %s..." % name)
208    bench = synth.synthesize()
209    print_info("%s Generated!" % name)
210    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
211
212
213###############################################################################
214# Example 4: loop with FMUL* instructions with different weights and with     #
215#            dependency distance 10                                           #
216###############################################################################
217def example_4():
218    """ Example 4 """
219    name = "VSU-FMUL"
220
221    profile = {}
222    profile[TARGET.instructions['FMUL_V0']] = 4
223    profile[TARGET.instructions['FMULS_V0']] = 3
224    profile[TARGET.instructions['FMULx_V0']] = 2
225    profile[TARGET.instructions['FMULSx_V0']] = 1
226
227    cwrapper = microprobe.code.get_wrapper("CInfPpc")
228    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
229
230    synth.add_pass(
231        microprobe.passes.initialization.InitializeRegistersPass(
232            value=RNDINT))
233    synth.add_pass(
234        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
235    synth.add_pass(
236        microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
237    synth.add_pass(
238        microprobe.passes.register.DefaultRegisterAllocationPass(
239            dd=10))
240
241    print_info("Generating %s..." % name)
242    bench = synth.synthesize()
243    print_info("%s Generated!" % name)
244    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
245
246
247###############################################################################
248# Example 5: loop with FADD* instructions with different weights and with     #
249#            dependency distance 1                                            #
250###############################################################################
251def example_5():
252    """ Example 5 """
253    name = "VSU-FADD"
254
255    profile = {}
256    profile[TARGET.instructions['FADD_V0']] = 100
257    profile[TARGET.instructions['FADDx_V0']] = 1
258    profile[TARGET.instructions['FADDS_V0']] = 10
259    profile[TARGET.instructions['FADDSx_V0']] = 1
260
261    cwrapper = microprobe.code.get_wrapper("CInfPpc")
262    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
263
264    synth.add_pass(
265        microprobe.passes.initialization.InitializeRegistersPass(
266            value=RNDINT))
267    synth.add_pass(
268        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
269    synth.add_pass(
270        microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
271    synth.add_pass(
272        microprobe.passes.register.DefaultRegisterAllocationPass(
273            dd=1))
274
275    print_info("Generating %s..." % name)
276    bench = synth.synthesize()
277    print_info("%s Generated!" % name)
278    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
279
280
281###############################################################################
282# Call the examples                                                           #
283###############################################################################
284example_1()
285example_2()
286example_3()
287example_4()
288example_5()
289exit(0)

power_v206_power7_ppc64_linux_gcc_genetic.py

Deprecated since version 0.5: Support for the PyEvolve and genetic algorithm based searches has been discontinued

The following example shows how to use the design exploration module and the genetic algorithm based searches to look for a solution. In particular, for each functional unit of the architecture and a range of IPCs (instruction per cycle), the example looks for a solution that stresses that functional unit at the given IPC. External commands (not included) are needed to evaluate the generated microbenchmarks in the target platform.

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15
 16"""
 17power_v206_power7_ppc64_linux_gcc_genetic.py
 18
 19Example python script to show how to generate a set of microbenchmark
 20stressing a particular unit but at different IPC ratio using a genetic
 21search algorithm to play with two knobs: average latency and dependency
 22distance.
 23
 24An IPC evaluation and scoring script is required. For instance:
 25
 26.. code:: bash
 27
 28   #!/bin/bash
 29   # ARGS: $1 is the target IPC
 30   #       $2 is the name of the generate benchnark
 31   target_ipc=$1
 32   source_bench=$2
 33
 34   # Compile the benchmark
 35   gcc -O0 -mcpu=power7 -mtune=power7 -std=c99 $source_bench.c -o $source_bench
 36
 37   # Evaluate the ipc
 38   ipc=< your preferred commands to evaluate the IPC >
 39
 40   # Compute the score (the closer to the target IPC the
 41   score=(1/($ipc-$target_ipc))^2 | bc -l
 42
 43   echo $score
 44
 45Use the script above as a template for your own GA-based search.
 46"""
 47
 48# Futures
 49from __future__ import absolute_import, division
 50
 51# Built-in modules
 52import datetime
 53import os
 54import sys
 55import time as runtime
 56
 57# Third party modules
 58from six.moves import range
 59
 60# Own modules
 61import microprobe.code
 62import microprobe.driver.genetic
 63import microprobe.passes.ilp
 64import microprobe.passes.initialization
 65import microprobe.passes.instruction
 66import microprobe.passes.register
 67import microprobe.passes.structure
 68from microprobe.exceptions import MicroprobeTargetDefinitionError
 69from microprobe.target import import_definition
 70from microprobe.utils.cmdline import print_error, print_info, print_warning
 71from microprobe.utils.misc import RNDINT
 72
 73__author__ = "Ramon Bertran"
 74__copyright__ = "Copyright 2011-2021 IBM Corporation"
 75__credits__ = []
 76__license__ = "IBM (c) 2011-2021 All rights reserved"
 77__version__ = "0.5"
 78__maintainer__ = "Ramon Bertran"
 79__email__ = "rbertra@us.ibm.com"
 80__status__ = "Development"  # "Prototype", "Development", or "Production"
 81
 82# Benchmark size
 83BENCHMARK_SIZE = 20
 84
 85# Get the target definition
 86try:
 87    TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
 88except MicroprobeTargetDefinitionError as exc:
 89    print_error("Unable to import target definition")
 90    print_error("Exception message: %s" % str(exc))
 91    exit(-1)
 92
 93
 94def main():
 95    """Main function."""
 96
 97    component_list = ["FXU", "FXU-noLSU", "FXU-LSU", "VSU", "VSU-FXU"]
 98    ipcs = [float(x) / 10 for x in range(1, 41)]
 99    ipcs = ipcs[5:] + ipcs[:5]
100
101    for name in component_list:
102        for ipc in ipcs:
103            generate_genetic(name, ipc)
104
105
106def generate_genetic(compname, ipc):
107    """Generate a microbenchmark stressing compname at the given ipc."""
108    comps = []
109    bcomps = []
110    any_comp = False
111
112    if compname.find("FXU") >= 0:
113        comps.append(TARGET.elements["FXU0_Core0_SCM_Processor"])
114
115    if compname.find("VSU") >= 0:
116        comps.append(TARGET.elements["VSU0_Core0_SCM_Processor"])
117
118    if len(comps) == 2:
119        any_comp = True
120    elif compname.find("noLSU") >= 0:
121        bcomps.append(TARGET.elements["LSU0_Core0_SCM_Processor"])
122    elif compname.find("LSU") >= 0:
123        comps.append(TARGET.elements["LSU_Core0_SCM_Processor"])
124
125    if (len(comps) == 1 and ipc > 2) or (len(comps) == 2 and ipc > 4):
126        return True
127
128    for elem in os.listdir(DIRECTORY):
129        if not elem.endswith(".c"):
130            continue
131        if elem.startswith("%s:IPC:%.2f:DIST" % (compname, ipc)):
132            print_info("Already generated: %s %d" % (compname, ipc))
133            return True
134
135    print_info("Going for IPC: %f and Element: %s" % (ipc, compname))
136
137    def generate(name, *args):
138        """Benchmark generation function.
139
140        First argument is name, second the dependency distance and the
141        third is the average instruction latency.
142        """
143        dist, latency = args
144
145        wrapper = microprobe.code.get_wrapper("CInfPpc")
146        synth = microprobe.code.Synthesizer(TARGET, wrapper())
147        synth.add_pass(
148            microprobe.passes.initialization.InitializeRegistersPass(
149                value=RNDINT))
150        synth.add_pass(
151            microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
152        )
153        synth.add_pass(
154            microprobe.passes.instruction.SetInstructionTypeByElementPass(
155                TARGET,
156                comps,
157                {},
158                block=bcomps,
159                avelatency=latency,
160                any_comp=any_comp))
161        synth.add_pass(
162            microprobe.passes.register.DefaultRegisterAllocationPass(
163                dd=dist))
164        bench = synth.synthesize()
165        synth.save(name, bench=bench)
166
167    # Set the genetic algorithm parameters
168    ga_params = []
169    ga_params.append((0, 20, 0.05))  # Average dependency distance design space
170    ga_params.append((2, 8, 0.05))  # Average instruction latency design space
171
172    # Set up the search driver
173    driver = microprobe.driver.genetic.ExecCmdDriver(
174        generate, 20, 30, 30, "'%s' %f " %
175        (COMMAND, ipc), ga_params)
176
177    starttime = runtime.time()
178    print_info("Start search...")
179    driver.run(1)
180    print_info("Search end")
181    endtime = runtime.time()
182
183    print_info("Genetic time::%s" % (
184        datetime.timedelta(seconds=endtime - starttime))
185    )
186
187    # Check if we found a solution
188    ga_params = driver.solution()
189    score = driver.score()
190
191    print_info("IPC found: %f, score: %f" % (ipc, score))
192
193    if score < 20:
194        print_warning("Unable to find an optimal solution with IPC: %f:" % ipc)
195        print_info("Generating the closest solution...")
196        generate(
197            "%s/%s:IPC:%.2f:DIST:%.2f:LAT:%.2f-check" %
198            (DIRECTORY, compname, ipc, ga_params[0], ga_params[1]),
199            ga_params[0], ga_params[1]
200        )
201        print_info("Closest solution generated")
202    else:
203        print_info(
204            "Solution found for %s and IPC %f -> dist: %f , "
205            "latency: %f " %
206            (compname, ipc, ga_params[0], ga_params[1]))
207        print_info("Generating solution...")
208        generate("%s/%s:IPC:%.2f:DIST:%.2f:LAT:%.2f" %
209                 (DIRECTORY, compname, ipc, ga_params[0], ga_params[1]),
210                 ga_params[0], ga_params[1]
211                 )
212        print_info("Solution generated")
213    return True
214
215
216if __name__ == '__main__':
217    # run main if executed from the COMMAND line
218    # and the main method exists
219
220    if len(sys.argv) != 3:
221        print_info("Usage:")
222        print_info("%s output_dir eval_cmd" % (sys.argv[0]))
223        print_info("")
224        print_info("Output dir: output directory for the generated benchmarks")
225        print_info("eval_cmd: command accepting 2 parameters: the target IPC")
226        print_info("          and the filename of the generate benchmark. ")
227        print_info("          Output: the score used for the GA search. E.g.")
228        print_info("          the close the IPC of the generated benchmark to")
229        print_info("          the target IPC, the cmd should give a higher  ")
230        print_info("          score. ")
231        exit(-1)
232
233    DIRECTORY = sys.argv[1]
234    COMMAND = sys.argv[2]
235
236    if not os.path.isdir(DIRECTORY):
237        print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
238        exit(-1)
239
240    if not os.path.isfile(COMMAND):
241        print_info("The COMMAND '%s' does not exists" % (COMMAND))
242        exit(-1)
243
244    if callable(locals().get('main')):
245        main()