Examples on POWER

In the definitions/power/examples directory of the Microprobe distribution (if you installed the microprobe_target_power package), you will find different examples showing the usage of Microprobe for the power architecture. Although we have split the examples by architecture, the concepts we introduce in these examples are common in all the architectures.

We recommend users to go through the code of these examples to understand specific details on how to use the framework.

Contents:

isa_power_v206_info.py

The first example we show is isa_power_v206_info.py. This example shows how to search for architecture definitions (e.g. the ISA properties), how to import the definitions and then how to dump the definition. If you execute the following command:

> ./isa_power_v206_info.py

will generate the following output, which shows all the details of the POWER v2.06 architecture (first and last 20 lines for brevity):

--------------------------------------------------------------------------------
ISA Name: power_v206
ISA Description: power_v206
--------------------------------------------------------------------------------
Register Types:
     GPR: General Register (bit size: 64)
    VSCR: Vector Status and Control Register (bit size: 32)
     FPR: Floating-Point Register (bit size: 64)
     SPR: Special Purpose Register (64 bits) (bit size: 64)
      VR: Vector Register (bit size: 128)
     MSR: Machine State Register (bit size: 64)
   SPR32: Special Purpose Register (32 bits) (bit size: 32)
     VSR: Vector Scalar Register (bit size: 128)
   FPSCR: Floating-Point Status and Control Register (bit size: 32)
      CR: Condition Register (bit size: 4)
--------------------------------------------------------------------------------
Architected registers:
    AESR : AESR Register (Type: SPR)
    AMOR : AMOR Register (Type: SPR)
     AMR : Authority Mask Register (Type: SPR)
...
	access_storage              :	False	(Boolean indicating if the instruction has storage operands                                                          )
	access_storage_with_update  :	False	(Boolean indicating if the instruction accesses to storage and updates the source register with the generated address)
	algebraic                   :	False	(Boolean indicating if operation uses algebraic rules to keep values                                                 )
	branch                      :	False	(Boolean indicating if the instruction is a branch                                                                   )
	branch_conditional          :	False	(Boolean indicating if the instruction is a branch conditional                                                       )
	branch_relative             :	False	(Boolean indicating if the instruction is a relative branch                                                          )
	category                    :	VSX  	(String indicating if the instruction the instruction category                                                       )
	decimal                     :	False	(Boolean indication if the instruction requires inputs in decimal format                                             )
	disable_asm                 :	False	(Boolean indicating if ASM generation is disabled for the instruction. If so, binary codification is used.           )
	hypervisor                  :	False	(Boolean indicating if the instruction need hypervisor mode                                                          )
	privileged                  :	False	(Boolean indicating if the instruction is privileged                                                                 )
	privileged_optional         :	False	(Boolean indicating the instrucion is priviledged or not depending on the input values                               )
	switching                   :	None 	(Input values required to maximize the computational switching                                                       )
	syscall                     :	False	(Boolean indicating if the instruction is a syscall or return from one                                               )
	trap                        :	False	(Boolean indicating if the instruction is a trap                                                                     )


 Instructions defined: 938 
 Variants defined: 964 
--------------------------------------------------------------------------------

The following code is what has been executed:

 1#!/usr/bin/env python
 2# Copyright 2011-2021 IBM Corporation
 3#
 4# Licensed under the Apache License, Version 2.0 (the "License");
 5# you may not use this file except in compliance with the License.
 6# You may obtain a copy of the License at
 7#
 8# http://www.apache.org/licenses/LICENSE-2.0
 9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16isa_power_v206_info.py
17
18Example module to show how to access to isa definitions.
19"""
20
21# Futures
22from __future__ import absolute_import, print_function
23
24# Built-in modules
25import os
26
27# Own modules
28from microprobe.target.isa import find_isa_definitions, import_isa_definition
29
30__author__ = "Ramon Bertran"
31__copyright__ = "Copyright 2011-2021 IBM Corporation"
32__credits__ = []
33__license__ = "IBM (c) 2011-2021 All rights reserved"
34__version__ = "0.5"
35__maintainer__ = "Ramon Bertran"
36__email__ = "rbertra@us.ibm.com"
37__status__ = "Development"  # "Prototype", "Development", or "Production"
38
39# Constants
40ISANAME = "power_v206"
41
42# Functions
43
44# Classes
45
46# Main
47
48# Search and import definition
49ISADEF = import_isa_definition(
50    os.path.dirname([
51        isa for isa in find_isa_definitions() if isa.name == ISANAME
52    ][0].filename))
53
54# Print definition
55print((ISADEF.full_report()))
56exit(0)

In this simple code, first the find_isa_definitions, import_isa_definition from the microprobe.target.isa module are imported (line 14). Then, the first one is used to look for definitions of architectures, a list returned and filtered and only the one with name power_v206 is imported using the second method: import_isa_definition (lines 34-37). Finally, the full report of the ISADEF object is printed to standard output in line 40.

In the case, the full report is printed but the user can query any information about the particular ISA that has been imported by using the microprobe.target.isa.ISA API.

power_v206_power7_ppc64_linux_gcc_profile.py

The aim of this example is to show how the code generation works in Microprobe. In particular, this example shows how to generate, for each instruction of the ISA, an endless loop containing such instruction. The size of the loop and the dependency distance between the instructions of the loop can specified as a parameter. Using Microprobe you can generate thousands of microbenchmarks in few minutes. Let’s start with the command line interface. Executing:

> ./power_v206_power7_ppc64_linux_gcc_profile.py --help

will generate the following output:

power_v206_power7_ppc64_linux_gcc_profile.py: INFO: Processing input arguments...
usage: power_v206_power7_ppc64_linux_gcc_profile.py [-h]
                                                    [-P SEARCH_PATH [SEARCH_PATH ...]]
                                                    [-V] [-v] [-d]
                                                    [-i INSTRUCTION_NAME [INSTRUCTION_NAME ...]]
                                                    [--output_prefix PREFIX]
                                                    [-O PATH] [-p NUM_JOBS]
                                                    [-S BENCHMARK_SIZE]
                                                    [-D DEPENDECY_DISTANCE]

ISA power v206 profile example

optional arguments:
  -h, --help            show this help message and exit
  -P SEARCH_PATH [SEARCH_PATH ...], --default_paths SEARCH_PATH [SEARCH_PATH ...]
                        Default search paths for microprobe target definitions
  -V, --version         Show Microprobe version and exit
  -v, --verbosity       Verbosity level (Values: [0,1,2,3,4]). Each time this
                        argument is specified the verbosity level is
                        increased. By default, no logging messages are shown.
                        These are the four levels available:
                        
                          -v (1): critical messages
                          -v -v (2): critical and error messages
                          -v -v -v (3): critical, error and warning messages
                          -v -v -v -v (4): critical, error, warning and info messages
                        
                        Specifying more than four verbosity flags, will
                        default to the maximum of four. If you need extra
                        information, enable the debug mode (--debug or -d
                        flags).
  -d, --debug           Enable debug mode in Microprobe framework. Lots of
                        output messages will be generated
  -i INSTRUCTION_NAME [INSTRUCTION_NAME ...], --instruction INSTRUCTION_NAME [INSTRUCTION_NAME ...]
                        Instruction names to generate. Default: All
                        instructions
  --output_prefix PREFIX
                        Output prefix of the generated files. Default:
                        POWER_V206_PROFILE
  -O PATH, --output_path PATH
                        Output path. Default: current path
  -p NUM_JOBS, --parallel NUM_JOBS
                        Number of parallel jobs. Default: number of CPUs
                        available (80). Valid values: 1, 2, 3, 4, 5, 6, 7, 8,
                        9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                        23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
                        36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
                        49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
                        62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,
                        75, 76, 77, 78, 79, 80
  -S BENCHMARK_SIZE, --size BENCHMARK_SIZE
                        Benchmark size (number of instructions in the endless
                        loop). Default: 64 instructions
  -D DEPENDECY_DISTANCE, --dependency_distance DEPENDECY_DISTANCE
                        Average dependency distance between the instructions.
                        Default: 1000 (no dependencies)

Environment variables:

  MICROPROBETEMPLATES    Default path for microprobe templates
  MICROPROBEDEBUG        If set, enable debug
  MICROPROBEDEBUGPASSES  If set, enable debug during passes
  MICROPROBEASMHEXFMT    Assembly hexadecimal format. Options:
                         'all' -> All immediates in hex format
                         'address' -> Address immediates in hex format (default)
                         'none' -> All immediate in integer format

Lets look at the code to see how this command line tool is implemented. This is the complete code of the script:

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15"""
 16power_v206_power7_ppc64_linux_gcc_profile.py
 17
 18Example module to show how to generate a benchmark for each instruction
 19of the ISA
 20"""
 21
 22# Futures
 23from __future__ import absolute_import
 24
 25# Built-in modules
 26import multiprocessing as mp
 27import os
 28import sys
 29import traceback
 30
 31# Third party modules
 32from six.moves import map, range
 33
 34# Own modules
 35import microprobe.code.ins
 36import microprobe.passes.address
 37import microprobe.passes.branch
 38import microprobe.passes.decimal
 39import microprobe.passes.float
 40import microprobe.passes.ilp
 41import microprobe.passes.initialization
 42import microprobe.passes.instruction
 43import microprobe.passes.memory
 44import microprobe.passes.register
 45import microprobe.passes.structure
 46import microprobe.utils.cmdline
 47from microprobe import MICROPROBE_RC
 48from microprobe.exceptions import MicroprobeException
 49from microprobe.target import import_definition
 50from microprobe.utils.cmdline import existing_dir, \
 51    int_type, print_error, print_info, print_warning
 52from microprobe.utils.logger import get_logger
 53
 54__author__ = "Ramon Bertran"
 55__copyright__ = "Copyright 2011-2021 IBM Corporation"
 56__credits__ = []
 57__license__ = "IBM (c) 2011-2021 All rights reserved"
 58__version__ = "0.5"
 59__maintainer__ = "Ramon Bertran"
 60__email__ = "rbertra@us.ibm.com"
 61__status__ = "Development"  # "Prototype", "Development", or "Production"
 62
 63# Constants
 64LOG = get_logger(__name__)  # Get the generic logging interface
 65
 66
 67# Functions
 68def main_setup():
 69    """
 70    Set up the command line interface (CLI) with the arguments required by
 71    this command line tool.
 72    """
 73
 74    args = sys.argv[1:]
 75
 76    # Create the CLI interface object
 77    cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
 78                                           config_options=False,
 79                                           target_options=False,
 80                                           debug_options=False)
 81
 82    # Add the different parameters for this particular tool
 83    cmdline.add_option(
 84        "instruction",
 85        "i",
 86        None,
 87        "Instruction names to generate. Default: All instructions",
 88        required=False,
 89        nargs="+",
 90        metavar="INSTRUCTION_NAME")
 91
 92    cmdline.add_option(
 93        "output_prefix",
 94        None,
 95        "POWER_V206_PROFILE",
 96        "Output prefix of the generated files. Default: POWER_V206_PROFILE",
 97        opt_type=str,
 98        required=False,
 99        metavar="PREFIX")
100
101    cmdline.add_option("output_path",
102                       "O",
103                       "./",
104                       "Output path. Default: current path",
105                       opt_type=existing_dir,
106                       metavar="PATH")
107
108    cmdline.add_option(
109        "parallel",
110        "p",
111        MICROPROBE_RC['cpus'],
112        "Number of parallel jobs. Default: number of CPUs available (%s)" %
113        mp.cpu_count(),
114        opt_type=int,
115        choices=list(range(1, MICROPROBE_RC['cpus'] + 1)),
116        metavar="NUM_JOBS")
117
118    cmdline.add_option(
119        "size",
120        "S",
121        64, "Benchmark size (number of instructions in the endless loop). "
122        "Default: 64 instructions",
123        opt_type=int_type(1, 2**20),
124        metavar="BENCHMARK_SIZE")
125
126    cmdline.add_option("dependency_distance",
127                       "D",
128                       1000,
129                       "Average dependency distance between the instructions. "
130                       "Default: 1000 (no dependencies)",
131                       opt_type=int_type(1, 1000),
132                       metavar="DEPENDECY_DISTANCE")
133
134    # Start the main
135    print_info("Processing input arguments...")
136    cmdline.main(args, _main)
137
138
139def _main(arguments):
140    """
141    Main program. Called after the arguments from the CLI interface have
142    been processed.
143    """
144
145    print_info("Arguments processed!")
146
147    print_info("Importing target definition "
148               "'power_v206-power7-ppc64_linux_gcc'...")
149    target = import_definition("power_v206-power7-ppc64_linux_gcc")
150
151    # Get the arguments
152    instructions = arguments.get("instruction", None)
153    prefix = arguments["output_prefix"]
154    output_path = arguments["output_path"]
155    parallel_jobs = arguments["parallel"]
156    size = arguments["size"]
157    distance = arguments["dependency_distance"]
158
159    # Process the arguments
160    if instructions is not None:
161
162        # If the user has provided some instructions, make sure they
163        # exists and then we call the generation function
164
165        instructions = _validate_instructions(instructions, target)
166
167        if len(instructions) == 0:
168            print_error("No valid instructions defined.")
169            exit(-1)
170
171        # Set more verbose level
172        # set_log_level(10)
173        #
174        list(
175            map(_generate_benchmark,
176                [(instruction, prefix, output_path, target, size, distance)
177                 for instruction in instructions]))
178
179    else:
180
181        # If the user has not provided any instruction, go for all of them
182        # and then call he generation function
183
184        instructions = _generate_instructions(target, output_path, prefix)
185
186        # Since several benchmark will be generated, reduce verbose level
187        # and call the generation function in parallel
188
189        # set_log_level(30)
190
191        if parallel_jobs > 1:
192            pool = mp.Pool(processes=parallel_jobs)
193            pool.map(
194                _generate_benchmark,
195                [(instruction, prefix, output_path, target, size, distance)
196                 for instruction in instructions], 1)
197        else:
198            list(
199                map(_generate_benchmark,
200                    [(instruction, prefix, output_path, target, size, distance)
201                     for instruction in instructions]))
202
203
204def _validate_instructions(instructions, target):
205    """
206    Validate the provided instruction for a given target
207    """
208
209    nins = []
210    for instruction in instructions:
211
212        if instruction not in list(target.isa.instructions.keys()):
213            print_warning("'%s' not defined in the ISA. Skipping..." %
214                          instruction)
215            continue
216        nins.append(instruction)
217    return nins
218
219
220def _generate_instructions(target, path, prefix):
221    """
222    Generate the list of instruction to be generated for a given target
223    """
224
225    instructions = []
226    for name, instr in target.instructions.items():
227
228        if instr.privileged or instr.hypervisor:
229            # Skip priv/hyper instructions
230            continue
231
232        if instr.branch and not instr.branch_relative:
233            # Skip branch absolute due to relocation problems
234            continue
235
236        if instr.category in ['LMA', 'LMV', 'DS', 'EC']:
237            # Skip some instruction categories
238            continue
239
240        if name in [
241                'LSWI_V0', 'LSWX_V0', 'LMW_V0', 'STSWX_V0', 'LD_V1', 'LWZ_V1',
242                'STW_V1'
243        ]:
244            # Some instructions are not completely supported yet
245            # String-related instructions and load multiple
246
247            continue
248
249        # Skip if the files already exists
250
251        fname = "%s/%s_%s.c" % (path, prefix, name)
252        ffname = "%s/%s_%s.c.fail" % (path, prefix, name)
253
254        if os.path.isfile(fname):
255            print_warning("Skip %s. '%s' already generated" % (name, fname))
256            continue
257
258        if os.path.isfile(ffname):
259            print_warning("Skip %s. '%s' already generated (failed)" %
260                          (name, ffname))
261            continue
262
263        instructions.append(name)
264
265    return instructions
266
267
268def _generate_benchmark(args):
269    """
270    Actual benchmark generation policy. This is the function that defines
271    how the microbenchmark are going to be generated
272    """
273
274    instr_name, prefix, output_path, target, size, distance = args
275
276    try:
277
278        # Name of the output file
279        fname = "%s/%s_%s" % (output_path, prefix, instr_name)
280
281        # Name of the fail output file (generated in case of exception)
282        ffname = "%s.c.fail" % (fname)
283
284        print_info("Generating %s ..." % (fname))
285
286        instruction = microprobe.code.ins.Instruction()
287        instruction.set_arch_type(target.instructions[instr_name])
288        sequence = [target.instructions[instr_name]]
289
290        # Get the wrapper object. The wrapper object is in charge of
291        # translating the internal representation of the microbenchmark
292        # to the final output format.
293        #
294        # In this case, we obtain the 'CInfGen' wrapper, which embeds
295        # the generated code within an infinite loop using C plus
296        # in-line assembly statements.
297        cwrapper = microprobe.code.get_wrapper("CInfGen")
298
299        # Create the synthesizer object, which is in charge of driving the
300        # generation of the microbenchmark, given a set of passes
301        # (a.k.a. transformations) to apply to the an empty internal
302        # representation of the microbenchmark
303        synth = microprobe.code.Synthesizer(target,
304                                            cwrapper(),
305                                            value=0b01010101)
306
307        # Add the transformation passes
308
309        #######################################################################
310        # Pass 1: Init integer registers to a given value                     #
311        #######################################################################
312        synth.add_pass(
313            microprobe.passes.initialization.InitializeRegistersPass(
314                value=_init_value()))
315        floating = False
316        vector = False
317
318        for operand in instruction.operands():
319            if operand.type.immediate:
320                continue
321
322            if operand.type.float:
323                floating = True
324
325            if operand.type.vector:
326                vector = True
327
328        if vector and floating:
329            ###################################################################
330            # Pass 1.A: if instruction uses vector floats, init vector        #
331            #           registers to float values                             #
332            ###################################################################
333            synth.add_pass(
334                microprobe.passes.initialization.InitializeRegistersPass(
335                    v_value=(1.000000000000001, 64)))
336        elif vector:
337            ###################################################################
338            # Pass 1.B: if instruction uses vector but not floats, init       #
339            #           vector registers to integer value                     #
340            ###################################################################
341            synth.add_pass(
342                microprobe.passes.initialization.InitializeRegistersPass(
343                    v_value=(_init_value(), 64)))
344        elif floating:
345            ###################################################################
346            # Pass 1.C: if instruction uses floats, init float                #
347            #           registers to float values                             #
348            ###################################################################
349            synth.add_pass(
350                microprobe.passes.initialization.InitializeRegistersPass(
351                    fp_value=1.000000000000001))
352
353        #######################################################################
354        # Pass 2: Add a building block of size 'size'                         #
355        #######################################################################
356        synth.add_pass(
357            microprobe.passes.structure.SimpleBuildingBlockPass(size))
358
359        #######################################################################
360        # Pass 3: Fill the building block with the instruction sequence       #
361        #######################################################################
362        synth.add_pass(
363            microprobe.passes.instruction.SetInstructionTypeBySequencePass(
364                sequence))
365
366        #######################################################################
367        # Pass 4: Compute addresses of instructions (this pass is needed to   #
368        #         update the internal representation information so that in   #
369        #         case addresses are required, they are up to date).          #
370        #######################################################################
371        synth.add_pass(
372            microprobe.passes.address.UpdateInstructionAddressesPass())
373
374        #######################################################################
375        # Pass 5: Set target of branches to be the next instruction in the    #
376        #         instruction stream                                          #
377        #######################################################################
378        synth.add_pass(microprobe.passes.branch.BranchNextPass())
379
380        #######################################################################
381        # Pass 6: Set memory-related operands to access 16 storage locations  #
382        #         in a round-robin fashion in stride 256 bytes.               #
383        #         The pattern would be: 0, 256, 512, .... 3840, 0, 256, ...   #
384        #######################################################################
385        synth.add_pass(microprobe.passes.memory.SingleMemoryStreamPass(
386            16, 256))
387
388        #######################################################################
389        # Pass 7.A: Initialize the storage locations accessed by floating     #
390        #           point instructions to have a valid floating point value   #
391        #######################################################################
392        synth.add_pass(
393            microprobe.passes.float.InitializeMemoryFloatPass(
394                value=1.000000000000001))
395
396        #######################################################################
397        # Pass 7.B: Initialize the storage locations accessed by decimal      #
398        #           instructions to have a valid decimal value                #
399        #######################################################################
400        synth.add_pass(
401            microprobe.passes.decimal.InitializeMemoryDecimalPass(value=1))
402
403        #######################################################################
404        # Pass 8: Set the remaining instructions operands (if not set)        #
405        #         (Required to set remaining immediate operands)              #
406        #######################################################################
407        synth.add_pass(
408            microprobe.passes.register.DefaultRegisterAllocationPass(
409                dd=distance))
410
411        # Synthesize the microbenchmark.The synthesize applies the set of
412        # transformation passes added before and returns object representing
413        # the microbenchmark
414        bench = synth.synthesize()
415
416        # Save the microbenchmark to the file 'fname'
417        synth.save(fname, bench=bench)
418
419        print_info("%s generated!" % (fname))
420
421        # Remove fail file if exists
422        if os.path.isfile(ffname):
423            os.remove(ffname)
424
425    except MicroprobeException:
426
427        # In case of exception during the generation of the microbenchmark,
428        # print the error, write the fail file and exit
429        print_error(traceback.format_exc())
430        open(ffname, 'a').close()
431        exit(-1)
432
433
434def _init_value():
435    """ Return a init value """
436    return 0b0101010101010101010101010101010101010101010101010101010101010101
437
438
439# Main
440if __name__ == '__main__':
441    # run main if executed from the command line
442    # and the main method exists
443
444    if callable(locals().get('main_setup')):
445        main_setup()
446        exit(0)

The code is self-documented. You can take a look to understand the basic concepts of the code generation in Microprobe. In order to help the readers, let us summarize and elaborate the explanations in the code. The following are the suggested steps required to implement a command line tool to generate microbenchmarks using Microprobe:

  1. Define the command line interface and parameters (main_setup() function in the example). This includes:

    1. Create a command line interface object

    2. Define parameters using the add_option interface

    3. Call the actual main with the arguments

  2. Define the function to process the input parameters (_main() function in the example). This includes:

    1. Import target definition

    2. Get processed arguments

    3. Validate and use the arguments to call the actual microbenchmark generation function

  3. Define the function to generate the microbenchmark (_generate_benchmark function in the example). The main elements are the following:

    1. Get the wrapper object. The wrapper object defines the general characteristics of code being generated (i.e. how the internal representation will be translated to the final file being generated). General characteristics are, for instance, code prologs such as #include <header.h> directives, the main function declaration, epilogs, etc. In this case, the wrapper selected is the CInfGen. This wrapper generates C code with an infinite loop of instructions. This results in the following code:

      #include <stdio.h>
      #include <string.h>
      
      // <declaration of variables>
      
      int main(int argc, char** argv, char** envp) {
      
          // <initialization_code>
      
          while(1) {
      
              // <generated_code>
      
          } // end while
      }
      

      The user can subclass or define their own wrappers to fulfill their needs. See microprobe.code.wrapper.Wrapper for more details.

    2. Instantiate synthesizer. The benchmark synthesizer object is in charge of driving the code generation object by applying the set of transformation passes defined by the user.

    3. Define the transformation passes. The transformation passes will fill the declaration of variables, <initialization_code> and <generated_code> sections of the previous code block. Depending on the order and the type of passes applied, the code generated will be different. The user has plenty of transformation passes to apply. See microprobe.passes and all its submodules for further details. Also, the use can define its own passes by subclassing the class microprobe.passes.Pass.

    4. Finally, once the generation policy is defined, the user only has to synthesize the benchmark and save it to a file.

power_v206_power7_ppc64_linux_gcc_fu_stress.py

The following example shows how to generate microbenchmarks that stress a particular functional unit of the architecture. The code is self explanatory:

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15"""
 16power_v206_power7_ppc64_linux_gcc_fu_stress.py
 17
 18Example module to show how to generate a benchmark stressing a particular
 19functional unit of the microarchitecture at different rate using the
 20average latency of instructions as well as the average dependency distance
 21between the instructions
 22"""
 23
 24# Futures
 25from __future__ import absolute_import
 26
 27# Built-in modules
 28import os
 29import sys
 30import traceback
 31
 32# Own modules
 33import microprobe.code.ins
 34import microprobe.passes.address
 35import microprobe.passes.branch
 36import microprobe.passes.decimal
 37import microprobe.passes.float
 38import microprobe.passes.ilp
 39import microprobe.passes.initialization
 40import microprobe.passes.instruction
 41import microprobe.passes.memory
 42import microprobe.passes.register
 43import microprobe.passes.structure
 44import microprobe.utils.cmdline
 45from microprobe.exceptions import MicroprobeException, \
 46    MicroprobeTargetDefinitionError
 47from microprobe.target import import_definition
 48from microprobe.utils.cmdline import dict_key, existing_dir, \
 49    float_type, int_type, print_error, print_info
 50from microprobe.utils.logger import get_logger
 51
 52__author__ = "Ramon Bertran"
 53__copyright__ = "Copyright 2011-2021 IBM Corporation"
 54__credits__ = []
 55__license__ = "IBM (c) 2011-2021 All rights reserved"
 56__version__ = "0.5"
 57__maintainer__ = "Ramon Bertran"
 58__email__ = "rbertra@us.ibm.com"
 59__status__ = "Development"  # "Prototype", "Development", or "Production"
 60
 61# Constants
 62LOG = get_logger(__name__)  # Get the generic logging interface
 63
 64
 65# Functions
 66def main_setup():
 67    """
 68    Set up the command line interface (CLI) with the arguments required by
 69    this command line tool.
 70    """
 71
 72    args = sys.argv[1:]
 73
 74    # Get the target definition
 75    try:
 76        target = import_definition("power_v206-power7-ppc64_linux_gcc")
 77    except MicroprobeTargetDefinitionError as exc:
 78        print_error("Unable to import target definition")
 79        print_error("Exception message: %s" % str(exc))
 80        exit(-1)
 81
 82    func_units = {}
 83    valid_units = [elem.name for elem in target.elements.values()]
 84
 85    for instr in target.isa.instructions.values():
 86        if instr.execution_units == "None":
 87            LOG.debug("Execution units for: '%s' not defined", instr.name)
 88            continue
 89
 90        for unit in instr.execution_units:
 91            if unit not in valid_units:
 92                continue
 93
 94            if unit not in func_units:
 95                func_units[unit] = [
 96                    elem for elem in target.elements.values()
 97                    if elem.name == unit
 98                ][0]
 99
100    # Create the CLI interface object
101    cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
102                                           config_options=False,
103                                           target_options=False,
104                                           debug_options=False)
105
106    # Add the different parameters for this particular tool
107    cmdline.add_option("functional_unit",
108                       "f", [func_units['ALU']],
109                       "Functional units to stress. Default: ALU",
110                       required=False,
111                       nargs="+",
112                       choices=func_units,
113                       opt_type=dict_key(func_units),
114                       metavar="FUNCTIONAL_UNIT_NAME")
115
116    cmdline.add_option(
117        "output_prefix",
118        None,
119        "POWER_V206_FU_STRESS",
120        "Output prefix of the generated files. Default: POWER_V206_FU_STRESS",
121        opt_type=str,
122        required=False,
123        metavar="PREFIX")
124
125    cmdline.add_option("output_path",
126                       "O",
127                       "./",
128                       "Output path. Default: current path",
129                       opt_type=existing_dir,
130                       metavar="PATH")
131
132    cmdline.add_option(
133        "size",
134        "S",
135        64, "Benchmark size (number of instructions in the endless loop). "
136        "Default: 64 instructions",
137        opt_type=int_type(1, 2**20),
138        metavar="BENCHMARK_SIZE")
139
140    cmdline.add_option("dependency_distance",
141                       "D",
142                       1000,
143                       "Average dependency distance between the instructions. "
144                       "Default: 1000 (no dependencies)",
145                       opt_type=int_type(1, 1000),
146                       metavar="DEPENDECY_DISTANCE")
147
148    cmdline.add_option("average_latency",
149                       "L",
150                       2, "Average latency of the selected instructins. "
151                       "Default: 2 cycles",
152                       opt_type=float_type(1, 1000),
153                       metavar="AVERAGE_LATENCY")
154
155    # Start the main
156    print_info("Processing input arguments...")
157    cmdline.main(args, _main)
158
159
160def _main(arguments):
161    """
162    Main program. Called after the arguments from the CLI interface have
163    been processed.
164    """
165
166    print_info("Arguments processed!")
167
168    print_info("Importing target definition "
169               "'power_v206-power7-ppc64_linux_gcc'...")
170    target = import_definition("power_v206-power7-ppc64_linux_gcc")
171
172    # Get the arguments
173    functional_units = arguments["functional_unit"]
174    prefix = arguments["output_prefix"]
175    output_path = arguments["output_path"]
176    size = arguments["size"]
177    latency = arguments["average_latency"]
178    distance = arguments["dependency_distance"]
179
180    if functional_units is None:
181        functional_units = ["ALL"]
182
183    _generate_benchmark(target, "%s/%s_" % (output_path, prefix),
184                        (functional_units, size, latency, distance))
185
186
187def _generate_benchmark(target, output_prefix, args):
188    """
189    Actual benchmark generation policy. This is the function that defines
190    how the microbenchmark are going to be generated
191    """
192
193    functional_units, size, latency, distance = args
194
195    try:
196
197        # Name of the output file
198        func_unit_names = [unit.name for unit in functional_units]
199        fname = "%s%s" % (output_prefix, "_".join(func_unit_names))
200        fname = "%s_LAT_%s" % (fname, latency)
201        fname = "%s_DEP_%s" % (fname, distance)
202
203        # Name of the fail output file (generated in case of exception)
204        ffname = "%s.c.fail" % (fname)
205
206        print_info("Generating %s ..." % (fname))
207
208        # Get the wrapper object. The wrapper object is in charge of
209        # translating the internal representation of the microbenchmark
210        # to the final output format.
211        #
212        # In this case, we obtain the 'CInfGen' wrapper, which embeds
213        # the generated code within an infinite loop using C plus
214        # in-line assembly statements.
215        cwrapper = microprobe.code.get_wrapper("CInfGen")
216
217        # Create the synthesizer object, which is in charge of driving the
218        # generation of the microbenchmark, given a set of passes
219        # (a.k.a. transformations) to apply to the an empty internal
220        # representation of the microbenchmark
221        synth = microprobe.code.Synthesizer(target,
222                                            cwrapper(),
223                                            value=0b01010101)
224
225        # Add the transformation passes
226
227        #######################################################################
228        # Pass 1: Init integer registers to a given value                     #
229        #######################################################################
230        synth.add_pass(
231            microprobe.passes.initialization.InitializeRegistersPass(
232                value=_init_value()))
233
234        #######################################################################
235        # Pass 2: Add a building block of size 'size'                         #
236        #######################################################################
237        synth.add_pass(
238            microprobe.passes.structure.SimpleBuildingBlockPass(size))
239
240        #######################################################################
241        # Pass 3: Fill the building block with the instruction sequence       #
242        #######################################################################
243        synth.add_pass(
244            microprobe.passes.instruction.SetInstructionTypeByElementPass(
245                target, functional_units, {}))
246
247        #######################################################################
248        # Pass 4: Compute addresses of instructions (this pass is needed to   #
249        #         update the internal representation information so that in   #
250        #         case addresses are required, they are up to date).          #
251        #######################################################################
252        synth.add_pass(
253            microprobe.passes.address.UpdateInstructionAddressesPass())
254
255        #######################################################################
256        # Pass 5: Set target of branches to be the next instruction in the    #
257        #         instruction stream                                          #
258        #######################################################################
259        synth.add_pass(microprobe.passes.branch.BranchNextPass())
260
261        #######################################################################
262        # Pass 6: Set memory-related operands to access 16 storage locations  #
263        #         in a round-robin fashion in stride 256 bytes.               #
264        #         The pattern would be: 0, 256, 512, .... 3840, 0, 256, ...   #
265        #######################################################################
266        synth.add_pass(microprobe.passes.memory.SingleMemoryStreamPass(
267            16, 256))
268
269        #######################################################################
270        # Pass 7.A: Initialize the storage locations accessed by floating     #
271        #           point instructions to have a valid floating point value   #
272        #######################################################################
273        synth.add_pass(
274            microprobe.passes.float.InitializeMemoryFloatPass(
275                value=1.000000000000001))
276
277        #######################################################################
278        # Pass 7.B: Initialize the storage locations accessed by decimal      #
279        #           instructions to have a valid decimal value                #
280        #######################################################################
281        synth.add_pass(
282            microprobe.passes.decimal.InitializeMemoryDecimalPass(value=1))
283
284        #######################################################################
285        # Pass 8: Set the remaining instructions operands (if not set)        #
286        #         (Required to set remaining immediate operands)              #
287        #######################################################################
288        synth.add_pass(
289            microprobe.passes.register.DefaultRegisterAllocationPass(
290                dd=distance))
291
292        # Synthesize the microbenchmark.The synthesize applies the set of
293        # transformation passes added before and returns object representing
294        # the microbenchmark
295        bench = synth.synthesize()
296
297        # Save the microbenchmark to the file 'fname'
298        synth.save(fname, bench=bench)
299
300        print_info("%s generated!" % (fname))
301
302        # Remove fail file if exists
303        if os.path.isfile(ffname):
304            os.remove(ffname)
305
306    except MicroprobeException:
307
308        # In case of exception during the generation of the microbenchmark,
309        # print the error, write the fail file and exit
310        print_error(traceback.format_exc())
311        open(ffname, 'a').close()
312        exit(-1)
313
314
315def _init_value():
316    """ Return a init value """
317    return 0b0101010101010101010101010101010101010101010101010101010101010101
318
319
320# Main
321if __name__ == '__main__':
322    # run main if executed from the command line
323    # and the main method exists
324
325    if callable(locals().get('main_setup')):
326        main_setup()
327        exit(0)

power_v206_power7_ppc64_linux_gcc_memory.py

The following example shows how to create microbenchmarks with different activity (stress levels) on the different levels of the cache hierarchy. Note that it is not necessary to use the built-in command line interface provided by Microprobe, as the example shows.

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15"""
 16power_v206_power7_ppc64_linux_gcc_memory.py
 17
 18Example python script to show how to generate microbenchmarks with particular
 19levels of activity in the memory hierarchy.
 20"""
 21
 22# Futures
 23from __future__ import absolute_import
 24
 25# Built-in modules
 26import multiprocessing as mp
 27import os
 28import random
 29import sys
 30
 31# Third party modules
 32from six.moves import map
 33
 34# Own modules
 35import microprobe.code
 36import microprobe.passes.address
 37import microprobe.passes.ilp
 38import microprobe.passes.initialization
 39import microprobe.passes.instruction
 40import microprobe.passes.memory
 41import microprobe.passes.register
 42import microprobe.passes.structure
 43from microprobe import MICROPROBE_RC
 44from microprobe.exceptions import MicroprobeTargetDefinitionError
 45from microprobe.model.memory import EndlessLoopDataMemoryModel
 46from microprobe.target import import_definition
 47from microprobe.utils.cmdline import print_error, print_info
 48
 49__author__ = "Ramon Bertran"
 50__copyright__ = "Copyright 2011-2021 IBM Corporation"
 51__credits__ = []
 52__license__ = "IBM (c) 2011-2021 All rights reserved"
 53__version__ = "0.5"
 54__maintainer__ = "Ramon Bertran"
 55__email__ = "rbertra@us.ibm.com"
 56__status__ = "Development"  # "Prototype", "Development", or "Production"
 57
 58# Get the target definition
 59try:
 60    TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
 61except MicroprobeTargetDefinitionError as exc:
 62    print_error("Unable to import target definition")
 63    print_error("Exception message: %s" % str(exc))
 64    exit(-1)
 65
 66BASE_ELEMENT = [
 67    element for element in TARGET.elements.values() if element.name == 'L1D'
 68][0]
 69CACHE_HIERARCHY = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
 70    BASE_ELEMENT)
 71
 72# Benchmark size
 73BENCHMARK_SIZE = 8 * 1024
 74
 75# Fill a list of the models to be generated
 76
 77MEMORY_MODELS = []
 78
 79#
 80# Due to performance issues (long exec. time) this
 81# model is disabled
 82#
 83# MEMORY_MODELS.append(
 84#    (
 85#        "ALL", CACHE_HIERARCHY, [
 86#            25, 25, 25, 25]))
 87
 88MEMORY_MODELS.append(("L1", CACHE_HIERARCHY, [100, 0, 0, 0]))
 89MEMORY_MODELS.append(("L2", CACHE_HIERARCHY, [0, 100, 0, 0]))
 90MEMORY_MODELS.append(("L3", CACHE_HIERARCHY, [0, 0, 100, 0]))
 91MEMORY_MODELS.append(("L1L3", CACHE_HIERARCHY, [50, 0, 50, 0]))
 92MEMORY_MODELS.append(("L1L2", CACHE_HIERARCHY, [50, 50, 0, 0]))
 93MEMORY_MODELS.append(("L2L3", CACHE_HIERARCHY, [0, 50, 50, 0]))
 94MEMORY_MODELS.append(("CACHES", CACHE_HIERARCHY, [33, 33, 34, 0]))
 95MEMORY_MODELS.append(("MEM", CACHE_HIERARCHY, [0, 0, 0, 100]))
 96
 97# Enable parallel generation
 98PARALLEL = False
 99
100
101def main():
102    """Main function. """
103    # call the generate method for each model in the memory model list
104
105    if PARALLEL:
106        print_info("Start parallel execution...")
107        pool = mp.Pool(processes=MICROPROBE_RC['cpus'])
108        pool.map(generate, MEMORY_MODELS, 1)
109    else:
110        print_info("Start sequential execution...")
111        list(map(generate, MEMORY_MODELS))
112
113    exit(0)
114
115
116def generate(model):
117    """Benchmark generation policy function. """
118
119    print_info("Creating memory model '%s' ..." % model[0])
120    model = EndlessLoopDataMemoryModel(*model)
121
122    modelname = model.name
123
124    print_info("Generating Benchmark mem-%s ..." % (modelname))
125
126    # Get the architecture
127    garch = TARGET
128
129    # For all the supported instructions, get the memory operations,
130    sequence = []
131    for instr_name in sorted(garch.instructions.keys()):
132
133        instr = garch.instructions[instr_name]
134
135        if not instr.access_storage:
136            continue
137        if instr.privileged:  # Skip privileged
138            continue
139        if instr.hypervisor:  # Skip hypervisor
140            continue
141        if instr.trap:  # Skip traps
142            continue
143        if "String" in instr.description:  # Skip unsupported string instr.
144            continue
145        if "Multiple" in instr.description:  # Skip unsupported mult. ld/sts
146            continue
147        if instr.category in ['LMA', 'LMV', 'DS', 'EC',
148                              'WT']:  # Skip unsupported categories
149            continue
150        if instr.access_storage_with_update:  # Not supported by mem. model
151            continue
152        if "Reserve Indexed" in instr.description:  # Skip (illegal intr.)
153            continue
154        if "Conditional Indexed" in instr.description:  # Skip (illegal intr.)
155            continue
156        if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1']:
157            continue
158
159        sequence.append(instr)
160
161    # Get the loop wrapper. In this case we take the 'CInfPpc', which
162    # generates an infinite loop in C using PowerPC embedded assembly.
163    cwrapper = microprobe.code.get_wrapper("CInfPpc")
164
165    # Define function to return random numbers (used afterwards)
166    def rnd():
167        """Return a random value. """
168        return random.randrange(0, (1 << 64) - 1)
169
170    # Create the benchmark synthesizer
171    synth = microprobe.code.Synthesizer(garch, cwrapper())
172
173    ##########################################################################
174    # Add the passes we want to apply to synthesize benchmarks               #
175    ##########################################################################
176
177    # --> Init registers to random values
178    synth.add_pass(
179        microprobe.passes.initialization.InitializeRegistersPass(value=rnd))
180
181    # --> Add a single basic block of size 'size'
182    if model.name in ['MEM']:
183        synth.add_pass(
184            microprobe.passes.structure.SimpleBuildingBlockPass(
185                BENCHMARK_SIZE * 4))
186    else:
187        synth.add_pass(
188            microprobe.passes.structure.SimpleBuildingBlockPass(
189                BENCHMARK_SIZE))
190
191    # --> Fill the basic block using the sequence of instructions provided
192    synth.add_pass(
193        microprobe.passes.instruction.SetInstructionTypeBySequencePass(
194            sequence))
195
196    # --> Set the memory operations parameters to fulfill the given model
197    synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(model))
198
199    # --> Set the dependency distance and the default allocation. Sets the
200    # remaining undefined instruction operands (register allocation,...)
201    synth.add_pass(microprobe.passes.register.NoHazardsAllocationPass())
202    synth.add_pass(
203        microprobe.passes.register.DefaultRegisterAllocationPass(dd=0))
204
205    # Generate the benchmark (applies the passes).
206    bench = synth.synthesize()
207
208    print_info("Benchmark mem-%s saving to disk..." % (modelname))
209
210    # Save the benchmark
211    synth.save("%s/mem-%s" % (DIRECTORY, modelname), bench=bench)
212
213    print_info("Benchmark mem-%s generated" % (modelname))
214    return True
215
216
217if __name__ == '__main__':
218    # run main if executed from the command line
219    # and the main method exists
220
221    if len(sys.argv) != 2:
222        print_info("Usage:")
223        print_info("%s output_dir" % (sys.argv[0]))
224        exit(-1)
225
226    DIRECTORY = sys.argv[1]
227
228    if not os.path.isdir(DIRECTORY):
229        print_error("Output directory '%s' does not exists" % (DIRECTORY))
230        exit(-1)
231
232    if callable(locals().get('main')):
233        main()

power_v206_power7_ppc64_linux_gcc_random.py

The following example generates random microbenchmarks:

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15"""
 16power_v206_power7_ppc64_linux_gcc_memory.py
 17
 18Example python script to show how to generate random microbenchmarks.
 19"""
 20
 21# Futures
 22from __future__ import absolute_import
 23
 24# Built-in modules
 25import multiprocessing as mp
 26import os
 27import random
 28import sys
 29
 30# Third party modules
 31from six.moves import map, range
 32
 33# Own modules
 34import microprobe.code
 35import microprobe.passes.address
 36import microprobe.passes.branch
 37import microprobe.passes.ilp
 38import microprobe.passes.initialization
 39import microprobe.passes.instruction
 40import microprobe.passes.memory
 41import microprobe.passes.register
 42import microprobe.passes.structure
 43from microprobe import MICROPROBE_RC
 44from microprobe.exceptions import MicroprobeError, \
 45    MicroprobeTargetDefinitionError
 46from microprobe.model.memory import EndlessLoopDataMemoryModel
 47from microprobe.target import import_definition
 48from microprobe.utils.cmdline import print_error, print_info
 49
 50__author__ = "Ramon Bertran"
 51__copyright__ = "Copyright 2011-2021 IBM Corporation"
 52__credits__ = []
 53__license__ = "IBM (c) 2011-2021 All rights reserved"
 54__version__ = "0.5"
 55__maintainer__ = "Ramon Bertran"
 56__email__ = "rbertra@us.ibm.com"
 57__status__ = "Development"  # "Prototype", "Development", or "Production"
 58
 59# Benchmark size
 60BENCHMARK_SIZE = 8 * 1024
 61
 62# Get the target definition
 63try:
 64    TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
 65except MicroprobeTargetDefinitionError as exc:
 66    print_error("Unable to import target definition")
 67    print_error("Exception message: %s" % str(exc))
 68    exit(-1)
 69
 70BASE_ELEMENT = [
 71    element for element in TARGET.elements.values() if element.name == 'L1D'
 72][0]
 73CACHE_HIERARCHY = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
 74    BASE_ELEMENT)
 75
 76PARALLEL = True
 77
 78
 79def main():
 80    """ Main program. """
 81    if PARALLEL:
 82        pool = mp.Pool(processes=MICROPROBE_RC['cpus'])
 83        pool.map(generate, list(range(0, 100)), 1)
 84    else:
 85        list(map(generate, list(range(0, 100))))
 86
 87
 88def generate(name):
 89    """ Benchmark generation policy. """
 90
 91    if os.path.isfile("%s/random-%s.c" % (DIRECTORY, name)):
 92        print_info("Skip %d" % name)
 93        return
 94
 95    print_info("Generating %d..." % name)
 96
 97    # Seed the randomness
 98    rand = random.Random()
 99    rand.seed(64)  # My favorite number ;)
100
101    # Generate a random memory model (used afterwards)
102    model = []
103    total = 100
104    for mcomp in CACHE_HIERARCHY[0:-1]:
105        weight = rand.randint(0, total)
106        model.append(weight)
107        print_info("%s: %d%%" % (mcomp, weight))
108        total = total - weight
109
110    # Fix remaining
111    level = rand.randint(0, len(CACHE_HIERARCHY[0:-1]) - 1)
112    model[level] += total
113
114    # Last level always zero
115    model.append(0)
116
117    # Sanity check
118    psum = 0
119    for elem in model:
120        psum += elem
121    assert psum == 100
122
123    modelobj = EndlessLoopDataMemoryModel("random-%s", CACHE_HIERARCHY, model)
124
125    # Get the loop wrapper. In this case we take the 'CInfPpc', which
126    # generates an infinite loop in C using PowerPC embedded assembly.
127    cwrapper = microprobe.code.get_wrapper("CInfPpc")
128
129    # Define function to return random numbers (used afterwards)
130    def rnd():
131        """Return a random value. """
132        return rand.randrange(0, (1 << 64) - 1)
133
134    # Create the benchmark synthesizer
135    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
136
137    ##########################################################################
138    # Add the passes we want to apply to synthesize benchmarks               #
139    ##########################################################################
140
141    # --> Init registers to random values
142    synth.add_pass(
143        microprobe.passes.initialization.InitializeRegistersPass(value=rnd))
144
145    # --> Add a single basic block of size size
146    synth.add_pass(
147        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
148
149    # --> Fill the basic block with instructions picked randomly from the list
150    #     provided
151
152    instructions = []
153    for instr in TARGET.instructions.values():
154
155        if instr.privileged:  # Skip privileged
156            continue
157        if instr.hypervisor:  # Skip hypervisor
158            continue
159        if instr.trap:  # Skip traps
160            continue
161        if instr.syscall:  # Skip syscall
162            continue
163        if "String" in instr.description:  # Skip unsupported string instr.
164            continue
165        if "Multiple" in instr.description:  # Skip unsupported mult. ld/sts
166            continue
167        if instr.category in ['LMA', 'LMV', 'DS', 'EC',
168                              'WT']:  # Skip unsupported categories
169            continue
170        if instr.access_storage_with_update:  # Not supported by mem. model
171            continue
172        if instr.branch and not instr.branch_relative:  # Skip branches
173            continue
174        if "Reserve Indexed" in instr.description:  # Skip (illegal intr.)
175            continue
176        if "Conitional Indexed" in instr.description:  # Skip (illegal intr.)
177            continue
178        if instr.name in [
179                'LD_V1',
180                'LWZ_V1',
181                'STW_V1',
182        ]:
183            continue
184
185        instructions.append(instr)
186
187    synth.add_pass(
188        microprobe.passes.instruction.SetRandomInstructionTypePass(
189            instructions, rand
190        )
191    )
192
193    # --> Set the memory operations parameters to fulfill the given model
194    synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(modelobj))
195
196    # --> Set target of branches to next instruction (first compute addresses)
197    synth.add_pass(microprobe.passes.address.UpdateInstructionAddressesPass())
198    synth.add_pass(microprobe.passes.branch.BranchNextPass())
199
200    # --> Set the dependency distance and the default allocation. Dependency
201    #     distance is randomly picked
202    synth.add_pass(
203        microprobe.passes.register.DefaultRegisterAllocationPass(
204            dd=rand.randint(1, 20)
205        )
206    )
207
208    # Generate the benchmark (applies the passes)
209    # Since it is a randomly generated code, the generation might fail
210    # (e.g. not enough access to fulfill the requested memory model, etc.)
211    # Because of that, we handle the exception accordingly.
212    try:
213        print_info("Synthesizing %d..." % name)
214        bench = synth.synthesize()
215        print_info("Synthesized %d!" % name)
216        # Save the benchmark
217        synth.save("%s/random-%s" % (DIRECTORY, name), bench=bench)
218    except MicroprobeError:
219        print_info("Synthesizing error in '%s'. This is Ok." % name)
220
221    return True
222
223
224if __name__ == '__main__':
225    # run main if executed from the command line
226    # and the main method exists
227
228    if len(sys.argv) != 2:
229        print_info("Usage:")
230        print_info("%s output_dir" % (sys.argv[0]))
231        exit(-1)
232
233    DIRECTORY = sys.argv[1]
234
235    if not os.path.isdir(DIRECTORY):
236        print_error("Output directory '%s' does not exists" % (DIRECTORY))
237        exit(-1)
238
239    if callable(locals().get('main')):
240        main()

power_v206_power7_ppc64_linux_gcc_custom.py

The following example shows different examples on how to customize the generation of microbenchmarks:

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15"""
 16power_v206_power7_ppc64_linux_gcc_custom.py
 17
 18Example python script to show how to generate random microbenchmarks.
 19"""
 20
 21# Futures
 22from __future__ import absolute_import
 23
 24# Built-in modules
 25import os
 26import sys
 27
 28# Own modules
 29import microprobe.code
 30import microprobe.passes.initialization
 31import microprobe.passes.instruction
 32import microprobe.passes.memory
 33import microprobe.passes.register
 34import microprobe.passes.structure
 35from microprobe.exceptions import MicroprobeTargetDefinitionError
 36from microprobe.model.memory import EndlessLoopDataMemoryModel
 37from microprobe.target import import_definition
 38from microprobe.utils.cmdline import print_error, print_info
 39from microprobe.utils.misc import RNDINT
 40
 41__author__ = "Ramon Bertran"
 42__copyright__ = "Copyright 2011-2021 IBM Corporation"
 43__credits__ = []
 44__license__ = "IBM (c) 2011-2021 All rights reserved"
 45__version__ = "0.5"
 46__maintainer__ = "Ramon Bertran"
 47__email__ = "rbertra@us.ibm.com"
 48__status__ = "Development"  # "Prototype", "Development", or "Production"
 49
 50# Benchmark size
 51BENCHMARK_SIZE = 8 * 1024
 52
 53if len(sys.argv) != 2:
 54    print_info("Usage:")
 55    print_info("%s output_dir" % (sys.argv[0]))
 56    exit(-1)
 57
 58DIRECTORY = sys.argv[1]
 59
 60if not os.path.isdir(DIRECTORY):
 61    print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
 62    exit(-1)
 63
 64# Get the target definition
 65try:
 66    TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
 67except MicroprobeTargetDefinitionError as exc:
 68    print_error("Unable to import target definition")
 69    print_error("Exception message: %s" % str(exc))
 70    exit(-1)
 71
 72
 73###############################################################################
 74# Example 1: loop with instructions accessing storage , hitting the first     #
 75#            level of cache and with dependency distance of 3                 #
 76###############################################################################
 77def example_1():
 78    """ Example 1 """
 79    name = "L1-LOADS"
 80
 81    base_element = [
 82        element for element in TARGET.elements.values()
 83        if element.name == 'L1D'
 84    ][0]
 85    cache_hierarchy = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
 86        base_element)
 87
 88    model = [0] * len(cache_hierarchy)
 89    model[0] = 100
 90
 91    mmodel = EndlessLoopDataMemoryModel("random-%s", cache_hierarchy, model)
 92
 93    profile = {}
 94    for instr_name in sorted(TARGET.instructions.keys()):
 95        instr = TARGET.instructions[instr_name]
 96        if not instr.access_storage:
 97            continue
 98        if instr.privileged:  # Skip privileged
 99            continue
100        if instr.hypervisor:  # Skip hypervisor
101            continue
102        if "String" in instr.description:  # Skip unsupported string instr.
103            continue
104        if "ultiple" in instr.description:  # Skip unsupported mult. ld/sts
105            continue
106        if instr.category in ['DS', 'LMA', 'LMV',
107                              'EC']:  # Skip unsupported categories
108            continue
109        if instr.access_storage_with_update:  # Not supported
110            continue
111
112        if instr.name in [
113                'LD_V1',
114                'LWZ_V1',
115                'STW_V1',
116        ]:
117            continue
118
119        if (any([moper.is_load for moper in instr.memory_operand_descriptors])
120                and all([
121                    not moper.is_store
122                    for moper in instr.memory_operand_descriptors
123                ])):
124            profile[instr] = 1
125
126    cwrapper = microprobe.code.get_wrapper("CInfPpc")
127    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
128
129    synth.add_pass(
130        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
131    synth.add_pass(
132        microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
133    synth.add_pass(
134        microprobe.passes.initialization.InitializeRegisterPass("GPR1",
135                                                                0,
136                                                                force=True,
137                                                                reserve=True))
138    synth.add_pass(
139        microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
140    synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(mmodel))
141    synth.add_pass(
142        microprobe.passes.register.DefaultRegisterAllocationPass(dd=3))
143
144    print_info("Generating %s..." % name)
145    bench = synth.synthesize()
146    print_info("%s Generated!" % name)
147    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
148
149
150###############################################################################
151# Example 2: loop with instructions using the MUL unit and with dependency    #
152#            distance of 4                                                    #
153###############################################################################
154def example_2():
155    """ Example 2 """
156    name = "FXU-MUL"
157
158    cwrapper = microprobe.code.get_wrapper("CInfPpc")
159    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
160
161    synth.add_pass(
162        microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
163    synth.add_pass(
164        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
165    synth.add_pass(
166        microprobe.passes.instruction.SetInstructionTypeByElementPass(
167            TARGET, [TARGET.elements['MUL_FXU0_Core0_SCM_Processor']], {}))
168    synth.add_pass(
169        microprobe.passes.register.DefaultRegisterAllocationPass(dd=4))
170
171    print_info("Generating %s..." % name)
172    bench = synth.synthesize()
173    print_info("%s Generated!" % name)
174    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
175
176
177###############################################################################
178# Example 3: loop with instructions using the ALU unit and with dependency    #
179#            distance of 1                                                    #
180###############################################################################
181def example_3():
182    """ Example 3 """
183    name = "FXU-ALU"
184
185    cwrapper = microprobe.code.get_wrapper("CInfPpc")
186    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
187
188    synth.add_pass(
189        microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
190    synth.add_pass(
191        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
192    synth.add_pass(
193        microprobe.passes.instruction.SetInstructionTypeByElementPass(
194            TARGET, [TARGET.elements['ALU_FXU0_Core0_SCM_Processor']], {}))
195    synth.add_pass(
196        microprobe.passes.register.DefaultRegisterAllocationPass(dd=1))
197
198    print_info("Generating %s..." % name)
199    bench = synth.synthesize()
200    print_info("%s Generated!" % name)
201    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
202
203
204###############################################################################
205# Example 4: loop with FMUL* instructions with different weights and with     #
206#            dependency distance 10                                           #
207###############################################################################
208def example_4():
209    """ Example 4 """
210    name = "VSU-FMUL"
211
212    profile = {}
213    profile[TARGET.instructions['FMUL_V0']] = 4
214    profile[TARGET.instructions['FMULS_V0']] = 3
215    profile[TARGET.instructions['FMULx_V0']] = 2
216    profile[TARGET.instructions['FMULSx_V0']] = 1
217
218    cwrapper = microprobe.code.get_wrapper("CInfPpc")
219    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
220
221    synth.add_pass(
222        microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
223    synth.add_pass(
224        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
225    synth.add_pass(
226        microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
227    synth.add_pass(
228        microprobe.passes.register.DefaultRegisterAllocationPass(dd=10))
229
230    print_info("Generating %s..." % name)
231    bench = synth.synthesize()
232    print_info("%s Generated!" % name)
233    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
234
235
236###############################################################################
237# Example 5: loop with FADD* instructions with different weights and with     #
238#            dependency distance 1                                            #
239###############################################################################
240def example_5():
241    """ Example 5 """
242    name = "VSU-FADD"
243
244    profile = {}
245    profile[TARGET.instructions['FADD_V0']] = 100
246    profile[TARGET.instructions['FADDx_V0']] = 1
247    profile[TARGET.instructions['FADDS_V0']] = 10
248    profile[TARGET.instructions['FADDSx_V0']] = 1
249
250    cwrapper = microprobe.code.get_wrapper("CInfPpc")
251    synth = microprobe.code.Synthesizer(TARGET, cwrapper())
252
253    synth.add_pass(
254        microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
255    synth.add_pass(
256        microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
257    synth.add_pass(
258        microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
259    synth.add_pass(
260        microprobe.passes.register.DefaultRegisterAllocationPass(dd=1))
261
262    print_info("Generating %s..." % name)
263    bench = synth.synthesize()
264    print_info("%s Generated!" % name)
265    synth.save("%s/%s" % (DIRECTORY, name), bench=bench)  # Save the benchmark
266
267
268###############################################################################
269# Call the examples                                                           #
270###############################################################################
271example_1()
272example_2()
273example_3()
274example_4()
275example_5()
276exit(0)

power_v206_power7_ppc64_linux_gcc_genetic.py

Deprecated since version 0.5: Support for the PyEvolve and genetic algorithm based searches has been discontinued

The following example shows how to use the design exploration module and the genetic algorithm based searches to look for a solution. In particular, for each functional unit of the architecture and a range of IPCs (instruction per cycle), the example looks for a solution that stresses that functional unit at the given IPC. External commands (not included) are needed to evaluate the generated microbenchmarks in the target platform.

  1#!/usr/bin/env python
  2# Copyright 2011-2021 IBM Corporation
  3#
  4# Licensed under the Apache License, Version 2.0 (the "License");
  5# you may not use this file except in compliance with the License.
  6# You may obtain a copy of the License at
  7#
  8# http://www.apache.org/licenses/LICENSE-2.0
  9#
 10# Unless required by applicable law or agreed to in writing, software
 11# distributed under the License is distributed on an "AS IS" BASIS,
 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 13# See the License for the specific language governing permissions and
 14# limitations under the License.
 15"""
 16power_v206_power7_ppc64_linux_gcc_genetic.py
 17
 18Example python script to show how to generate a set of microbenchmark
 19stressing a particular unit but at different IPC ratio using a genetic
 20search algorithm to play with two knobs: average latency and dependency
 21distance.
 22
 23An IPC evaluation and scoring script is required. For instance:
 24
 25.. code:: bash
 26
 27   #!/bin/bash
 28   # ARGS: $1 is the target IPC
 29   #       $2 is the name of the generate benchnark
 30   target_ipc=$1
 31   source_bench=$2
 32
 33   # Compile the benchmark
 34   gcc -O0 -mcpu=power7 -mtune=power7 -std=c99 $source_bench.c -o $source_bench
 35
 36   # Evaluate the ipc
 37   ipc=< your preferred commands to evaluate the IPC >
 38
 39   # Compute the score (the closer to the target IPC the
 40   score=(1/($ipc-$target_ipc))^2 | bc -l
 41
 42   echo $score
 43
 44Use the script above as a template for your own GA-based search.
 45"""
 46
 47# Futures
 48from __future__ import absolute_import, division
 49
 50# Built-in modules
 51import datetime
 52import os
 53import sys
 54import time as runtime
 55
 56# Third party modules
 57from six.moves import range
 58
 59# Own modules
 60import microprobe.code
 61import microprobe.driver.genetic
 62import microprobe.passes.ilp
 63import microprobe.passes.initialization
 64import microprobe.passes.instruction
 65import microprobe.passes.register
 66import microprobe.passes.structure
 67from microprobe.exceptions import MicroprobeTargetDefinitionError
 68from microprobe.target import import_definition
 69from microprobe.utils.cmdline import print_error, print_info, print_warning
 70from microprobe.utils.misc import RNDINT
 71
 72__author__ = "Ramon Bertran"
 73__copyright__ = "Copyright 2011-2021 IBM Corporation"
 74__credits__ = []
 75__license__ = "IBM (c) 2011-2021 All rights reserved"
 76__version__ = "0.5"
 77__maintainer__ = "Ramon Bertran"
 78__email__ = "rbertra@us.ibm.com"
 79__status__ = "Development"  # "Prototype", "Development", or "Production"
 80
 81# Benchmark size
 82BENCHMARK_SIZE = 20
 83
 84# Get the target definition
 85try:
 86    TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
 87except MicroprobeTargetDefinitionError as exc:
 88    print_error("Unable to import target definition")
 89    print_error("Exception message: %s" % str(exc))
 90    exit(-1)
 91
 92
 93def main():
 94    """Main function."""
 95
 96    component_list = ["FXU", "FXU-noLSU", "FXU-LSU", "VSU", "VSU-FXU"]
 97    ipcs = [float(x) / 10 for x in range(1, 41)]
 98    ipcs = ipcs[5:] + ipcs[:5]
 99
100    for name in component_list:
101        for ipc in ipcs:
102            generate_genetic(name, ipc)
103
104
105def generate_genetic(compname, ipc):
106    """Generate a microbenchmark stressing compname at the given ipc."""
107    comps = []
108    bcomps = []
109    any_comp = False
110
111    if compname.find("FXU") >= 0:
112        comps.append(TARGET.elements["FXU0_Core0_SCM_Processor"])
113
114    if compname.find("VSU") >= 0:
115        comps.append(TARGET.elements["VSU0_Core0_SCM_Processor"])
116
117    if len(comps) == 2:
118        any_comp = True
119    elif compname.find("noLSU") >= 0:
120        bcomps.append(TARGET.elements["LSU0_Core0_SCM_Processor"])
121    elif compname.find("LSU") >= 0:
122        comps.append(TARGET.elements["LSU_Core0_SCM_Processor"])
123
124    if (len(comps) == 1 and ipc > 2) or (len(comps) == 2 and ipc > 4):
125        return True
126
127    for elem in os.listdir(DIRECTORY):
128        if not elem.endswith(".c"):
129            continue
130        if elem.startswith("%s:IPC:%.2f:DIST" % (compname, ipc)):
131            print_info("Already generated: %s %d" % (compname, ipc))
132            return True
133
134    print_info("Going for IPC: %f and Element: %s" % (ipc, compname))
135
136    def generate(name, *args):
137        """Benchmark generation function.
138
139        First argument is name, second the dependency distance and the
140        third is the average instruction latency.
141        """
142        dist, latency = args
143
144        wrapper = microprobe.code.get_wrapper("CInfPpc")
145        synth = microprobe.code.Synthesizer(TARGET, wrapper())
146        synth.add_pass(
147            microprobe.passes.initialization.InitializeRegistersPass(
148                value=RNDINT))
149        synth.add_pass(
150            microprobe.passes.structure.SimpleBuildingBlockPass(
151                BENCHMARK_SIZE))
152        synth.add_pass(
153            microprobe.passes.instruction.SetInstructionTypeByElementPass(
154                TARGET,
155                comps, {},
156                block=bcomps,
157                avelatency=latency,
158                any_comp=any_comp))
159        synth.add_pass(
160            microprobe.passes.register.DefaultRegisterAllocationPass(dd=dist))
161        bench = synth.synthesize()
162        synth.save(name, bench=bench)
163
164    # Set the genetic algorithm parameters
165    ga_params = []
166    ga_params.append((0, 20, 0.05))  # Average dependency distance design space
167    ga_params.append((2, 8, 0.05))  # Average instruction latency design space
168
169    # Set up the search driver
170    driver = microprobe.driver.genetic.ExecCmdDriver(
171        generate, 20, 30, 30, "'%s' %f " % (COMMAND, ipc), ga_params)
172
173    starttime = runtime.time()
174    print_info("Start search...")
175    driver.run(1)
176    print_info("Search end")
177    endtime = runtime.time()
178
179    print_info("Genetic time::%s" %
180               (datetime.timedelta(seconds=endtime - starttime)))
181
182    # Check if we found a solution
183    ga_params = driver.solution()
184    score = driver.score()
185
186    print_info("IPC found: %f, score: %f" % (ipc, score))
187
188    if score < 20:
189        print_warning("Unable to find an optimal solution with IPC: %f:" % ipc)
190        print_info("Generating the closest solution...")
191        generate(
192            "%s/%s:IPC:%.2f:DIST:%.2f:LAT:%.2f-check" %
193            (DIRECTORY, compname, ipc, ga_params[0], ga_params[1]),
194            ga_params[0], ga_params[1])
195        print_info("Closest solution generated")
196    else:
197        print_info("Solution found for %s and IPC %f -> dist: %f , "
198                   "latency: %f " %
199                   (compname, ipc, ga_params[0], ga_params[1]))
200        print_info("Generating solution...")
201        generate(
202            "%s/%s:IPC:%.2f:DIST:%.2f:LAT:%.2f" %
203            (DIRECTORY, compname, ipc, ga_params[0], ga_params[1]),
204            ga_params[0], ga_params[1])
205        print_info("Solution generated")
206    return True
207
208
209if __name__ == '__main__':
210    # run main if executed from the COMMAND line
211    # and the main method exists
212
213    if len(sys.argv) != 3:
214        print_info("Usage:")
215        print_info("%s output_dir eval_cmd" % (sys.argv[0]))
216        print_info("")
217        print_info("Output dir: output directory for the generated benchmarks")
218        print_info("eval_cmd: command accepting 2 parameters: the target IPC")
219        print_info("          and the filename of the generate benchmark. ")
220        print_info("          Output: the score used for the GA search. E.g.")
221        print_info("          the close the IPC of the generated benchmark to")
222        print_info("          the target IPC, the cmd should give a higher  ")
223        print_info("          score. ")
224        exit(-1)
225
226    DIRECTORY = sys.argv[1]
227    COMMAND = sys.argv[2]
228
229    if not os.path.isdir(DIRECTORY):
230        print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
231        exit(-1)
232
233    if not os.path.isfile(COMMAND):
234        print_info("The COMMAND '%s' does not exists" % (COMMAND))
235        exit(-1)
236
237    if callable(locals().get('main')):
238        main()