Examples on POWER
In the definitions/power/examples
directory of the Microprobe distribution
(if you installed the microprobe_target_power package),
you will find different examples showing the usage of Microprobe
for the power architecture. Although we have split the examples by
architecture, the concepts we introduce in these examples are common in all
the architectures.
We recommend users to go through the code of these examples to understand specific details on how to use the framework.
Contents:
isa_power_v206_info.py
The first example we show is isa_power_v206_info.py
. This example
shows how to search for architecture definitions (e.g. the ISA properties),
how to import the definitions and then how to dump the definition.
If you execute the following command:
> ./isa_power_v206_info.py
will generate the following output, which shows all the details of the POWER v2.06 architecture (first and last 20 lines for brevity):
--------------------------------------------------------------------------------
ISA Name: power_v206
ISA Description: power_v206
--------------------------------------------------------------------------------
Register Types:
GPR: General Register (bit size: 64)
VSCR: Vector Status and Control Register (bit size: 32)
FPR: Floating-Point Register (bit size: 64)
SPR: Special Purpose Register (64 bits) (bit size: 64)
VR: Vector Register (bit size: 128)
MSR: Machine State Register (bit size: 64)
SPR32: Special Purpose Register (32 bits) (bit size: 32)
VSR: Vector Scalar Register (bit size: 128)
FPSCR: Floating-Point Status and Control Register (bit size: 32)
CR: Condition Register (bit size: 4)
--------------------------------------------------------------------------------
Architected registers:
AESR : AESR Register (Type: SPR)
AMOR : AMOR Register (Type: SPR)
AMR : Authority Mask Register (Type: SPR)
...
access_storage : False (Boolean indicating if the instruction has storage operands )
access_storage_with_update : False (Boolean indicating if the instruction accesses to storage and updates the source register with the generated address)
algebraic : False (Boolean indicating if operation uses algebraic rules to keep values )
branch : False (Boolean indicating if the instruction is a branch )
branch_conditional : False (Boolean indicating if the instruction is a branch conditional )
branch_relative : False (Boolean indicating if the instruction is a relative branch )
category : VSX (String indicating if the instruction the instruction category )
decimal : False (Boolean indication if the instruction requires inputs in decimal format )
disable_asm : False (Boolean indicating if ASM generation is disabled for the instruction. If so, binary codification is used. )
hypervisor : False (Boolean indicating if the instruction need hypervisor mode )
privileged : False (Boolean indicating if the instruction is privileged )
privileged_optional : False (Boolean indicating the instrucion is priviledged or not depending on the input values )
switching : None (Input values required to maximize the computational switching )
syscall : False (Boolean indicating if the instruction is a syscall or return from one )
trap : False (Boolean indicating if the instruction is a trap )
Instructions defined: 938
Variants defined: 964
--------------------------------------------------------------------------------
The following code is what has been executed:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16isa_power_v206_info.py
17
18Example module to show how to access to isa definitions.
19"""
20
21# Futures
22from __future__ import absolute_import, print_function
23
24# Built-in modules
25import os
26
27# Own modules
28from microprobe.target.isa import find_isa_definitions, import_isa_definition
29
30__author__ = "Ramon Bertran"
31__copyright__ = "Copyright 2011-2021 IBM Corporation"
32__credits__ = []
33__license__ = "IBM (c) 2011-2021 All rights reserved"
34__version__ = "0.5"
35__maintainer__ = "Ramon Bertran"
36__email__ = "rbertra@us.ibm.com"
37__status__ = "Development" # "Prototype", "Development", or "Production"
38
39# Constants
40ISANAME = "power_v206"
41
42# Functions
43
44# Classes
45
46# Main
47
48# Search and import definition
49ISADEF = import_isa_definition(
50 os.path.dirname(
51 [isa for isa in find_isa_definitions()
52 if isa.name == ISANAME][0].filename
53 )
54 )
55
56# Print definition
57print((ISADEF.full_report()))
58exit(0)
In this simple code, first the find_isa_definitions
,
import_isa_definition
from the microprobe.target.isa module
are imported (line 14). Then, the first one is used to look for definitions of
architectures, a list returned and filtered and only the one with
name power_v206
is imported using the second method:
import_isa_definition
(lines 34-37). Finally, the full report of
the ISADEF
object is printed to standard output in line 40.
In the case, the full report is printed but the user can query any
information about the particular ISA that has been imported by using the
microprobe.target.isa.ISA
API.
power_v206_power7_ppc64_linux_gcc_profile.py
The aim of this example is to show how the code generation works in Microprobe. In particular, this example shows how to generate, for each instruction of the ISA, an endless loop containing such instruction. The size of the loop and the dependency distance between the instructions of the loop can specified as a parameter. Using Microprobe you can generate thousands of microbenchmarks in few minutes. Let’s start with the command line interface. Executing:
> ./power_v206_power7_ppc64_linux_gcc_profile.py --help
will generate the following output:
power_v206_power7_ppc64_linux_gcc_profile.py: INFO: Processing input arguments...
usage: power_v206_power7_ppc64_linux_gcc_profile.py [-h]
[-P SEARCH_PATH [SEARCH_PATH ...]]
[-V] [-v] [-d]
[-i INSTRUCTION_NAME [INSTRUCTION_NAME ...]]
[--output_prefix PREFIX]
[-O PATH] [-p NUM_JOBS]
[-S BENCHMARK_SIZE]
[-D DEPENDECY_DISTANCE]
ISA power v206 profile example
optional arguments:
-h, --help show this help message and exit
-P SEARCH_PATH [SEARCH_PATH ...], --default_paths SEARCH_PATH [SEARCH_PATH ...]
Default search paths for microprobe target definitions
-V, --version Show Microprobe version and exit
-v, --verbosity Verbosity level (Values: [0,1,2,3,4]). Each time this
argument is specified the verbosity level is
increased. By default, no logging messages are shown.
These are the four levels available:
-v (1): critical messages
-v -v (2): critical and error messages
-v -v -v (3): critical, error and warning messages
-v -v -v -v (4): critical, error, warning and info messages
Specifying more than four verbosity flags, will
default to the maximum of four. If you need extra
information, enable the debug mode (--debug or -d
flags).
-d, --debug Enable debug mode in Microprobe framework. Lots of
output messages will be generated
-i INSTRUCTION_NAME [INSTRUCTION_NAME ...], --instruction INSTRUCTION_NAME [INSTRUCTION_NAME ...]
Instruction names to generate. Default: All
instructions
--output_prefix PREFIX
Output prefix of the generated files. Default:
POWER_V206_PROFILE
-O PATH, --output_path PATH
Output path. Default: current path
-p NUM_JOBS, --parallel NUM_JOBS
Number of parallel jobs. Default: number of CPUs
available (80). Valid values: 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,
75, 76, 77, 78, 79, 80
-S BENCHMARK_SIZE, --size BENCHMARK_SIZE
Benchmark size (number of instructions in the endless
loop). Default: 64 instructions
-D DEPENDECY_DISTANCE, --dependency_distance DEPENDECY_DISTANCE
Average dependency distance between the instructions.
Default: 1000 (no dependencies)
Environment variables:
MICROPROBETEMPLATES Default path for microprobe templates
MICROPROBEDEBUG If set, enable debug
MICROPROBEDEBUGPASSES If set, enable debug during passes
MICROPROBEASMHEXFMT Assembly hexadecimal format. Options:
'all' -> All immediates in hex format
'address' -> Address immediates in hex format (default)
'none' -> All immediate in integer format
Lets look at the code to see how this command line tool is implemented. This is the complete code of the script:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15
16"""
17power_v206_power7_ppc64_linux_gcc_profile.py
18
19Example module to show how to generate a benchmark for each instruction
20of the ISA
21"""
22
23# Futures
24from __future__ import absolute_import
25
26# Built-in modules
27import multiprocessing as mp
28import os
29import sys
30import traceback
31
32# Third party modules
33from six.moves import map, range
34
35# Own modules
36import microprobe.code.ins
37import microprobe.passes.address
38import microprobe.passes.branch
39import microprobe.passes.decimal
40import microprobe.passes.float
41import microprobe.passes.ilp
42import microprobe.passes.initialization
43import microprobe.passes.instruction
44import microprobe.passes.memory
45import microprobe.passes.register
46import microprobe.passes.structure
47import microprobe.utils.cmdline
48from microprobe.exceptions import MicroprobeException
49from microprobe.target import import_definition
50from microprobe.utils.cmdline import existing_dir, \
51 int_type, print_error, print_info, print_warning
52from microprobe.utils.logger import get_logger
53
54__author__ = "Ramon Bertran"
55__copyright__ = "Copyright 2011-2021 IBM Corporation"
56__credits__ = []
57__license__ = "IBM (c) 2011-2021 All rights reserved"
58__version__ = "0.5"
59__maintainer__ = "Ramon Bertran"
60__email__ = "rbertra@us.ibm.com"
61__status__ = "Development" # "Prototype", "Development", or "Production"
62
63# Constants
64LOG = get_logger(__name__) # Get the generic logging interface
65
66
67# Functions
68def main_setup():
69 """
70 Set up the command line interface (CLI) with the arguments required by
71 this command line tool.
72 """
73
74 args = sys.argv[1:]
75
76 # Create the CLI interface object
77 cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
78 config_options=False,
79 target_options=False,
80 debug_options=False)
81
82 # Add the different parameters for this particular tool
83 cmdline.add_option(
84 "instruction",
85 "i",
86 None,
87 "Instruction names to generate. Default: All instructions",
88 required=False,
89 nargs="+",
90 metavar="INSTRUCTION_NAME")
91
92 cmdline.add_option(
93 "output_prefix",
94 None,
95 "POWER_V206_PROFILE",
96 "Output prefix of the generated files. Default: POWER_V206_PROFILE",
97 opt_type=str,
98 required=False,
99 metavar="PREFIX")
100
101 cmdline.add_option(
102 "output_path",
103 "O",
104 "./",
105 "Output path. Default: current path",
106 opt_type=existing_dir,
107 metavar="PATH")
108
109 cmdline.add_option(
110 "parallel",
111 "p",
112 mp.cpu_count(),
113 "Number of parallel jobs. Default: number of CPUs available (%s)" %
114 mp.cpu_count(),
115 opt_type=int,
116 choices=list(range(
117 1,
118 mp.cpu_count() +
119 1)),
120 metavar="NUM_JOBS")
121
122 cmdline.add_option(
123 "size",
124 "S",
125 64,
126 "Benchmark size (number of instructions in the endless loop). "
127 "Default: 64 instructions",
128 opt_type=int_type(1, 2**20),
129 metavar="BENCHMARK_SIZE")
130
131 cmdline.add_option(
132 "dependency_distance",
133 "D",
134 1000,
135 "Average dependency distance between the instructions. "
136 "Default: 1000 (no dependencies)",
137 opt_type=int_type(1, 1000),
138 metavar="DEPENDECY_DISTANCE")
139
140 # Start the main
141 print_info("Processing input arguments...")
142 cmdline.main(args, _main)
143
144
145def _main(arguments):
146 """
147 Main program. Called after the arguments from the CLI interface have
148 been processed.
149 """
150
151 print_info("Arguments processed!")
152
153 print_info("Importing target definition "
154 "'power_v206-power7-ppc64_linux_gcc'...")
155 target = import_definition("power_v206-power7-ppc64_linux_gcc")
156
157 # Get the arguments
158 instructions = arguments.get("instruction", None)
159 prefix = arguments["output_prefix"]
160 output_path = arguments["output_path"]
161 parallel_jobs = arguments["parallel"]
162 size = arguments["size"]
163 distance = arguments["dependency_distance"]
164
165 # Process the arguments
166 if instructions is not None:
167
168 # If the user has provided some instructions, make sure they
169 # exists and then we call the generation function
170
171 instructions = _validate_instructions(instructions, target)
172
173 if len(instructions) == 0:
174 print_error("No valid instructions defined.")
175 exit(-1)
176
177 # Set more verbose level
178 # set_log_level(10)
179 #
180 list(map(_generate_benchmark,
181 [(instruction,
182 prefix,
183 output_path,
184 target, size, distance) for instruction in instructions]))
185
186 else:
187
188 # If the user has not provided any instruction, go for all of them
189 # and then call he generation function
190
191 instructions = _generate_instructions(target, output_path, prefix)
192
193 # Since several benchmark will be generated, reduce verbose level
194 # and call the generation function in parallel
195
196 # set_log_level(30)
197
198 if parallel_jobs > 1:
199 pool = mp.Pool(processes=parallel_jobs)
200 pool.map(_generate_benchmark,
201 [(instruction,
202 prefix,
203 output_path,
204 target,
205 size,
206 distance) for instruction in instructions],
207 1)
208 else:
209 list(map(_generate_benchmark,
210 [(instruction,
211 prefix,
212 output_path,
213 target,
214 size,
215 distance) for instruction in instructions]))
216
217
218def _validate_instructions(instructions, target):
219 """
220 Validate the provided instruction for a given target
221 """
222
223 nins = []
224 for instruction in instructions:
225
226 if instruction not in list(target.isa.instructions.keys()):
227 print_warning(
228 "'%s' not defined in the ISA. Skipping..." %
229 instruction)
230 continue
231 nins.append(instruction)
232 return nins
233
234
235def _generate_instructions(target, path, prefix):
236 """
237 Generate the list of instruction to be generated for a given target
238 """
239
240 instructions = []
241 for name, instr in target.instructions.items():
242
243 if instr.privileged or instr.hypervisor:
244 # Skip priv/hyper instructions
245 continue
246
247 if instr.branch and not instr.branch_relative:
248 # Skip branch absolute due to relocation problems
249 continue
250
251 if instr.category in ['LMA', 'LMV', 'DS', 'EC']:
252 # Skip some instruction categories
253 continue
254
255 if name in ['LSWI_V0', 'LSWX_V0', 'LMW_V0', 'STSWX_V0',
256 'LD_V1', 'LWZ_V1', 'STW_V1']:
257 # Some instructions are not completely supported yet
258 # String-related instructions and load multiple
259
260 continue
261
262 # Skip if the files already exists
263
264 fname = "%s/%s_%s.c" % (path, prefix, name)
265 ffname = "%s/%s_%s.c.fail" % (path, prefix, name)
266
267 if os.path.isfile(fname):
268 print_warning("Skip %s. '%s' already generated" % (name, fname))
269 continue
270
271 if os.path.isfile(ffname):
272 print_warning("Skip %s. '%s' already generated (failed)"
273 % (name, ffname))
274 continue
275
276 instructions.append(name)
277
278 return instructions
279
280
281def _generate_benchmark(args):
282 """
283 Actual benchmark generation policy. This is the function that defines
284 how the microbenchmark are going to be generated
285 """
286
287 instr_name, prefix, output_path, target, size, distance = args
288
289 try:
290
291 # Name of the output file
292 fname = "%s/%s_%s" % (output_path, prefix, instr_name)
293
294 # Name of the fail output file (generated in case of exception)
295 ffname = "%s.c.fail" % (fname)
296
297 print_info("Generating %s ..." % (fname))
298
299 instruction = microprobe.code.ins.Instruction()
300 instruction.set_arch_type(target.instructions[instr_name])
301 sequence = [target.instructions[instr_name]]
302
303 # Get the wrapper object. The wrapper object is in charge of
304 # translating the internal representation of the microbenchmark
305 # to the final output format.
306 #
307 # In this case, we obtain the 'CInfGen' wrapper, which embeds
308 # the generated code within an infinite loop using C plus
309 # in-line assembly statements.
310 cwrapper = microprobe.code.get_wrapper("CInfGen")
311
312 # Create the synthesizer object, which is in charge of driving the
313 # generation of the microbenchmark, given a set of passes
314 # (a.k.a. transformations) to apply to the an empty internal
315 # representation of the microbenchmark
316 synth = microprobe.code.Synthesizer(target, cwrapper(),
317 value=0b01010101)
318
319 # Add the transformation passes
320
321 #######################################################################
322 # Pass 1: Init integer registers to a given value #
323 #######################################################################
324 synth.add_pass(
325 microprobe.passes.initialization.InitializeRegistersPass(
326 value=_init_value()))
327 floating = False
328 vector = False
329
330 for operand in instruction.operands():
331 if operand.type.immediate:
332 continue
333
334 if operand.type.float:
335 floating = True
336
337 if operand.type.vector:
338 vector = True
339
340 if vector and floating:
341 ###################################################################
342 # Pass 1.A: if instruction uses vector floats, init vector #
343 # registers to float values #
344 ###################################################################
345 synth.add_pass(
346 microprobe.passes.initialization.InitializeRegistersPass(
347 v_value=(
348 1.000000000000001,
349 64)))
350 elif vector:
351 ###################################################################
352 # Pass 1.B: if instruction uses vector but not floats, init #
353 # vector registers to integer value #
354 ###################################################################
355 synth.add_pass(
356 microprobe.passes.initialization.InitializeRegistersPass(
357 v_value=(
358 _init_value(),
359 64)))
360 elif floating:
361 ###################################################################
362 # Pass 1.C: if instruction uses floats, init float #
363 # registers to float values #
364 ###################################################################
365 synth.add_pass(
366 microprobe.passes.initialization.InitializeRegistersPass(
367 fp_value=1.000000000000001))
368
369 #######################################################################
370 # Pass 2: Add a building block of size 'size' #
371 #######################################################################
372 synth.add_pass(
373 microprobe.passes.structure.SimpleBuildingBlockPass(size))
374
375 #######################################################################
376 # Pass 3: Fill the building block with the instruction sequence #
377 #######################################################################
378 synth.add_pass(
379 microprobe.passes.instruction.SetInstructionTypeBySequencePass(
380 sequence
381 )
382 )
383
384 #######################################################################
385 # Pass 4: Compute addresses of instructions (this pass is needed to #
386 # update the internal representation information so that in #
387 # case addresses are required, they are up to date). #
388 #######################################################################
389 synth.add_pass(
390 microprobe.passes.address.UpdateInstructionAddressesPass())
391
392 #######################################################################
393 # Pass 5: Set target of branches to be the next instruction in the #
394 # instruction stream #
395 #######################################################################
396 synth.add_pass(microprobe.passes.branch.BranchNextPass())
397
398 #######################################################################
399 # Pass 6: Set memory-related operands to access 16 storage locations #
400 # in a round-robin fashion in stride 256 bytes. #
401 # The pattern would be: 0, 256, 512, .... 3840, 0, 256, ... #
402 #######################################################################
403 synth.add_pass(
404 microprobe.passes.memory.SingleMemoryStreamPass(
405 16,
406 256))
407
408 #######################################################################
409 # Pass 7.A: Initialize the storage locations accessed by floating #
410 # point instructions to have a valid floating point value #
411 #######################################################################
412 synth.add_pass(microprobe.passes.float.InitializeMemoryFloatPass(
413 value=1.000000000000001)
414 )
415
416 #######################################################################
417 # Pass 7.B: Initialize the storage locations accessed by decimal #
418 # instructions to have a valid decimal value #
419 #######################################################################
420 synth.add_pass(
421 microprobe.passes.decimal.InitializeMemoryDecimalPass(
422 value=1))
423
424 #######################################################################
425 # Pass 8: Set the remaining instructions operands (if not set) #
426 # (Required to set remaining immediate operands) #
427 #######################################################################
428 synth.add_pass(
429 microprobe.passes.register.DefaultRegisterAllocationPass(
430 dd=distance))
431
432 # Synthesize the microbenchmark.The synthesize applies the set of
433 # transformation passes added before and returns object representing
434 # the microbenchmark
435 bench = synth.synthesize()
436
437 # Save the microbenchmark to the file 'fname'
438 synth.save(fname, bench=bench)
439
440 print_info("%s generated!" % (fname))
441
442 # Remove fail file if exists
443 if os.path.isfile(ffname):
444 os.remove(ffname)
445
446 except MicroprobeException:
447
448 # In case of exception during the generation of the microbenchmark,
449 # print the error, write the fail file and exit
450 print_error(traceback.format_exc())
451 open(ffname, 'a').close()
452 exit(-1)
453
454
455def _init_value():
456 """ Return a init value """
457 return 0b0101010101010101010101010101010101010101010101010101010101010101
458
459
460# Main
461if __name__ == '__main__':
462 # run main if executed from the command line
463 # and the main method exists
464
465 if callable(locals().get('main_setup')):
466 main_setup()
467 exit(0)
The code is self-documented. You can take a look to understand the basic concepts of the code generation in Microprobe. In order to help the readers, let us summarize and elaborate the explanations in the code. The following are the suggested steps required to implement a command line tool to generate microbenchmarks using Microprobe:
Define the command line interface and parameters (
main_setup()
function in the example). This includes:Create a command line interface object
Define parameters using the
add_option
interfaceCall the actual main with the arguments
Define the function to process the input parameters (
_main()
function in the example). This includes:Import target definition
Get processed arguments
Validate and use the arguments to call the actual microbenchmark generation function
Define the function to generate the microbenchmark (
_generate_benchmark
function in the example). The main elements are the following:Get the wrapper object. The wrapper object defines the general characteristics of code being generated (i.e. how the internal representation will be translated to the final file being generated). General characteristics are, for instance, code prologs such as
#include <header.h>
directives, the main function declaration, epilogs, etc. In this case, the wrapper selected is theCInfGen
. This wrapper generates C code with an infinite loop of instructions. This results in the following code:#include <stdio.h> #include <string.h> // <declaration of variables> int main(int argc, char** argv, char** envp) { // <initialization_code> while(1) { // <generated_code> } // end while }
The user can subclass or define their own wrappers to fulfill their needs. See
microprobe.code.wrapper.Wrapper
for more details.Instantiate synthesizer. The benchmark synthesizer object is in charge of driving the code generation object by applying the set of transformation passes defined by the user.
Define the transformation passes. The transformation passes will fill the
declaration of variables
,<initialization_code>
and<generated_code>
sections of the previous code block. Depending on the order and the type of passes applied, the code generated will be different. The user has plenty of transformation passes to apply. Seemicroprobe.passes
and all its submodules for further details. Also, the use can define its own passes by subclassing the classmicroprobe.passes.Pass
.Finally, once the generation policy is defined, the user only has to synthesize the benchmark and save it to a file.
power_v206_power7_ppc64_linux_gcc_fu_stress.py
The following example shows how to generate microbenchmarks that stress a particular functional unit of the architecture. The code is self explanatory:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15
16"""
17power_v206_power7_ppc64_linux_gcc_fu_stress.py
18
19Example module to show how to generate a benchmark stressing a particular
20functional unit of the microarchitecture at different rate using the
21average latency of instructions as well as the average dependency distance
22between the instructions
23"""
24
25# Futures
26from __future__ import absolute_import
27
28# Built-in modules
29import os
30import sys
31import traceback
32
33# Own modules
34import microprobe.code.ins
35import microprobe.passes.address
36import microprobe.passes.branch
37import microprobe.passes.decimal
38import microprobe.passes.float
39import microprobe.passes.ilp
40import microprobe.passes.initialization
41import microprobe.passes.instruction
42import microprobe.passes.memory
43import microprobe.passes.register
44import microprobe.passes.structure
45import microprobe.utils.cmdline
46from microprobe.exceptions import MicroprobeException, \
47 MicroprobeTargetDefinitionError
48from microprobe.target import import_definition
49from microprobe.utils.cmdline import dict_key, existing_dir, \
50 float_type, int_type, print_error, print_info
51from microprobe.utils.logger import get_logger
52
53__author__ = "Ramon Bertran"
54__copyright__ = "Copyright 2011-2021 IBM Corporation"
55__credits__ = []
56__license__ = "IBM (c) 2011-2021 All rights reserved"
57__version__ = "0.5"
58__maintainer__ = "Ramon Bertran"
59__email__ = "rbertra@us.ibm.com"
60__status__ = "Development" # "Prototype", "Development", or "Production"
61
62# Constants
63LOG = get_logger(__name__) # Get the generic logging interface
64
65
66# Functions
67def main_setup():
68 """
69 Set up the command line interface (CLI) with the arguments required by
70 this command line tool.
71 """
72
73 args = sys.argv[1:]
74
75 # Get the target definition
76 try:
77 target = import_definition("power_v206-power7-ppc64_linux_gcc")
78 except MicroprobeTargetDefinitionError as exc:
79 print_error("Unable to import target definition")
80 print_error("Exception message: %s" % str(exc))
81 exit(-1)
82
83 func_units = {}
84 valid_units = [elem.name for elem in target.elements.values()]
85
86 for instr in target.isa.instructions.values():
87 if instr.execution_units == "None":
88 LOG.debug("Execution units for: '%s' not defined", instr.name)
89 continue
90
91 for unit in instr.execution_units:
92 if unit not in valid_units:
93 continue
94
95 if unit not in func_units:
96 func_units[unit] = [elem for elem in target.elements.values()
97 if elem.name == unit][0]
98
99 # Create the CLI interface object
100 cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
101 config_options=False,
102 target_options=False,
103 debug_options=False)
104
105 # Add the different parameters for this particular tool
106 cmdline.add_option(
107 "functional_unit",
108 "f",
109 [func_units['ALU']],
110 "Functional units to stress. Default: ALU",
111 required=False,
112 nargs="+",
113 choices=func_units,
114 opt_type=dict_key(func_units),
115 metavar="FUNCTIONAL_UNIT_NAME")
116
117 cmdline.add_option(
118 "output_prefix",
119 None,
120 "POWER_V206_FU_STRESS",
121 "Output prefix of the generated files. Default: POWER_V206_FU_STRESS",
122 opt_type=str,
123 required=False,
124 metavar="PREFIX")
125
126 cmdline.add_option(
127 "output_path",
128 "O",
129 "./",
130 "Output path. Default: current path",
131 opt_type=existing_dir,
132 metavar="PATH")
133
134 cmdline.add_option(
135 "size",
136 "S",
137 64,
138 "Benchmark size (number of instructions in the endless loop). "
139 "Default: 64 instructions",
140 opt_type=int_type(1, 2**20),
141 metavar="BENCHMARK_SIZE")
142
143 cmdline.add_option(
144 "dependency_distance",
145 "D",
146 1000,
147 "Average dependency distance between the instructions. "
148 "Default: 1000 (no dependencies)",
149 opt_type=int_type(1, 1000),
150 metavar="DEPENDECY_DISTANCE")
151
152 cmdline.add_option(
153 "average_latency",
154 "L",
155 2,
156 "Average latency of the selected instructins. "
157 "Default: 2 cycles",
158 opt_type=float_type(1, 1000),
159 metavar="AVERAGE_LATENCY")
160
161 # Start the main
162 print_info("Processing input arguments...")
163 cmdline.main(args, _main)
164
165
166def _main(arguments):
167 """
168 Main program. Called after the arguments from the CLI interface have
169 been processed.
170 """
171
172 print_info("Arguments processed!")
173
174 print_info("Importing target definition "
175 "'power_v206-power7-ppc64_linux_gcc'...")
176 target = import_definition("power_v206-power7-ppc64_linux_gcc")
177
178 # Get the arguments
179 functional_units = arguments["functional_unit"]
180 prefix = arguments["output_prefix"]
181 output_path = arguments["output_path"]
182 size = arguments["size"]
183 latency = arguments["average_latency"]
184 distance = arguments["dependency_distance"]
185
186 if functional_units is None:
187 functional_units = ["ALL"]
188
189 _generate_benchmark(target,
190 "%s/%s_" % (output_path, prefix),
191 (functional_units, size, latency, distance))
192
193
194def _generate_benchmark(target, output_prefix, args):
195 """
196 Actual benchmark generation policy. This is the function that defines
197 how the microbenchmark are going to be generated
198 """
199
200 functional_units, size, latency, distance = args
201
202 try:
203
204 # Name of the output file
205 func_unit_names = [unit.name for unit in functional_units]
206 fname = "%s%s" % (output_prefix, "_".join(func_unit_names))
207 fname = "%s_LAT_%s" % (fname, latency)
208 fname = "%s_DEP_%s" % (fname, distance)
209
210 # Name of the fail output file (generated in case of exception)
211 ffname = "%s.c.fail" % (fname)
212
213 print_info("Generating %s ..." % (fname))
214
215 # Get the wrapper object. The wrapper object is in charge of
216 # translating the internal representation of the microbenchmark
217 # to the final output format.
218 #
219 # In this case, we obtain the 'CInfGen' wrapper, which embeds
220 # the generated code within an infinite loop using C plus
221 # in-line assembly statements.
222 cwrapper = microprobe.code.get_wrapper("CInfGen")
223
224 # Create the synthesizer object, which is in charge of driving the
225 # generation of the microbenchmark, given a set of passes
226 # (a.k.a. transformations) to apply to the an empty internal
227 # representation of the microbenchmark
228 synth = microprobe.code.Synthesizer(target, cwrapper(),
229 value=0b01010101)
230
231 # Add the transformation passes
232
233 #######################################################################
234 # Pass 1: Init integer registers to a given value #
235 #######################################################################
236 synth.add_pass(
237 microprobe.passes.initialization.InitializeRegistersPass(
238 value=_init_value()))
239
240 #######################################################################
241 # Pass 2: Add a building block of size 'size' #
242 #######################################################################
243 synth.add_pass(
244 microprobe.passes.structure.SimpleBuildingBlockPass(size)
245 )
246
247 #######################################################################
248 # Pass 3: Fill the building block with the instruction sequence #
249 #######################################################################
250 synth.add_pass(
251 microprobe.passes.instruction.SetInstructionTypeByElementPass(
252 target,
253 functional_units,
254 {}))
255
256 #######################################################################
257 # Pass 4: Compute addresses of instructions (this pass is needed to #
258 # update the internal representation information so that in #
259 # case addresses are required, they are up to date). #
260 #######################################################################
261 synth.add_pass(
262 microprobe.passes.address.UpdateInstructionAddressesPass()
263 )
264
265 #######################################################################
266 # Pass 5: Set target of branches to be the next instruction in the #
267 # instruction stream #
268 #######################################################################
269 synth.add_pass(microprobe.passes.branch.BranchNextPass())
270
271 #######################################################################
272 # Pass 6: Set memory-related operands to access 16 storage locations #
273 # in a round-robin fashion in stride 256 bytes. #
274 # The pattern would be: 0, 256, 512, .... 3840, 0, 256, ... #
275 #######################################################################
276 synth.add_pass(
277 microprobe.passes.memory.SingleMemoryStreamPass(
278 16,
279 256))
280
281 #######################################################################
282 # Pass 7.A: Initialize the storage locations accessed by floating #
283 # point instructions to have a valid floating point value #
284 #######################################################################
285 synth.add_pass(microprobe.passes.float.InitializeMemoryFloatPass(
286 value=1.000000000000001)
287 )
288
289 #######################################################################
290 # Pass 7.B: Initialize the storage locations accessed by decimal #
291 # instructions to have a valid decimal value #
292 #######################################################################
293 synth.add_pass(
294 microprobe.passes.decimal.InitializeMemoryDecimalPass(
295 value=1))
296
297 #######################################################################
298 # Pass 8: Set the remaining instructions operands (if not set) #
299 # (Required to set remaining immediate operands) #
300 #######################################################################
301 synth.add_pass(
302 microprobe.passes.register.DefaultRegisterAllocationPass(
303 dd=distance))
304
305 # Synthesize the microbenchmark.The synthesize applies the set of
306 # transformation passes added before and returns object representing
307 # the microbenchmark
308 bench = synth.synthesize()
309
310 # Save the microbenchmark to the file 'fname'
311 synth.save(fname, bench=bench)
312
313 print_info("%s generated!" % (fname))
314
315 # Remove fail file if exists
316 if os.path.isfile(ffname):
317 os.remove(ffname)
318
319 except MicroprobeException:
320
321 # In case of exception during the generation of the microbenchmark,
322 # print the error, write the fail file and exit
323 print_error(traceback.format_exc())
324 open(ffname, 'a').close()
325 exit(-1)
326
327
328def _init_value():
329 """ Return a init value """
330 return 0b0101010101010101010101010101010101010101010101010101010101010101
331
332
333# Main
334if __name__ == '__main__':
335 # run main if executed from the command line
336 # and the main method exists
337
338 if callable(locals().get('main_setup')):
339 main_setup()
340 exit(0)
power_v206_power7_ppc64_linux_gcc_memory.py
The following example shows how to create microbenchmarks with different activity (stress levels) on the different levels of the cache hierarchy. Note that it is not necessary to use the built-in command line interface provided by Microprobe, as the example shows.
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15
16"""
17power_v206_power7_ppc64_linux_gcc_memory.py
18
19Example python script to show how to generate microbenchmarks with particular
20levels of activity in the memory hierarchy.
21"""
22
23# Futures
24from __future__ import absolute_import
25
26# Built-in modules
27import multiprocessing as mp
28import os
29import random
30import sys
31
32# Third party modules
33from six.moves import map
34
35# Own modules
36import microprobe.code
37import microprobe.passes.address
38import microprobe.passes.ilp
39import microprobe.passes.initialization
40import microprobe.passes.instruction
41import microprobe.passes.memory
42import microprobe.passes.register
43import microprobe.passes.structure
44from microprobe.exceptions import MicroprobeTargetDefinitionError
45from microprobe.model.memory import EndlessLoopDataMemoryModel
46from microprobe.target import import_definition
47from microprobe.utils.cmdline import print_error, print_info
48
49__author__ = "Ramon Bertran"
50__copyright__ = "Copyright 2011-2021 IBM Corporation"
51__credits__ = []
52__license__ = "IBM (c) 2011-2021 All rights reserved"
53__version__ = "0.5"
54__maintainer__ = "Ramon Bertran"
55__email__ = "rbertra@us.ibm.com"
56__status__ = "Development" # "Prototype", "Development", or "Production"
57
58# Get the target definition
59try:
60 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
61except MicroprobeTargetDefinitionError as exc:
62 print_error("Unable to import target definition")
63 print_error("Exception message: %s" % str(exc))
64 exit(-1)
65
66BASE_ELEMENT = [element for element in TARGET.elements.values()
67 if element.name == 'L1D'][0]
68CACHE_HIERARCHY = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
69 BASE_ELEMENT)
70
71# Benchmark size
72BENCHMARK_SIZE = 8 * 1024
73
74# Fill a list of the models to be generated
75
76MEMORY_MODELS = []
77
78#
79# Due to performance issues (long exec. time) this
80# model is disabled
81#
82# MEMORY_MODELS.append(
83# (
84# "ALL", CACHE_HIERARCHY, [
85# 25, 25, 25, 25]))
86
87MEMORY_MODELS.append(
88 (
89 "L1", CACHE_HIERARCHY, [
90 100, 0, 0, 0]))
91MEMORY_MODELS.append(
92 (
93 "L2",
94 CACHE_HIERARCHY, [
95 0, 100, 0, 0]))
96MEMORY_MODELS.append(
97 (
98 "L3",
99 CACHE_HIERARCHY, [
100 0, 0, 100, 0]))
101MEMORY_MODELS.append(
102 (
103 "L1L3",
104 CACHE_HIERARCHY, [
105 50, 0, 50, 0]))
106MEMORY_MODELS.append(
107 (
108 "L1L2",
109 CACHE_HIERARCHY, [
110 50, 50, 0, 0]))
111MEMORY_MODELS.append(
112 (
113 "L2L3",
114 CACHE_HIERARCHY, [
115 0, 50, 50, 0]))
116MEMORY_MODELS.append(
117 (
118 "CACHES",
119 CACHE_HIERARCHY, [
120 33, 33, 34, 0]))
121MEMORY_MODELS.append(
122 (
123 "MEM", CACHE_HIERARCHY, [
124 0, 0, 0, 100]))
125
126
127# Enable parallel generation
128PARALLEL = False
129
130
131def main():
132 """Main function. """
133 # call the generate method for each model in the memory model list
134
135 if PARALLEL:
136 print_info("Start parallel execution...")
137 pool = mp.Pool(processes=mp.cpu_count())
138 pool.map(generate, MEMORY_MODELS, 1)
139 else:
140 print_info("Start sequential execution...")
141 list(map(generate, MEMORY_MODELS))
142
143 exit(0)
144
145
146def generate(model):
147 """Benchmark generation policy function. """
148
149 print_info("Creating memory model '%s' ..." % model[0])
150 model = EndlessLoopDataMemoryModel(*model)
151
152 modelname = model.name
153
154 print_info("Generating Benchmark mem-%s ..." % (modelname))
155
156 # Get the architecture
157 garch = TARGET
158
159 # For all the supported instructions, get the memory operations,
160 sequence = []
161 for instr_name in sorted(garch.instructions.keys()):
162
163 instr = garch.instructions[instr_name]
164
165 if not instr.access_storage:
166 continue
167 if instr.privileged: # Skip privileged
168 continue
169 if instr.hypervisor: # Skip hypervisor
170 continue
171 if instr.trap: # Skip traps
172 continue
173 if "String" in instr.description: # Skip unsupported string instr.
174 continue
175 if "Multiple" in instr.description: # Skip unsupported mult. ld/sts
176 continue
177 if instr.category in [
178 'LMA',
179 'LMV',
180 'DS',
181 'EC',
182 'WT']: # Skip unsupported categories
183 continue
184 if instr.access_storage_with_update: # Not supported by mem. model
185 continue
186 if "Reserve Indexed" in instr.description: # Skip (illegal intr.)
187 continue
188 if "Conditional Indexed" in instr.description: # Skip (illegal intr.)
189 continue
190 if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1']:
191 continue
192
193 sequence.append(instr)
194
195 # Get the loop wrapper. In this case we take the 'CInfPpc', which
196 # generates an infinite loop in C using PowerPC embedded assembly.
197 cwrapper = microprobe.code.get_wrapper("CInfPpc")
198
199 # Define function to return random numbers (used afterwards)
200 def rnd():
201 """Return a random value. """
202 return random.randrange(0, (1 << 64) - 1)
203
204 # Create the benchmark synthesizer
205 synth = microprobe.code.Synthesizer(garch, cwrapper())
206
207 ##########################################################################
208 # Add the passes we want to apply to synthesize benchmarks #
209 ##########################################################################
210
211 # --> Init registers to random values
212 synth.add_pass(
213 microprobe.passes.initialization.InitializeRegistersPass(
214 value=rnd))
215
216 # --> Add a single basic block of size 'size'
217 if model.name in ['MEM']:
218 synth.add_pass(
219 microprobe.passes.structure.SimpleBuildingBlockPass(
220 BENCHMARK_SIZE *
221 4))
222 else:
223 synth.add_pass(
224 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
225 )
226
227 # --> Fill the basic block using the sequence of instructions provided
228 synth.add_pass(
229 microprobe.passes.instruction.SetInstructionTypeBySequencePass(
230 sequence
231 )
232 )
233
234 # --> Set the memory operations parameters to fulfill the given model
235 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(model))
236
237 # --> Set the dependency distance and the default allocation. Sets the
238 # remaining undefined instruction operands (register allocation,...)
239 synth.add_pass(microprobe.passes.register.NoHazardsAllocationPass())
240 synth.add_pass(
241 microprobe.passes.register.DefaultRegisterAllocationPass(
242 dd=0))
243
244 # Generate the benchmark (applies the passes).
245 bench = synth.synthesize()
246
247 print_info("Benchmark mem-%s saving to disk..." % (modelname))
248
249 # Save the benchmark
250 synth.save("%s/mem-%s" % (DIRECTORY, modelname), bench=bench)
251
252 print_info("Benchmark mem-%s generated" % (modelname))
253 return True
254
255
256if __name__ == '__main__':
257 # run main if executed from the command line
258 # and the main method exists
259
260 if len(sys.argv) != 2:
261 print_info("Usage:")
262 print_info("%s output_dir" % (sys.argv[0]))
263 exit(-1)
264
265 DIRECTORY = sys.argv[1]
266
267 if not os.path.isdir(DIRECTORY):
268 print_error("Output directory '%s' does not exists" % (DIRECTORY))
269 exit(-1)
270
271 if callable(locals().get('main')):
272 main()
power_v206_power7_ppc64_linux_gcc_random.py
The following example generates random microbenchmarks:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15
16"""
17power_v206_power7_ppc64_linux_gcc_memory.py
18
19Example python script to show how to generate random microbenchmarks.
20"""
21
22# Futures
23from __future__ import absolute_import
24
25# Built-in modules
26import multiprocessing as mp
27import os
28import random
29import sys
30
31# Third party modules
32from six.moves import map, range
33
34# Own modules
35import microprobe.code
36import microprobe.passes.address
37import microprobe.passes.branch
38import microprobe.passes.ilp
39import microprobe.passes.initialization
40import microprobe.passes.instruction
41import microprobe.passes.memory
42import microprobe.passes.register
43import microprobe.passes.structure
44from microprobe.exceptions import MicroprobeError, \
45 MicroprobeTargetDefinitionError
46from microprobe.model.memory import EndlessLoopDataMemoryModel
47from microprobe.target import import_definition
48from microprobe.utils.cmdline import print_error, print_info
49
50__author__ = "Ramon Bertran"
51__copyright__ = "Copyright 2011-2021 IBM Corporation"
52__credits__ = []
53__license__ = "IBM (c) 2011-2021 All rights reserved"
54__version__ = "0.5"
55__maintainer__ = "Ramon Bertran"
56__email__ = "rbertra@us.ibm.com"
57__status__ = "Development" # "Prototype", "Development", or "Production"
58
59# Benchmark size
60BENCHMARK_SIZE = 8 * 1024
61
62# Get the target definition
63try:
64 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
65except MicroprobeTargetDefinitionError as exc:
66 print_error("Unable to import target definition")
67 print_error("Exception message: %s" % str(exc))
68 exit(-1)
69
70BASE_ELEMENT = [element for element in TARGET.elements.values()
71 if element.name == 'L1D'][0]
72CACHE_HIERARCHY = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
73 BASE_ELEMENT)
74
75PARALLEL = True
76
77
78def main():
79 """ Main program. """
80 if PARALLEL:
81 pool = mp.Pool(processes=mp.cpu_count())
82 pool.map(generate, list(range(0, 100)), 1)
83 else:
84 list(map(generate, list(range(0, 100))))
85
86
87def generate(name):
88 """ Benchmark generation policy. """
89
90 if os.path.isfile("%s/random-%s.c" % (DIRECTORY, name)):
91 print_info("Skip %d" % name)
92 return
93
94 print_info("Generating %d..." % name)
95
96 # Generate a random memory model (used afterwards)
97 model = []
98 total = 100
99 for mcomp in CACHE_HIERARCHY[0:-1]:
100 weight = random.randint(0, total)
101 model.append(weight)
102 print_info("%s: %d%%" % (mcomp, weight))
103 total = total - weight
104
105 # Fix remaining
106 level = random.randint(0, len(CACHE_HIERARCHY[0:-1]) - 1)
107 model[level] += total
108
109 # Last level always zero
110 model.append(0)
111
112 # Sanity check
113 psum = 0
114 for elem in model:
115 psum += elem
116 assert psum == 100
117
118 modelobj = EndlessLoopDataMemoryModel("random-%s", CACHE_HIERARCHY, model)
119
120 # Get the loop wrapper. In this case we take the 'CInfPpc', which
121 # generates an infinite loop in C using PowerPC embedded assembly.
122 cwrapper = microprobe.code.get_wrapper("CInfPpc")
123
124 # Define function to return random numbers (used afterwards)
125 def rnd():
126 """Return a random value. """
127 return random.randrange(0, (1 << 64) - 1)
128
129 # Create the benchmark synthesizer
130 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
131
132 ##########################################################################
133 # Add the passes we want to apply to synthesize benchmarks #
134 ##########################################################################
135
136 # --> Init registers to random values
137 synth.add_pass(
138 microprobe.passes.initialization.InitializeRegistersPass(
139 value=rnd))
140
141 # --> Add a single basic block of size size
142 synth.add_pass(
143 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
144
145 # --> Fill the basic block with instructions picked randomly from the list
146 # provided
147
148 instructions = []
149 for instr in TARGET.instructions.values():
150
151 if instr.privileged: # Skip privileged
152 continue
153 if instr.hypervisor: # Skip hypervisor
154 continue
155 if instr.trap: # Skip traps
156 continue
157 if instr.syscall: # Skip syscall
158 continue
159 if "String" in instr.description: # Skip unsupported string instr.
160 continue
161 if "Multiple" in instr.description: # Skip unsupported mult. ld/sts
162 continue
163 if instr.category in [
164 'LMA',
165 'LMV',
166 'DS',
167 'EC',
168 'WT']: # Skip unsupported categories
169 continue
170 if instr.access_storage_with_update: # Not supported by mem. model
171 continue
172 if instr.branch and not instr.branch_relative: # Skip branches
173 continue
174 if "Reserve Indexed" in instr.description: # Skip (illegal intr.)
175 continue
176 if "Conitional Indexed" in instr.description: # Skip (illegal intr.)
177 continue
178 if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1', ]:
179 continue
180
181 instructions.append(instr)
182
183 synth.add_pass(
184 microprobe.passes.instruction.SetRandomInstructionTypePass(
185 instructions
186 )
187 )
188
189 # --> Set the memory operations parameters to fulfill the given model
190 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(modelobj))
191
192 # --> Set target of branches to next instruction (first compute addresses)
193 synth.add_pass(microprobe.passes.address.UpdateInstructionAddressesPass())
194 synth.add_pass(microprobe.passes.branch.BranchNextPass())
195
196 # --> Set the dependency distance and the default allocation. Dependency
197 # distance is randomly picked
198 synth.add_pass(
199 microprobe.passes.register.DefaultRegisterAllocationPass(
200 dd=random.randint(1, 20)
201 )
202 )
203
204 # Generate the benchmark (applies the passes)
205 # Since it is a randomly generated code, the generation might fail
206 # (e.g. not enough access to fulfill the requested memory model, etc.)
207 # Because of that, we handle the exception accordingly.
208 try:
209 print_info("Synthesizing %d..." % name)
210 bench = synth.synthesize()
211 print_info("Synthesized %d!" % name)
212 # Save the benchmark
213 synth.save("%s/random-%s" % (DIRECTORY, name), bench=bench)
214 except MicroprobeError:
215 print_info("Synthesizing error in '%s'. This is Ok." % name)
216
217 return True
218
219
220if __name__ == '__main__':
221 # run main if executed from the command line
222 # and the main method exists
223
224 if len(sys.argv) != 2:
225 print_info("Usage:")
226 print_info("%s output_dir" % (sys.argv[0]))
227 exit(-1)
228
229 DIRECTORY = sys.argv[1]
230
231 if not os.path.isdir(DIRECTORY):
232 print_error("Output directory '%s' does not exists" % (DIRECTORY))
233 exit(-1)
234
235 if callable(locals().get('main')):
236 main()
power_v206_power7_ppc64_linux_gcc_custom.py
The following example shows different examples on how to customize the generation of microbenchmarks:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15
16"""
17power_v206_power7_ppc64_linux_gcc_custom.py
18
19Example python script to show how to generate random microbenchmarks.
20"""
21
22# Futures
23from __future__ import absolute_import
24
25# Built-in modules
26import os
27import sys
28
29# Own modules
30import microprobe.code
31import microprobe.passes.initialization
32import microprobe.passes.instruction
33import microprobe.passes.memory
34import microprobe.passes.register
35import microprobe.passes.structure
36from microprobe.exceptions import MicroprobeTargetDefinitionError
37from microprobe.model.memory import EndlessLoopDataMemoryModel
38from microprobe.target import import_definition
39from microprobe.utils.cmdline import print_error, print_info
40from microprobe.utils.misc import RNDINT
41
42__author__ = "Ramon Bertran"
43__copyright__ = "Copyright 2011-2021 IBM Corporation"
44__credits__ = []
45__license__ = "IBM (c) 2011-2021 All rights reserved"
46__version__ = "0.5"
47__maintainer__ = "Ramon Bertran"
48__email__ = "rbertra@us.ibm.com"
49__status__ = "Development" # "Prototype", "Development", or "Production"
50
51# Benchmark size
52BENCHMARK_SIZE = 8 * 1024
53
54if len(sys.argv) != 2:
55 print_info("Usage:")
56 print_info("%s output_dir" % (sys.argv[0]))
57 exit(-1)
58
59DIRECTORY = sys.argv[1]
60
61if not os.path.isdir(DIRECTORY):
62 print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
63 exit(-1)
64
65# Get the target definition
66try:
67 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
68except MicroprobeTargetDefinitionError as exc:
69 print_error("Unable to import target definition")
70 print_error("Exception message: %s" % str(exc))
71 exit(-1)
72
73
74###############################################################################
75# Example 1: loop with instructions accessing storage , hitting the first #
76# level of cache and with dependency distance of 3 #
77###############################################################################
78def example_1():
79 """ Example 1 """
80 name = "L1-LOADS"
81
82 base_element = [element for element in TARGET.elements.values()
83 if element.name == 'L1D'][0]
84 cache_hierarchy = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
85 base_element)
86
87 model = [0] * len(cache_hierarchy)
88 model[0] = 100
89
90 mmodel = EndlessLoopDataMemoryModel("random-%s", cache_hierarchy, model)
91
92 profile = {}
93 for instr_name in sorted(TARGET.instructions.keys()):
94 instr = TARGET.instructions[instr_name]
95 if not instr.access_storage:
96 continue
97 if instr.privileged: # Skip privileged
98 continue
99 if instr.hypervisor: # Skip hypervisor
100 continue
101 if "String" in instr.description: # Skip unsupported string instr.
102 continue
103 if "ultiple" in instr.description: # Skip unsupported mult. ld/sts
104 continue
105 if instr.category in [
106 'DS',
107 'LMA',
108 'LMV',
109 'EC']: # Skip unsupported categories
110 continue
111 if instr.access_storage_with_update: # Not supported
112 continue
113
114 if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1', ]:
115 continue
116
117 if (any([moper.is_load
118 for moper in instr.memory_operand_descriptors]) and
119 all(
120 [not moper.is_store
121 for moper in instr.memory_operand_descriptors])):
122 profile[instr] = 1
123
124 cwrapper = microprobe.code.get_wrapper("CInfPpc")
125 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
126
127 synth.add_pass(
128 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
129 synth.add_pass(
130 microprobe.passes.initialization.InitializeRegistersPass(
131 value=RNDINT))
132 synth.add_pass(
133 microprobe.passes.initialization.InitializeRegisterPass(
134 "GPR1", 0, force=True, reserve=True
135 )
136 )
137 synth.add_pass(
138 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile)
139 )
140 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(mmodel))
141 synth.add_pass(
142 microprobe.passes.register.DefaultRegisterAllocationPass(
143 dd=3))
144
145 print_info("Generating %s..." % name)
146 bench = synth.synthesize()
147 print_info("%s Generated!" % name)
148 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
149
150
151###############################################################################
152# Example 2: loop with instructions using the MUL unit and with dependency #
153# distance of 4 #
154###############################################################################
155def example_2():
156 """ Example 2 """
157 name = "FXU-MUL"
158
159 cwrapper = microprobe.code.get_wrapper("CInfPpc")
160 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
161
162 synth.add_pass(
163 microprobe.passes.initialization.InitializeRegistersPass(
164 value=RNDINT))
165 synth.add_pass(
166 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
167 synth.add_pass(
168 microprobe.passes.instruction.SetInstructionTypeByElementPass(
169 TARGET,
170 [TARGET.elements['MUL_FXU0_Core0_SCM_Processor']],
171 {}))
172 synth.add_pass(
173 microprobe.passes.register.DefaultRegisterAllocationPass(
174 dd=4))
175
176 print_info("Generating %s..." % name)
177 bench = synth.synthesize()
178 print_info("%s Generated!" % name)
179 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
180
181
182###############################################################################
183# Example 3: loop with instructions using the ALU unit and with dependency #
184# distance of 1 #
185###############################################################################
186def example_3():
187 """ Example 3 """
188 name = "FXU-ALU"
189
190 cwrapper = microprobe.code.get_wrapper("CInfPpc")
191 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
192
193 synth.add_pass(
194 microprobe.passes.initialization.InitializeRegistersPass(
195 value=RNDINT))
196 synth.add_pass(
197 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
198 synth.add_pass(
199 microprobe.passes.instruction.SetInstructionTypeByElementPass(
200 TARGET,
201 [TARGET.elements['ALU_FXU0_Core0_SCM_Processor']],
202 {}))
203 synth.add_pass(
204 microprobe.passes.register.DefaultRegisterAllocationPass(
205 dd=1))
206
207 print_info("Generating %s..." % name)
208 bench = synth.synthesize()
209 print_info("%s Generated!" % name)
210 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
211
212
213###############################################################################
214# Example 4: loop with FMUL* instructions with different weights and with #
215# dependency distance 10 #
216###############################################################################
217def example_4():
218 """ Example 4 """
219 name = "VSU-FMUL"
220
221 profile = {}
222 profile[TARGET.instructions['FMUL_V0']] = 4
223 profile[TARGET.instructions['FMULS_V0']] = 3
224 profile[TARGET.instructions['FMULx_V0']] = 2
225 profile[TARGET.instructions['FMULSx_V0']] = 1
226
227 cwrapper = microprobe.code.get_wrapper("CInfPpc")
228 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
229
230 synth.add_pass(
231 microprobe.passes.initialization.InitializeRegistersPass(
232 value=RNDINT))
233 synth.add_pass(
234 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
235 synth.add_pass(
236 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
237 synth.add_pass(
238 microprobe.passes.register.DefaultRegisterAllocationPass(
239 dd=10))
240
241 print_info("Generating %s..." % name)
242 bench = synth.synthesize()
243 print_info("%s Generated!" % name)
244 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
245
246
247###############################################################################
248# Example 5: loop with FADD* instructions with different weights and with #
249# dependency distance 1 #
250###############################################################################
251def example_5():
252 """ Example 5 """
253 name = "VSU-FADD"
254
255 profile = {}
256 profile[TARGET.instructions['FADD_V0']] = 100
257 profile[TARGET.instructions['FADDx_V0']] = 1
258 profile[TARGET.instructions['FADDS_V0']] = 10
259 profile[TARGET.instructions['FADDSx_V0']] = 1
260
261 cwrapper = microprobe.code.get_wrapper("CInfPpc")
262 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
263
264 synth.add_pass(
265 microprobe.passes.initialization.InitializeRegistersPass(
266 value=RNDINT))
267 synth.add_pass(
268 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
269 synth.add_pass(
270 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
271 synth.add_pass(
272 microprobe.passes.register.DefaultRegisterAllocationPass(
273 dd=1))
274
275 print_info("Generating %s..." % name)
276 bench = synth.synthesize()
277 print_info("%s Generated!" % name)
278 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
279
280
281###############################################################################
282# Call the examples #
283###############################################################################
284example_1()
285example_2()
286example_3()
287example_4()
288example_5()
289exit(0)
power_v206_power7_ppc64_linux_gcc_genetic.py
Deprecated since version 0.5: Support for the PyEvolve and genetic algorithm based searches has been discontinued
The following example shows how to use the design exploration module and the genetic algorithm based searches to look for a solution. In particular, for each functional unit of the architecture and a range of IPCs (instruction per cycle), the example looks for a solution that stresses that functional unit at the given IPC. External commands (not included) are needed to evaluate the generated microbenchmarks in the target platform.
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15
16"""
17power_v206_power7_ppc64_linux_gcc_genetic.py
18
19Example python script to show how to generate a set of microbenchmark
20stressing a particular unit but at different IPC ratio using a genetic
21search algorithm to play with two knobs: average latency and dependency
22distance.
23
24An IPC evaluation and scoring script is required. For instance:
25
26.. code:: bash
27
28 #!/bin/bash
29 # ARGS: $1 is the target IPC
30 # $2 is the name of the generate benchnark
31 target_ipc=$1
32 source_bench=$2
33
34 # Compile the benchmark
35 gcc -O0 -mcpu=power7 -mtune=power7 -std=c99 $source_bench.c -o $source_bench
36
37 # Evaluate the ipc
38 ipc=< your preferred commands to evaluate the IPC >
39
40 # Compute the score (the closer to the target IPC the
41 score=(1/($ipc-$target_ipc))^2 | bc -l
42
43 echo $score
44
45Use the script above as a template for your own GA-based search.
46"""
47
48# Futures
49from __future__ import absolute_import, division
50
51# Built-in modules
52import datetime
53import os
54import sys
55import time as runtime
56
57# Third party modules
58from six.moves import range
59
60# Own modules
61import microprobe.code
62import microprobe.driver.genetic
63import microprobe.passes.ilp
64import microprobe.passes.initialization
65import microprobe.passes.instruction
66import microprobe.passes.register
67import microprobe.passes.structure
68from microprobe.exceptions import MicroprobeTargetDefinitionError
69from microprobe.target import import_definition
70from microprobe.utils.cmdline import print_error, print_info, print_warning
71from microprobe.utils.misc import RNDINT
72
73__author__ = "Ramon Bertran"
74__copyright__ = "Copyright 2011-2021 IBM Corporation"
75__credits__ = []
76__license__ = "IBM (c) 2011-2021 All rights reserved"
77__version__ = "0.5"
78__maintainer__ = "Ramon Bertran"
79__email__ = "rbertra@us.ibm.com"
80__status__ = "Development" # "Prototype", "Development", or "Production"
81
82# Benchmark size
83BENCHMARK_SIZE = 20
84
85# Get the target definition
86try:
87 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
88except MicroprobeTargetDefinitionError as exc:
89 print_error("Unable to import target definition")
90 print_error("Exception message: %s" % str(exc))
91 exit(-1)
92
93
94def main():
95 """Main function."""
96
97 component_list = ["FXU", "FXU-noLSU", "FXU-LSU", "VSU", "VSU-FXU"]
98 ipcs = [float(x) / 10 for x in range(1, 41)]
99 ipcs = ipcs[5:] + ipcs[:5]
100
101 for name in component_list:
102 for ipc in ipcs:
103 generate_genetic(name, ipc)
104
105
106def generate_genetic(compname, ipc):
107 """Generate a microbenchmark stressing compname at the given ipc."""
108 comps = []
109 bcomps = []
110 any_comp = False
111
112 if compname.find("FXU") >= 0:
113 comps.append(TARGET.elements["FXU0_Core0_SCM_Processor"])
114
115 if compname.find("VSU") >= 0:
116 comps.append(TARGET.elements["VSU0_Core0_SCM_Processor"])
117
118 if len(comps) == 2:
119 any_comp = True
120 elif compname.find("noLSU") >= 0:
121 bcomps.append(TARGET.elements["LSU0_Core0_SCM_Processor"])
122 elif compname.find("LSU") >= 0:
123 comps.append(TARGET.elements["LSU_Core0_SCM_Processor"])
124
125 if (len(comps) == 1 and ipc > 2) or (len(comps) == 2 and ipc > 4):
126 return True
127
128 for elem in os.listdir(DIRECTORY):
129 if not elem.endswith(".c"):
130 continue
131 if elem.startswith("%s:IPC:%.2f:DIST" % (compname, ipc)):
132 print_info("Already generated: %s %d" % (compname, ipc))
133 return True
134
135 print_info("Going for IPC: %f and Element: %s" % (ipc, compname))
136
137 def generate(name, *args):
138 """Benchmark generation function.
139
140 First argument is name, second the dependency distance and the
141 third is the average instruction latency.
142 """
143 dist, latency = args
144
145 wrapper = microprobe.code.get_wrapper("CInfPpc")
146 synth = microprobe.code.Synthesizer(TARGET, wrapper())
147 synth.add_pass(
148 microprobe.passes.initialization.InitializeRegistersPass(
149 value=RNDINT))
150 synth.add_pass(
151 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE)
152 )
153 synth.add_pass(
154 microprobe.passes.instruction.SetInstructionTypeByElementPass(
155 TARGET,
156 comps,
157 {},
158 block=bcomps,
159 avelatency=latency,
160 any_comp=any_comp))
161 synth.add_pass(
162 microprobe.passes.register.DefaultRegisterAllocationPass(
163 dd=dist))
164 bench = synth.synthesize()
165 synth.save(name, bench=bench)
166
167 # Set the genetic algorithm parameters
168 ga_params = []
169 ga_params.append((0, 20, 0.05)) # Average dependency distance design space
170 ga_params.append((2, 8, 0.05)) # Average instruction latency design space
171
172 # Set up the search driver
173 driver = microprobe.driver.genetic.ExecCmdDriver(
174 generate, 20, 30, 30, "'%s' %f " %
175 (COMMAND, ipc), ga_params)
176
177 starttime = runtime.time()
178 print_info("Start search...")
179 driver.run(1)
180 print_info("Search end")
181 endtime = runtime.time()
182
183 print_info("Genetic time::%s" % (
184 datetime.timedelta(seconds=endtime - starttime))
185 )
186
187 # Check if we found a solution
188 ga_params = driver.solution()
189 score = driver.score()
190
191 print_info("IPC found: %f, score: %f" % (ipc, score))
192
193 if score < 20:
194 print_warning("Unable to find an optimal solution with IPC: %f:" % ipc)
195 print_info("Generating the closest solution...")
196 generate(
197 "%s/%s:IPC:%.2f:DIST:%.2f:LAT:%.2f-check" %
198 (DIRECTORY, compname, ipc, ga_params[0], ga_params[1]),
199 ga_params[0], ga_params[1]
200 )
201 print_info("Closest solution generated")
202 else:
203 print_info(
204 "Solution found for %s and IPC %f -> dist: %f , "
205 "latency: %f " %
206 (compname, ipc, ga_params[0], ga_params[1]))
207 print_info("Generating solution...")
208 generate("%s/%s:IPC:%.2f:DIST:%.2f:LAT:%.2f" %
209 (DIRECTORY, compname, ipc, ga_params[0], ga_params[1]),
210 ga_params[0], ga_params[1]
211 )
212 print_info("Solution generated")
213 return True
214
215
216if __name__ == '__main__':
217 # run main if executed from the COMMAND line
218 # and the main method exists
219
220 if len(sys.argv) != 3:
221 print_info("Usage:")
222 print_info("%s output_dir eval_cmd" % (sys.argv[0]))
223 print_info("")
224 print_info("Output dir: output directory for the generated benchmarks")
225 print_info("eval_cmd: command accepting 2 parameters: the target IPC")
226 print_info(" and the filename of the generate benchmark. ")
227 print_info(" Output: the score used for the GA search. E.g.")
228 print_info(" the close the IPC of the generated benchmark to")
229 print_info(" the target IPC, the cmd should give a higher ")
230 print_info(" score. ")
231 exit(-1)
232
233 DIRECTORY = sys.argv[1]
234 COMMAND = sys.argv[2]
235
236 if not os.path.isdir(DIRECTORY):
237 print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
238 exit(-1)
239
240 if not os.path.isfile(COMMAND):
241 print_info("The COMMAND '%s' does not exists" % (COMMAND))
242 exit(-1)
243
244 if callable(locals().get('main')):
245 main()