Examples on POWER
In the definitions/power/examples
directory of the Microprobe distribution
(if you installed the microprobe_target_power package),
you will find different examples showing the usage of Microprobe
for the power architecture. Although we have split the examples by
architecture, the concepts we introduce in these examples are common in all
the architectures.
We recommend users to go through the code of these examples to understand specific details on how to use the framework.
Contents:
isa_power_v206_info.py
The first example we show is isa_power_v206_info.py
. This example
shows how to search for architecture definitions (e.g. the ISA properties),
how to import the definitions and then how to dump the definition.
If you execute the following command:
> ./isa_power_v206_info.py
will generate the following output, which shows all the details of the POWER v2.06 architecture (first and last 20 lines for brevity):
--------------------------------------------------------------------------------
ISA Name: power_v206
ISA Description: power_v206
--------------------------------------------------------------------------------
Register Types:
GPR: General Register (bit size: 64)
VSCR: Vector Status and Control Register (bit size: 32)
FPR: Floating-Point Register (bit size: 64)
SPR: Special Purpose Register (64 bits) (bit size: 64)
VR: Vector Register (bit size: 128)
MSR: Machine State Register (bit size: 64)
SPR32: Special Purpose Register (32 bits) (bit size: 32)
VSR: Vector Scalar Register (bit size: 128)
FPSCR: Floating-Point Status and Control Register (bit size: 32)
CR: Condition Register (bit size: 4)
--------------------------------------------------------------------------------
Architected registers:
AESR : AESR Register (Type: SPR)
AMOR : AMOR Register (Type: SPR)
AMR : Authority Mask Register (Type: SPR)
...
access_storage : False (Boolean indicating if the instruction has storage operands )
access_storage_with_update : False (Boolean indicating if the instruction accesses to storage and updates the source register with the generated address)
algebraic : False (Boolean indicating if operation uses algebraic rules to keep values )
branch : False (Boolean indicating if the instruction is a branch )
branch_conditional : False (Boolean indicating if the instruction is a branch conditional )
branch_relative : False (Boolean indicating if the instruction is a relative branch )
category : VSX (String indicating if the instruction the instruction category )
decimal : False (Boolean indication if the instruction requires inputs in decimal format )
disable_asm : False (Boolean indicating if ASM generation is disabled for the instruction. If so, binary codification is used. )
hypervisor : False (Boolean indicating if the instruction need hypervisor mode )
privileged : False (Boolean indicating if the instruction is privileged )
privileged_optional : False (Boolean indicating the instrucion is priviledged or not depending on the input values )
switching : None (Input values required to maximize the computational switching )
syscall : False (Boolean indicating if the instruction is a syscall or return from one )
trap : False (Boolean indicating if the instruction is a trap )
Instructions defined: 938
Variants defined: 964
--------------------------------------------------------------------------------
The following code is what has been executed:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16isa_power_v206_info.py
17
18Example module to show how to access to isa definitions.
19"""
20
21# Futures
22from __future__ import absolute_import, print_function
23
24# Built-in modules
25import os
26
27# Own modules
28from microprobe.target.isa import find_isa_definitions, import_isa_definition
29
30__author__ = "Ramon Bertran"
31__copyright__ = "Copyright 2011-2021 IBM Corporation"
32__credits__ = []
33__license__ = "IBM (c) 2011-2021 All rights reserved"
34__version__ = "0.5"
35__maintainer__ = "Ramon Bertran"
36__email__ = "rbertra@us.ibm.com"
37__status__ = "Development" # "Prototype", "Development", or "Production"
38
39# Constants
40ISANAME = "power_v206"
41
42# Functions
43
44# Classes
45
46# Main
47
48# Search and import definition
49ISADEF = import_isa_definition(
50 os.path.dirname([
51 isa for isa in find_isa_definitions() if isa.name == ISANAME
52 ][0].filename))
53
54# Print definition
55print((ISADEF.full_report()))
56exit(0)
In this simple code, first the find_isa_definitions
,
import_isa_definition
from the microprobe.target.isa module
are imported (line 14). Then, the first one is used to look for definitions of
architectures, a list returned and filtered and only the one with
name power_v206
is imported using the second method:
import_isa_definition
(lines 34-37). Finally, the full report of
the ISADEF
object is printed to standard output in line 40.
In the case, the full report is printed but the user can query any
information about the particular ISA that has been imported by using the
microprobe.target.isa.ISA
API.
power_v206_power7_ppc64_linux_gcc_profile.py
The aim of this example is to show how the code generation works in Microprobe. In particular, this example shows how to generate, for each instruction of the ISA, an endless loop containing such instruction. The size of the loop and the dependency distance between the instructions of the loop can specified as a parameter. Using Microprobe you can generate thousands of microbenchmarks in few minutes. Let’s start with the command line interface. Executing:
> ./power_v206_power7_ppc64_linux_gcc_profile.py --help
will generate the following output:
power_v206_power7_ppc64_linux_gcc_profile.py: INFO: Processing input arguments...
usage: power_v206_power7_ppc64_linux_gcc_profile.py [-h]
[-P SEARCH_PATH [SEARCH_PATH ...]]
[-V] [-v] [-d]
[-i INSTRUCTION_NAME [INSTRUCTION_NAME ...]]
[--output_prefix PREFIX]
[-O PATH] [-p NUM_JOBS]
[-S BENCHMARK_SIZE]
[-D DEPENDECY_DISTANCE]
ISA power v206 profile example
optional arguments:
-h, --help show this help message and exit
-P SEARCH_PATH [SEARCH_PATH ...], --default_paths SEARCH_PATH [SEARCH_PATH ...]
Default search paths for microprobe target definitions
-V, --version Show Microprobe version and exit
-v, --verbosity Verbosity level (Values: [0,1,2,3,4]). Each time this
argument is specified the verbosity level is
increased. By default, no logging messages are shown.
These are the four levels available:
-v (1): critical messages
-v -v (2): critical and error messages
-v -v -v (3): critical, error and warning messages
-v -v -v -v (4): critical, error, warning and info messages
Specifying more than four verbosity flags, will
default to the maximum of four. If you need extra
information, enable the debug mode (--debug or -d
flags).
-d, --debug Enable debug mode in Microprobe framework. Lots of
output messages will be generated
-i INSTRUCTION_NAME [INSTRUCTION_NAME ...], --instruction INSTRUCTION_NAME [INSTRUCTION_NAME ...]
Instruction names to generate. Default: All
instructions
--output_prefix PREFIX
Output prefix of the generated files. Default:
POWER_V206_PROFILE
-O PATH, --output_path PATH
Output path. Default: current path
-p NUM_JOBS, --parallel NUM_JOBS
Number of parallel jobs. Default: number of CPUs
available (80). Valid values: 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,
75, 76, 77, 78, 79, 80
-S BENCHMARK_SIZE, --size BENCHMARK_SIZE
Benchmark size (number of instructions in the endless
loop). Default: 64 instructions
-D DEPENDECY_DISTANCE, --dependency_distance DEPENDECY_DISTANCE
Average dependency distance between the instructions.
Default: 1000 (no dependencies)
Environment variables:
MICROPROBETEMPLATES Default path for microprobe templates
MICROPROBEDEBUG If set, enable debug
MICROPROBEDEBUGPASSES If set, enable debug during passes
MICROPROBEASMHEXFMT Assembly hexadecimal format. Options:
'all' -> All immediates in hex format
'address' -> Address immediates in hex format (default)
'none' -> All immediate in integer format
Lets look at the code to see how this command line tool is implemented. This is the complete code of the script:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_profile.py
17
18Example module to show how to generate a benchmark for each instruction
19of the ISA
20"""
21
22# Futures
23from __future__ import absolute_import
24
25# Built-in modules
26import multiprocessing as mp
27import os
28import sys
29import traceback
30
31# Third party modules
32from six.moves import map, range
33
34# Own modules
35import microprobe.code.ins
36import microprobe.passes.address
37import microprobe.passes.branch
38import microprobe.passes.decimal
39import microprobe.passes.float
40import microprobe.passes.ilp
41import microprobe.passes.initialization
42import microprobe.passes.instruction
43import microprobe.passes.memory
44import microprobe.passes.register
45import microprobe.passes.structure
46import microprobe.utils.cmdline
47from microprobe import MICROPROBE_RC
48from microprobe.exceptions import MicroprobeException
49from microprobe.target import import_definition
50from microprobe.utils.cmdline import existing_dir, \
51 int_type, print_error, print_info, print_warning
52from microprobe.utils.logger import get_logger
53
54__author__ = "Ramon Bertran"
55__copyright__ = "Copyright 2011-2021 IBM Corporation"
56__credits__ = []
57__license__ = "IBM (c) 2011-2021 All rights reserved"
58__version__ = "0.5"
59__maintainer__ = "Ramon Bertran"
60__email__ = "rbertra@us.ibm.com"
61__status__ = "Development" # "Prototype", "Development", or "Production"
62
63# Constants
64LOG = get_logger(__name__) # Get the generic logging interface
65
66
67# Functions
68def main_setup():
69 """
70 Set up the command line interface (CLI) with the arguments required by
71 this command line tool.
72 """
73
74 args = sys.argv[1:]
75
76 # Create the CLI interface object
77 cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
78 config_options=False,
79 target_options=False,
80 debug_options=False)
81
82 # Add the different parameters for this particular tool
83 cmdline.add_option(
84 "instruction",
85 "i",
86 None,
87 "Instruction names to generate. Default: All instructions",
88 required=False,
89 nargs="+",
90 metavar="INSTRUCTION_NAME")
91
92 cmdline.add_option(
93 "output_prefix",
94 None,
95 "POWER_V206_PROFILE",
96 "Output prefix of the generated files. Default: POWER_V206_PROFILE",
97 opt_type=str,
98 required=False,
99 metavar="PREFIX")
100
101 cmdline.add_option("output_path",
102 "O",
103 "./",
104 "Output path. Default: current path",
105 opt_type=existing_dir,
106 metavar="PATH")
107
108 cmdline.add_option(
109 "parallel",
110 "p",
111 MICROPROBE_RC['cpus'],
112 "Number of parallel jobs. Default: number of CPUs available (%s)" %
113 mp.cpu_count(),
114 opt_type=int,
115 choices=list(range(1, MICROPROBE_RC['cpus'] + 1)),
116 metavar="NUM_JOBS")
117
118 cmdline.add_option(
119 "size",
120 "S",
121 64, "Benchmark size (number of instructions in the endless loop). "
122 "Default: 64 instructions",
123 opt_type=int_type(1, 2**20),
124 metavar="BENCHMARK_SIZE")
125
126 cmdline.add_option("dependency_distance",
127 "D",
128 1000,
129 "Average dependency distance between the instructions. "
130 "Default: 1000 (no dependencies)",
131 opt_type=int_type(1, 1000),
132 metavar="DEPENDECY_DISTANCE")
133
134 # Start the main
135 print_info("Processing input arguments...")
136 cmdline.main(args, _main)
137
138
139def _main(arguments):
140 """
141 Main program. Called after the arguments from the CLI interface have
142 been processed.
143 """
144
145 print_info("Arguments processed!")
146
147 print_info("Importing target definition "
148 "'power_v206-power7-ppc64_linux_gcc'...")
149 target = import_definition("power_v206-power7-ppc64_linux_gcc")
150
151 # Get the arguments
152 instructions = arguments.get("instruction", None)
153 prefix = arguments["output_prefix"]
154 output_path = arguments["output_path"]
155 parallel_jobs = arguments["parallel"]
156 size = arguments["size"]
157 distance = arguments["dependency_distance"]
158
159 # Process the arguments
160 if instructions is not None:
161
162 # If the user has provided some instructions, make sure they
163 # exists and then we call the generation function
164
165 instructions = _validate_instructions(instructions, target)
166
167 if len(instructions) == 0:
168 print_error("No valid instructions defined.")
169 exit(-1)
170
171 # Set more verbose level
172 # set_log_level(10)
173 #
174 list(
175 map(_generate_benchmark,
176 [(instruction, prefix, output_path, target, size, distance)
177 for instruction in instructions]))
178
179 else:
180
181 # If the user has not provided any instruction, go for all of them
182 # and then call he generation function
183
184 instructions = _generate_instructions(target, output_path, prefix)
185
186 # Since several benchmark will be generated, reduce verbose level
187 # and call the generation function in parallel
188
189 # set_log_level(30)
190
191 if parallel_jobs > 1:
192 pool = mp.Pool(processes=parallel_jobs)
193 pool.map(
194 _generate_benchmark,
195 [(instruction, prefix, output_path, target, size, distance)
196 for instruction in instructions], 1)
197 else:
198 list(
199 map(_generate_benchmark,
200 [(instruction, prefix, output_path, target, size, distance)
201 for instruction in instructions]))
202
203
204def _validate_instructions(instructions, target):
205 """
206 Validate the provided instruction for a given target
207 """
208
209 nins = []
210 for instruction in instructions:
211
212 if instruction not in list(target.isa.instructions.keys()):
213 print_warning("'%s' not defined in the ISA. Skipping..." %
214 instruction)
215 continue
216 nins.append(instruction)
217 return nins
218
219
220def _generate_instructions(target, path, prefix):
221 """
222 Generate the list of instruction to be generated for a given target
223 """
224
225 instructions = []
226 for name, instr in target.instructions.items():
227
228 if instr.privileged or instr.hypervisor:
229 # Skip priv/hyper instructions
230 continue
231
232 if instr.branch and not instr.branch_relative:
233 # Skip branch absolute due to relocation problems
234 continue
235
236 if instr.category in ['LMA', 'LMV', 'DS', 'EC']:
237 # Skip some instruction categories
238 continue
239
240 if name in [
241 'LSWI_V0', 'LSWX_V0', 'LMW_V0', 'STSWX_V0', 'LD_V1', 'LWZ_V1',
242 'STW_V1'
243 ]:
244 # Some instructions are not completely supported yet
245 # String-related instructions and load multiple
246
247 continue
248
249 # Skip if the files already exists
250
251 fname = "%s/%s_%s.c" % (path, prefix, name)
252 ffname = "%s/%s_%s.c.fail" % (path, prefix, name)
253
254 if os.path.isfile(fname):
255 print_warning("Skip %s. '%s' already generated" % (name, fname))
256 continue
257
258 if os.path.isfile(ffname):
259 print_warning("Skip %s. '%s' already generated (failed)" %
260 (name, ffname))
261 continue
262
263 instructions.append(name)
264
265 return instructions
266
267
268def _generate_benchmark(args):
269 """
270 Actual benchmark generation policy. This is the function that defines
271 how the microbenchmark are going to be generated
272 """
273
274 instr_name, prefix, output_path, target, size, distance = args
275
276 try:
277
278 # Name of the output file
279 fname = "%s/%s_%s" % (output_path, prefix, instr_name)
280
281 # Name of the fail output file (generated in case of exception)
282 ffname = "%s.c.fail" % (fname)
283
284 print_info("Generating %s ..." % (fname))
285
286 instruction = microprobe.code.ins.Instruction()
287 instruction.set_arch_type(target.instructions[instr_name])
288 sequence = [target.instructions[instr_name]]
289
290 # Get the wrapper object. The wrapper object is in charge of
291 # translating the internal representation of the microbenchmark
292 # to the final output format.
293 #
294 # In this case, we obtain the 'CInfGen' wrapper, which embeds
295 # the generated code within an infinite loop using C plus
296 # in-line assembly statements.
297 cwrapper = microprobe.code.get_wrapper("CInfGen")
298
299 # Create the synthesizer object, which is in charge of driving the
300 # generation of the microbenchmark, given a set of passes
301 # (a.k.a. transformations) to apply to the an empty internal
302 # representation of the microbenchmark
303 synth = microprobe.code.Synthesizer(target,
304 cwrapper(),
305 value=0b01010101)
306
307 # Add the transformation passes
308
309 #######################################################################
310 # Pass 1: Init integer registers to a given value #
311 #######################################################################
312 synth.add_pass(
313 microprobe.passes.initialization.InitializeRegistersPass(
314 value=_init_value()))
315 floating = False
316 vector = False
317
318 for operand in instruction.operands():
319 if operand.type.immediate:
320 continue
321
322 if operand.type.float:
323 floating = True
324
325 if operand.type.vector:
326 vector = True
327
328 if vector and floating:
329 ###################################################################
330 # Pass 1.A: if instruction uses vector floats, init vector #
331 # registers to float values #
332 ###################################################################
333 synth.add_pass(
334 microprobe.passes.initialization.InitializeRegistersPass(
335 v_value=(1.000000000000001, 64)))
336 elif vector:
337 ###################################################################
338 # Pass 1.B: if instruction uses vector but not floats, init #
339 # vector registers to integer value #
340 ###################################################################
341 synth.add_pass(
342 microprobe.passes.initialization.InitializeRegistersPass(
343 v_value=(_init_value(), 64)))
344 elif floating:
345 ###################################################################
346 # Pass 1.C: if instruction uses floats, init float #
347 # registers to float values #
348 ###################################################################
349 synth.add_pass(
350 microprobe.passes.initialization.InitializeRegistersPass(
351 fp_value=1.000000000000001))
352
353 #######################################################################
354 # Pass 2: Add a building block of size 'size' #
355 #######################################################################
356 synth.add_pass(
357 microprobe.passes.structure.SimpleBuildingBlockPass(size))
358
359 #######################################################################
360 # Pass 3: Fill the building block with the instruction sequence #
361 #######################################################################
362 synth.add_pass(
363 microprobe.passes.instruction.SetInstructionTypeBySequencePass(
364 sequence))
365
366 #######################################################################
367 # Pass 4: Compute addresses of instructions (this pass is needed to #
368 # update the internal representation information so that in #
369 # case addresses are required, they are up to date). #
370 #######################################################################
371 synth.add_pass(
372 microprobe.passes.address.UpdateInstructionAddressesPass())
373
374 #######################################################################
375 # Pass 5: Set target of branches to be the next instruction in the #
376 # instruction stream #
377 #######################################################################
378 synth.add_pass(microprobe.passes.branch.BranchNextPass())
379
380 #######################################################################
381 # Pass 6: Set memory-related operands to access 16 storage locations #
382 # in a round-robin fashion in stride 256 bytes. #
383 # The pattern would be: 0, 256, 512, .... 3840, 0, 256, ... #
384 #######################################################################
385 synth.add_pass(microprobe.passes.memory.SingleMemoryStreamPass(
386 16, 256))
387
388 #######################################################################
389 # Pass 7.A: Initialize the storage locations accessed by floating #
390 # point instructions to have a valid floating point value #
391 #######################################################################
392 synth.add_pass(
393 microprobe.passes.float.InitializeMemoryFloatPass(
394 value=1.000000000000001))
395
396 #######################################################################
397 # Pass 7.B: Initialize the storage locations accessed by decimal #
398 # instructions to have a valid decimal value #
399 #######################################################################
400 synth.add_pass(
401 microprobe.passes.decimal.InitializeMemoryDecimalPass(value=1))
402
403 #######################################################################
404 # Pass 8: Set the remaining instructions operands (if not set) #
405 # (Required to set remaining immediate operands) #
406 #######################################################################
407 synth.add_pass(
408 microprobe.passes.register.DefaultRegisterAllocationPass(
409 dd=distance))
410
411 # Synthesize the microbenchmark.The synthesize applies the set of
412 # transformation passes added before and returns object representing
413 # the microbenchmark
414 bench = synth.synthesize()
415
416 # Save the microbenchmark to the file 'fname'
417 synth.save(fname, bench=bench)
418
419 print_info("%s generated!" % (fname))
420
421 # Remove fail file if exists
422 if os.path.isfile(ffname):
423 os.remove(ffname)
424
425 except MicroprobeException:
426
427 # In case of exception during the generation of the microbenchmark,
428 # print the error, write the fail file and exit
429 print_error(traceback.format_exc())
430 open(ffname, 'a').close()
431 exit(-1)
432
433
434def _init_value():
435 """ Return a init value """
436 return 0b0101010101010101010101010101010101010101010101010101010101010101
437
438
439# Main
440if __name__ == '__main__':
441 # run main if executed from the command line
442 # and the main method exists
443
444 if callable(locals().get('main_setup')):
445 main_setup()
446 exit(0)
The code is self-documented. You can take a look to understand the basic concepts of the code generation in Microprobe. In order to help the readers, let us summarize and elaborate the explanations in the code. The following are the suggested steps required to implement a command line tool to generate microbenchmarks using Microprobe:
Define the command line interface and parameters (
main_setup()
function in the example). This includes:Create a command line interface object
Define parameters using the
add_option
interfaceCall the actual main with the arguments
Define the function to process the input parameters (
_main()
function in the example). This includes:Import target definition
Get processed arguments
Validate and use the arguments to call the actual microbenchmark generation function
Define the function to generate the microbenchmark (
_generate_benchmark
function in the example). The main elements are the following:Get the wrapper object. The wrapper object defines the general characteristics of code being generated (i.e. how the internal representation will be translated to the final file being generated). General characteristics are, for instance, code prologs such as
#include <header.h>
directives, the main function declaration, epilogs, etc. In this case, the wrapper selected is theCInfGen
. This wrapper generates C code with an infinite loop of instructions. This results in the following code:#include <stdio.h> #include <string.h> // <declaration of variables> int main(int argc, char** argv, char** envp) { // <initialization_code> while(1) { // <generated_code> } // end while }
The user can subclass or define their own wrappers to fulfill their needs. See
microprobe.code.wrapper.Wrapper
for more details.Instantiate synthesizer. The benchmark synthesizer object is in charge of driving the code generation object by applying the set of transformation passes defined by the user.
Define the transformation passes. The transformation passes will fill the
declaration of variables
,<initialization_code>
and<generated_code>
sections of the previous code block. Depending on the order and the type of passes applied, the code generated will be different. The user has plenty of transformation passes to apply. Seemicroprobe.passes
and all its submodules for further details. Also, the use can define its own passes by subclassing the classmicroprobe.passes.Pass
.Finally, once the generation policy is defined, the user only has to synthesize the benchmark and save it to a file.
power_v206_power7_ppc64_linux_gcc_fu_stress.py
The following example shows how to generate microbenchmarks that stress a particular functional unit of the architecture. The code is self explanatory:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_fu_stress.py
17
18Example module to show how to generate a benchmark stressing a particular
19functional unit of the microarchitecture at different rate using the
20average latency of instructions as well as the average dependency distance
21between the instructions
22"""
23
24# Futures
25from __future__ import absolute_import
26
27# Built-in modules
28import os
29import sys
30import traceback
31
32# Own modules
33import microprobe.code.ins
34import microprobe.passes.address
35import microprobe.passes.branch
36import microprobe.passes.decimal
37import microprobe.passes.float
38import microprobe.passes.ilp
39import microprobe.passes.initialization
40import microprobe.passes.instruction
41import microprobe.passes.memory
42import microprobe.passes.register
43import microprobe.passes.structure
44import microprobe.utils.cmdline
45from microprobe.exceptions import MicroprobeException, \
46 MicroprobeTargetDefinitionError
47from microprobe.target import import_definition
48from microprobe.utils.cmdline import dict_key, existing_dir, \
49 float_type, int_type, print_error, print_info
50from microprobe.utils.logger import get_logger
51
52__author__ = "Ramon Bertran"
53__copyright__ = "Copyright 2011-2021 IBM Corporation"
54__credits__ = []
55__license__ = "IBM (c) 2011-2021 All rights reserved"
56__version__ = "0.5"
57__maintainer__ = "Ramon Bertran"
58__email__ = "rbertra@us.ibm.com"
59__status__ = "Development" # "Prototype", "Development", or "Production"
60
61# Constants
62LOG = get_logger(__name__) # Get the generic logging interface
63
64
65# Functions
66def main_setup():
67 """
68 Set up the command line interface (CLI) with the arguments required by
69 this command line tool.
70 """
71
72 args = sys.argv[1:]
73
74 # Get the target definition
75 try:
76 target = import_definition("power_v206-power7-ppc64_linux_gcc")
77 except MicroprobeTargetDefinitionError as exc:
78 print_error("Unable to import target definition")
79 print_error("Exception message: %s" % str(exc))
80 exit(-1)
81
82 func_units = {}
83 valid_units = [elem.name for elem in target.elements.values()]
84
85 for instr in target.isa.instructions.values():
86 if instr.execution_units == "None":
87 LOG.debug("Execution units for: '%s' not defined", instr.name)
88 continue
89
90 for unit in instr.execution_units:
91 if unit not in valid_units:
92 continue
93
94 if unit not in func_units:
95 func_units[unit] = [
96 elem for elem in target.elements.values()
97 if elem.name == unit
98 ][0]
99
100 # Create the CLI interface object
101 cmdline = microprobe.utils.cmdline.CLI("ISA power v206 profile example",
102 config_options=False,
103 target_options=False,
104 debug_options=False)
105
106 # Add the different parameters for this particular tool
107 cmdline.add_option("functional_unit",
108 "f", [func_units['ALU']],
109 "Functional units to stress. Default: ALU",
110 required=False,
111 nargs="+",
112 choices=func_units,
113 opt_type=dict_key(func_units),
114 metavar="FUNCTIONAL_UNIT_NAME")
115
116 cmdline.add_option(
117 "output_prefix",
118 None,
119 "POWER_V206_FU_STRESS",
120 "Output prefix of the generated files. Default: POWER_V206_FU_STRESS",
121 opt_type=str,
122 required=False,
123 metavar="PREFIX")
124
125 cmdline.add_option("output_path",
126 "O",
127 "./",
128 "Output path. Default: current path",
129 opt_type=existing_dir,
130 metavar="PATH")
131
132 cmdline.add_option(
133 "size",
134 "S",
135 64, "Benchmark size (number of instructions in the endless loop). "
136 "Default: 64 instructions",
137 opt_type=int_type(1, 2**20),
138 metavar="BENCHMARK_SIZE")
139
140 cmdline.add_option("dependency_distance",
141 "D",
142 1000,
143 "Average dependency distance between the instructions. "
144 "Default: 1000 (no dependencies)",
145 opt_type=int_type(1, 1000),
146 metavar="DEPENDECY_DISTANCE")
147
148 cmdline.add_option("average_latency",
149 "L",
150 2, "Average latency of the selected instructins. "
151 "Default: 2 cycles",
152 opt_type=float_type(1, 1000),
153 metavar="AVERAGE_LATENCY")
154
155 # Start the main
156 print_info("Processing input arguments...")
157 cmdline.main(args, _main)
158
159
160def _main(arguments):
161 """
162 Main program. Called after the arguments from the CLI interface have
163 been processed.
164 """
165
166 print_info("Arguments processed!")
167
168 print_info("Importing target definition "
169 "'power_v206-power7-ppc64_linux_gcc'...")
170 target = import_definition("power_v206-power7-ppc64_linux_gcc")
171
172 # Get the arguments
173 functional_units = arguments["functional_unit"]
174 prefix = arguments["output_prefix"]
175 output_path = arguments["output_path"]
176 size = arguments["size"]
177 latency = arguments["average_latency"]
178 distance = arguments["dependency_distance"]
179
180 if functional_units is None:
181 functional_units = ["ALL"]
182
183 _generate_benchmark(target, "%s/%s_" % (output_path, prefix),
184 (functional_units, size, latency, distance))
185
186
187def _generate_benchmark(target, output_prefix, args):
188 """
189 Actual benchmark generation policy. This is the function that defines
190 how the microbenchmark are going to be generated
191 """
192
193 functional_units, size, latency, distance = args
194
195 try:
196
197 # Name of the output file
198 func_unit_names = [unit.name for unit in functional_units]
199 fname = "%s%s" % (output_prefix, "_".join(func_unit_names))
200 fname = "%s_LAT_%s" % (fname, latency)
201 fname = "%s_DEP_%s" % (fname, distance)
202
203 # Name of the fail output file (generated in case of exception)
204 ffname = "%s.c.fail" % (fname)
205
206 print_info("Generating %s ..." % (fname))
207
208 # Get the wrapper object. The wrapper object is in charge of
209 # translating the internal representation of the microbenchmark
210 # to the final output format.
211 #
212 # In this case, we obtain the 'CInfGen' wrapper, which embeds
213 # the generated code within an infinite loop using C plus
214 # in-line assembly statements.
215 cwrapper = microprobe.code.get_wrapper("CInfGen")
216
217 # Create the synthesizer object, which is in charge of driving the
218 # generation of the microbenchmark, given a set of passes
219 # (a.k.a. transformations) to apply to the an empty internal
220 # representation of the microbenchmark
221 synth = microprobe.code.Synthesizer(target,
222 cwrapper(),
223 value=0b01010101)
224
225 # Add the transformation passes
226
227 #######################################################################
228 # Pass 1: Init integer registers to a given value #
229 #######################################################################
230 synth.add_pass(
231 microprobe.passes.initialization.InitializeRegistersPass(
232 value=_init_value()))
233
234 #######################################################################
235 # Pass 2: Add a building block of size 'size' #
236 #######################################################################
237 synth.add_pass(
238 microprobe.passes.structure.SimpleBuildingBlockPass(size))
239
240 #######################################################################
241 # Pass 3: Fill the building block with the instruction sequence #
242 #######################################################################
243 synth.add_pass(
244 microprobe.passes.instruction.SetInstructionTypeByElementPass(
245 target, functional_units, {}))
246
247 #######################################################################
248 # Pass 4: Compute addresses of instructions (this pass is needed to #
249 # update the internal representation information so that in #
250 # case addresses are required, they are up to date). #
251 #######################################################################
252 synth.add_pass(
253 microprobe.passes.address.UpdateInstructionAddressesPass())
254
255 #######################################################################
256 # Pass 5: Set target of branches to be the next instruction in the #
257 # instruction stream #
258 #######################################################################
259 synth.add_pass(microprobe.passes.branch.BranchNextPass())
260
261 #######################################################################
262 # Pass 6: Set memory-related operands to access 16 storage locations #
263 # in a round-robin fashion in stride 256 bytes. #
264 # The pattern would be: 0, 256, 512, .... 3840, 0, 256, ... #
265 #######################################################################
266 synth.add_pass(microprobe.passes.memory.SingleMemoryStreamPass(
267 16, 256))
268
269 #######################################################################
270 # Pass 7.A: Initialize the storage locations accessed by floating #
271 # point instructions to have a valid floating point value #
272 #######################################################################
273 synth.add_pass(
274 microprobe.passes.float.InitializeMemoryFloatPass(
275 value=1.000000000000001))
276
277 #######################################################################
278 # Pass 7.B: Initialize the storage locations accessed by decimal #
279 # instructions to have a valid decimal value #
280 #######################################################################
281 synth.add_pass(
282 microprobe.passes.decimal.InitializeMemoryDecimalPass(value=1))
283
284 #######################################################################
285 # Pass 8: Set the remaining instructions operands (if not set) #
286 # (Required to set remaining immediate operands) #
287 #######################################################################
288 synth.add_pass(
289 microprobe.passes.register.DefaultRegisterAllocationPass(
290 dd=distance))
291
292 # Synthesize the microbenchmark.The synthesize applies the set of
293 # transformation passes added before and returns object representing
294 # the microbenchmark
295 bench = synth.synthesize()
296
297 # Save the microbenchmark to the file 'fname'
298 synth.save(fname, bench=bench)
299
300 print_info("%s generated!" % (fname))
301
302 # Remove fail file if exists
303 if os.path.isfile(ffname):
304 os.remove(ffname)
305
306 except MicroprobeException:
307
308 # In case of exception during the generation of the microbenchmark,
309 # print the error, write the fail file and exit
310 print_error(traceback.format_exc())
311 open(ffname, 'a').close()
312 exit(-1)
313
314
315def _init_value():
316 """ Return a init value """
317 return 0b0101010101010101010101010101010101010101010101010101010101010101
318
319
320# Main
321if __name__ == '__main__':
322 # run main if executed from the command line
323 # and the main method exists
324
325 if callable(locals().get('main_setup')):
326 main_setup()
327 exit(0)
power_v206_power7_ppc64_linux_gcc_memory.py
The following example shows how to create microbenchmarks with different activity (stress levels) on the different levels of the cache hierarchy. Note that it is not necessary to use the built-in command line interface provided by Microprobe, as the example shows.
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_memory.py
17
18Example python script to show how to generate microbenchmarks with particular
19levels of activity in the memory hierarchy.
20"""
21
22# Futures
23from __future__ import absolute_import
24
25# Built-in modules
26import multiprocessing as mp
27import os
28import random
29import sys
30
31# Third party modules
32from six.moves import map
33
34# Own modules
35import microprobe.code
36import microprobe.passes.address
37import microprobe.passes.ilp
38import microprobe.passes.initialization
39import microprobe.passes.instruction
40import microprobe.passes.memory
41import microprobe.passes.register
42import microprobe.passes.structure
43from microprobe import MICROPROBE_RC
44from microprobe.exceptions import MicroprobeTargetDefinitionError
45from microprobe.model.memory import EndlessLoopDataMemoryModel
46from microprobe.target import import_definition
47from microprobe.utils.cmdline import print_error, print_info
48
49__author__ = "Ramon Bertran"
50__copyright__ = "Copyright 2011-2021 IBM Corporation"
51__credits__ = []
52__license__ = "IBM (c) 2011-2021 All rights reserved"
53__version__ = "0.5"
54__maintainer__ = "Ramon Bertran"
55__email__ = "rbertra@us.ibm.com"
56__status__ = "Development" # "Prototype", "Development", or "Production"
57
58# Get the target definition
59try:
60 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
61except MicroprobeTargetDefinitionError as exc:
62 print_error("Unable to import target definition")
63 print_error("Exception message: %s" % str(exc))
64 exit(-1)
65
66BASE_ELEMENT = [
67 element for element in TARGET.elements.values() if element.name == 'L1D'
68][0]
69CACHE_HIERARCHY = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
70 BASE_ELEMENT)
71
72# Benchmark size
73BENCHMARK_SIZE = 8 * 1024
74
75# Fill a list of the models to be generated
76
77MEMORY_MODELS = []
78
79#
80# Due to performance issues (long exec. time) this
81# model is disabled
82#
83# MEMORY_MODELS.append(
84# (
85# "ALL", CACHE_HIERARCHY, [
86# 25, 25, 25, 25]))
87
88MEMORY_MODELS.append(("L1", CACHE_HIERARCHY, [100, 0, 0, 0]))
89MEMORY_MODELS.append(("L2", CACHE_HIERARCHY, [0, 100, 0, 0]))
90MEMORY_MODELS.append(("L3", CACHE_HIERARCHY, [0, 0, 100, 0]))
91MEMORY_MODELS.append(("L1L3", CACHE_HIERARCHY, [50, 0, 50, 0]))
92MEMORY_MODELS.append(("L1L2", CACHE_HIERARCHY, [50, 50, 0, 0]))
93MEMORY_MODELS.append(("L2L3", CACHE_HIERARCHY, [0, 50, 50, 0]))
94MEMORY_MODELS.append(("CACHES", CACHE_HIERARCHY, [33, 33, 34, 0]))
95MEMORY_MODELS.append(("MEM", CACHE_HIERARCHY, [0, 0, 0, 100]))
96
97# Enable parallel generation
98PARALLEL = False
99
100
101def main():
102 """Main function. """
103 # call the generate method for each model in the memory model list
104
105 if PARALLEL:
106 print_info("Start parallel execution...")
107 pool = mp.Pool(processes=MICROPROBE_RC['cpus'])
108 pool.map(generate, MEMORY_MODELS, 1)
109 else:
110 print_info("Start sequential execution...")
111 list(map(generate, MEMORY_MODELS))
112
113 exit(0)
114
115
116def generate(model):
117 """Benchmark generation policy function. """
118
119 print_info("Creating memory model '%s' ..." % model[0])
120 model = EndlessLoopDataMemoryModel(*model)
121
122 modelname = model.name
123
124 print_info("Generating Benchmark mem-%s ..." % (modelname))
125
126 # Get the architecture
127 garch = TARGET
128
129 # For all the supported instructions, get the memory operations,
130 sequence = []
131 for instr_name in sorted(garch.instructions.keys()):
132
133 instr = garch.instructions[instr_name]
134
135 if not instr.access_storage:
136 continue
137 if instr.privileged: # Skip privileged
138 continue
139 if instr.hypervisor: # Skip hypervisor
140 continue
141 if instr.trap: # Skip traps
142 continue
143 if "String" in instr.description: # Skip unsupported string instr.
144 continue
145 if "Multiple" in instr.description: # Skip unsupported mult. ld/sts
146 continue
147 if instr.category in ['LMA', 'LMV', 'DS', 'EC',
148 'WT']: # Skip unsupported categories
149 continue
150 if instr.access_storage_with_update: # Not supported by mem. model
151 continue
152 if "Reserve Indexed" in instr.description: # Skip (illegal intr.)
153 continue
154 if "Conditional Indexed" in instr.description: # Skip (illegal intr.)
155 continue
156 if instr.name in ['LD_V1', 'LWZ_V1', 'STW_V1']:
157 continue
158
159 sequence.append(instr)
160
161 # Get the loop wrapper. In this case we take the 'CInfPpc', which
162 # generates an infinite loop in C using PowerPC embedded assembly.
163 cwrapper = microprobe.code.get_wrapper("CInfPpc")
164
165 # Define function to return random numbers (used afterwards)
166 def rnd():
167 """Return a random value. """
168 return random.randrange(0, (1 << 64) - 1)
169
170 # Create the benchmark synthesizer
171 synth = microprobe.code.Synthesizer(garch, cwrapper())
172
173 ##########################################################################
174 # Add the passes we want to apply to synthesize benchmarks #
175 ##########################################################################
176
177 # --> Init registers to random values
178 synth.add_pass(
179 microprobe.passes.initialization.InitializeRegistersPass(value=rnd))
180
181 # --> Add a single basic block of size 'size'
182 if model.name in ['MEM']:
183 synth.add_pass(
184 microprobe.passes.structure.SimpleBuildingBlockPass(
185 BENCHMARK_SIZE * 4))
186 else:
187 synth.add_pass(
188 microprobe.passes.structure.SimpleBuildingBlockPass(
189 BENCHMARK_SIZE))
190
191 # --> Fill the basic block using the sequence of instructions provided
192 synth.add_pass(
193 microprobe.passes.instruction.SetInstructionTypeBySequencePass(
194 sequence))
195
196 # --> Set the memory operations parameters to fulfill the given model
197 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(model))
198
199 # --> Set the dependency distance and the default allocation. Sets the
200 # remaining undefined instruction operands (register allocation,...)
201 synth.add_pass(microprobe.passes.register.NoHazardsAllocationPass())
202 synth.add_pass(
203 microprobe.passes.register.DefaultRegisterAllocationPass(dd=0))
204
205 # Generate the benchmark (applies the passes).
206 bench = synth.synthesize()
207
208 print_info("Benchmark mem-%s saving to disk..." % (modelname))
209
210 # Save the benchmark
211 synth.save("%s/mem-%s" % (DIRECTORY, modelname), bench=bench)
212
213 print_info("Benchmark mem-%s generated" % (modelname))
214 return True
215
216
217if __name__ == '__main__':
218 # run main if executed from the command line
219 # and the main method exists
220
221 if len(sys.argv) != 2:
222 print_info("Usage:")
223 print_info("%s output_dir" % (sys.argv[0]))
224 exit(-1)
225
226 DIRECTORY = sys.argv[1]
227
228 if not os.path.isdir(DIRECTORY):
229 print_error("Output directory '%s' does not exists" % (DIRECTORY))
230 exit(-1)
231
232 if callable(locals().get('main')):
233 main()
power_v206_power7_ppc64_linux_gcc_random.py
The following example generates random microbenchmarks:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_memory.py
17
18Example python script to show how to generate random microbenchmarks.
19"""
20
21# Futures
22from __future__ import absolute_import
23
24# Built-in modules
25import multiprocessing as mp
26import os
27import random
28import sys
29
30# Third party modules
31from six.moves import map, range
32
33# Own modules
34import microprobe.code
35import microprobe.passes.address
36import microprobe.passes.branch
37import microprobe.passes.ilp
38import microprobe.passes.initialization
39import microprobe.passes.instruction
40import microprobe.passes.memory
41import microprobe.passes.register
42import microprobe.passes.structure
43from microprobe import MICROPROBE_RC
44from microprobe.exceptions import MicroprobeError, \
45 MicroprobeTargetDefinitionError
46from microprobe.model.memory import EndlessLoopDataMemoryModel
47from microprobe.target import import_definition
48from microprobe.utils.cmdline import print_error, print_info
49
50__author__ = "Ramon Bertran"
51__copyright__ = "Copyright 2011-2021 IBM Corporation"
52__credits__ = []
53__license__ = "IBM (c) 2011-2021 All rights reserved"
54__version__ = "0.5"
55__maintainer__ = "Ramon Bertran"
56__email__ = "rbertra@us.ibm.com"
57__status__ = "Development" # "Prototype", "Development", or "Production"
58
59# Benchmark size
60BENCHMARK_SIZE = 8 * 1024
61
62# Get the target definition
63try:
64 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
65except MicroprobeTargetDefinitionError as exc:
66 print_error("Unable to import target definition")
67 print_error("Exception message: %s" % str(exc))
68 exit(-1)
69
70BASE_ELEMENT = [
71 element for element in TARGET.elements.values() if element.name == 'L1D'
72][0]
73CACHE_HIERARCHY = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
74 BASE_ELEMENT)
75
76PARALLEL = True
77
78
79def main():
80 """ Main program. """
81 if PARALLEL:
82 pool = mp.Pool(processes=MICROPROBE_RC['cpus'])
83 pool.map(generate, list(range(0, 100)), 1)
84 else:
85 list(map(generate, list(range(0, 100))))
86
87
88def generate(name):
89 """ Benchmark generation policy. """
90
91 if os.path.isfile("%s/random-%s.c" % (DIRECTORY, name)):
92 print_info("Skip %d" % name)
93 return
94
95 print_info("Generating %d..." % name)
96
97 # Seed the randomness
98 rand = random.Random()
99 rand.seed(64) # My favorite number ;)
100
101 # Generate a random memory model (used afterwards)
102 model = []
103 total = 100
104 for mcomp in CACHE_HIERARCHY[0:-1]:
105 weight = rand.randint(0, total)
106 model.append(weight)
107 print_info("%s: %d%%" % (mcomp, weight))
108 total = total - weight
109
110 # Fix remaining
111 level = rand.randint(0, len(CACHE_HIERARCHY[0:-1]) - 1)
112 model[level] += total
113
114 # Last level always zero
115 model.append(0)
116
117 # Sanity check
118 psum = 0
119 for elem in model:
120 psum += elem
121 assert psum == 100
122
123 modelobj = EndlessLoopDataMemoryModel("random-%s", CACHE_HIERARCHY, model)
124
125 # Get the loop wrapper. In this case we take the 'CInfPpc', which
126 # generates an infinite loop in C using PowerPC embedded assembly.
127 cwrapper = microprobe.code.get_wrapper("CInfPpc")
128
129 # Define function to return random numbers (used afterwards)
130 def rnd():
131 """Return a random value. """
132 return rand.randrange(0, (1 << 64) - 1)
133
134 # Create the benchmark synthesizer
135 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
136
137 ##########################################################################
138 # Add the passes we want to apply to synthesize benchmarks #
139 ##########################################################################
140
141 # --> Init registers to random values
142 synth.add_pass(
143 microprobe.passes.initialization.InitializeRegistersPass(value=rnd))
144
145 # --> Add a single basic block of size size
146 synth.add_pass(
147 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
148
149 # --> Fill the basic block with instructions picked randomly from the list
150 # provided
151
152 instructions = []
153 for instr in TARGET.instructions.values():
154
155 if instr.privileged: # Skip privileged
156 continue
157 if instr.hypervisor: # Skip hypervisor
158 continue
159 if instr.trap: # Skip traps
160 continue
161 if instr.syscall: # Skip syscall
162 continue
163 if "String" in instr.description: # Skip unsupported string instr.
164 continue
165 if "Multiple" in instr.description: # Skip unsupported mult. ld/sts
166 continue
167 if instr.category in ['LMA', 'LMV', 'DS', 'EC',
168 'WT']: # Skip unsupported categories
169 continue
170 if instr.access_storage_with_update: # Not supported by mem. model
171 continue
172 if instr.branch and not instr.branch_relative: # Skip branches
173 continue
174 if "Reserve Indexed" in instr.description: # Skip (illegal intr.)
175 continue
176 if "Conitional Indexed" in instr.description: # Skip (illegal intr.)
177 continue
178 if instr.name in [
179 'LD_V1',
180 'LWZ_V1',
181 'STW_V1',
182 ]:
183 continue
184
185 instructions.append(instr)
186
187 synth.add_pass(
188 microprobe.passes.instruction.SetRandomInstructionTypePass(
189 instructions, rand
190 )
191 )
192
193 # --> Set the memory operations parameters to fulfill the given model
194 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(modelobj))
195
196 # --> Set target of branches to next instruction (first compute addresses)
197 synth.add_pass(microprobe.passes.address.UpdateInstructionAddressesPass())
198 synth.add_pass(microprobe.passes.branch.BranchNextPass())
199
200 # --> Set the dependency distance and the default allocation. Dependency
201 # distance is randomly picked
202 synth.add_pass(
203 microprobe.passes.register.DefaultRegisterAllocationPass(
204 dd=rand.randint(1, 20)
205 )
206 )
207
208 # Generate the benchmark (applies the passes)
209 # Since it is a randomly generated code, the generation might fail
210 # (e.g. not enough access to fulfill the requested memory model, etc.)
211 # Because of that, we handle the exception accordingly.
212 try:
213 print_info("Synthesizing %d..." % name)
214 bench = synth.synthesize()
215 print_info("Synthesized %d!" % name)
216 # Save the benchmark
217 synth.save("%s/random-%s" % (DIRECTORY, name), bench=bench)
218 except MicroprobeError:
219 print_info("Synthesizing error in '%s'. This is Ok." % name)
220
221 return True
222
223
224if __name__ == '__main__':
225 # run main if executed from the command line
226 # and the main method exists
227
228 if len(sys.argv) != 2:
229 print_info("Usage:")
230 print_info("%s output_dir" % (sys.argv[0]))
231 exit(-1)
232
233 DIRECTORY = sys.argv[1]
234
235 if not os.path.isdir(DIRECTORY):
236 print_error("Output directory '%s' does not exists" % (DIRECTORY))
237 exit(-1)
238
239 if callable(locals().get('main')):
240 main()
power_v206_power7_ppc64_linux_gcc_custom.py
The following example shows different examples on how to customize the generation of microbenchmarks:
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_custom.py
17
18Example python script to show how to generate random microbenchmarks.
19"""
20
21# Futures
22from __future__ import absolute_import
23
24# Built-in modules
25import os
26import sys
27
28# Own modules
29import microprobe.code
30import microprobe.passes.initialization
31import microprobe.passes.instruction
32import microprobe.passes.memory
33import microprobe.passes.register
34import microprobe.passes.structure
35from microprobe.exceptions import MicroprobeTargetDefinitionError
36from microprobe.model.memory import EndlessLoopDataMemoryModel
37from microprobe.target import import_definition
38from microprobe.utils.cmdline import print_error, print_info
39from microprobe.utils.misc import RNDINT
40
41__author__ = "Ramon Bertran"
42__copyright__ = "Copyright 2011-2021 IBM Corporation"
43__credits__ = []
44__license__ = "IBM (c) 2011-2021 All rights reserved"
45__version__ = "0.5"
46__maintainer__ = "Ramon Bertran"
47__email__ = "rbertra@us.ibm.com"
48__status__ = "Development" # "Prototype", "Development", or "Production"
49
50# Benchmark size
51BENCHMARK_SIZE = 8 * 1024
52
53if len(sys.argv) != 2:
54 print_info("Usage:")
55 print_info("%s output_dir" % (sys.argv[0]))
56 exit(-1)
57
58DIRECTORY = sys.argv[1]
59
60if not os.path.isdir(DIRECTORY):
61 print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
62 exit(-1)
63
64# Get the target definition
65try:
66 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
67except MicroprobeTargetDefinitionError as exc:
68 print_error("Unable to import target definition")
69 print_error("Exception message: %s" % str(exc))
70 exit(-1)
71
72
73###############################################################################
74# Example 1: loop with instructions accessing storage , hitting the first #
75# level of cache and with dependency distance of 3 #
76###############################################################################
77def example_1():
78 """ Example 1 """
79 name = "L1-LOADS"
80
81 base_element = [
82 element for element in TARGET.elements.values()
83 if element.name == 'L1D'
84 ][0]
85 cache_hierarchy = TARGET.cache_hierarchy.get_data_hierarchy_from_element(
86 base_element)
87
88 model = [0] * len(cache_hierarchy)
89 model[0] = 100
90
91 mmodel = EndlessLoopDataMemoryModel("random-%s", cache_hierarchy, model)
92
93 profile = {}
94 for instr_name in sorted(TARGET.instructions.keys()):
95 instr = TARGET.instructions[instr_name]
96 if not instr.access_storage:
97 continue
98 if instr.privileged: # Skip privileged
99 continue
100 if instr.hypervisor: # Skip hypervisor
101 continue
102 if "String" in instr.description: # Skip unsupported string instr.
103 continue
104 if "ultiple" in instr.description: # Skip unsupported mult. ld/sts
105 continue
106 if instr.category in ['DS', 'LMA', 'LMV',
107 'EC']: # Skip unsupported categories
108 continue
109 if instr.access_storage_with_update: # Not supported
110 continue
111
112 if instr.name in [
113 'LD_V1',
114 'LWZ_V1',
115 'STW_V1',
116 ]:
117 continue
118
119 if (any([moper.is_load for moper in instr.memory_operand_descriptors])
120 and all([
121 not moper.is_store
122 for moper in instr.memory_operand_descriptors
123 ])):
124 profile[instr] = 1
125
126 cwrapper = microprobe.code.get_wrapper("CInfPpc")
127 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
128
129 synth.add_pass(
130 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
131 synth.add_pass(
132 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
133 synth.add_pass(
134 microprobe.passes.initialization.InitializeRegisterPass("GPR1",
135 0,
136 force=True,
137 reserve=True))
138 synth.add_pass(
139 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
140 synth.add_pass(microprobe.passes.memory.GenericMemoryModelPass(mmodel))
141 synth.add_pass(
142 microprobe.passes.register.DefaultRegisterAllocationPass(dd=3))
143
144 print_info("Generating %s..." % name)
145 bench = synth.synthesize()
146 print_info("%s Generated!" % name)
147 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
148
149
150###############################################################################
151# Example 2: loop with instructions using the MUL unit and with dependency #
152# distance of 4 #
153###############################################################################
154def example_2():
155 """ Example 2 """
156 name = "FXU-MUL"
157
158 cwrapper = microprobe.code.get_wrapper("CInfPpc")
159 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
160
161 synth.add_pass(
162 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
163 synth.add_pass(
164 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
165 synth.add_pass(
166 microprobe.passes.instruction.SetInstructionTypeByElementPass(
167 TARGET, [TARGET.elements['MUL_FXU0_Core0_SCM_Processor']], {}))
168 synth.add_pass(
169 microprobe.passes.register.DefaultRegisterAllocationPass(dd=4))
170
171 print_info("Generating %s..." % name)
172 bench = synth.synthesize()
173 print_info("%s Generated!" % name)
174 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
175
176
177###############################################################################
178# Example 3: loop with instructions using the ALU unit and with dependency #
179# distance of 1 #
180###############################################################################
181def example_3():
182 """ Example 3 """
183 name = "FXU-ALU"
184
185 cwrapper = microprobe.code.get_wrapper("CInfPpc")
186 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
187
188 synth.add_pass(
189 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
190 synth.add_pass(
191 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
192 synth.add_pass(
193 microprobe.passes.instruction.SetInstructionTypeByElementPass(
194 TARGET, [TARGET.elements['ALU_FXU0_Core0_SCM_Processor']], {}))
195 synth.add_pass(
196 microprobe.passes.register.DefaultRegisterAllocationPass(dd=1))
197
198 print_info("Generating %s..." % name)
199 bench = synth.synthesize()
200 print_info("%s Generated!" % name)
201 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
202
203
204###############################################################################
205# Example 4: loop with FMUL* instructions with different weights and with #
206# dependency distance 10 #
207###############################################################################
208def example_4():
209 """ Example 4 """
210 name = "VSU-FMUL"
211
212 profile = {}
213 profile[TARGET.instructions['FMUL_V0']] = 4
214 profile[TARGET.instructions['FMULS_V0']] = 3
215 profile[TARGET.instructions['FMULx_V0']] = 2
216 profile[TARGET.instructions['FMULSx_V0']] = 1
217
218 cwrapper = microprobe.code.get_wrapper("CInfPpc")
219 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
220
221 synth.add_pass(
222 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
223 synth.add_pass(
224 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
225 synth.add_pass(
226 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
227 synth.add_pass(
228 microprobe.passes.register.DefaultRegisterAllocationPass(dd=10))
229
230 print_info("Generating %s..." % name)
231 bench = synth.synthesize()
232 print_info("%s Generated!" % name)
233 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
234
235
236###############################################################################
237# Example 5: loop with FADD* instructions with different weights and with #
238# dependency distance 1 #
239###############################################################################
240def example_5():
241 """ Example 5 """
242 name = "VSU-FADD"
243
244 profile = {}
245 profile[TARGET.instructions['FADD_V0']] = 100
246 profile[TARGET.instructions['FADDx_V0']] = 1
247 profile[TARGET.instructions['FADDS_V0']] = 10
248 profile[TARGET.instructions['FADDSx_V0']] = 1
249
250 cwrapper = microprobe.code.get_wrapper("CInfPpc")
251 synth = microprobe.code.Synthesizer(TARGET, cwrapper())
252
253 synth.add_pass(
254 microprobe.passes.initialization.InitializeRegistersPass(value=RNDINT))
255 synth.add_pass(
256 microprobe.passes.structure.SimpleBuildingBlockPass(BENCHMARK_SIZE))
257 synth.add_pass(
258 microprobe.passes.instruction.SetInstructionTypeByProfilePass(profile))
259 synth.add_pass(
260 microprobe.passes.register.DefaultRegisterAllocationPass(dd=1))
261
262 print_info("Generating %s..." % name)
263 bench = synth.synthesize()
264 print_info("%s Generated!" % name)
265 synth.save("%s/%s" % (DIRECTORY, name), bench=bench) # Save the benchmark
266
267
268###############################################################################
269# Call the examples #
270###############################################################################
271example_1()
272example_2()
273example_3()
274example_4()
275example_5()
276exit(0)
power_v206_power7_ppc64_linux_gcc_genetic.py
Deprecated since version 0.5: Support for the PyEvolve and genetic algorithm based searches has been discontinued
The following example shows how to use the design exploration module and the genetic algorithm based searches to look for a solution. In particular, for each functional unit of the architecture and a range of IPCs (instruction per cycle), the example looks for a solution that stresses that functional unit at the given IPC. External commands (not included) are needed to evaluate the generated microbenchmarks in the target platform.
1#!/usr/bin/env python
2# Copyright 2011-2021 IBM Corporation
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""
16power_v206_power7_ppc64_linux_gcc_genetic.py
17
18Example python script to show how to generate a set of microbenchmark
19stressing a particular unit but at different IPC ratio using a genetic
20search algorithm to play with two knobs: average latency and dependency
21distance.
22
23An IPC evaluation and scoring script is required. For instance:
24
25.. code:: bash
26
27 #!/bin/bash
28 # ARGS: $1 is the target IPC
29 # $2 is the name of the generate benchnark
30 target_ipc=$1
31 source_bench=$2
32
33 # Compile the benchmark
34 gcc -O0 -mcpu=power7 -mtune=power7 -std=c99 $source_bench.c -o $source_bench
35
36 # Evaluate the ipc
37 ipc=< your preferred commands to evaluate the IPC >
38
39 # Compute the score (the closer to the target IPC the
40 score=(1/($ipc-$target_ipc))^2 | bc -l
41
42 echo $score
43
44Use the script above as a template for your own GA-based search.
45"""
46
47# Futures
48from __future__ import absolute_import, division
49
50# Built-in modules
51import datetime
52import os
53import sys
54import time as runtime
55
56# Third party modules
57from six.moves import range
58
59# Own modules
60import microprobe.code
61import microprobe.driver.genetic
62import microprobe.passes.ilp
63import microprobe.passes.initialization
64import microprobe.passes.instruction
65import microprobe.passes.register
66import microprobe.passes.structure
67from microprobe.exceptions import MicroprobeTargetDefinitionError
68from microprobe.target import import_definition
69from microprobe.utils.cmdline import print_error, print_info, print_warning
70from microprobe.utils.misc import RNDINT
71
72__author__ = "Ramon Bertran"
73__copyright__ = "Copyright 2011-2021 IBM Corporation"
74__credits__ = []
75__license__ = "IBM (c) 2011-2021 All rights reserved"
76__version__ = "0.5"
77__maintainer__ = "Ramon Bertran"
78__email__ = "rbertra@us.ibm.com"
79__status__ = "Development" # "Prototype", "Development", or "Production"
80
81# Benchmark size
82BENCHMARK_SIZE = 20
83
84# Get the target definition
85try:
86 TARGET = import_definition("power_v206-power7-ppc64_linux_gcc")
87except MicroprobeTargetDefinitionError as exc:
88 print_error("Unable to import target definition")
89 print_error("Exception message: %s" % str(exc))
90 exit(-1)
91
92
93def main():
94 """Main function."""
95
96 component_list = ["FXU", "FXU-noLSU", "FXU-LSU", "VSU", "VSU-FXU"]
97 ipcs = [float(x) / 10 for x in range(1, 41)]
98 ipcs = ipcs[5:] + ipcs[:5]
99
100 for name in component_list:
101 for ipc in ipcs:
102 generate_genetic(name, ipc)
103
104
105def generate_genetic(compname, ipc):
106 """Generate a microbenchmark stressing compname at the given ipc."""
107 comps = []
108 bcomps = []
109 any_comp = False
110
111 if compname.find("FXU") >= 0:
112 comps.append(TARGET.elements["FXU0_Core0_SCM_Processor"])
113
114 if compname.find("VSU") >= 0:
115 comps.append(TARGET.elements["VSU0_Core0_SCM_Processor"])
116
117 if len(comps) == 2:
118 any_comp = True
119 elif compname.find("noLSU") >= 0:
120 bcomps.append(TARGET.elements["LSU0_Core0_SCM_Processor"])
121 elif compname.find("LSU") >= 0:
122 comps.append(TARGET.elements["LSU_Core0_SCM_Processor"])
123
124 if (len(comps) == 1 and ipc > 2) or (len(comps) == 2 and ipc > 4):
125 return True
126
127 for elem in os.listdir(DIRECTORY):
128 if not elem.endswith(".c"):
129 continue
130 if elem.startswith("%s:IPC:%.2f:DIST" % (compname, ipc)):
131 print_info("Already generated: %s %d" % (compname, ipc))
132 return True
133
134 print_info("Going for IPC: %f and Element: %s" % (ipc, compname))
135
136 def generate(name, *args):
137 """Benchmark generation function.
138
139 First argument is name, second the dependency distance and the
140 third is the average instruction latency.
141 """
142 dist, latency = args
143
144 wrapper = microprobe.code.get_wrapper("CInfPpc")
145 synth = microprobe.code.Synthesizer(TARGET, wrapper())
146 synth.add_pass(
147 microprobe.passes.initialization.InitializeRegistersPass(
148 value=RNDINT))
149 synth.add_pass(
150 microprobe.passes.structure.SimpleBuildingBlockPass(
151 BENCHMARK_SIZE))
152 synth.add_pass(
153 microprobe.passes.instruction.SetInstructionTypeByElementPass(
154 TARGET,
155 comps, {},
156 block=bcomps,
157 avelatency=latency,
158 any_comp=any_comp))
159 synth.add_pass(
160 microprobe.passes.register.DefaultRegisterAllocationPass(dd=dist))
161 bench = synth.synthesize()
162 synth.save(name, bench=bench)
163
164 # Set the genetic algorithm parameters
165 ga_params = []
166 ga_params.append((0, 20, 0.05)) # Average dependency distance design space
167 ga_params.append((2, 8, 0.05)) # Average instruction latency design space
168
169 # Set up the search driver
170 driver = microprobe.driver.genetic.ExecCmdDriver(
171 generate, 20, 30, 30, "'%s' %f " % (COMMAND, ipc), ga_params)
172
173 starttime = runtime.time()
174 print_info("Start search...")
175 driver.run(1)
176 print_info("Search end")
177 endtime = runtime.time()
178
179 print_info("Genetic time::%s" %
180 (datetime.timedelta(seconds=endtime - starttime)))
181
182 # Check if we found a solution
183 ga_params = driver.solution()
184 score = driver.score()
185
186 print_info("IPC found: %f, score: %f" % (ipc, score))
187
188 if score < 20:
189 print_warning("Unable to find an optimal solution with IPC: %f:" % ipc)
190 print_info("Generating the closest solution...")
191 generate(
192 "%s/%s:IPC:%.2f:DIST:%.2f:LAT:%.2f-check" %
193 (DIRECTORY, compname, ipc, ga_params[0], ga_params[1]),
194 ga_params[0], ga_params[1])
195 print_info("Closest solution generated")
196 else:
197 print_info("Solution found for %s and IPC %f -> dist: %f , "
198 "latency: %f " %
199 (compname, ipc, ga_params[0], ga_params[1]))
200 print_info("Generating solution...")
201 generate(
202 "%s/%s:IPC:%.2f:DIST:%.2f:LAT:%.2f" %
203 (DIRECTORY, compname, ipc, ga_params[0], ga_params[1]),
204 ga_params[0], ga_params[1])
205 print_info("Solution generated")
206 return True
207
208
209if __name__ == '__main__':
210 # run main if executed from the COMMAND line
211 # and the main method exists
212
213 if len(sys.argv) != 3:
214 print_info("Usage:")
215 print_info("%s output_dir eval_cmd" % (sys.argv[0]))
216 print_info("")
217 print_info("Output dir: output directory for the generated benchmarks")
218 print_info("eval_cmd: command accepting 2 parameters: the target IPC")
219 print_info(" and the filename of the generate benchmark. ")
220 print_info(" Output: the score used for the GA search. E.g.")
221 print_info(" the close the IPC of the generated benchmark to")
222 print_info(" the target IPC, the cmd should give a higher ")
223 print_info(" score. ")
224 exit(-1)
225
226 DIRECTORY = sys.argv[1]
227 COMMAND = sys.argv[2]
228
229 if not os.path.isdir(DIRECTORY):
230 print_info("Output DIRECTORY '%s' does not exists" % (DIRECTORY))
231 exit(-1)
232
233 if not os.path.isfile(COMMAND):
234 print_info("The COMMAND '%s' does not exists" % (COMMAND))
235 exit(-1)
236
237 if callable(locals().get('main')):
238 main()