Itanium 2 Profiling Tools: Performance monitoring events Pfmon (Open Source) Intel Vtune Analyzer - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Itanium 2 Profiling Tools: Performance monitoring events Pfmon (Open Source) Intel Vtune Analyzer

Description:

Itanium 2 Profiling Tools: Performance monitoring events Pfmon (Open Source) Intel Vtune Analyzer Arthur Raefsky raefsky_at_sgi.com Overview Profiling Tools: The Intel ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 54
Provided by: multimedi71
Category:

less

Transcript and Presenter's Notes

Title: Itanium 2 Profiling Tools: Performance monitoring events Pfmon (Open Source) Intel Vtune Analyzer


1
Itanium 2 Profiling Tools Performance
monitoring events Pfmon (Open Source) Intel
Vtune Analyzer
  • Arthur Raefsky
  • raefsky_at_sgi.com

2
Agenda
  • Performance Monitoring Events
  • Based on talk by David Levinthal of Intel
  • presented at Fall IDF.
  • Pfmon/Profile.pl
  • VTune

3
Overview
  • Profiling Tools
  • The Intel Vtune Performance Analyzer
  • Collects, analyzes, and displays performance data
    for Windows and Linux systems
  • Applications ( Both single and Multi-threaded)
  • System-wide Profile
  • No special build required
  • Very low overhead
  • Linux remote sampling

4
Overview
  • Profiling Tools
  • Pfmon Performance Analyzer
  • Collects, analyzes, and displays performance data
    for Linux systems
  • Applications ( Both user and kernel level)
  • System-wide Profile
  • No special build required
  • Very low overhead
  • Pfmon will be shipped with all SNIA systems

5
Optimization Guide
6
Itanium 2 Architecture
7
Performance Monitoring Events
8
Performance Monitoring Events
9
Performance Monitoring Events
10
Performance Monitoring Events
11
Performance Monitoring Events
12
Performance Monitoring Events
13
Performance Monitoring Events
14
Performance Monitoring Events
15
Performance Monitoring Events
16
Performance Monitoring Events
17
Performance Monitoring Events
18
Performance Monitoring Events
19
Performance Monitoring Events
pfmon --smpl-outfilesample.out \
--smpl-entries100000 \ -u --short-smpl-periods9
958 \ --smpl-output-formatdetailed-itanium2 \
--eventsDATA_EAR_CACHE_LAT8 ./bar
Sample entry in the file sample.out entry 1
PID9133 CPU1 STAMP0xb6fd88b06dbf
IIP0x40000000000035c0 PMD OVFL 4
PMD2 0x200000000042c448 PMD3
0x0000000000004009, valid Y, latency 9, overflow
N PMD17 0x4000000000003608, valid Y,
bundle 0, address 0x4000000000003600
20
Performance Monitoring Events
21
Performance Monitoring Events
22
Performance Monitoring Events
23
Pfmon / Profile .pl
  • Profile.pl
  • Written by Ray Bryant
  • Profile.pl is a Perl script that provides a
    simple way to do procedure- level profiling of an
    unmodified binary on an SDV or SN2 system,
  • The simplest way to use these scripts is as
    follows
    profile.pl -c0-3 x6 test_program
  • In this case, it is assumed that the test_program
    uses 4 processes.The 4 processes will be bound to
    processors 0-3 (via dplace) and the program will
    profiled under control of pfmon. The profile
    event will be CPU_CYCLES and the PMU will be set
    up to generate approximately 1,000 interrupts per
    second.
  • The profile.pl script will create a map file
    (using makemap.pl) for test_program and put it
    into test_program.map.
  • The profile samples themselves will go into
    sample.out. The analyzed profile will
    go into profile.out.

24
Pfmon / Profile .pl example
25
Pfmon / Profile .pl example
26
Pfmon / Profile .pl profile.out
27
Pfmon / Profile .pl profile.out
Understanding _shell_207_par_loop5 _functionName_
Line_par_loopXX Find function shell Go to line
207
28
Pfmon / Profile .pl (OpenMP)
On a SNIA system if test_program is an OpenMP
program, then you need to specify the "-x6
option as well (to get dplace to ignore the two
shepherd processes that the OpenMP library
creates) profile.pl -c0-3 -x6
test_program Program arguments can be supplied as
follows profile.pl -c0-3 -x6 test_program arg1
arg2 arg3 etc To make input or output redirection
apply to test_program only, you need to put
quotes around the program name as
follows profile.pl -c0-3 -x6 "test_program
ltinput gtoutput" otherwise the redirection applies
to profile.pl instead, which is probably not what
you wanted.
29
Pfmon / Profile .pl (MPI)
.   To use MPI with profile.pl mpirun -np 4
/usr/bin/profile.pl c0-3 s1 ./blast_waves lt
input
30
Pfmon Profile.out (kernel)
31
Pfmon Profile.out (User)
32
Pfmon / Profile .pl (MPI)
.   To use MPI with profile.pl mpirun -np 4
/usr/bin/profile.pl -K c0-3 s1 ./blast_waves lt
input -K keep the separate per cpu sample files
around and produce a separate profile report for
each cpu.
33
Pfmon / Profile.pl (List of commands)
34
Pfmon / Profile .pl example 1
Start application mpirun np 4 ./blastwave lt
input Run Top to get the PIDs of processes you
want to profile When application has reached
point where you want to start profiling, issue
the command profile.pl -T (secs Run a timed
profile experiment for the given number of
seconds.) -P (Program name)
-c0-3 -L (pidlist pidlist is a
comma separated list of pid's (containing no
blanks)This list will be passed to profile
analyzer and will restrict profiling to these
process ID's.
35
Pfmon / Profile .pl example 1
profile.pl -c0-3 -P blast_waves -T 120 -L
1527,1528,1529,1530
36
Pfmon / Profile .pl example 2
ecc -o barkern -O3 -ftz ./barkern44.c ecc -o bar
-O3 -ftz -mP3OPT_ecg_mm_fp_ld_latency16./barkern
44.c
37
Pfmon / Profile .pl example 2
Barkern 6122528008 BACK_END_BUBBLE_ALL 60804
04237 BE_EXE_BUBBLE_ALL 642645
BE_FLUSH_BUBBLE_ALL 41149543
BE_L1D_FPU_BUBBLE_ALL 12383933977
CPU_CYCLES BE_EXE_BUBBLE_ALL/ CPU_CYCLES
.49 Bar 2654275323 BACK_END_BUBBLE_ALL 2567
404906 BE_EXE_BUBBLE_ALL 615358
BE_FLUSH_BUBBLE_AL 85854187
BE_L1D_FPU_BUBBLE_ALL 9035670097
CPU_CYCLES BE_EXE_BUBBLE_ALL/ CPU_CYCLES .28
38
Pfmon / Profile .pl example 2
39
Pfmon / Profile .pl example 2
// Block 9 lentry lexit ltail collapsed
pipelined Pred 9 8 Succ 9 10 -S // Freq
1.2e05, Prob 0.99 .b3_9 // emit lab 1
.mfi (p16) ldfd f32r15,8
//0115 1207 (p17) fma.d
f41f48,f37,f42 //8118
1215 nop.i 0 .mfi (p16) ldfd
f36r14,8
//0115 1208 (p17) fma.d f45f33,f51,f46
//8119 1217 nop.i
0
40
Pfmon / Profile .pl example 2
.mfi nop.m 0 (p16) fma.d
f34f32,f36,f35 //6115
1209 nop.i 0 .mfi nop.m
0 (p17) fma.d f38f48,f52,f39
//14130 1230 nop.i 0
41
Pfmon / Profile .pl example 2
42
Pfmon / Profile .pl example 2
ecc -o barkern -O3 -ftz ./barkern44.c ecc -o bar
-O3 -ftz --mP3OPT_ecg_mm_fp_ld_latency16
./barkern44.c
43
Guideview
To get the profiling statistics for OpenMP use
the following compiler options -O3
-openmp -openmp_profile This will cause the
linker to use libguide_stats.a instead of
libguide.a For example efc O3 -openmp
-openmp_profile o swim swim.f To
get the profiling data you simply run the
program. For example export
OMP_NUM_THREADS8
./swim lt swim.in Once the program has
finished a file named swim.gvs will be
produced.
44
Guideview
Without Java, the functionality of Guideview is
severely limited but text output is still
available and is useful. The graphical portions
of Guideview require Java. Java 1.1.6-8 and
Java 1.2.2 version are supported. Later versions
seem to work also. To invoke guideview guideview
-jpath/root/java/j2sdk1.4.1/bin/java -mhz998
./swim.gvs NOTE a beta version of guideview is
now available for Vtune
45
Guideview Main panel
46
Guideview Region View
47
Guideview Thread View
48
lipfpm and histx
lipfpm does not work on statically linked
applications. The correct invocation for MPI is
mpirun -np N lipfpm ltlipfpm args including
"-f"gt a.out lta.out argsgt histx does not work on
statically linked executables The correct
invocation for MPI is mpirun -np N histx lthistx
args including "-f"gt a.out lta.out argsgt When
using dplace on OpenMP codes, the correct
invocation is dplace ltdplace args including
"-x13"gt histx lthistx args, "-f" not requiredgt \
a.out lta.out argsgt
49
Vtune Main Panel
50
Vtune Pick View As Table
51
Vtune Edit Menu, Pick Filter
52
Vtune To Drill down, Click on process ID
53
Vtune
Write a Comment
User Comments (0)
About PowerShow.com