Title: cc Compiler Parallelization Options
1cc Compiler Parallelization Options
- CSE 260 Mini-project
- Fall 2001
- John Kerwin
2Background
- The Sun Workshop 6.1 cc compiler does not support
OpenMP, but it does contain Multi Processing
options similar to those in OpenMP.
FOR MORE INFO...
Chapter 4 of the C User's Guide
at http//docs.sun.com/htmlcoll/coll.33.7/iso-8859
-1/CUG/parallel.html discusses how the compiler
can Parallelize Sun ANSI/ISO C Code. Slides from
"Application Tuning on Sun Systems" by Ruud van
der Pas at http//www.uni-koeln.de/RRZK/server/sun
fire/Koln_Talk_Jun2001_Summary_HO.pdf contain a
lot of useful information about compiler options.
3Three Ways to Enable Compiler Parallelization
- -xautopar Automatic parallelization
- Just compile and run on a multiprocessor.
- Use command like "setenv PARALLEL 8" to set the
number of processors at runtime. - -xexplicitpar Explicit parallelization only
- Use pragmas similar to those used in OpenMP to
guide the compiler. Â - -xparallel Automatic and explicit parallelization
4-xautopar
- Requires -xO3 or higher optimization
- Includes -xdepend
- -xdepend analyzes loops for inter-iteration data
dependencies and restructures them if possible to
allow different iterations of the loop to be
executed in parallel. - -xautopar analyzes every loop in the program and
generates parallel code for parallelizable loops.
5How Automatic Parallelization Works
- At the beginning of the program, the master
thread spawns slave threads to execute the
parallel code. - The slave threads wait idly until the master
thread encounters a parallelizable loop that is
profitable to execute in parallel. - If it encounters one, different iterations of the
loop are assigned to slave threads, and all the
threads synchronize at a barrier at the end of
the loop.
6How Automatic Parallelization Works (continued)
- The master thread uses an estimate of the
granularity of each loop (number of iterations,
versus the overhead of distributing work to
threads and synchronizing) to determine whether
or not it is profitable to execute the loop in
parallel. Â - If it cannot determine the granularity of the
loop at compile time, it will generate both
serial and parallel versions of the loop, and
only call the parallel version at runtime if the
number of iterations justify the overhead.Â
7How Effective is xautopar?
- Success or failure with -xautopar depends on
- Type of application
- Coding style
- Quality of the compiler
- The compiler may not be able to automatically
parallelize the loops in the most efficient
manner. - This can happen if
- The data dependency analysis is unable to
determine whether or not it is safe to
parallelize a loop. - The granularity is not high enough because the
compiler lacks information to parallelize the
loop at the highest possible level. - Can check parallelization messages to see which
loops were parallelized by using the -xloopinfo
option.
8xexplicitpar
- This is when explicit parallelization through
pragmas comes into the picture. - -xexplicitpar allows the programmer to insert
pragmas into the code to guide the compiler on
how to parallelize certain loops. - The programmer is responsible for ensuring
pragmas are used correctly, otherwise results are
undefined. - Use -xvpara to print compiler warnings about
potentially misused pragmas.
9Examples of some Pragmas Similar to OpenMP Pragmas
- Static Scheduling
- All the iterations of the loop are uniformly
distributed among all the participating
processors. - pragma MP taskloop schedtype(static)
- for(i1 i lt N-1 i)
-
- ...
-
- similar to
- pragma omp for schedule(static)
10Examples of some Pragmas Similar to OpenMP Pragmas
- Dynamic Scheduling with a Specified chunk_size
- pragma MP taskloop schedtype(self(120))
- similar to
- pragma omp for schedule(dynamic, 120)
- Â
- Guided Dynamic Scheduling with a Minimum
chunk_size - pragma MP taskloop schedtype(gss(10))
- similar to
- pragma omp for schedule(guided, 10)
11Speedup Using Static, Dynamic, and Guided MP
Pragmas with 8 Processors
12Speedup from MPI, Pthreads, and Sun MP Programs
with 8 Processors
13Time Spent Converting Serial Code to Parallel Code
14Coming Soon OpenMP
- OpenMP is supported in the Workshop 6.2 C
compiler - include ltomp.hgt