cc Compiler Parallelization Options - PowerPoint PPT Presentation

About This Presentation
Title:

cc Compiler Parallelization Options

Description:

The Sun Workshop 6.1 cc compiler does not support OpenMP, but ... Just compile and run on a ... granularity of the loop at compile time, it will generate both ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 15
Provided by: csewe4
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: cc Compiler Parallelization Options


1
cc Compiler Parallelization Options
  • CSE 260 Mini-project
  • Fall 2001
  • John Kerwin

2
Background
  • The Sun Workshop 6.1 cc compiler does not support
    OpenMP, but it does contain Multi Processing
    options similar to those in OpenMP.

FOR MORE INFO...
Chapter 4 of the C User's Guide
at http//docs.sun.com/htmlcoll/coll.33.7/iso-8859
-1/CUG/parallel.html discusses how the compiler
can Parallelize Sun ANSI/ISO C Code. Slides from
"Application Tuning on Sun Systems" by Ruud van
der Pas at http//www.uni-koeln.de/RRZK/server/sun
fire/Koln_Talk_Jun2001_Summary_HO.pdf contain a
lot of useful information about compiler options.

3
Three Ways to Enable Compiler Parallelization
  • -xautopar Automatic parallelization
  • Just compile and run on a multiprocessor.
  • Use command like "setenv PARALLEL 8" to set the
    number of processors at runtime.
  • -xexplicitpar Explicit parallelization only
  • Use pragmas similar to those used in OpenMP to
    guide the compiler.  
  • -xparallel Automatic and explicit parallelization

4
-xautopar
  • Requires -xO3 or higher optimization
  • Includes -xdepend
  • -xdepend analyzes loops for inter-iteration data
    dependencies and restructures them if possible to
    allow different iterations of the loop to be
    executed in parallel.
  • -xautopar analyzes every loop in the program and
    generates parallel code for parallelizable loops.

5
How Automatic Parallelization Works
  • At the beginning of the program, the master
    thread spawns slave threads to execute the
    parallel code.
  • The slave threads wait idly until the master
    thread encounters a parallelizable loop that is
    profitable to execute in parallel.
  • If it encounters one, different iterations of the
    loop are assigned to slave threads, and all the
    threads synchronize at a barrier at the end of
    the loop.

6
How Automatic Parallelization Works (continued)
  • The master thread uses an estimate of the
    granularity of each loop (number of iterations,
    versus the overhead of distributing work to
    threads and synchronizing) to determine whether
    or not it is profitable to execute the loop in
    parallel.  
  • If it cannot determine the granularity of the
    loop at compile time, it will generate both
    serial and parallel versions of the loop, and
    only call the parallel version at runtime if the
    number of iterations justify the overhead. 

7
How Effective is xautopar?
  • Success or failure with -xautopar depends on
  • Type of application
  • Coding style
  • Quality of the compiler
  • The compiler may not be able to automatically
    parallelize the loops in the most efficient
    manner.
  • This can happen if
  • The data dependency analysis is unable to
    determine whether or not it is safe to
    parallelize a loop.
  • The granularity is not high enough because the
    compiler lacks information to parallelize the
    loop at the highest possible level.
  • Can check parallelization messages to see which
    loops were parallelized by using the -xloopinfo
    option.

8
xexplicitpar
  • This is when explicit parallelization through
    pragmas comes into the picture.
  • -xexplicitpar allows the programmer to insert
    pragmas into the code to guide the compiler on
    how to parallelize certain loops. 
  • The programmer is responsible for ensuring
    pragmas are used correctly, otherwise results are
    undefined.
  • Use -xvpara to print compiler warnings about
    potentially misused pragmas.

9
Examples of some Pragmas Similar to OpenMP Pragmas
  • Static Scheduling
  • All the iterations of the loop are uniformly
    distributed among all the participating
    processors. 
  • pragma MP taskloop schedtype(static)
  • for(i1 i lt N-1 i)
  • ...
  • similar to
  • pragma omp for schedule(static)

10
Examples of some Pragmas Similar to OpenMP Pragmas
  • Dynamic Scheduling with a Specified chunk_size
  • pragma MP taskloop schedtype(self(120))
  • similar to
  • pragma omp for schedule(dynamic, 120)
  •  
  • Guided Dynamic Scheduling with a Minimum
    chunk_size
  • pragma MP taskloop schedtype(gss(10))
  • similar to
  • pragma omp for schedule(guided, 10)

11
Speedup Using Static, Dynamic, and Guided MP
Pragmas with 8 Processors
12
Speedup from MPI, Pthreads, and Sun MP Programs
with 8 Processors
13
Time Spent Converting Serial Code to Parallel Code
14
Coming Soon OpenMP
  • OpenMP is supported in the Workshop 6.2 C
    compiler
  • include ltomp.hgt
Write a Comment
User Comments (0)
About PowerShow.com