Overview - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Overview

Description:

UPC is a global address space language for parallel ... Elan MPI: 40ms. MuPC: 63ms. LAM MPI: 37ms. Time. Matrix Multiplication (na ve) shared[P] int a[N][P] ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 11
Provided by: Ste8252
Category:
Tags: elan | overview

less

Transcript and Presenter's Notes

Title: Overview


1
Overview Unified Parallel C is an
extension to ANSI C. UPC is a global address
space language for parallel programming. UPC
extends C by providing shared arrays, data
affinity to processors, a parallel loop
construct, locks and split-barrier
synchronization primitives. The first UPC
compiler was written for the Cray T3E. UPC
compilers are now available for AlphaServer and
SGI platforms.
2
Example UPC Program
Memory Layout
Thread 0
Thread 1
a0 0
Shared
a1 0
b 0
Local
shared int aTHREADS shared int b void
main(void) if(MYTHREAD 0) a0
4 a1 2 upc_barrier
shared int aTHREADS shared int b void
main(void) upc_barrier if(MYTHREAD 1)
b a0
a0 4
Shared
a1 2
b 4
Local
3
The Big Picture
UPC Code
EDG UPC to C
Translator
UPC Intermediate code in C
C
UPC Executable Code
MuPC RTS Object Code
Compiler
MPI Library
4
The Run Time System Interface
The run time system interface is divided into six
parts. Initialization and finalization Gets
and put to implement one-sided remote
references. Synchronization functions to
implement the UPC builtins barrier, notify and
wait Locks to implement upc_lock, upc_unlock
and upc_lockattempt Dynamic memory allocation
functions to implement upc_local_alloc,
upc_global_alloc and upc_all_alloc String
functions to implement upc_memcpy, upc_memget,
upc_memset and upc_memput
5
MuPC
MuPC is Michigan Technological Universitys
implementation of Compaqs runtime system
interface. MuPC is open source. MuPC
available on Alpha Server, Sun Solaris and Linux
Clusters. MuPC is a user level implementation
based on Pthreads and MPI.
6
MuPC Design
mupcrun -n 3 a.out
pthread_create
pthread_create
pthread_create
Send Recv Pthread
Send Recv Pthread
User UPC Pthread
User UPC Pthread
Send Recv Pthread
User UPC Pthread
upc_finalize
upc_finalize
upc_finalize
1 UPC 2 Pthreads 1 Unix process The user
UPC Pthread is the users code. The send/recv
Pthread uses MPI for interprocess communication.
7
Ping-Pong Test Performance
LAM MPI 37ms
MuPC 63ms
2GHz Intel Processors, (Gigabit ethernet)
MuPC 55ms
Elan MPI 40ms
AlphaServer
MuPC 75ms
Sun MPI 7ms
Sun Enterprise 4500
Time
8
Matrix Multiplication (naïve)
16x2x2GHz Intel processors, Gigabit ethernet
Total problem size 128x128 integer
sharedP int aNP shared int
bPM sharedM int cNM forall(i0iltNi
ai0) for(j0jltMj) sum0
for(k0kltPk) sumaikbkj
cijsum
1 2 4 8
16
9
Matrix Multiplication (with prefetching)
16x2x2GHz Intel processors, Gigabit ethernet
Total problem size 128x128 integer
int local_aP forall(j0jltMjb0j)
for(i0iltNi) upc_memget(local_a,ai,
Psizeof(int)) sum0
for(k0kltPk) sumlocal_akbkj
cijsum
1 2 4 8
16
10
Matrix Multiplication (prefetching local
pointer)
16x2x2GHz Intel processors, Gigabit ethernet
Total problem size 128x128 integer
int local_aP int pb int strideM/THREADS fo
rall(j0jltMjb0j) for(i0iltNi)
pb(int)b0j upc_memget(local_a,ai,
Psizeof(int)) sum0
for(k0,s0kltPk, sstride)
sumlocal_akpbs cijsum
1 2 4 8
16
Write a Comment
User Comments (0)
About PowerShow.com