GAMMA: An Efficient Distributed Shared Memory Toolbox for MATLAB PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: GAMMA: An Efficient Distributed Shared Memory Toolbox for MATLAB


1
GAMMA An Efficient Distributed Shared Memory
Toolbox for MATLAB
  • Rajkiran Panuganti1,
  • Muthu Baskaran1, Jarek Nieplocha2,
  • Ashok Krishnamurthy3, Atanas Rountev1, P.
    Sadayappan1
  • 1 The Ohio State University
  • 2 PNNL
  • 3 Ohio Supercomputer Center

2
Overview
  • Motivation
  • GAMMA Programming Model
  • Implementation Overview
  • Experimental Evaluation
  • Conclusions

3
High Productivity Computing
  • Programmers productivity is extremely important
  • C/Fortran Good performance but poor
    productivity
  • Parallel Programming in C/Fortran even harder
  • MATLAB, Python etc. Good programmer
    productivity
  • Poor performance and inability to run large scale
    problems (memory limitations)

4
MATLAB and High Productivity
  • Numerous features resulting in High Programmer
    Productivity
  • Array Based Semantics
  • Copy/Value based semantics
  • Debugging and Profiling Support
  • Integrated Development Environment
  • Numerous Domain Specific libraries (Toolboxes)
  • Visualization
  • And a lot more......
  • Need to retain above features while addressing
    performance Issues

5
Problem
Out-Of-Memory!
Out-Of-Memory!
Performance!
199 sec
10.19 s
6
ParaM - Parallel MATLAB
USER
DParaM
GAMMA
Specialized Libraries
mexMPI
Library Writers
Compiler
MATLAB
GA MVAPICH
GA MVAPICH
7
Overview
  • Motivation
  • GAMMA Programming Model
  • Implementation Overview
  • Experimental Evaluation
  • Conclusions

8
Programming Model
  • Global Shared View of the distributed Array

Physical View
Logical View
(1,1)
P1
(250,75)
P0
P2
P3
(700,610)
(1024,1024)
A GA(1024, 1024,distr) Block
A(250700,75610)
9
Programming Model (Contd..)
  • Get-Compute-Put Computation Model

Get()
Put()
Put()
Process 0
Get()
Compute
Process 1
Compute
10
Other features in the Programming Model enabling
Efficiency
  • Pass-by-reference semantics for distributed
    arrays
  • Intended for Library writers
  • Management of Data Locality (NUMA)
  • Distribution information can be retrieved by the
    programmer
  • Reference based access to the local data
  • Data replication
  • Support for replicating near-neighbor data

11
Other features in the Programming Model enabling
Efficiency Contd..
  • Asynchronous operations
  • Support for Library Writers
  • Interoperable with Message Passing
  • Message Passing support using mexMPI
  • Interoperable with some other Parallel MATLAB
    projects
  • Interoperable with pMATLAB, Mathworks DCT

12
Illustration by Example (FFT2) 2D FFT
  • rank, nprocs Begin()
  • dims N N distr N N/nprocs
  • A GA(dims, distr)
  • tmplocal(A) GET()
  • tmp fft(tmp) Compute()
  • Put(A,tmp) PUT()
  • Sync()
  • ATmp GA(A)
  • Transpose(A,ATmp) Collective Ops
  • Tmp local(ATmp)
  • Put(ATmp,fft(Tmp))
  • Sync()
  • Transpose(ATmp,A)
  • GA_End()

Transpose
13
Software Architecture
User
MATLAB Front-End
GAMMA
mexMPI
GA
MATLAB Computation Engine
MPI
SCALAPACK
14
Overview
  • Motivation
  • GAMMA Programming Model
  • Implementation Overview
  • Experimental Evaluation
  • Conclusions

15
Evaluation
  • OSC Pentium 4 Cluster
  • Two 2.4 GHz Intel P4 processors per node, Linux
    kernel 2.6.6 , 4GB RAM,
  • MVAPICH 0.9.4
  • Infiniband
  • MATLAB Version 7.01
  • Fully distributed environment
  • Evaluation using NAS Benchmarks

16
Programmability
Moderate Increase in SLOC
Moderate Increase in SLOC
Moderate Increase in SLOC
Slight Increase in SLOC
17
Performance Analysis

18
Performance Analysis
19
Use of reference-based semantics
20
Speedup on Large Problem Sizes
21
Related Work
  • Early 90s MPI Cluster Programming
  • 1995 Why there isnt a Parallel MATLAB?
    Cleve Moler
  • Embarrassingly Parallel
  • Paralize(98) Multi(00) PLab(00)
    Parmatlab(01)
  • Message Passing
  • MultiMatlab(96) PT(96) DPToolbox(99)
    MATmarks(99) PMI(99) MPITB/PVMTB(00)
    CMTM(01)
  • Compilation Based
  • Conlab(93) Falcon(95) ParAL(95) Otter(98)
    Menhir(98) MaJIC(98) MATCH(00)
    RTExpress(00)
  • Backend Support
  • Matpar(98) DLab(99) Netsolve(01)
    Paramat(01)

22
Related Work (Currently Active)
  • Star-P (97) MIT
  • MatlabMPI(98) pMATLAB(02) MIT-LL
  • File-based Message Passing Communication
  • MATLAB_D (00) Rice
  • Telescoping Compilation HPF JIT Compilation
  • ParaM (04) OSU OSC
  • Mathworks(04) MDCE/MDCT

23
Conclusions
  • Discussed an efficient Distributed Shared Memory
    Toolbox for MATLAB
  • Programming Model and Efficiency features of the
    toolbox
  • Demonstrated efficiency using NAS Benchmarks
  • Download available upon request

24
Questions ?
  • Contact
  • panugant_at_cse.ohio-state.edud

25
Backup
  • NAS FT A
  • NAS EP A
  • Implementation Issues

26
Performance Analysis Contd
27
Implementation Issues
  • Different Memory managers
  • Automated Book Keeping
  • Data layout inconsistencies
  • In-Place Operations
  • Data movement between different workspaces
  • Out-of-order and irregular accesses

28
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com