Title: Lessons Learned and A Pragmatic Approach to Parallelism
1Lessons Learned andA Pragmatic Approach to
Parallelism
- Saman Amarasinghe
-
- Computer Science and Artificial Intelligence
Laboratory - Massachusetts Institute of Technology
2The Raw Multicore Processor
With Anant Agarwal and the MIT Raw Group 1997 -
2003
- Multiprocessor ? Multicore a Paradigm shift
- Bisection Bandwidth
- Build a tightly integrated multicore
- 16 single issue cores
- 4 register-mapped networks
- Huge IO bandwidth
- Raw power
- 16 Flops/ops per cycle
- 16 Memory Accesses per cycle
- 208 Operand Routes per cycle
- 12 IO Operations per cycle
- Able to efficiently execute fine-grained
parallelism - But Programming is still hard
3StreamIt Language and Compiler
With the MIT Commit Group 2002 - current
- Some programming models are inherently concurrent
- Streaming domain
- A win-win situation
- StreamIt Language
- Conceptually easy to understand and program
- StreamIt Compiler
- Finds the Inherent Parallelism
- Task Parallelism
- Data Parallelism
- Pipeline Parallelism
- Map the parallelism to a given multicore
- Use all available parallelism
- Maximize load-balance
- Minimize communication
- Scalable, stable, architecture-independent
- But what about legacy code
4SUIF Parallelizing Compiler
With Monica Lam and the Stanford SUIF team 1993
- 1997
- Parallel Programming is hard!
- Billons of LOC written in sequential languages
- Automatically extract parallelism from sequential
programs - Heroic Analysis
- Interprocedural analysis
- Array and scalar data-flow analysis
- Reduction and recurrence recognition
- C to FORTRAN
- Achieved Best SPEC results of the day
- Vector processor Cray C90 540
- Uniprocessor Digital 21164 508
- SUIF on 8 processors Digital 8400 1,016
- But Techniques not robust for general use
5Crossing the Chasm
Legacy Program Source File
Original Binary
- Millions of legacy programs exist
- Need to help them cross the chasm to parallelism
- ISVs will not go there if the burden is too heavy
- There is no silver bullet to parallelism
- Needs a holistic approach
- Help Programmers change the body, but preserve
the soul ? Program Reincarnation - Critical for the success of multicores
.exe
Original Compiler
6Program Reincarnation
- Dynamic analysis
- Managed program execution
- Program invariant inference
- Application knowledge database
Legacy Program Source File
Original Binary
.exe
Original Compiler
Managed Program Execution
Instrumenter and Binary interpreter
Application Knowledge (program representation
invariants)
.log
7Program Reincarnation
- Dynamic analysis
- Managed program execution
- Program invariant inference
- Application knowledge database
- Assisted parallelization
- GUI tool
Legacy Program Source File
Original Binary
.exe
Original Compiler
Managed Program Execution
Instrumenter and Binary interpreter
Application Knowledge (program representation
invariants)
.log
Assisted Application Reincarnation Tool
Compiler Instrumenter
Reincarnated .c
.exe
8Program Reincarnation
- Dynamic analysis
- Managed program execution
- Program invariant inference
- Application knowledge database
- Assisted parallelization
- GUI tool
- Correctness in reincarnated
- Test Generation
- Divergence Analysis
Legacy Program Source File
Original Binary
.exe
Original Compiler
Managed Program Execution
Instrumenter and Binary interpreter
Application Knowledge (program representation
invariants)
.log
Test Generation
.log
Divergence Analysis
Managed Program Execution
Assisted Application Reincarnation Tool
Compiler Instrumenter
Reincarnated .c
.exe
9Program Reincarnation
- Dynamic analysis
- Managed program execution
- Program invariant inference
- Application knowledge database
- Assisted parallelization
- GUI tool
- Correctness in reincarnated
- Test Generation
- Divergence Analysis
- Static analysis
- Automatic parallelization
- info for program understanding
Legacy Program Source File
Original Binary
.exe
Original Compiler
Automatic Parallelization
Managed Program Execution
Instrumenter and Binary interpreter
Application Knowledge (program representation
invariants)
Static Analysis
.log
Test Generation
.log
Divergence Analysis
Managed Program Execution
Assisted Application Reincarnation Tool
Compiler Instrumenter
Reincarnated .c
.exe
10Program Reincarnation
- Dynamic analysis
- Managed program execution
- Program invariant inference
- Application knowledge database
- Assisted parallelization
- GUI tool
- Correctness in reincarnated
- Test Generation
- Divergence Analysis
- Static analysis
- Automatic parallelization
- info for program understanding
- Learn about the domain
- Flag domain specific issues
- Generate domain-specific hints
Legacy Program Source File
Original Binary
Domain Knowledge Database
.exe
Original Compiler
Automatic Parallelization
Managed Program Execution
Instrumenter and Binary interpreter
Domain Knowledge Extraction
Application Knowledge (program representation
invariants)
Static Analysis
.log
Known Idiom Identification Domain Hint
Generation
Test Generation
.log
Divergence Analysis
Managed Program Execution
Assisted Application Reincarnation Tool
Compiler Instrumenter
Reincarnated .c
.exe
11Program Reincarnation
- Dynamic analysis
- Managed program execution
- Program invariant inference
- Application knowledge database
- Assisted parallelization
- GUI tool
- Correctness in reincarnated
- Test Generation
- Divergence Analysis
- Static analysis
- Automatic parallelization
- info for program understanding
- Learn about the domain
- Flag domain specific issues
- Generate domain-specific hints
Legacy Program Source File
Original Binary
Domain Knowledge Database
.exe
Original Compiler
Automatic Parallelization
Managed Program Execution
Instrumenter and Binary interpreter
Domain Knowledge Extraction
Application Knowledge (program representation
invariants)
Static Analysis
.log
Known Idiom Identification Domain Hint
Generation
Test Generation
.log
Divergence Analysis
Refactoring Identification
Managed Program Execution
Assisted Application Reincarnation Tool
Compiler Instrumenter
Reincarnated .c
.exe
Block Diagram Representation
12Program Reincarnation Example Extracting
Coarse-Grain Pipeline Parallelism in C Programs
- Streaming applications normally written in C
- Extremely complex program (function pointers
etc.) - Hard to understand by a human programmer
- Impossible by a compiler
- Butthe access patterns are simple, repetitive
and stable
13Approach
- Programmer Annotations
- Identifies possible pipeline stages
14Approach
- Programmer Annotations
- Identifies possible pipeline stages
- Dynamic analysis
- Check if parallel
- Identify communication
15Approach
- Programmer Annotations
- Identifies possible pipeline stages
- Dynamic analysis
- Check if parallel
- Identify communication
- Iterate until satisfied
16Approach
- Programmer Annotations
- Identifies possible pipeline stages
- Dynamic analysis
- Check if parallel
- Identify communication
- Iterate until satisfied
- Then the tool automatically
- Generates the parallel code
- Inserts communication
17Finally.
- Stop training programmers oblivious to
performance - Concurrency vs. Parallelism
- We need a lot more innovation! Languages that..
- require no non-intuitive reorganization of data
or code. - eliminate hard problems such as race conditions
and deadlocks (akin to the elimination of memory
bugs in Java) - inform the programmer if they have done something
illegal (akin to a type system or runtime
null-pointer checks) - off-load the parallelism and performance issues
to the compiler (akin to ILP compilation to VLIW
machines) - take advantage of domains to reduce the
parallelization burden (akin to the StreamIt
language for the streaming domain) - use novel hardware to eliminate problems help
the programmer (akin to cache coherence hardware)