CISC 372 5 October - PowerPoint PPT Presentation

About This Presentation
Title:

CISC 372 5 October

Description:

In MPI programming, goal often to create one agglomerated task per processor ... Finding optimal mapping is NP-hard. Must rely on heuristics. Mapping Checklist ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 26
Provided by: mikej9
Category:
Tags: cisc | create | october

less

Transcript and Presenter's Notes

Title: CISC 372 5 October


1
CISC 372 5 October
  • Goals for today
  • Fosters parallel algorithm design
  • Partitioning
  • Task dependency graph
  • Granularity
  • Concurrency
  • Collective communication

2
Task/Channel Model
3
Parallel Algorithm Design
  • Large problem
  • May be complex
  • May have large data set
  • May be embarrassingly parallel
  • How solve problem? (Fosters Methodology)
  • Partition problem data
  • Determine communication requirements
  • Agglomerate tasks into more efficient size
  • Map tasks to processors

4
Fosters Methodology
5
Partitioning
  • Dividing computation and data into pieces
  • Domain decomposition
  • Divide data into pieces
  • Determine how to associate computations with the
    data
  • Functional decomposition
  • Divide computation into pieces
  • Determine how to associate data with the
    computations

6
Partitioning Checklist
  • At least 10x more primitive tasks than processors
    in target computer
  • Minimize redundant computations and redundant
    data storage
  • Primitive tasks roughly the same size
  • Number of tasks an increasing function of problem
    size

7
Fosters Methodology
8
Communication
  • Determine values passed among tasks
  • Local communication
  • Task needs values from a small number of other
    tasks
  • Create channels illustrating data flow
  • Global communication
  • Significant number of tasks contribute data to
    perform a computation
  • Dont create channels for them early in design

9
Communication Checklist
  • Communication operations balanced among tasks
  • Each task communicates with only small group of
    neighbors
  • Tasks can perform communications concurrently
  • Task can perform computations concurrently

10
Fosters Methodology
11
Agglomeration
  • Grouping tasks into larger tasks
  • Goals
  • Improve performance
  • Maintain scalability of program
  • Simplify programming
  • In MPI programming, goal often to create one
    agglomerated task per processor

12
Agglomeration to Improve Performance
  • Eliminate communication between primitive tasks
    agglomerated into consolidated task
  • Combine groups of sending and receiving tasks

13
Agglomeration Checklist
  • Locality of parallel algorithm has increased
  • Replicated computations take less time than
    communications they replace
  • Data replication doesnt affect scalability
  • Agglomerated tasks have similar computational and
    communications costs
  • Number of tasks increases with problem size
  • Number of tasks suitable for likely target
    systems
  • Tradeoff between agglomeration and code
    modifications costs is reasonable

14
Fosters Methodology
15
Mapping
  • Process of assigning tasks to processors
  • Centralized multiprocessor mapping done by
    operating system
  • Distributed memory system mapping done by user
  • Conflicting goals of mapping
  • Maximize processor utilization
  • Minimize interprocessor communication

16
Mapping Example
17
Optimal Mapping
  • Finding optimal mapping is NP-hard
  • Must rely on heuristics

18
Mapping Checklist
  • Considered designs based on one task per
    processor and multiple tasks per processor
  • Evaluated static and dynamic task allocation
  • If dynamic task allocation chosen, task allocator
    is not a bottleneck to performance
  • If static task allocation chosen, ratio of tasks
    to processors is at least 101

19
Complex Mesh Decomposition
  • Mesh Cube With
  • Hollow Sphere inside

Cross-section of Cube
  • Finite number of tetrahedra
  • Each tetrahedron varies in size

20
Tetra-he-who?
  • Tetrahedron A solid having four triangular faces

Maybe not this.
This is a tetrahedron.
21
Static Task Number,Unstructured Comm,
Partitioned with METIS
  • Mesh Cube

22
Edge Detection
  • Finite number of pixels
  • All pixels same size
  • All pixel values constrained (0-255)
  • Stencil computation

23
Electromagnetic Fields
  • Rough surface sphere
  • Radiation source at center
  • Measure strength on surface
  • Finite number of points
  • Laplacian equation (Jacobi method - iterate until
    converge on solution)
  • Convergence time varies at each point

24
Parallel Tree Search
  • Unbalanced tree
  • Nodes may change from active to inactive at any
    time
  • Searching for specific object in set of all
    active nodes

25
Online Sort
  • Receive list in separate pieces at different
    times (constant update)
  • Keep list sorted at all times
Write a Comment
User Comments (0)
About PowerShow.com