159.703 Parallel Computing - PowerPoint PPT Presentation

About This Presentation
Title:

159.703 Parallel Computing

Description:

159.703 Parallel Computing (Dr.Wang) 2. Information ... Your Onus. Identifying portions of the work can be performed concurrently (Decomposition) ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 22
Provided by: wwwistM
Category:

less

Transcript and Presenter's Notes

Title: 159.703 Parallel Computing


1
159.703 Parallel Computing
  • Section 2 Principles of Parallel Algorithm
    Design
  • Dr. Ruili Wang
  • IIST, A/H 3.82
  • PN321, Massey
  • Email r.wang_at_massey.ac.nz
  • Phone 2548

2
Information
  • http//www-ist.massey.ac.nz/rwang/teach/PC.htm
  • One assignment (35)
  • One presentation (15)

3
Parallel Programming
  • Parallel programming involves
  • Decomposing an algorithm or data into parts
  • Distributing the parts as tasks which are worked
    on by multiple processors simultaneously
  • Coordinating work and communications of those
    processors
  • Parallel programming considerations
  • Type of parallel architecture being used
  • Type of processor communications used

4
Algorithm Design
  • Sequential algorithm is to specify a sequence of
    basic steps for solving a given problem using a
    serial computer
  • Parallel algorithm is to specify sets of steps
    that can be executed simultaneously
    (concurrency), for solving a given problem using
    a parallel computer

5
Your Onus
  • Identifying portions of the work can be performed
    concurrently (Decomposition)
  • Mapping the concurrent pieces of work onto
    multiple processors running in parallel
  • Distributing the input, output and intermediate
    data associated with the program
  • Managing accesses to data shared by multiple
    processors
  • Synchronizing the processors at various stages of
    the parallel execution

6
Design of parallel algorithms
  • Two Key Steps
  • Dividing a computation into smaller computations
  • Assigning them to different processors

7
Terminology
  • Decomposition is the process of dividing a
    computation into smaller parts, some or all of
    which may potentially be executed in parallel.
  • Tasks are programmer-defined units of computation
    into which main computation is subdivided by
    means of decomposition

8
Dense Matrix-vector Multiplication
  • Consider the multiplication of a dense nxn matrix
    A with a vector b to yield another vector y.
  • The i th element of yi ? A i,j b j
  • Computation of each yi can be regarded as a
    task
  • Alternatively, the computation can be decomposed
    into fewer.

9
Terminology
  • A task-dependency graphs is an abstraction used
    to express dependency among tasks and their
    relative order of execution.
  • It is a directed a cyclic graph in which the
    nodes represent tasks and directed edges
    indicated dependencies amongst them.
  • The task corresponding to a node can be executed
    only when all tasks that connected to this node
    by incoming edges have completed.
  • See example

10
Granularity
  • The number and size of tasks into which a problem
    is decomposed determines the granularity of the
    decomposition
  • Fine-grained
  • Coarse-grained
  • Depends on
  • The Computation
  • Number of processors

11
Degree of Concurrency (DC)
  • The maximum number of tasks that can be executed
    simultaneously in a parallel program at any given
    time is the maximum degree of concurrency
  • The maximum degree of concurrency ( lt gt???)
  • the total number of
    task
  • Due to dependencies among the tasks

12
Average Degree of Concurrency
  • It (ADC) is the average number of tasks that can
    run concurrently over the entire duration of
    execution of the program.
  • Both DC and ADC will increase as the granularity
    of tasks becomes
  • Finer?? Or Coarser ???)
  • The same granularity does not guarantee the same
    degree of concurrency.
  • Again, see the example AND task dependency graph

13
Task-dependency Graph
  • Start nodes no incoming edges
  • Finish nodes no outgoing edges
  • Critical path the longest directed path between
    any pair of stat and finish nodes.
  • Critical path lengththe sum of weights of nodes
    along this
  • The ratio of the total amount of work to the
    critical path length is Average Degree of
    Concurrency.

14
Unbound speedup?
  • An inherent bound on how fine-grained a
    decomposition a problem permits
  • I.E. limited granularity and degree of
    concurrency
  • Interactions among the tasks running on different
    physical processors
  • As they share input, output, intermediate data
    and dependencies, i.e.output of one task is input
    of another task .

15
Task interaction graph
  • It indicates the pattern of interaction among
    tasks
  • Again,
  • Nodes represent tasks
  • Edges connect tasks that interact with each other
  • Edges undirected, but directed can be used to
    indicate the direction of flow of data, if it is
    unidirectional
  • The edge-set of a task-interaction graph is
    usually a superset of the edge-set of
    task-dependency graph. See example

16
Sparse matrix-vector multiplication
  • In additional to assign yi to Task i, we also
    make it the owner of A i, of the matrix and
    the element of bi
  • The computation yi requires access to many
    elements owned by other tasks, i.e. Task i must
    get these elements from appropriate locations
  • In message passing paradigm, with the ownership
    of bi, Task i also inherits he responsibility
    of sending bi to all the other tasks

17
Processes and Mapping
  • Process refers to a processing or computing agent
    that perform tasks
  • It is an abstract entity that uses the code and
    data corresponding to a task to produce the
    output of that task within a finite amount of
    time after the task is activated by the parallel
    program
  • In addition to performing computations, a
    process may synchronise or communicate with other
    process

18
Processes and Mapping
  • The mechanism by which tasks are assigned to
    processes for execution is call mapping
  • A good mapping
  • Maximize the use of concurrency by mapping
    independent tasks onto different processes
  • Minimize the total completing time by ensuring
    that processes are available to execute that task
    on the critical path as soon as such tasks become
    executable
  • Minimize interaction among processes by mapping
    a high degree of mutual interaction onto the same
    process

19
Processes vs Processors
  • Processes are logical computing agent
  • Processors are the hardware units that physically
    perform computations.
  • High level abstraction may be required to express
    a parallel algorithm if it is a complex algorithm
    with multi-stages or with different form of
    parallelism
  • A parallel computer each node shared
    address-space-module

20
Next Session
  • Decomposition Techniques

21
Research 1
  • What is research?
  • What is research reqiured for MSc and PhD?
  • Who can perform a research task?
  • How to stat?
Write a Comment
User Comments (0)
About PowerShow.com