COMP 308 Parallel Efficient Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

COMP 308 Parallel Efficient Algorithms

Description:

Ashton Building, room 3.15. E-mail: igor_at_csc.liv.ac.uk. COMP 308 web-page: ... von Neumann model. Multiple processors: Flynn's taxonomy. Slide 25. MISD. SISD ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 36
Provided by: IgorPo
Category:

less

Transcript and Presenter's Notes

Title: COMP 308 Parallel Efficient Algorithms


1
COMP 308Parallel Efficient Algorithms
Introduction to Parallel Computation
  • Lecturer Dr. Igor Potapov
  • Ashton Building, room 3.15
  • E-mail igor_at_csc.liv.ac.uk
  • COMP 308 web-page
  • http//www.csc.liv.ac.uk/igor/COMP308

2
Course Description and Objectives
  • The aim of the module is
  • to introduce techniques for the design of
    efficient parallel algorithms and
  • their implementation.

3
Learning Outcomes
  • At the end of the course you will be
  • ? familiar with the wide applicability of graph
    theory and tree algorithms as an abstraction for
    the analysis of many practical problems,
  • ? familiar with the efficient parallel
    algorithms related to many areas of computer
    science expression computation, sorting,
    graph-theoretic problems, computational geometry,
    algorithmics of texts etc.
  • ? familiar with the basic issues of implementing
    parallel algorithms.
  • Also a knowledge will be acquired of those
    problems which have been perceived as intractable
    for parallelization.

4
Teaching method
  • Series of 30 lectures ( 3hrs per week )
  • Lecture Monday 10.00
  • Lecture Tuesday 10.00
  • Lecture Friday 10.00
  • -------------- Course Assessment
    ----------------------
  • A two-hour examination 80
  • Continues assignment
  • (Written class test Home assignment) 20
  • --------------------------------------------------
    ---------------------

5
Recommended Course Textbooks
  • Introduction to AlgorithmsCormen et al.
  • Introduction to Parallel Computing Design and
    Analysis of AlgorithmsVipin Kumar, Ananth Grama,
    Anshul Gupta, and George Karypis, Benjamin
    Cummings 2nd ed. - 2003
  • Efficient Parallel Algorithms
  • A.Gibbons, W.Rytter, Cambridge University Press
    1988.

Research papers (will be announced later)
6
What is Parallel Computing?(basic idea)
  • Consider the problem of stacking (reshelving) a
    set of library books.
  • A single worker trying to stack all the books in
    their proper places cannot accomplish the task
    faster than a certain rate.
  • We can speed up this process, however, by
    employing more than one worker.

7
Solution 1
  • Assume that books are organized into shelves and
    that the shelves are grouped into bays
  • One simple way to assign the task to the workers
    is
  • To divide the books equally among them.
  • Each worker stacks the books one a time
  • This division of work may not be most efficient
    way to accomplish the task since
  • The workers must walk all over the library to
    stack books.

8
Solution 2
Instance of task partitioning
  • An alternative way to divide the work is to
    assign a fixed and disjoint set of bays to each
    worker.
  • As before, each worker is assigned an equal
    number of books arbitrarily.
  • If the worker finds a book that belongs to a bay
    assigned to him or her,
  • he or she places that book in its assignment spot
  • Otherwise,
  • He or she passes it on to the worker responsible
    for the bay it belongs to.
  • The second approach requires less effort from
    individual workers

Instance of Communication task
9
Problems are parallelizable to different degrees
  • For some problems, assigning partitions to other
    processors might be more time-consuming than
    performing the processing locally.
  • Other problems may be completely serial.
  • For example, consider the task of digging a post
    hole.
  • Although one person can dig a hole in a certain
    amount of time,
  • Employing more people does not reduce this time

10
Power of parallel solutions
  • Pile collection
  • Ants/robots with very limited abilities
  • (see its neighbourhood )
  • Grid environment
  • (sticks and robots)

Move() Move randomly ( ????) Until robot
sees a stick in its nighbouhood
Collect() Move() Pick up a sick Move() Put
it down Collect()
11
Sorting in nature
6 2 1 3 5 7 4
12
Parallel Processing(Several processing elements
working to solve a single problem)
  • Primary consideration elapsed time
  • NOT throughput, sharing resources, etc.
  • Downside complexity
  • system, algorithm design
  • Elapsed Time computation time
  • communication time
  • synchronization time

13
Design of efficient algorithms
  • A parallel computer is of little use unless
    efficient parallel algorithms are available.
  • The issue in designing parallel algorithms are
    very different from those in designing their
    sequential counterparts.
  • A significant amount of work is being done to
    develop efficient parallel algorithms for a
    variety of parallel architectures.

14
Processor Trends
  • Moores Law
  • performance doubles every 18 months
  • Parallelization within processors
  • pipelining
  • multiple pipelines

15
Why Parallel Computing
  • Practical
  • Moores Law cannot hold forever
  • Problems must be solved immediately
  • Cost-effectiveness
  • Scalability
  • Theoretical
  • challenging problems

16
Some Complex Problems
  • N-body simulation
  • Atmospheric simulation
  • Image generation
  • Oil exploration
  • Financial processing
  • Computational biology

17
Some Complex Problems
  • N-body simulation
  • O(n log n) time
  • galaxy ? 1011 stars ? approx. one year /
    iteration
  • Atmospheric simulation
  • 3D grid, each element interacts with neighbors
  • 1x1x1 mile element ? 5 ? 108 elements
  • 10 day simulation requires approx. 100 days

18
Some Complex Problems
  • Image generation
  • animation, special effects
  • several minutes of video ? 50 days of rendering
  • Oil exploration
  • large amounts of seismic data to be processed
  • months of sequential exploration

19
Some Complex Problems
  • Financial processing
  • market prediction, investing
  • Cornell Theory Center, Renaissance Tech.
  • Computational biology
  • drug design
  • gene sequencing (Celera)
  • structure prediction (Proteomics)

20
Fundamental Issues
  • Is the problem amenable to parallelization?
  • How to decompose the problem to exploit
    parallelism?
  • What machine architecture should be used?
  • What parallel resources are available?
  • What kind of speedup is desired?

21
Two Kinds of Parallelism
  • Pragmatic
  • goal is to speed up a given computation as much
    as possible
  • problem-specific
  • techniques include
  • overlapping instructions (multiple pipelines)
  • overlapping I/O operations (RAID systems)
  • traditional (asymptotic) parallelism techniques

22
Two Kinds of Parallelism
  • Asymptotic
  • studies
  • architectures for general parallel computation
  • parallel algorithms for fundamental problems
  • limits of parallelization
  • can be subdivided into three main areas

23
Asymptotic Parallelism
  • Models
  • comparing/evaluating different architectures
  • Algorithm Design
  • utilizing a given architecture to solve a given
    problem
  • Computational Complexity
  • classifying problems according to their difficulty

24
Architecture
  • Single processor
  • single instruction stream
  • single data stream
  • von Neumann model
  • Multiple processors
  • Flynns taxonomy

25
Flynns Taxonomy
MISD
MIMD
Many
Instruction Streams
SISD
SIMD
1
Many
1
Data Streams
26
(No Transcript)
27
Parallel Architectures
  • Multiple processing elements
  • Memory
  • shared
  • distributed
  • hybrid
  • Control
  • centralized
  • distributed

28
Parallel vs Distributed Computing
  • Parallel
  • several processing elements concurrently solving
    a single same problem
  • Distributed
  • processing elements do not share memory or system
    clock
  • Which is the subset of which?
  • distributed is a subset of parallel

29
Efficient and optimal parallel algorithms
  • A parallel algorithm is efficient iff
  • it is fast (e.g. polynomial time) and
  • the product of the parallel time and number of
    processors is close to the time of at the best
    know sequential algorithm
  • T sequential ? T parallel ? N processors
  • A parallel algorithms is optimal iff this product
    is of the same order as the best known sequential
    time

30
Metrics
A measure of relative performance between a
multiprocessor system and a single processor
system is the speed-up S( p), defined as follows
Execution time using a single processor
system Execution time using a multiprocessor with
p processors
S( p)
T1 Tp
Sp p
S( p)
Efficiency
Cost p ? Tp
31
Metrics
  • Parallel algorithm is cost-optimal
  • parallel cost sequential time
  • Cp T1
  • Ep 100
  • Critical when down-scaling
  • parallel implementation may
  • become slower than sequential
  • T1 n3
  • Tp n2.5 when p n2
  • Cp n4.5

32
Amdahls Law
  • f fraction of the problem thats inherently
    sequential
  • (1 f) fraction thats parallel
  • Parallel time Tp
  • Speedup with p processors

33
What kind of speed-up may be achieved?
  • Part f is computed by a single processor
  • Part (1-f) is computed by p processors, pgt1
  • Basic observation Increasing p we cannot
    speed-up part f.

f
34
Amdahls Law
  • Upper bound on speedup (p ?)
  • Example
  • f 2
  • S 1 / 0.02 50

35
The main open question
  • The basic parallel complexity class is NC.
  • NC is a class of problems computable in
    poly-logarithmic time (log c n, for a constant c)
    using a polynomial number of processors.
  • P is a class of problems computable sequentially
    in a polynomial time

The main open question in parallel computations
is NC P ?
Write a Comment
User Comments (0)
About PowerShow.com