Parallel Image Processing - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Parallel Image Processing

Description:

Vector (supercomputers, MMX) SIMD (graphics processors) Custom. FPGA. October 26, 2006 ... Vector instructions. Supercomputers. MMX/SSEx. Special instructions ... – PowerPoint PPT presentation

Number of Views:157
Avg rating:3.0/5.0
Slides: 23
Provided by: wouter5
Category:

less

Transcript and Presenter's Notes

Title: Parallel Image Processing


1
Parallel Image Processing
  • Programming and Architecture

IST PhD Lunch Seminar
Wouter Caarls
Quantitative Imaging Group
2
Why Parallel?
  • Processing time
  • Smaller timesteps, more scales, faster response
    times
  • Memory
  • Larger images, more dimensions
  • Energy consumption
  • More applications, smaller devices

3
Data parallelism
  • Many image processing operations have locality
    of reference (segmentation, filtering, distance
    transforms, etc.)
  • Data parallelism

4
Task farm parallelism
  • An application consists of many different
    operations
  • Some of these operations are independent (scale
    spaces, parameter sweeps, noise realizations,
    etc.)
  • Task farm parallelism

5
Pipeline parallelism
  • An image processing algorithm consists of
    consecutive stages
  • If multiple objects are to be processed, they
    may be in different stages at the same time
  • Pipeline parallelism

6
Parallel hardware architecturesFine grained
  • Irregular
  • Superscalar (most modern microprocessors)
  • VLIW (DSPs)
  • Regular
  • Vector (supercomputers, MMX)
  • SIMD (graphics processors)
  • Custom
  • FPGA

7
Parallel hardware architecturesCoarse grained
  • Homogeneous
  • Multi-core, SMP
  • Cluster
  • Heterogeneous
  • Embedded systems
  • Grid

8
Obstacles
  • Programming
  • Synchronization, bookkeeping
  • Different systems, languages, optimization
    strategies
  • Choosing an architecture
  • Analyze program before it is written
  • Additional requirements or unexpected performance
    may require rewrite

9
Architecture-independent parallel programming
  • Data parallelism
  • Differentiate between synchronization pattern and
    computation
  • Library provides pattern, user provides
    computation
  • Task farm pipeline parallelism
  • Operations do not work on images, but on streams
  • Sequences of operation calls do not imply an
    order, but a stream graph.

10
Algorithmic Skeletons
11
Example skeletons
  • Pixel
  • Neighbourhood
  • Recursive neighbourhood
  • Stack
  • Filter
  • Associative reduction

12
Constructing stream graphs
capture
normalize
  • By program (dynamic)
  • capture(orig)
  • normalize(orig, norm)
  • dx(orig, x_der, 1.0)
  • dy(orig, y_der, 1.0)
  • direction(x_der, y_der, dir)
  • display(dir)
  • Visually (static)

dx
dy
direction
display
13
Mapping stream graphs to processors
14
Dealing with heterogeneous tasks
15
Dealing with interconnect
16
Dealing with dependencies
17
Choosing an architecture automatically
  • Architecture-independent program allows automatic
    analyis after it is written, but before an
    architecture is chosen
  • Based on certain constraints, architecture can be
    chosen automatically to optimize some cost
    function.
  • Tradeoff between cost, power and performance must
    be made by the designer

18
Design Space Exploration
Archi- tecture
Explore
Program
Metrics
Analyze
19
Search strategyConstrained single objective
20
Search strategyMultiobjective tradeoff iteration
21
Search strategyStrength Pareto
22
Conclusions
  • Architecture-independent programming allows
  • Parallel programming without bookkeeping
  • Targeting heterogeneous systems
  • Choosing the most appropriate architecture
    automatically
  • http//www.qi.tnw.tudelft.nl/wcaarls/smartcam

23
Overview
  • Parallelism in image processing
  • Parallel hardware architectures
  • Architecture-independent parallel programming
  • Algorithmic skeletons
  • Stream programming
  • Choosing an appropriate architecture
  • Design Space Exploration

24
Exploiting parallelismFine grained, irregular
  • Superscalar
  • Dataflow dispatch reorder
  • Most modern microprocessors
  • Automatic by processor
  • Very Long Instruction Word
  • Multiple instructions per word
  • DSPs, Itanium
  • Automatic by compiler

25
Exploiting parallelismFine grained, regular
  • Vector instructions
  • Supercomputers
  • MMX/SSEx
  • Special instructions/datatypes
  • Single Instruction Multiple Data
  • Graphics processors
  • Special languages

26
Exploiting parallelismCoarse grained
  • Multiprocessing
  • Multiple processors/cores sharing a memory
  • Shared-memory threading libraries (pthread,
    OpenMP)
  • Clusters
  • Relatively loosely coupled systems connected by a
    network
  • Message-passing libraries (MPI)
  • Heterogeneous systems
  • Exploit differences in algorithmic requirements
  • Multiple paradigms in a single application
Write a Comment
User Comments (0)
About PowerShow.com