Hardware Acceleration of Parallel Prefix Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

Hardware Acceleration of Parallel Prefix Algorithms

Description:

Peter Scott (Project leader) Avinash Srinivasa Vaibhav Sundriyal What is parallel prefix? Finding parallelism in serial-looking problems. Take an array, like [1, 3, 2 ... – PowerPoint PPT presentation

Number of Views:143
Avg rating:3.0/5.0
Slides: 14
Provided by: Scott939
Category:

less

Transcript and Presenter's Notes

Title: Hardware Acceleration of Parallel Prefix Algorithms


1
Hardware Acceleration of Parallel Prefix
Algorithms
  • Peter Scott (Project leader)
  • Avinash Srinivasa
  • Vaibhav Sundriyal

2
What is parallel prefix?
  • Finding parallelism in serial-looking problems.
  • Take an array, like 1, 3, 2, 1
  • Find partial sums 1, 13, 132, 1321
  • We can use any associative operation, not just
    addition.
  • Matrix multiplication is okay
  • Vector dot product doesnt work

3
Applications
  • DNA sequence alignment
  • Large tree data structure acceleration
  • Incremental regular expression matching
  • Many others, parameterizable by kernel.

4
Parallel version of this
  • Distribute data to several processors.
  • Do redundant computations to get parallelism.

Image taken from Steele Hillis, 1986.
5
Architecture
  • Several processors, shared multi-channel bus

6
1,2 3,4
5,6
7,8
P1
P2
P3
P4
1,3
3,7 5,11
7,15 OPERATE
1,3
3,7 5,11
7,15 COMMUNICATE
1,3 6,10
5,11
18,26 UPDATE
1,3 6,10
5,11
18,26 COMMUNICATE
1,3
6,10 15,21
28,36 UPDATE
7
Bus contention
  • There are often more processors than bus
    channels.
  • How to deal with contention?
  • Answer pre-computed static scheduling.
  • Store schedule as sequence of instructions
  • Write ltchannelgt
  • Load ltchannelgt
  • No_op
  • Comm_step_complete

8
How to use the final product
  • Write VHDL for an associative binary operation,
    like addition or multiplication.
  • Say how many processors you want, how wide your
    data are, how many bus channels, etc.
  • A wizard generates all the VHDL.
  • Just customize it and go.

9
Core generator wizard
10
Core generator wizard
11
Generates processor code
12
and various supporting files
  • Bus program memory holds bus instructions
  • Prefix accelerator instantiates processors and
    bus
  • Etc.

13
Related Papers
  • Explanation of parallel prefix and DNA sequence
    alignment (Aluru) http//class.ece.iastate.edu/cp
    re526/basics.pdf
  • Data parallel algorithms (Steele and Hillis)
    http//cva.stanford.edu/classes/cs99s/papers/hilli
    s-steele-data-parallel-algorithms.pdf
  • Prefix sums and their applications (Bleloch)
    http//www.cs.cmu.edu/guyb/papers/Ble93.pdf
  • Finger trees (Hinze Paterson)
    http//www.soi.city.ac.uk/ross/papers/FingerTree.
    pdf
Write a Comment
User Comments (0)
About PowerShow.com