Problem Solving Strategies - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Problem Solving Strategies

Description:

Title: Message Passing Computing Author: harveyd Last modified by: Southern Oregon University Created Date: 4/1/2002 3:20:18 PM Document presentation format – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 20
Provided by: harv87
Learn more at: http://cs.sou.edu
Category:

less

Transcript and Presenter's Notes

Title: Problem Solving Strategies


1
Problem Solving Strategies
  • Partitioning
  • Divide the problem into disjoint parts
  • Compute each part separately
  • Divide and Conquer
  • Divide Phase Recursively create sub-problems of
    the same type
  • Base case Reached Execute an algorithm
  • Conquer phase Merge the results as the recursion
    unwinds
  • Traditional Example Merge Sort
  • Where is the work?
  • Partitioning Creating disjoint parts of the
    problem
  • Divide and Conquer Merging the separate results
  • Traditional Example Quick Sort

2
Parallel Sorting Considerations
  • Distributed memory
  • Distributed system precision differences can
    cause unpredictable results
  • Traditional algorithms can require excessive
    communication
  • Modified algorithms minimize communication
    requirements
  • Typically, data is scattered to the P processors
  • Shared Memory
  • Critical sections and Mutual Exclusion locks can
    inhibit performance
  • Modified algorithms eliminate the need for locks
  • Each processor can sort N/P data points or they
    can work in parallel in a more fine grain manner
    (no need for processor communication).

3
Two Related Sorts
  • Bubble Sort
  • Odd-Even Sort
  • void bubble(char x, int N)
  • int sorted0, i, sizeN-1
  • char temp
  • while (!sorted) sorted1
  • for (i0iltsizei)
  • if (strcmp(xi,xi1gt0)
  • strcpy(temp,xi)
  • strcpy(xi,xi1)
  • strcpy(xi1,temp)
  • sorted 0
  • size--
  • void oddEven(char x, int N)
  • int even0,sorted0,i,sizeN-1
  • char temp
  • while(!sorted)
  • sorted1
  • for(ieven iltsize i2)
  • if(strcmp(xi,xi1gt0)
  • strcpy(temp,xi)
  • strcpy(xi,xi1)
  • strcpy(xi1,temp)
  • sorted 0
  • even 1 even
  1. Sequential version Odd-Even has no advantages
  2. Parallel version Processors can work
    independently without data conflicts

4
Bubble, Odd Even Example
  • Bubble Pass
  • Odd Even Pass

Bubble Smaller values move left one spot per
pass. Largest value move immediately to the end.
The loop size can shrink by one each pass. Odd
Even Large values move only one position per
pass. The loop size cannot shrink. However, all
interchanges can occur in parallel.
5
One Parallel Iteration
  • Distributed Memory
  • Shared Memory
  • Odd ProcessorsmergeLow(pr data, pr-1 data)
    Barrierif (rltP-2) mergeHigh(pr data,pr1
    data)Barrier
  • Even ProcessorsmergeHigh(pr data, pr1 data)
    Barrierif (rgt1) mergeLow(pr data, pr-1
    data)Barrier
  • Odd Processors
  • sendRecv(pr data, pr-1 data) mergeHigh(pr
    data, pr-1 data)if(rltP-2)
  •   sendRecv(pr data, pr1 data)   
    mergeLow(pr data, pr1 data)
  • Even Processors sendRecv(pr data, pr1 data)
    mergeLow(pr data, pr1 data) if(rgt1)  sendrec
    v(pr data, Pr-1 data)
  •    mergeHigh(pr data, pr-1 data)

Notation r Processor rank, P number of
processors, pr data is the block of data
belonging to processor, r
Note P/2 Iterations are necessary to complete
the sort
6
A Distributed Memory Implementation
  • Scatter the data among available processors
  • Locally sort N/P items on each processor
  • Even Passes
  • Even processors, pltN-1, exchange data with
    processor, p1.
  • Processors, p and p1 perform a partial merge
    where p extracts the lower half and p1 extracts
    the upper half.
  • Odd Passes
  • Even processors, pgt2, exchange data with
    processor, p-1.
  • Processors, p, and p-1 perform a partial merge
    where p extracts the upper half and p-1 extracts
    the lower half.
  • Exchanging Data MPI_Sendrecv

7
Partial Merge Lower keys
Store the lower n keys from arrays a and b into
array c
  • mergeLow(char a, char b, char c, int n)
  • int countA0, countB0, countC0
  • while (countC lt n)
  • if (strcmp(acountA,bcountB)
  • strcpy(ccountC, acountA)
  • else
  • strcpy(ccountC, acountB)
  • To merge upper keys
  • Initialize the counts to n-1
  • Decrement the counts instead of increment
  • Change the countC lt n to countC gt 0

8
Bitonic Sequence
10,12,14,20 95,90,60,40,35,23,18,0 3,5,8,9
  • 3,5,8,9,10,12,14,20 95,90,60,40,35,23,18,0

Increasing and then decreasing where the end can
wrap around
9
BitonicSort
  • Unsorted 10,20,5,9.3,8,12,14,90,0,60,40,23,35,95,
    18
  • Step 1 10,20 9,5 3,8 14,12 0,90 60,40
    23,35 95,18
  • Step 2 9,510,2014,123,80,4060,9095,
    3523,18
  • 5,9 10,20 14,12 8,3 0,40 60,90 95,35
    23,18
  • Step 3 5,9,8,314,12,10,20
    95,40,60,900,35,23,18
  • 5,38,910,1214,20 95,9060,4023,35
    0,18
  • 3,5, 8,9, 10,12, 14,20 95,90, 60,40, 35,23,
    18,0
  • Step 4 3,5,8,9,10,12,14,0
    95,90,60,40,35,23,18,20
  • 3,5,8,0 10,12,14,9 35,23,18,2095,90,60,
    40
  • 3,08,5 10,914,12 18,2035,23
    60,4095,90
  • Sorted 0,3,5,8,9,10,12,14,18,20,23,35,40,60,90,95

10
Bitonic Sorting Functions
  • void bitonicMerge(int lo, int n, int dir)
  • if (ngt1)
  • int mn/2
  • for (int ilo iltlom i)
  • compareExchange(i, im, dir)
  • bitonicMerge(lo, m, dir)
  • bitonicMerge(lom, m, dir)
  • void bitonicSort(int lo, int n, int dir)
  • if (ngt1)
  • int mn/2
  • bitonicSort(lo, m, UP)
  • bitonicSort(lom, m, DOWN
  • bitonicMerge(lo, n, dir)
  • Notes
  • dir 0 for DOWN, and 1 for UP
  • compareExchange moves
  • low value left if dir UP
  • high value left if dir DOWN

11
Bitonic Sort Partners/Direction
Algorithm Steps ? level 1 2 2 3 3 3 4 4 4 4 j 0
0 1 0 1 2 0 1 2 3
  • rank 0 partners 1/L, 2/L 1/L, 4/L
    2/L 1/L, 8/L 4/L 2/L 1/L
  • rank 1 partners 0/H, 3/L 0/H, 5/L
    3/L 0/H, 9/L 5/L 3/L 0/H
  • rank 2 partners 3/H, 0/H 3/L, 6/L
    0/H 3/L, 10/L 6/L 0/H 3/L
  • rank 3 partners 2/L, 1/H 2/H, 7/L
    1/H 2/H, 11/L 7/L 1/H 2/H
  • rank 4 partners 5/L, 6/H 5/H, 0/H
    6/L 5/L, 12/L 0/H 6/L 5/L
  • rank 5 partners 4/H, 7/H 4/L, 1/H
    7/L 4/H, 13/L 1/H 7/L 4/H
  • rank 6 partners 7/H, 4/L 7/H, 2/H
    4/H 7/L, 14/L 2/H 4/H 7/L
  • rank 7 partners 6/L, 5/L 6/L, 3/H
    5/H 6/H, 15/L 3/H 5/H 6/H
  • rank 8 partners 9/L, 10/L 9/L, 12/H
    10/H 9/H, 0/H 12/L 10/L 9/L
  • rank 9 partners 8/H, 11/L 8/H, 13/H
    11/H 8/L, 1/H 13/L 11/L 8/H
  • rank 10 partners 11/H, 8/H 11/L, 14/H
    8/L 11/H, 2/H 14/L 8/H 11/L
  • rank 11 partners 10/L, 9/H 10/H, 15/H
    9/L 10/L, 3/H 15/L 9/H 10/H
  • rank 12 partners 13/L, 14/H 13/H, 8/L
    14/H 13/H, 4/H 8/H 14/L 13/L
  • rank 13 partners 12/H, 15/H 12/L, 9/L
    15/H 12/L, 5/H 9/H 15/L 12/H
  • rank 14 partners 15/H, 12/L 15/H, 10/L
    12/L 15/H, 6/H 10/H 12/H 15/L
  • rank 15 partners 14/L, 13/L 14/L, 11/L
    13/L 14/L, 7/H 11/H 13/H 14/H

partner rank (1ltlt(level-j-1)) direction
((rankltpartner) ((rank (1ltltlevel)) 0))
12
Java Partner/Direction Code
  • public static void main(String args)
  • int nproc 16, partner, levels
    (int)(Math.log(nproc)/Math.log(2))
  • for (int rank 0 rankltnproc rank)
  • System.out.printf("rank 2d partners ",
    rank)
  • for (int level 1 level lt levels level )
  • for (int j 0 j lt level j)
  • partner rank (1ltlt(level-j-1))
  • String dir ((rankltpartner)((rank
    (1ltltlevel))0))?"L""H"
  • System.out.printf("3d/s", partner,
    dir)
  • if (levelltlevels) System.out.print(", ")
  • System.out.println()

13
Parallel Bitonic Pseudo code
  • IF master processor
  • Create or retrieve data to sort
  • Scatter it among all processors (including the
    master)
  • ELSE
  • Receive portion to sort
  • Sort local data using an algorithm of preference
  • FOR( level 1 level lt lg(P) level )
  • FOR ( j 0 jltlevel j )
  • partner rank (1ltlt(level-j-1))
  • Exchange data with partner
  • IF ((rankltpartner) ((rank (1ltltlevel))
    0))
  • extract low values from local and received
    data (mergeLow)
  • ELSE extract high values from local and
    received data (mergeHigh)
  • Gather sorted data at the master

14
Bucket Sort Partitioning
  • Algorithm
  • Assign a range of values to each processor
  • Each processor sorts the values assigned
  • The resulting values are forwarded to the master
  • Steps
  • Scatter N/P numbers to each processor
  • Each Processor
  • Creates smaller buckets of numbers for designated
    for each processor
  • Sends the designated buckets to the various
    processors and receives the designated buckets it
    expects to receive
  • Sorts its section
  • Sends its data back to the processor with rank 0

15
Bucket Sort Partitioning
Unsorted Numbers
Unsorted Numbers
P1
P2
P3
Pp
Sorted
  • Sequential Bucket Sort
  • Drop sections of data to sort into buckets
  • Sort each bucket
  • Copy sorted bucket data back into the primary
    array
  • Complexity O(b (n/b lg(n/b))

Sorted
Parallel Bucket Sort
Notes
  1. Bucket Sort works well for uniformly distributed
    data
  2. Recursively finding mediums from a data sample
    (Sample Sort) attempts to equalize bucket sizes

16
Rank (Enumeration) Sort
  • Count the numbers smaller to each number, srci
    or duplicates with a smaller index
  • The count is the final array position for x
  • for (i0 iltN i)
  • count 0
  • for (j0 jltN j)
  • if (srci gt srcj srcisrcj jlti)
    x
  • destx srci
  • Shared Memory parallel implementation
  • Assign groups of numbers to each processor
  • Find positions of N/P numbers in parallel

17
Counting Sort
Works on primitive fixed point types int, char,
long, etc.
  1. Master scatters the data among the processors
  2. In parallel, each processor counts the total
    occurrences for each of the N/P data points
  3. Processors perform a collective sum operation
  4. Processors performs an all-to-all collective
    prefix sum operation
  5. In parallel, each processor stores the N/P data
    items appropriately in the output array
  6. Sorted data gathered at the master processor

Note This logic can be repeated to implement a
radix sort
18
Merge Sort
  • Scatter N/P items to each processor
  • Sort Phase Processors sort its data with a
    method of choice
  • Merge Phase Data routed and a merge is performed
    at each level
  • for (gap1 gapltP gap2)
  • if ((p/gap)2 ! 0) Send data to pgap
    break
  • else
  • Receive data from pgap Merge with
    local data

19
Quick Sort
  • Slave computers
  • Perform the quick sort algorithm
  • Base Case if data length lt threshold, send to
    master (rank 0)
  • Recursive Step quick sort partition the data
  • Request work from the master processor
  • If none terminate
  • Receive data, sort and send back to master
  • Master computer
  • Scatter N/P items to each processor
  • When receive work request Send data to slave or
    termination message
  • When receive sorted data Place data correctly in
    final data list
  • When data sorted save data and terminate

Note Distributed work pools requires load
balancing. Processors maintain local work pools.
When the local load queue falls below a
threshold, processors request work from their
neighbors
Write a Comment
User Comments (0)
About PowerShow.com