Adaptive%20Sorting - PowerPoint PPT Presentation

About This Presentation
Title:

Adaptive%20Sorting

Description:

By Xiaoming Li, Maria Jesus Garzaran, and David ... Winnow algorithm: i wi *Ei T ... Uses Winnow for learning the weight vector. Genetic Algorithm. Crossover: ... – PowerPoint PPT presentation

Number of Views:150
Avg rating:3.0/5.0
Slides: 41
Provided by: amo58
Category:

less

Transcript and Presenter's Notes

Title: Adaptive%20Sorting


1
Adaptive Sorting
  • A Dynamically Tuned Sorting Library
  • Optimizing Sorting with Genetic Algorithms
  • By Xiaoming Li, Maria Jesus Garzaran, and David
    Padua
  • Presented by Anton Morozov

2
Motivations and Observations
  • Success of ATLAS, FFTW and SPIRAL (signal
    processing libraries)

What Can be done for Sorting?
3
Why are we interested in the sorting algorithms?
Does this reflects the performance of the sorting
algorithms?
4
Which additional factors influence the
performance of the sorting algorithm?
5
Performance vs. Standard Deviation
6
Observation
Quicksort and Merge sort are both comparison
based sorts, thus they are independent of the
chosen distribution or standard deviation
Performance depends on degree of sortedness i.e.
the number of inversions Max n(n-1)/2
7
(No Transcript)
8
Architectural Model and Empirical Search
  • We saw how programs like BLAS and ATLAS use
    search to establish the parameters of the
    underlying architecture

9
So what Sort Algorithm is better?
  • What performance of the sorting algorithm
    depends on?
  • How to choose the best sorting algorithm?

10
Sorting algorithms
  • QuickSort
  • Radix Sort
  • Merge Sort
  • Insertion Sort
  • Sorting Networks
  • Heap Sort

11
Sorting algorithms
  • QuickSort
  • Radix Sort
  • Merge Sort
  • Insertion Sort
  • Sorting Networks
  • Heap Sort

12
Sorting algorithms
  • QuickSort
  • Radix Sort
  • Cache-Conscious Radix sort
  • Merge Sort
  • Multiway Merge Sort
  • Insertion Sort
  • Sorting Networks
  • Heap Sort


Register sorts
13
Quick Sort
  • Description Pick a pivot, move records around
    the pivot, records which are smaller than pivot
    go to the front, bigger go to the back, and pivot
    inserted between them.
  • Improvements
  • Move iteratively
  • Choose pivot among the first, middle and last
    keys
  • Use fast sorts for the small partitioning.
    (insertion or sorting networks)

14
Cache-Conscious Radix Sort
Having b-bit integer and a radix of size 2r,
algorithm first sorts by lower r bits then sorts
by next r bits total in b/r phases, where r is
chosen to be r log2STBL-1 where STBL number of
entries in translation look-aside buffer.
  • Improvements
  • Proceed iteratively,
  • Compute the histogram of the each r bits first
    time the sort is applied,
  • Choose r as described above

15
Multiway merge sort.
It partitions the keys into p subsets, each
subset is then sorted in (in this case with
CC-radix sort) and then subsets are merged using
heap. First smallest/largest element of the
subset is promoted to the leaves of the heap then
leaves are compared and an appropriate leaf is
promoted.
  • Heap contains 2p-1 leaves.
  • Each parent in a heap has A/r children, A cache
    line, r size of a node.

16
Insertion Sort.
Used for the small data sizes
Algorithm working from left to right for each key
scans to the left of the key and places it in the
appropriate place
Sorting Networks
Algorithms compares two inputs in sequence and if
one is bigger then the other it swaps them.
17
Input Data Factors
  • Number of keys
  • Distribution
  • Standard deviation

Approximate S.D. with Entropy vector
?i -Pilog2Pi where Pi ci /N, ci is a number of
keys with value i in that digit
18
Parameters to search for during installation
Merge Sort Size of the heap and the
fanout depends on cache size, cache line, input
size and entropy at run time needs N and E
Quick Sort Insertion sort or Sorting Networks
and their thresholds, depends on the number of
registers and cache size
CC-radix Sort Insertion sort or Sorting Networks
or standard Radix sort depending on the size,
also depends on the number of registers and
cache size
19
Learning procedure
? (N,E) ? CC-radix, Multiway Merge(N,E),
Quicksort
Winnow algorithm ?i wi Ei gt T
Computes weights vector and threshold depending
on the Entropy vector
20
Selection at run time
  • Sample the input array (every fourth entry)
  • Compute the entropy vector
  • Compute S ?i wi entropyi
  • If S ?
  • choose CC-radix
  • else
  • choose others based on size of input
  • (either Merge Sort or QuickSort)

21
Summarize
  • Architectural Factors
  • Cache / TLB size
  • Number of Registers
  • Cache Line Size

Empirical Search
  • Runtime Factors
  • Distribution shape of the data
  • Amount of data to Sort
  • Distribution Width

Any, since it doesnt matter
Learn at installation time
22
Performance Results
23
Performance Results
24
Is it possible to do better?
25
Sorting Primitives
To build a new sorting algorithms sorting and
selection primitives
  • Sorting primitive Is a pure sorting algorithm
    looked before
  • Selection primitive Is a process to be executed
    at run time to decide which sorting algorithm to
    apply

26
Sorting Primitives
  • Divide-by-Value corresponds to the first phase
    of Quicksort takes the number of pivots as a
    parameter (np1)
  • - A step in Quicksort
  • Select one or multiple pivots and sort the input
    array around these pivots
  • Divide-by-Position corresponds to initial break
    of Merg Sort
  • takes size of each partition and fan-out of the
    heap
  • - Divide input into same-size sub-partitions
  • - Use heap to merge the multiple sorted
    sub-partitions

27
Sorting Primitives
  • Divide-by-Radix corresponds to the step in the
    radix sort algorithm. Takes a radix as a
    parameter.
  • Parameter radix (r bits)
  • Step 1 Scan the input to get distribution array,
    which records how many elements in each of the 2r
    sub-partitions.
  • Step 2 Compute the accumulative distribution
    array, which is used as the indexes when copying
    the input to the destination array.
  • Step 3 Copy the input to the 2r sub-partitions.

counter
accum.
dest.
src.
0 1 2 3
0 1 2 3
1 1 1 1
0 1 2 3
11 23 30 12
30 11 12 23
1 2 3 4
28
Sorting Primitives
  • Divide-by-radix-assuming-Uniform-distribution
    same as above. Assumes that each bucket contains
    n/2r keys
  • - Step 1 and Step 2 in DR are expensive.
  • - If the input elements are distributed among 2r
    sub-partitions near evenly, the input can be
    copied into the destination array directly
    assuming every partition have the same number of
    elements.
  • - Overhead partition overflow

29
Sorting Primitives
  • Once the partition is small
  • Leaf-Divide-by-Value same as DV but applies
    recursively to the partitions. lt Threshold
    applies register sorting
  • Leaf-Divide-by-Radix same as DR but is used on
    all remaining subsets. lt threshold applies
    register sorting

30
Selection Primitives
  • Branch-by-Size used to select different paths
    based on size
  • Branch-by-Entropy uses entropy to branch on
    different path.
  • Uses Winnow for learning the weight vector

31
Genetic Algorithm
  • Crossover
  • Propagate good sub-trees
  • Mutation
  • Mutate the structure of the algorithm.
  • Change the parameter values of primitives.

32
Genetic Algorithm
  • Fitness function
  • Average performance by S.D.
  • Uses Rank instead of fitness.

33
Performance Results
34
Performance Results
35
Is it possible to do better?
Empirically was observed that Branch-by-Entropy
selection primitive was never used
36
Classifier Sorting
Based on the idea that the performance of the
algorithm in one region of input space can be
independent of the other.
i is an input characteristic string, c is a
condition string with 1, 0 and for dont
care.
37
  • Example
  • Encode number of keys into 4 bits.
  • 0000 01M, 0001 12M
  • Number of keys 10.5M. Encoded into 1100

Condition Action Fitness Accuracy
(dr 5 (lq 1 16))
(dp 4 2 ( lr 5 16))

1100
01
1100
1010
1100
110
(dv 2 ( lr 6 16))
38
Experimental Results
39
Experimental Results
40
Experimental Results
41
Summary and Future work
  • The work presented shows how sorting can be
    adapted to underlying platforms
  • Potential future work
  • Figure out what went wrong or not wrong with
    those graphs
  • Incorporate the notion of sortedness into sort
    selection
  • Simplify the selection algorithm
  • See if these notions can be used in the cache
    oblivious way
Write a Comment
User Comments (0)
About PowerShow.com