Title: Improve Run Generation
1Improve Run Generation
- Overlap input,output, and internal CPU work.
- Reduce the number of runs (equivalently, increase
average run length).
2Internal Quick Sort
Use 6 as the pivot (median of 3). Input first,
middle, and last blocks first. In-place
partitioning.
Input blocks from the ends toward the
middle. Sort left and right groups
recursively. Can begin output as soon as left
most block is ready.
3Alternative Internal Sort Scheme
Partition into 3 areas, each may be more than 1
block in size.
DISK
DISK
4Steady State Operation
- Synchronization is done when the current internal
sort terminates.
5New Strategy
- Use 2 input and 2 output buffers.
- Rest of memory is used for a min loser tree.
- Actually, 3 buffers adequate.
6Steady State Operation
- Synchronization is done when the active input
buffer gets empty (the active output buffer will
be full at this time).
7Initialize
3
4
8
4
3
6
8
1
5
7
3
2
6
9
4
5
2
5
8
Fill From Disk
8Initialize
3
6
1
4
8
5
7
4
3
6
8
1
5
7
3
2
6
9
4
5
2
5
8
Fill From Disk
9Initialize
1
3
6
3
2
4
8
5
7
6
9
4
3
6
8
1
5
7
3
2
6
9
4
5
2
5
8
Fill From Disk
10Initialize
1
3
2
6
3
2
4
4
8
5
7
5
8
6
9
4
3
6
8
1
5
7
3
2
6
9
4
5
2
5
8
Fill From Disk
11Initialize
1
3
2
6
3
5
4
4
8
5
7
5
8
6
9
4
3
6
8
1
5
7
3
2
6
9
4
5
2
5
8
Fill From Disk
12Initialize
2
3
2
6
3
5
4
4
8
5
7
5
8
6
9
4
3
6
8
1
5
7
3
2
6
9
4
5
2
5
8
Fill From Disk
13Generate Run 1
Fill From Tree
3 5 4
Fill From Disk
14Generate Run 1
Fill From Tree
1
2
3
2
6
3
5
4
4
8
5
7
5
8
6
9
3
4
3
6
8
5
7
3
2
6
9
4
5
2
5
8
3 5 4
Fill From Disk
15Generate Run 1
Fill From Tree
1
2
3
3
2
6
3
5
4
4
8
5
7
5
8
6
9
5
3
4
3
6
8
5
7
3
6
9
4
5
2
5
8
3 5 4
Fill From Disk
16Generate Run 1
Fill From Tree
1
2
3
2
3
2
6
3
5
4
4
8
5
7
5
8
6
9
4
5
3
4
3
6
8
5
7
3
6
9
4
5
5
8
3 5 4
Interchange Role Of Buffers
Fill From Disk
17Interchange Role Of Buffers
Write To Disk
Fill From Tree
1
2
3
2
3
2
6
3
5
4
4
8
5
7
5
8
6
9
5
3
4
4
3
6
8
5
7
3
6
9
4
5
5
8
1 9 2
Fill From Disk
18Continue With Run 1
Write To Disk
Fill From Tree
1
2
3
2
3
4
6
3
5
4
4
8
5
7
5
8
6
9
5
3
4
4
3
6
8
5
7
3
6
9
4
5
5
8
1 9 2
Fill From Disk
19Continue With Run 1
Write To Disk
Fill From Tree
3
1
2
4
2
3
4
6
3
5
4
4
8
5
7
5
8
6
9
4
5
4
3
6
8
5
7
3
6
9
4
5
5
8
1
1 9 2
Fill From Disk
20Continue With Run 1
Write To Disk
Fill From Tree
3
1
2
3
4
2
3
4
6
3
5
4
4
8
5
7
5
8
6
9
4
5
9
4
3
6
8
5
7
6
9
4
5
5
8
1
1 9 2
Fill From Disk
21Continue With Run 1
Write To Disk
Fill From Tree
3
1
2
3
4
3
2
3
4
6
3
5
4
4
8
5
7
5
8
6
9
9
4
5
2
4
6
8
5
7
6
9
4
5
5
8
1
1 9 2
Interchange Role Of Buffers
Fill From Disk
22Write To Disk
Fill From Tree
Interchange Role Of Buffers
3
3
4
3
3
4
6
3
5
4
4
8
5
7
5
8
6
9
9
4
5
2
4
6
8
5
7
6
9
4
5
5
8
1
6 1 3
Fill From Disk
23Continue With Run 1
Write To Disk
Fill From Tree
3
3
4
3
3
4
6
3
5
4
4
8
5
7
5
8
6
9
9
4
5
2
4
6
8
5
7
6
9
4
5
5
8
1
6 1 3
Fill From Disk
24Continue With Run 1
Write To Disk
Fill From Tree
3
4
3
4
3
3
4
6
3
5
4
4
8
5
7
5
8
6
9
2
9
4
5
6
6
8
5
7
6
9
4
5
5
8
1
6 1 3
Fill From Disk
25Continue With Run 1
Write To Disk
Fill From Tree
3
4
3
4
4
3
3
6
3
5
4
4
8
5
7
5
8
6
9
2
9
4
5
6
1
6
8
5
7
6
9
5
5
8
1
6 1 3
Fill From Disk
26RUN SIZE
- Let k be number of external nodes in loser tree.
- Run size gt k.
- Sorted input gt 1 run.
- Reverse of sorted input gt n/k runs.
- Average run size is 2k.
27Comparison
- Memory capacity m records.
- Run size using fill memory, sort, and output run
scheme m. - Use loser tree scheme.
- Assume block size is b records.
- Need memory for 4 buffers (4b records).
- Loser tree k m 4b.
- Average run size 2k 2(m 4b).
- 2k gt m when m gt 8b.
28Comparison
29Comparison
- Total internal processing time using fill memory,
sort, and output run scheme O((n/m) m
log m) O(n log m). - Total internal processing time using loser tree
O(n log k). - Loser tree scheme generates runs that differ in
their lengths.
30Merging Runs Of Different Length
22
22
13
7
15
7
Cost 42
Cost 44
Best merge sequence?