A Survey of External Memory Algorithms and Data Structures presentation

About This Presentation

Transcript and Presenter's Notes

Title: A Survey of External Memory Algorithms and Data Structures

1
A Survey of External Memory Algorithms and Data
Structures

Presented by Reynold Cheng
Feb 10, 2003

2
Internal vs. External Memory

Internal memory (RAM) not sufficient to store
data sets in large applications
External memory (e.g., disks) used to store data
for the algorithm
The I/O between fast internal memory and slow
external memory is a performance bottleneck.

3
Virtual Memory and I/O Performance

Virtual Memory in OS
Provides one uniform address space
Principle of Locality
Caching and prefetching improves performance
Designed to be General-purpose
Doesnt help if computations are inherently
nonlocal
Results in Large amounts of I/O and poor
performance

4
EM Algorithms and Data Structures

Incorporate locality directly into algorithm
design
Bypass the virtual memory system
EM Algorithms and Data Structures
explicitly manage data placement and movement
among memory hierarchy
I/O communication between internal memory (RAM)
and external memory (magnetic disk)

5
Two Problem Categories

Batched Problems
No preprocessing is done
Process entire file of data items
Stream data through internal memory in 1 passes
Online Problems
Immediate response to queries
Only small portion of data is examined for each
query
Usually organize data items in hierarchical index
Data can be static or dynamic

6
Talk Outline

Modeling Parallel Disks
Design Goal of EM Algorithms
Striping and Load Balancing
Batched Problems
Algorithms for External Sorting
Online Problems
Hash tables and B-trees
Conclusions

7
Parallel Disk Model (PDM)
8
Disk Block Size B

Rotational Latency and seek time are long
Improve average search time by transferring data
in blocks
Track size 50 200 KB
Batched applications B a little larger than
track size ( 100 KB)
Online Applications (pointer-based indexes) B
8 KB
Further improve performance by parallel disks

9
Data placement and disk Striping

We assume the input data are striped across D
disks.

D 5, B 2
Items 12 and 13 stored in 2nd block (stripe 1)
of disk D1
N items read/written in O(N/DB) O(n/D) I/Os
optimal

10
Performance measures in PDM

Number of I/O operations
Disk space used (utilization)
Ideally data structures should use linear space,
i.e., O(N/B) O(n) disk blocks
Internal (sequential/parallel) computation time
Focus only on (1) and (2)
Most algorithms described run in optimal CPU time

11
Fundamental I/O Bounds (1)

I/O performance often expressed in terms of the
bounds for
Scan(N) Sequential read/write N items
Sort(N) Sorting N items
Search(N) Searching N sorted items
Output(Z) Outputting Z answers to a query in a
blocked output-sensitive fashion

12
Fundamental I/O Bounds (2)

Scan(N) and Sort(N) apply to batched problems
Search(N) and Output(Z) apply to online problems
Typically combined in the form Search(N)
Output (Z)

13
Optimal I/O Bounds
14
Comments on I/O Bounds (1)

Scan(N) O(n/D) linear of I/Os
Nontrival batched problems like permuting, matrix
transposing and list ranking need nonlinear I/Os
those can be solved in linear time in internal
memory

15
Comments on I/O Bounds (2)

Multiple-disk bounds for Scan(N), Sort(N) and
Output(Z) are D times smaller than single-disk
bounds
For Search(N), the speedup is less significant
(logBN) for D 1 becomes T(logDBN) for D 1
Speedup T((logBN)/logDBN)
T((log DB) / log B)
T(1 (log D) / log B) lt 2

16
Locality I/O Performance

Key to achieve efficient I/O access data with
high degree of locality
Single Disk Each I/O transfers a block of B
items optimal when all B items are needed
Multiple Disks Each I/O transfers D blocks
(stripe) optimal when all DB items are needed

17
Single Disk Locality

Many batched problems like sorting requires O(N
log N) CPU time
If data set doesnt fit into RAM, relying on
virtual memory may need O(N log n) I/O!
Goal Incorporate locality in algorithm design to
achieve O(n logmn) I/Os

18
Design Goal of EM Algorithm

For batched problems,
N and Z terms of the I/O bounds of the naïve
algorithms are replaced by n and z
Base of the logarithm terms 2 ? m
For online problems,
Base of the logarithm terms 2 ? B
Z ? z
Speedup significant, e.g., for batched problems,
(N log n) / n logmn B log m

19
Multiple Disk Locality

Disk Striping I/Os are permitted on entire
stripes, one stripe at a time

20
Single-Disk to Multiple-Disk (1)

Net Effect of disk striping behave as a single
disk with logical block size DB
Apply disk striping paradigm to automatically
convert
algorithm for single disk with block size DB
? algorithm for D disks with block size B

21
Single-Disk to Multiple-Disk (2)

Example Single-disk algorithm for searching
requires T(logBN) I/Os.
Using striping, we obtain a multiple-disk
algorithm by substituting DB for B
T(logDBN) I/Os
Disk striping is very powerful can be used to
get optimal multiple-disk algorithms from optimal
single-disk algorithms for
streaming, online search and output reporting

22
Disk Striping and Sorting

Disk striping is not optimal for sorting!
I/O Bound for disk-striping
Optimal I/O Bound
Striping larger than optimal bound by

23
Sorting with Multiple Disks

To attain optimal sorting bound, we need to
forsake disk striping
Control disks independently
Average/Worst case cost

Algorithms are based on distribution and merge
paradigms
Online load-balancing distribute data in D
disks evenly for access

24
Distribution Sort with D Disks
S 7 using 6 partitioning elements
1
2
5
4
3
6
7

Uses S-1 partitioning elements to partition items
into S disjoint buckets
Items in bucket i Items in bucket j for i j
Sort the buckets recursively
Concatenate the sorted buckets

25
Partitioning Elements

Choose S-1 partitioning elements so that buckets
sizes are roughly equal, using ?(n/D) I/Os
Bucket sizes decrease by a factor of ?(S)
O(logSn) levels of recursion.
Maximum S ?(m), so minimum of recursion
levels is O(logmn).
Deterministic methods exist for choosing S?m
partitioning elements.

26
I/O bound for Distribution Sort

If each level of recursion uses ?(n/D) I/Os,
Number of I/Os
Each set of items being partitioned is itself one
bucket formed in previous recursion level.
The blocks of a bucket should be spread evenly
for the next read, so that
All D disks are involved for reading from a
bucket at the same time.

27
Randomized Distribution Sort VS94

Goal With high probability, each bucket is well
balanced across D disks
If N is so large that number of blocks per bucket
?(n/S) is ?(D log D), then write D blocks in
independent random order to a disk stripe.

28
Classical Occupancy Problem

b balls are inserted independently and uniformly
at random into d bins.
If b/d grows faster than log d, the largest bin
contains b/d balls on average ? Even distribution

29
Distribution Sort for the case?(n/S) ? ?(D log
D)

The previous technique breaks down when ?(n/S) ?
?(D log D)
A typical memoryload contains S log S blocks
? well-balanced among S buckets
1st pass read file in one pass, one memoryload
at a time. Permute and write to disk.
2nd pass Extract a part from each of several
memory to form a typical memoryload.
Output each memoryload by a round-robin placement
of S buckets on D disks.

30
Merge Sort with D Disks

Orthogonal to the distribution paradigm
Run formation scan n blocks of data, one
memoryload (m blocks) each time. Sort each load
and output on stripes.
N/M n/m sorted runs
Merging Scan and merge groups of R ?(m) runs
passes logR(n/m) logmn - 1

31
I/O bound for Merge Sort

If each pass uses ?(n/D) I/Os,
Number of I/Os
Must bring in next ?(D) blocks on average at each
step
Ensure that the blocks used for merging reside on
different disks

32
Simple Randomized Mergesort BV99

Each run is striped starting at a randomly chosen
disk.
At any time, the disk containing the leading
block of any run is uniformly random.

D 4 disks, R 8 runs
33
Cyclic Occupancy Problem

Conjecture The expected maximum bin size is at
most that of the
classical occupancy problem.

34
External Hashing for Online Dictionary Search

Advantage of hashing expected number of probes
per operation is constant, independent of N
Goal develop dynamic EM structures that can
adapt smoothly to widely varying values of N
Directory method extendible hashing
2 I/Os for directory data access
Directory-less method linear hashing
1 I/O only may require overflow lists

35
Multiway Tree

Items in a tree are sorted ? the tree can be used
for 1D range search
To find items in x,y search x, inorder
traversal from x to y
Arise naturally in online settings updates and
queries are processed immediately
To exploit block transfers, the balanced multiway
B-tree was proposed most widely used nontrival
EM data structure.

36
B-Trees and B-Trees

In B-tree, all items are stored in leaves
Internal nodes only store key values and
pointers higher branching factor
Leaves are linked together sequentially to
facilitate range queries
or sequential access
B-tree The most popular variant of B-tree

37
B-Trees

Postpone splitting by sharing the nodes data
with one of adjacent siblings
Split node only when sibling is also full, and
evenly redistribute data among node and siblings
Reduces the number of times new nodes created
Higher storage utilization, lower search I/O
costs
Average utilization of nodes from 69 ? 81

38
The Buffer Tree ARGE95

Each insertion operation takes O(logBN) I/Os
Building a B-Tree
Repeated insertion ? O(N logBN) I/Os!
We can take advantage of blocking and obtain O(n
logmn) I/Os
Buffer tree ARGE95

39
Main Idea of Buffer Tree

Logically group nodes together and add buffers.
Balanced tree Degree ?(m) rather than ?(B)
Each node has a buffer for storing ?(M) items
(?(m) blocks)
Insertions done lazily items inserted into
buffers.
When a buffer is full, its items are pushed one
level down.

40
I/O Cost of Buffer Trees

Buffer-emptyng in O(m) I/Os
Amortize the cost of distributing M items to ?(m)
children
Each item incurs an amortized cost of
O(m/M)O(1/B) I/Os per level
The resulting cost for update/query is only
O((1/B)logmn) I/Os amortized
Cost of inserting N nB items
nB O((1/B)logmn) nlogmn I/Os

41
Conclusions (1)

Internal memory algorithms are not readily
adapted for external memory
EM algorithms are developed to handle I/O
communications more efficiently
The most fundamental I/O bounds are scanning,
sorting, searching, outputting
The goal of EM algorithms is try to achieve these
optimal bounds

42
Conclusions (2)

Striping is optimal for scanning, searching and
outputting, but not sorting.
Randomized distribution sort and merge sort can
achieve optimal I/O bound.
Hash tables for online dictionary search
B-trees best for 1D online range search
Buffer trees improves the speed of B-tree
construction

43
Other Interesting Topics

Handling duplicates during sorting
Permutation, Fast Fourier Transform (FFT),
computational geometry and graphs
Multi-dimensional data, R-trees and range queries
Dynamic data Structures, moving objects, strings
EM algorithm development tools (e.g., TPIE)

44
References

JSV2001 J. S. Vitter. External Memory
Algorithms and Data Structures Dealing with
MASSIVE DATA, ACM Computing Surveys, 33(2), June
2001, 209-271.
JSV1998 J.S. Vitter. External Memory
Algorithms in an invited tutorial in Proceedings
of the 17th Annual ACM Symposium on Principles of
Database Systems (PODS '98), Seattle, WA, June
1998.
VS94 Vitter, J.S. and Shriver, E. A. M. 1994.
Algorithms for parallel memory I Two-level
memories. Algorithmica 12, 23, 110147.

45
References

BV99 Barve,R.D. and Vitter, J. S. 1999. A
simple and efficient parallel disk mergesort. In
Proceedings of the ACM Symposium on Parallel
Algorithms and Architectures (St. Malo, France,
June), Vol. 11, 232241.
BY96 Baeza-Yates, R. 1996. Bounded disorder
The effect of the index. Theoretical Computer
Science 168, 2138.
ARGE95 Arge, L. 1995. The buffer tree A new
technique for optimal I/O-algorithms. In
Proceedings of the Workshop on Algorithms and
Data Structures, Vol. 955 of Lecture Notes in
Computer Science, Springer-Verlag, 334345.

Write a Comment

User Comments (0)

About PowerShow.com

A Survey of External Memory Algorithms and Data Structures PowerPoint PPT Presentation