Lower Bound Techniques for Data Structures - PowerPoint PPT Presentation

About This Presentation
Title:

Lower Bound Techniques for Data Structures

Description:

Title: Lower Bounds Techniques for Data Structures Last modified by: Mihai Patrascu Created Date: 8/16/2006 12:00:00 AM Document presentation format – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 60
Provided by: peopleCsa8
Category:

less

Transcript and Presenter's Notes

Title: Lower Bound Techniques for Data Structures


1
Lower Bound Techniquesfor Data Structures
Mihai Patra?cu
  • Committee
  • Erik Demaine (advisor)
  • Piotr Indyk
  • Mikkel Thorup

2
Data Structures
  • I dont study stacks, queues and binary search
    trees!
  • I do study data structure problems (a.k.a.
    Abstract Data Types)

partial-sums problem
Preprocess T n numbers pred(q) max y
? T y lt q
predecessor search
Maintain an array An under update(i, ?)
Ai ? sum(i) return A0 Ai
3
Motivation?
packet forwarding
partial-sums problem
Preprocess T n numbers pred(q) max y
? T y lt q
predecessor search
Maintain an array An under update(i, ?)
Ai ? sum(i) return A0 Ai
4
Binary Search Trees Upper Bound
Binary search trees solve predecessor
search gt Complexity of
predecessor O(lg n)/operation
my work
Augmented binary search trees solve partial
sums gt Complexity of partial sums O(lg
n)/operation
my work
partial-sums problem
Preprocess T n numbers pred(q) max y
? T y lt q
predecessor search
Maintain an array An under update(i, ?)
Ai ? sum(i) return A0 Ai
5
What kind of lower bound?
Lower bounds you can trust.TM
  • Model of computation real computers
  • memory words of w gt lg n bits (pointers words)
  • random access to memory
  • any operation on CPU registers (arithmetic,
    bitwise)
  • Just prove lower bound on memory accesses

Array Mem1.. S of w-bit words
Black box
6
Why Data Structures?
I want to understand computation.
  • Other settings
  • streaming L.B. many ? not very
    computational mostly storage / info thy
  • space-bounded (P vs L)
  • L.B. a few, O(n vlg n) ? unnatural
    questions
  • algebraic L.B. some ? cool, but not
    real computing
  • depth 3 circuits with mod-6 gates ??
  • The gospel
  • data structures L.B. some ?
    understand some nontrivial computational
    phenomena
  • efficient algorithms circuit L.B. not
    forthcoming
  • hard optimization ? NP-completeness
    L.B. one per STOC/FOCS ?

7
Why Data Structures?
I want to understand computation.
  • Other settings
  • streaming L.B. many ? not very
    computational mostly storage / info thy
  • space-bounded (P vs L)
  • L.B. a few, O(n vlg n) ? unnatural
    questions
  • algebraic L.B. some ? cool, but not
    real computing
  • depth 3 circuits with mod-6 gates ??
  • The gospel
  • data structures L.B. some ?
    understand some nontrivial computational
    phenomena
  • efficient algorithms circuit L.B. not
    forthcoming
  • hard optimization ? NP-completeness
    L.B. one per STOC/FOCS ?

Weak as some of the lower bounds may be, its the
area that has gotten farthest towards
understanding computation
8
History
  • Yao, FOCS78
  • Ajtai88 -- predecessor (static)
  • Fredman, Saks89 -- partial sums, union
    find (dynamic)

Omitted bounds for succinct data structures.
  • Observations
  • huge influence
  • 2nd papers
  • result wrong (better upper bound known)
  • no journal version many claims without proof

9
History
  • Yao, FOCS78
  • Ajtai88 -- predecessor (static)
  • Bing Xiao, Stanford92
  • Miltersen STOC94
  • Miltersen, Nisan, Safra, Wigderson STOC95
  • Beame, Fich STOC99
  • Sen ICALP01
  • (1e)-nearest neighbor Chakrabarti,
    Chazelle, Gum, Lvov STOC99
    Chakrabarti, Regev FOCS04
  • Fredman, Saks89 -- partial sums, union
    find (dynamic)
  • Ben-Amram, Galil FOCS91
  • Miltersen, Subramanian, Vitter, Tamassia93
  • Husfeldt, Rauhe, Skyum96
  • Fredman, Henzinger98 planar connectivity
  • Husfeldt, Rauhe ICALP98 nondeterminism
  • Alstrup, Husfeldt, Rauhe FOCS98 marked
    ancestor
  • Alstrup, Husfeldt , Rauhe SODA01 dynamic
    2D NN
  • Alstrup, Ben-Amram, Rauhe STOC99 union-find

Omitted bounds for succinct data structures.
richness lower bounds Borodin,
Ostrovsky, Rabani STOC99 p.m. Barkol, Rabani
STOC00 rand. NN Jayram,Khot,Kumar,Rabani
STOC03 p.m. Liu04 det. ANN
10
Three Main Ideas
  • Yao, FOCS78
  • Ajtai88 -- predecessor (static)
  • Bing Xiao, Stanford92
  • Miltersen STOC94
  • Miltersen, Nisan, Safra, Wigderson STOC95
  • Beame, Fich STOC99
  • Sen ICALP01
  • (1e)-nearest neighbor Chakrabarti,
    Chazelle, Gum, Lvov STOC99
    Chakrabarti, Regev FOCS04
  • Fredman, Saks89 -- partial sums, union
    find (dynamic)
  • Ben-Amram, Galil FOCS91
  • Miltersen, Subramanian, Vitter, Tamassia93
  • Husfeldt, Rauhe, Skyum96
  • Fredman, Henzinger98 planar connectivity
  • Husfeldt, Rauhe ICALP98 nondeterminism
  • Alstrup, Husfeldt, Rauhe FOCS98 marked
    ancestor
  • Alstrup, Husfeldt , Rauhe SODA01 dynamic
    2D NN
  • Alstrup, Ben-Amram, Rauhe STOC99 union-find

richness lower bounds Borodin,
Ostrovsky, Rabani STOC99 p.m. Barkol, Rabani
STOC00 rand. NN Jayram,Khot,Kumar,Rabani
STOC03 p.m. Liu04 det. ANN
3. Round Elimination
2. Asym. Communication, Rectangles
1. Epochs
11
Three Main Ideas
  • Yao, FOCS78
  • Ajtai88 -- predecessor (static)
  • Bing Xiao, Stanford92
  • Miltersen STOC94
  • Miltersen, Nisan, Safra, Wigderson STOC95
  • Beame, Fich STOC99
  • Sen ICALP01
  • (1e)-nearest neighbor Chakrabarti,
    Chazelle, Gum, Lvov STOC99
    Chakrabarti, Regev FOCS04
  • Fredman, Saks89 -- partial sums, union
    find (dynamic)
  • Ben-Amram, Galil FOCS91
  • Miltersen, Subramanian, Vitter, Tamassia93
  • Husfeldt, Rauhe, Skyum96
  • Fredman, Henzinger98 planar connectivity
  • Husfeldt, Rauhe ICALP98 nondeterminism
  • Alstrup, Husfeldt, Rauhe FOCS98 marked
    ancestor
  • Alstrup, Husfeldt , Rauhe SODA01 dynamic
    2D NN
  • Alstrup, Ben-Amram, Rauhe STOC99 union-find

richness lower bounds Borodin,
Ostrovsky, Rabani STOC99 p.m. Barkol, Rabani
STOC00 rand. NN Jayram,Khot,Kumar,Rabani
STOC03 p.m. Liu04 det. ANN
3. Round Elimination
2. Asym. Communication, Rectangles
1. Epochs
12
Review Epoch Lower Bounds
time
update mark/unmark node tu query marked
ancestors? tq
updates r3 r2 r1 r0
bits written tuwr3 tuwr2 tuwr tuw
  • epoch j rj updates
  • epochs 0, .., j-1 write O(tuwrj-1) bits
  • pick r gtgt tuw

most updates from epoch j not known outside
epoch j
random query needs to read a cell from epoch j
tq ?(lg n / lg r) ?(lg n / lg(tuw))
max tq , tu ?(lg n / lglg n)
13
Review Epoch Lower Bounds
See also Fredman JACM 81 Fredman JACM
82 Yao SICOMP 85 Fredman, Saks STOC
89 Ben-Amram, Galil FOCS 91 Hampapuram,
Fredman FOCS 93 Chazelle STOC
95 Husfeldt, Rauhe, Skyum SWAT
96 Husfeldt, Rauhe ICALP 98 Alstrup,
Husfeldt, Rauhe FOCS 98
  • Big Challenges Miltersen99
  • prove some ?(lg n/lglg n) bound Candidate ?(lg
    n) for the partial sums problem
  • prove ?(lg n) in the bit-probe model

Maintain an array An underupdate(i, ?)
Ai ?sum(i) return A0 Ai
14
Our contribution
  • P., Demaine SODA04 ?(lg n) for partial sums
  • P., Demaine STOC04 ?(lg n) for dynamic trees,
    etc.
  • very simple proof not based on epochs
  • P., Tarnita ICALP05 ?(lg n) via epoch
    argument!!
  • gt ?(lg2n/lg2lg n) in the bit-probe model

Best Student Paper
15
?(lg n) via Epoch Arguments?
j
  • Old information about epoch j outside j
  • cells written by epochs 0, .., j-1
  • O(turj-1)

16
?(lg n) via Epoch Arguments?
j
  • New information about epoch j outside j
  • cells read by epochs 0, .., j-1 from epoch
    j
  • still O(turj-1) in the worst case ?

Foil worst-case by randomizing epoch construction!
17
?(lg n) via Epoch Arguments?
cells read by epochs 0, .., j-1 from epoch
j O((tu / epochs) rj-1) on average gt
max tu, tq ?(lg n)
Foil worst-case by randomizing epoch construction!
18
The Very Simple ?(lg n) Proof
19
Maintain an array An under update(i, ?)
Ai ? sum(i) return A0 Ai
  • The hard instance
  • p random permutation
  • for t 1 to nquery sum(p(t))?t
    rand()update(p(t), ?t)

20
time
21

?8
?7
?9
?1?5?3?7?2
?1
?1?5?3
How much information needs to be transferred?
?1?5?3?7?2 ?8 ?4
time
At least ?5 , ?5?7 , ?5?7?8 gt i.e. at
least 3 words (random values incompressible)
22
The general principle
  • Lower bound down arrows
  • How many down arrows? (in expectation)
  • (2k-1) Pr Pr
  • (2k-1) ½ ½ ?(k)

k operations
k operations
23
Recap
Communication between periods of k items ?(k)

?(k)
24
Putting it all together
aaaa
?(n/8)
?(n/4)
?(n/8)
?(n/2)
?(n/8)
?(n/4)
?(n/8)
time
25
Q.E.D.
  • Augmented binary search trees are optimal.
  • First ?(lg n) for any dynamic data structure.

26
Three Main Ideas
  • Yao, FOCS78
  • Ajtai88 -- predecessor (static)
  • Bing Xiao, Stanford92
  • Miltersen STOC94
  • Miltersen, Nisan, Safra, Wigderson STOC95
  • Beame, Fich STOC99
  • Sen ICALP01
  • (1e)-nearest neighbor Chakrabarti,
    Chazelle, Gum, Lvov STOC99
    Chakrabarti, Regev FOCS04
  • Fredman, Saks89 -- partial sums, union
    find (dynamic)
  • Ben-Amram, Galil FOCS91
  • Miltersen, Subramanian, Vitter, Tamassia93
  • Husfeldt, Rauhe, Skyum96
  • Fredman, Henzinger98 planar connectivity
  • Husfeldt, Rauhe ICALP98 nondeterminism
  • Alstrup, Husfeldt, Rauhe FOCS98 marked
    ancestor
  • Alstrup, Husfeldt , Rauhe SODA01 dynamic
    2D NN
  • Alstrup, Ben-Amram, Rauhe STOC99 union-find

richness lower bounds Borodin,
Ostrovsky, Rabani STOC99 p.m. Barkol, Rabani
STOC00 rand. NN Jayram,Khot,Kumar,Rabani
STOC03 p.m. Liu04 det. ANN
3. Round Elimination
2. Asym. Communication, Rectangles
2. Asym. Communication, Rectangles
1. Epochs
27
Review Communication Complexity
28
Review Communication Complexity
lg S bits
w bits
lg S bits
w bits
database gt space S
query(a,b,c)
Traditional communication complexity total
bits communicated X gt tq(lg S w) X
gt tq ?(X/w) But wait! X CPU input O(w)
29
Review Communication Complexity
lg S bits
w bits
lg S bits
w bits
database gt space S
query(a,b,c)
Asymmetric communication complexity either
Alice sends A bits or Bob sends B bits gt
either tqlg S A or tqw B gt tq min
A/lg S, B/w
30
Richness Lower Bounds
Prove either Alice sends A bits or Bob sends
B bits Assume Alice sends o(A), Bob sends o(B)
gt big monochromatic rectangle Show any big
rectangle is bichromatic (standard idea in comm.
complex.)
Bob
output1
1/2o(A)
Alice
1/2o(B)
Example Alice --gt q ? 0,1d Bob --gt Sn points
in 0,1d Goal find argminx?S x-q
2 Barkol, Rabani A?(d), B?(n1-e) gt
tq min d/lg S, n1-e/w
31
Richness Lower Bounds
  • upper bound either
  • exponential space
  • near-linear query time

What does this really mean? optimal space
lower bound for constant query time
tq
n1-o(1)
T(d/lg n)
1
S
lower bound S 2O(d/tq)
T(n)
2T(d)
Example Alice --gt q ? 0,1d Bob --gt Sn points
in 0,1d Goal find argminx?S x-q
2 Barkol, Rabani A?(d), B?(n1-e) gt
tq min d/lg S, n1-e/w
Also optimal lower bound for decision trees
32
Results
  • Partial match -- database of n strings in
    0,1d, query ? 0,1,d Borodin,
    Ostrovsky, Rabani STOC99 Jayram,Khot,Kumar,R
    abani STOC03 A ?(d/lg n)
  • P. FOCS08 A ?(d)
  • Nearest Neighbor on hypercube (l1, l2)
  • deterministic ?-approximate Liu04 A
    ?(d/ ?2)
  • randomized exact Barkol, Rabani
    STOC00 A ?(d)
  • rand. (1e)-approx Andoni, Indyk, P.
    FOCS06 A ?(e-2lg n)
  • Johnson-Lindenstrauss space is optimal!
  • Approximate Nearest Neighbor in l8
  • Andoni, Croitoru, P. FOCS08 Indyk FOCS98
    is optimal!

simplify
33
Limits of Communication Approach
tq
branchingprograms
n1-o(1)
T(d/lg d)
T(d/lg n)
1
S
T(n)
2T(d)
Alice must send ?(A) bits gt tq ?(A / lg S)
gt tq ?(A / lg(Sd/n))
No separation between SO(n) and SnO(1) !
Separation of ?(lg n / lglg n) between SO(n)
and SnO(1) !
34
Richness Gets You More
  • CPU(s) --gt memory communication
  • one query lg S
  • k queries lg ( )T(k lg )

S k
S k
35
Richness Gets You More
  • CPU(s) --gt memory communication
  • one query lg S
  • k queries lg ( )T(k lg )

S k
S k
36
Richness Gets You More
  • CPU(s) --gt memory communication
  • one query lg S
  • k queries lg ( )T(k lg )

S k
S k
Direct Sum
Any richness lower bound Alice must send A or
Bob must send B gt kAlice must send kA
or kBob must send kB
37
Richness Gets You More
  • CPU(s) --gt memory communication
  • one query lg S
  • k queries lg ( )T(k lg )

tq ?(A / lg(S/k))
S k
S k
Direct Sum
Any richness lower bound Alice must send A or
Bob must send B gt kAlice must send kA
or kBob must send kB
38
Three Main Ideas
  • Yao, FOCS78
  • Ajtai88 -- predecessor Bing
    Xiao, Stanford92 Miltersen STOC94
    Miltersen, Nisan, Safra, Wigderson
    STOC95 Beame, Fich STOC99 Sen
    ICALP01
  • (1e)-nearest neighbor Chakrabarti,
    Chazelle, Gum, Lvov STOC99
    Chakrabarti, Regev FOCS04
  • Fredman, Saks89 - partial sums, union
    find
  • Ben-Amram, Galil FOCS91
    Miltersen, Subramanian, Vitter, Tamassia93
    Husfeldt, Rauhe, Skyum96 Fredman,
    Henzinger98 planar connectivity
    Husfeldt, Rauhe ICALP98 nondeterminism
    Alstrup, Husfeldt, Rauhe FOCS98 marked
    ancestor Alstrup, Husfeldt , Rauhe
    SODA01 dynamic 2D NN Alstrup,
    Ben-Amram, Rauhe STOC99 union-find

richness lower bounds Borodin,
Ostrovsky, Rabani STOC99 p.m. Barkol, Rabani
STOC00 rand. NN Jayram,Khot,Kumar,Rabani
STOC03 p.m. Liu04 det. ANN
3. Round Elimination
2. Asym. Communication, Rectangles
1. Epochs
4. Range Queries
39
Open Hunting Season
  • Nice trick, but ?(lg n / lglg n) with O(n polylg
    n) space not impressive argument for curse of
    dimensionality
  • But space n1o(1) is hugely important in data
    structures gt open hunting season for range
    queries etc.

2D range counting
SELECT count() FROM employees WHERE salary lt
70000 AND startdate lt 1998
40
Open Hunting Season
P. STOC07 ?(lg n / lglg n) with O(n polylg n)
space N.B. tight! 1st bound beyond the
semigroup model question from Fredman
JACM82 Chazelle FOCS86
2D range counting
SELECT count() FROM employees WHERE salary lt
70000 AND startdate lt 1998
41
The Power of Reductions
2D stabbing
Preprocess Sn rectangles stab(x,y) is
(x,y) inside some R?S?
routing ACLs dispatching in some OO languages
2D range counting
SELECT count() FROM employees WHERE salary lt
70000 AND startdate lt 1998
42
The Power of Reductions
-1
1
1
-1
2D stabbing
1
-1
-1
1
-1
Preprocess Sn rectangles stab(x,y) is
(x,y) inside some R?S?
1
1
-1
1
-1
1
-1
2D range counting
SELECT count() FROM employees WHERE salary lt
70000 AND startdate lt 1998
43
The Power of Reductions
2D stabbing
Preprocess Sn rectangles stab(x,y) is
(x,y) inside some R?S?
reachability oracles in butterfly graph
Preprocess G subgraph of butterfly
reachable(x,y) is there a path x-gty ?
44
The Power of Reductions
2D stabbing
Preprocess Sn rectangles stab(x,y) is
(x,y) inside some R?S?
reachability oracles in butterfly graph
Preprocess G subgraph of butterfly
reachable(x,y) is there a path x-gty ?
45
The Power of Reductions
Lopsided Set Disjointness
Alice set S Bob set Tare S and T disjoint?
Hint S one edge out of every node gt n
queries from 1st to last level T deleted
edgesS disjoint from T gt all queries yes
reachability oracles in butterfly graph
Preprocess G subgraph of butterfly
reachable(x,y) is there a path x-gty ?
46
Reachability in Butterfly??
marked ancestor problem
update(node) (un)mark node query(leaf) any
marked ancestor?
47
lopsided set disjointness (LSD)
reachability oracles in the butterfly
partial match
(1e)-ANN l1, l2
NN in l1, l2
dyn. marked ancestor
3-ANN in l8
2D stabbing
worst-case union-find
dyn. trees, graphs
4D reporting
2D counting
dyn. 1D stabbing
P. FOCS08
partial sums
dyn. 2D reporting
dyn. NN in 2D
48
Three Main Ideas
  • Yao, FOCS78
  • Ajtai88 -- predecessor Bing
    Xiao, Stanford92 Miltersen STOC94
    Miltersen, Nisan, Safra, Wigderson
    STOC95 Beame, Fich STOC99 Sen
    ICALP01
  • (1e)-nearest neighbor Chakrabarti,
    Chazelle, Gum, Lvov STOC99
    Chakrabarti, Regev FOCS04
  • Fredman, Saks89 - partial sums, union
    find
  • Ben-Amram, Galil FOCS91
    Miltersen, Subramanian, Vitter, Tamassia93
    Husfeldt, Rauhe, Skyum96 Fredman,
    Henzinger98 planar connectivity
    Husfeldt, Rauhe ICALP98 nondeterminism
    Alstrup, Husfeldt, Rauhe FOCS98 marked
    ancestor Alstrup, Husfeldt , Rauhe
    SODA01 dynamic 2D NN Alstrup,
    Ben-Amram, Rauhe STOC99 union-find

richness lower bounds Borodin,
Ostrovsky, Rabani STOC99 p.m. Barkol, Rabani
STOC00 rand. NN Jayram,Khot,Kumar,Rabani
STOC03 p.m. Liu04 det. ANN
3. Round Elimination
3. Round Elimination
2. Asym. Communication, Rectangles
1. Epochs
4. Range Queries
49
Packet Forwarding/ Predecessor Search
  • Preprocess n prefixes of w bits
  • ? make a hash-table H with all prefixes of
    prefixes
  • ? HO(nw), can be reduced to O(n)
  • Given w-bit IP, find longest matching prefix
  • ? binary search for longest l such that IP0 l
    ? H
  • van Emde Boas FOCS75
  • Waldvogel, Varghese, Turener, Plattner
    SIGCOMM97
  • Degermark, Brodnik, Carlsson, Pink SIGCOMM97
  • Afek, Bremler-Barr, Har-Peled SIGCOMM99

O(lg w)
50
Review Round Elimination
hi lo
hash(hi)
0/1
I want to talk to Alice
0 continue searching for pred(hi) 1
continue searching for pred(lo)
i
1
o(k) bits
Message has negligible info about the typical
i gt can be eliminated for fixed i
2
k
51
The Lemma
  • Observe cant work worst-case!
  • Traditional fix introduce 2-sided error
  • Think outside the
  • easy proof with a different error model
    P.-Thorup, STOC06

52
The Model
  • Alice, Bob receive inputs
  • they may reject inputs
  • if they accept, they start communicating and
    must produce a correct output
  • The point error probability ½ is
    trivial reject probability 0.99999 is still
    hard
  • We regret to inform you that your input has not
    been accepted for communication. We receive a
    large number of inputs, many of them of high
    quality, and scheduling constraints unfortunately
    make it impossible to accept all of them.

53
The Proof
x1
  • Trie for Alices input (x1, , xk)
  • leaves message sent
  • node set of msgs in subtree
  • Say msg size is mk/2
  • leaf1, root2k/2 gt (?)root-to-leaf path,
    ½ of nodes have node ½parent
  • averaging over node-child pairs gt (?) node ½
    its children have childgt ½node
  • thus (?) msg M ¼ of children have M ? child

m1,m2, m3
x2
m1,m2
x3
m1
m3
m2
m1
m3
m1
m1
m2
m2
fix i, x1, , xi-1
fixed message (eliminate)
reject ¾ of inputs (xi)
54
Predecessor Search Timeline
  • after van Emde Boas FOCS75 O(lg w) has to
    be tight!
  • Beame, Fich STOC99 slightly better bound
    with O(n2) space must improve the algorithm
    for O(n) space!
  • P., Thorup STOC06 tight ?(lg w) for space
    O(n polylg n) !

Idea consider multiple queries prove round
elimination under direct sum
55
Predecessor Search Timeline
I want to talk to Alice
2
1
k
2
I want to talk to Alice
1
k
2
2
1
I want to talk to Alice
2
2
1
k
Idea consider multiple queries prove round
elimination under direct sum
56
Round Eliminated!
57
The End
Questions?
  • or Champagne?

58
(No Transcript)
59
The Partial Sums Problem
Textbook solution augmented binary search
trees Running time O(lg n) / operation
Maintain an array An underupdate(i, ?)
Ai ?sum(i) return A0 Ai
Write a Comment
User Comments (0)
About PowerShow.com