Constant-Time LCA Retrieval - PowerPoint PPT Presentation

About This Presentation
Title:

Constant-Time LCA Retrieval

Description:

Constant-Time LCA Retrieval Presentation by Danny Hermelin, String Matching Algorithms Seminar, Haifa University. The Lowest Common Ancestor In a rooted tree T, a ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 60
Provided by: DannyHe2
Category:

less

Transcript and Presenter's Notes

Title: Constant-Time LCA Retrieval


1
Constant-Time LCA Retrieval
  • Presentation by Danny Hermelin,
  • String Matching Algorithms Seminar,
  • Haifa University.

2
The Lowest Common Ancestor
  • In a rooted tree T, a node u is an ancestor of a
    node v if u is on the unique path from the root
    to v.
  • In a rooted tree T, the Lowest Common Ancestor
    (LCA) of two nodes u and v is the deepest node in
    T that is the ancestor of both u and v.

3
For example
1
2
3
4
5
6
  • Node 3 is the LCA of nodes 4 and 6.
  • Node 1 is the LCA of node 2 and 5.

4
The LCA Problem
  • The LCA problem is then, given a rooted tree T
    for preprocessing, preprocess it in a way so that
    the LCA of any two given nodes in T can be
    retrieved in constant time.
  • In this presentation we shall present a
    preprocessing algorithm that requires no more
    then linear time and space complexity.

5
The assumed machine model
  • We make the following two assumptions on our
    computational model.
  • Let n denote the size of our input in unary
    representation
  • All arithmetic, comparative and logical
    operations on numbers whose binary representation
    is of size no more then logn bits can be done in
    constant time.
  • We assume that finding the left-most bit or the
    right-most bit of a logn sized number can be done
    in constant time.

6
  • The first assumption is a very reasonable
    straightforward assumption considering most
    machines on the market today.
  • The second seems less reasonable but can be
    achieved with the help of a few (constant
    numbered) tables of size O( n ).
  • These assumptions helps our discussion focus on
    the more interesting parts of the algorithm
    solving the LCA problem.

7
The Simple caseComplete Binary Tree
  • Our discussion begins with a particularly simple
    instance of the LCA problem, LCA queries on
    complete binary trees.
  • We will use our knowledge of solving the LCA
    problem on complete binary trees and expand it
    later on, to solve the LCA problem on any
    arbitrary rooted tree T.

8
  • Let B denote a complete binary tree with n nodes.
  • The key here is to encode the unique path from
    the root to a node in the node itself. We assign
    each node a path number, a logn bit number that
    encodes the unique path from the root to the
    node.

9
The Path Number
  • For each node v in B we encode a path number in
    the following way
  • Counting from the left most bit, the ith bit of
    the path number for v corresponds to the ith
    edge on the path from the root to v.
  • A 0 for the ith bit from the left indicates that
    the ith edge on the path goes to a left child,
    and a 1 indicates that it goes to a right child.
  • Let k denote then number of edges on the path
    from the root to v, then we mark the k1 bit (the
    height bit) of the path number 1, and the rest of
    the logn-k-1 bits 0.

10
For example
1
0
node j
0
1
0
node i
  • Node is path number is
  • Node js path number is

1
0
0
1
1
0
1
0
The height bit is marked in blue Padded bits are
marked in red.
11
1000
0100
1100
0010
0110
1010
1110
0001
0011
0101
0111
1001
1011
1101
1111
  • Path numbers can easily be assigned in a simple
    O(n) in-order traversal on B.

12
How do we solve LCA queries in B
  • Suppose now that u and v are two nodes in B, and
    that path(u) and path(v) are their appropriate
    path numbers.
  • We denote the lowest common ancestor of u and v
    as lca(u,v).
  • We denote the prefix bits in the path number,
    those that correspond to edges on the path from
    the root, as the path bits of the path number.

13
  • First we calculate path(u) XOR path(v) and find
    the left most bit which equals 1.
  • If there is no such bit than path(u) path(v)
    and so u v, so assume that the kth bit of the
    result is 1.
  • If both the kth bit in path(u) and the kth bit
    in path(v) are path bits, then this means that u
    and v agree on k-1 edges of their path from the
    root, meaning that the k-1 prefix of each nodes
    path number encodes within it the path from the
    root to lca(u,v).

14
For example
lca(u,v)
0100
u
0010
v
0111
  • path(u) XOR path(v)

0 0 1 0 XOR 0 1 1 1 0 1 0 1
path(lca(u,v)
0
1
0
0
height bit
padded bits
15
For example
lca(u,v)
1010
u
v
1001
1011
  • path(u) XOR path(v)

1 0 0 1 XOR 1 0 1 1 0 0 1 0
path(lca(u,v)
1
0
1
0
height bit
padded bit
16
  • This concludes that if we take the prefix k-1
    bits of the result of path(u) XOR path(v), add 1
    as the kth bit, and pad logn-k 0 suffix bits, we
    get path(lca(u,v)).
  • If either the kth bit in path(u) or the kth bit
    in path(v) (or both) is not a path bit then one
    node is ancestor to the other, and lca(u,v) can
    easily be retrieved by comparing path(u) and
    path(v)s height bit.

17
The general LCA algorithm
  • The following are the two stages of the general
    LCA algorithm for any arbitrary tree T
  • First, we reduce the LCA problem to the
    Restricted Range Minima problem. The Restricted
    Range Minima problem is the problem of finding
    the smallest number in an interval of a fixed
    list of numbers, where the difference between two
    successive numbers in the list is exactly one.
  • Second, we solve the Restricted Range Minima
    problem and thus solve the LCA problem.

18
The Reduction
  • Let T denote an arbitrary tree
  • Let lca(u,v) denote the lowest common ancestor of
    nodes u and v in T.
  • First we execute a depth-first traversal of T to
    label the nodes in the depth-first order they are
    encountered.
  • In that same traversal we maintain a list L, of
    nodes of T, in the same order that they were
    visited.
  • The only property of the depth-first numbering we
    need is that the number given to any node is
    smaller then the number given to any of its
    descendents.

19
For example
000
001
010
011
100
101
111
110
  • The depth-first traversal creates these depth
    numbers and the following list L

L 0,
1,
0,
2,
3,
2,
4,
2,
5,
6,
5,
7,
5,
2,
0
20
  • Now if want to find lca(u,v), we find the first
    occurrence of the two nodes in L, this defines an
    interval I in L.
  • Suppose u occurs in L before v. Now, I describes
    the part of the traversal, from the point we
    first discovered u to the point we first
    discovered v.
  • lca(u,v) can be retrieved by finding the minimum
    number in I.

21
  • This is due to the following two simple facts
  • If u is an ancestor of v then all those nodes
    visited between u and v are in us subtree, and
    thus the depth-number assigned to u is minimal in
    I.
  • If u is not an ancestor of v, then all those
    nodes visited between u and v are in lca(u,v)s
    subtree, and the traversal must visit lca(u,v).
    Thus the minimum of I is the depth-number
    assigned to lca(u,v).

22
For example..
000
001
010
011
100
101
111
110
L 0,
1,
0,
2,
3,
2,
4,
2,
5,
6,
5,
7,
5,
2,
0
  • lca(3,7) 2
  • lca(0,7) 0

23
The Restricted Reduction
  • So far weve shown how to reduce the LCA problem
    to the range minima problem. This next step shows
    how to achieve reduction to the restricted range
    minima problem.
  • Denote level(u) as the number of edges in the
    unique path from the root to node u in T.
  • If L l1, l2, , lz then we build the
    following list
  • Llevel(l1),level(l2),level(lz).

24
  • We use L in the same manner we used L in the
    previous reduction scheme.
  • This works because in every interval I u,v in
    L, lca(u,v) is the lowest node in I for the same
    reasons mentioned earlier.
  • The difference between two adjacent elements in
    L is exactly one.
  • This completes the reduction to the restricted
    range minima problem.

25
The reduction complexity.
  • Denote n as the number of nodes in T.
  • Depth-first traversal can be done in O( n ) space
    and time complexity.
  • L is of size O( n ) and thus its creation and
    initialization can be done in O( n ) space and
    time complexity.
  • To find lca(u,v) we need the first occurrence of
    u and v in L. This could be stored in a table of
    size O( n ). Thus the creation and initialization
    of this table can be done in O( n ) space and
    time complexity.
  • The total space and time complexity of the
    reduction is then O( n ).

26
The Range Minima Problem
  • The Range Minima problem is the problem of
    finding the smallest number in an interval of a
    fixed list of numbers.
  • The Restricted Range Minima problem is an
    instance of the Range Minima problem where the
    difference between two successive numbers is
    exactly one.

27
More Formally
  • The Restricted Range Minima problem is stated
    formally in the following
  • Given a list L l1 , l2 , , ln of n real
    numbers, where for each i 1 n-1 li - li1
    1, preprocess the list so that for any
    interval li , li1 , , lj ,
  • 1 ? i lt j ? n, the minimum over the interval can
    be retrieved in constant time.

28
Two preprocessing methods for the Range Minima
Problem
  • The algorithm for solving the Range Minima
    problem uses two preprocessing methods
  • Procedure I uses no assumptions regarding the
    difference between adjacent elements, and
    requires O(nlogn) space and time complexity.
  • Procedure II uses the restricted assumption
    regarding adjacent elements, and requires
    exponential space and time complexity.

29
Procedure I
  • Suppose that our list L is of size n, and for
    convenience purposes suppose n is a power of
    2.The procedure has two main stages
  • First, build a complete binary tree B of size
    2n-1 with n leaves. Then for i from 1 to n,
    record the ith element of L at leaf i.
  • Second, for each internal node (not a leaf) in B,
    maintain a suffix-list and a prefix-list
    containing all prefix minima and suffix minima
    with respect to the leaves in its subtree.

30
  • Let Lv denote the number of nodes in the
    subtree rooted by node v which is internal in B.
  • A prefix list of an internal node v in B is a
    list of size equal to the number of leaves in vs
    subtree. The kth entry in the list is then the
    smallest number among the numbers represented by
    the first consecutive k leaves in vs subtree.
  • Likewise, a suffix list of v has the same size
    and the kth entry in it contains the smallest
    number among the numbers represented by the last
    consecutive Lv - k 1 leaves in vs subtree.

31
For Example
  • Suppose L 6, 7, 4, 1, 5, 2, 9, 9
  • Then Procedure I builds the following complete
    binary tree for L

6 7 4 1 5 2
9 9
32
6 7 4 1 5 2
9 9
  • The prefix list of the root node is then

6,
6,
4,
1,
1,
1,
1,
1
In the same manner, its suffix list is 1, 1,
1, 1, 2, 2, 9, 9
33
Finding the Range Minima
  • After the preprocessing stages are complete, the
    smallest number in any interval u,v can be
    found in constant time as follows
  • First find the LCA of u and v and call it z.
    Recall, we already know how to answer LCA
    quarries in complete binary trees, in constant
    time.
  • The minima is then the minimum between the value
    of zs left childs suffix list at entry u, and
    zs right childs prefix list at entry v.

34
For Example
  • Suppose I 4, 1, 5, 2 .
  • The endpoints of I, 4 and 2, are leaves in B
    whos LCA is the root node.
  • Denote the roots left son as left and the roots
    right son as right.
  • Leaf 4 is then,the third leaf from the left in
    lefts subtree and leaf 2 is the second leaf from
    the left in rights subtree.

35
right
left
6 7 4 1 5 2
9 9
I
  • lefts suffix list at entry 3 Min4, 1 1.
  • rights prefix list at entry 2 Min2, 5 2.
  • The minima over I is then Min1, 2 1.

36
  • Procedure I clearly requires O(nlogn) time and
    space complexity. This is a result of these two
    simple facts
  • The total size of all the prefix and suffix lists
    of all the internal nodes of B is O(nlogn).
  • Each entry in these list requires constant time
    to calculate if we use simple dynamic programming
    techniques.

37
Procedure II
  • Procedure II uses the assumption that the
    difference between any two adjacent elements of L
    is exactly one. We assume without loss of
    generality that the first element of L is zero
    (since, otherwise, we can subtract from every
    element in L the value of the first element, and
    then add it to the range-minima result).

38
  • The procedure runs in two main stages
  • First, a table is built with 2n-1 entries in it.
    Each entry in this table represents a valid
    instance of L, and is a reference to a particular
    subtable.
  • Second, in each subtable we store the answer to
    each of the n(n-1)/2 possible range queries.

39
  • All the possible instances of L are enumerable,
    and so are all the range-minima queries, thus,
    given an instance of L, any range-minima query on
    this L can be answered in constant time.









main table
query table
n
n-1
2
n
40
  • It is easy to see then, that Procedure II uses
    O( ) space and time complexity.

n
n
2
2
We shall now demonstrate how with the use of
Procedure I and Procedure II we achieve linear
time and space preprocessing in order to answer
all range-minima queries on L.
41
The Restricted Range-Minima preprocessing
algorithm
  • Our algorithm runs in three stages
  • First we partition L into logn sized subsets,
    giving us a total of n/logn subsets of this kind.
    We apply Procedure I to an array of all the
    minimums of these subsets.

42
subset minima
logn
n
43
  • Furthermore, each subset of size logn we
    partition into smaller subsets of size loglogn
  • giving us logn/loglogn partitions in each
    subset. Again we apply Procedure I to an array of
    all the minimums of these loglogn partitions.

44
subset minima
subset partition minima
loglogn
logn
45
  1. Finally, we run Procedure II to build the table
    required for any array of size loglogn. For each
    subset partition we identify its proper entry in
    our table.

46
loglogn
logn

procedure II table
47
  • After these stages are completed any
    range-minima query on L, can be answered in
    constant time. Consider a query requesting the
    minimum over i, j. Then the range i, j can
    easily be presented as the union of the following
    (at most) five ranges

x
x
x
x
x
x
x
x
  • i , , 1, , 1, ,
    1, , 1, j

2
3
1
2
3
4
1
4
i
j
48
  • Where
  • i , x1 and x4 1, j fall within a single
    subset partition of size loglogn, its minimum is
    available in its subtable.

i , j
x
x
x
x
x
x
x
x
  • i , , 1, , 1, ,
    1, , 1, j

2
3
1
2
3
4
1
4
i
j
49
  1. x1 1, x2 and x3 1, x4 are unions of
    subset partitions of size loglogn and fall within
    a single subset of size logn its minimum is
    available from the application of Procedure I on
    this subset.

i , j
x
x
x
x
x
x
x
x
  • i , , 1, , 1, ,
    1, , 1, j

2
3
1
2
3
4
1
4
i
j
50
  1. x2 1 , x3 is the union of subsets of size
    logn each, its minimum is available from the
    first application of Procedure I.

i , j
x
x
x
x
x
x
x
x
  • i , , 1, , 1, ,
    1, , 1, j

2
3
1
2
3
4
1
4
i
j
51
Space and Time Complexity
  • Did we archive linear space and time complexity,
    as promised? lets check.
  • Recall our preprocessing algorithm runs in three
    stage. Well check each stage separately.
  • Denote n as the size of our input list L.
  • We assume n is a power of 2 for convenience
    purposes.

52
  • The first stage space and time complexity can be
    computed as follows
  • Partitioning L into n/logn subsets of size logn
    each, and finding each new subsets minima
  • Time O( n ) - one pass through L is enough.
  • Space O( n/logn ) for storing all subset
    data.
  • Applying Procedure I on an array of n/logn
    minima
  • Time and Space according to Procedure I
    complexity
  • O( n/logn ?? log( n/logn )) ? O( n/ logn ? logn
    )
  • O( n ).
  • Total space and time complexity O ( n ).

n/logn lt n
53
  • The second stage space and time complexity can
    be computed as follows
  • Partitioning each n/logn subset, into smaller
    subsets of size loglogn each and finding each new
    subsets minima
  • Time O( n ) - one pass through L is enough.
  • Space O( n/loglogn ) for storing all subset
    data.
  • Applying Procedure I on n/logn arrays of
    logn/loglogn minima
  • Time and Space according to Procedure I
    complexity
  • n/logn ?? O( logn/loglogn ?? log( logn/loglogn
    )) ?
  • n/logn ? O( logn/ loglogn ? loglogn ) O( n ).
  • Total space and time complexity O ( n ).

logn/loglogn lt logn
54
  • The third stage simply runs Procedure II on
    inputs of size loglogn. So the space and time
    complexity of the third stage of the algorithm
    can be computed as
  • follows
  • Time and Space according to Procedure II
    complexity
  • O ( 2loglogn ? log2logn ) O( logn ?
    log2logn ) ? O ( log2n )
  • Total space and time complexity O ( log2n ).

log2logn lt logn
55
Total space and time complexity O (n)
56
Aftermath
  • How much did we really gain by reducing the LCA
    problem to the restricted range-minima problem?
  • Can we be satisfied by just reducing to the
    range-minima problem?
  • If you recall, the restricted range-minima
    reduction allows us to use Procedure II which
    assumes input of restricted nature. We used
    Procedure II to answer range queries of size on
    subsets of size equal or smaller then loglogn.

57
  • We can instead apply Procedure I to each of these
    loglogn subset which would total the space and
    time complexity of the whole algorithm to O(
    nloglogn ).
  • If we choose to further partition these subset
    into subsets of size logloglogn, we would reach
    O(nlogloglogn). We can continue in this
    fashion for as much as we like, improving our
    algorithms complexity along the way.
  • If k is the number of partition stages our
    algorithm applied, then its space and time
    complexity equals O(nloglog logn).

k
58
  • The space and Time complexity of our
    preprocessing algorithm for the un-restricted
    range minima problem is then O(nlogn) !
  • For practical applications the un-restricted
    range minima reduction is enough then,
    considerably simplifying the implementation
    process.
  • The restricted range minima reduction is needed
    mostly for theoretical purposes.

59
Bibliography
Write a Comment
User Comments (0)
About PowerShow.com