Succinct Representations of Trees - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Succinct Representations of Trees

Description:

Heap-like representation. Jacobson's representation. Parenthesis representation ... Heap-like notation for a binary tree. 1 1 1 1 0 1 1 0 1 0 0 1 0 0 0 0 0 ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 36
Provided by: ssri4
Category:

less

Transcript and Presenter's Notes

Title: Succinct Representations of Trees


1
Succinct Representations of Trees
  • S. Srinivasa Rao
  • IT University of Copenhagen

2
Outline
  • Succinct data structures
  • Introduction
  • Examples
  • Tree representations
  • Heap-like representation
  • Jacobsons representation
  • Parenthesis representation
  • Partitioning method
  • Conclusions

3
Succinct Data Structures
4
Succinct data structures
  • Goal represent the data in close to optimal
    space, while supporting the operations
    efficiently.
  • (optimal information-theoretic lower bound)
  • An extension of data compression.
  • (Data compression
  • Achieve close to optimal space
  • Queries need not be supported efficiently. )

5
Applications
  • Potential applications where
  • memory is limited small memory devices like
    PDAs, mobile phones etc.
  • massive amounts of data DNA sequences,
    geographical/astronomical data, search engines
    etc.

6
Examples
  • Trees, Graphs
  • Bit vectors, Sets
  • Dynamic arrays
  • Text indexes
  • suffix trees/suffix arrays etc.
  • Permutations, Functions
  • XML documents, File systems (labeled,
    multi-labeled trees)
  • BDDs

7
Example Permutations
  • A permutation ? of 1,,n
  • A simple representation
  • n lg n bits
  • ?(i) in O(1) time
  • ?-1(i) in O(n) time
  • Our representation
  • (1e) n lg n bits
  • ?(i) in O(1) time
  • ?-1(i) in O(1/e) time (optimal trade-off)
  • ?k(i) in O(1/e) time (for any positive or
    negative integer k)
  • lg (n!) o(n) (lt n lg n) bits (optimal space)
  • ?k(i) in O(lg n / lg lg n) time

?2(1)3 ?-2(1)5
8
Example Functions
  • A function f 1,,n ? 1,,n can be
    represented
  • - using n lg n O(n) bits
  • - f k(i) in O(1) time
  • - f -k(i) in O(1output) time
  • (optimal space and query times).
  • Can also be generalized to arbitrary functions (f
    1,,n ? 1,,m).

9
Representing Trees

10
Motivation
  • Trees are used to represent
  • - Directories (Unix, all the rest)
  • - Search trees (B-trees, binary search trees,
    digital trees or tries)
  • - Graph structures (we do a tree based search)
  • Search indexes for text (including DNA)
  • Suffix trees
  • XML documents

11
Space for trees
  • The space used by the tree structure could be
    the dominating factor in some applications.
  • Eg. More than half of the space used by a
    standard suffix tree representation is used to
    store the tree structure.
  • Standard representations of trees support very
    few operations. To support other useful queries,
    they require a large amount of extra space.

12
Standard representation
  • Binary tree
  • each node has two
  • pointers to its left
  • and right children
  • An n-node tree takes
  • 2n pointers or 2n lg n bits
  • (can be easily reduced to
  • n lg n O(n) bits).
  • Supports finding left child or right child of a
    node (in constant time).
  • For each extra operation (eg. parent, subtree
    size) we have to pay, roughly, an additional n lg
    n bits.

x
x
x
x
x
x
x
x
x
13
Can we improve the space bound?
  • There are less than 22n distinct binary trees on
    n nodes.
  • 2n bits are enough to distinguish between any two
    different binary trees.
  • Can we represent an n node binary tree using 2n
    bits?

14
Heap-like notation for a binary tree

1
Add external nodes
1
1
Label internal nodes with a 1 and external nodes
with a 0
1
1
1
0
1
1
0
0
0
0
Write the labels in level order
1 1 1 1 0 1 1 0 1 0 0 1 0 0 0 0 0
0
0
0
0
One can reconstruct the tree from this sequence
An n node binary tree can be represented in 2n1
bits.
What about the operations?
15
Heap-like notation for a binary tree

1
1
left child(x) 2x
3
2
2
3
right child(x) 2x1
5
7
6
4
6
5
4
parent(x) ?x/2?
12
8
9
13
11
10
8
7
x ? x 1s up to x x ? x position of x-th 1
17
16
15
14
1 2 3 4 5 6 7 8

1 1 1 1 0 1 1 0 1 0 0 1 0 0 0
0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
16
Rank/Select on a bit vector
  • Given a bit vector B
  • rank1(i) 1s up to position i in B
  • select1(i) position of the i-th 1 in B
  • (similarly rank0 and select0)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
B 0 1 1 0 1 0 0 0 1 1 0 1 1 1
1
rank1(5) 3 select1(4) 9 rank0(5)
2 select0(4) 7
Given a bit vector of length n, by storing an
additional o(n)-bit structure, we can support all
four operations in constant time.
An important substructure in most succinct data
structures. Have been implemented.
17
Binary tree representation
  • A binary tree on n nodes can be represented using
    2no(n) bits to support
  • parent
  • left child
  • right child
  • in constant time.

18
Ordered trees
  • A rooted ordered tree (on n nodes)
  • Navigational operations
  • - parent(x) a
  • - first child(x) b
  • - next sibling(x) c
  • Other useful operations
  • - degree(x) 2
  • - subtree size(x) 4

a
x
c
b
19
Ordered trees
  • A binary tree representation taking 2no(n) bits
    that supports parent, left child and right child
    operations in constant time.
  • There is a one-to-one correspondence between
    binary trees (on n nodes) and rooted ordered
    trees (on n1 nodes).
  • Gives an ordered tree representation taking
    2no(n) bits that supports first child, next
    sibling (but not parent) operations in constant
    time.
  • We will now consider ordered tree representations
    that support more operations.

20
Level-order degree sequence

3
Write the degree sequence in level order
3 2 0 3 0 1 0 2 0 0 0 0
2
0
3
But, this still requires n lg n bits
0
0
0
1
2
Solution write them in unary
1 1 1 0 1 1 0 0 1 1 1 0 0 1 0 0 1 1 0
0 0 0 0 Takes 2n-1 bits
0
0
0
A tree is uniquely determined by its degree
sequence
21
Supporting operations

Add a dummy root so that each node has a
corresponding 1
1 0 1 1 1 0 1 1 0 0 1 1 1 0 0 1 0 0 1 1 0 0 0 0
0 1 2 3 4 5 6 7 8 9 10 11 12
1
node k corresponds to the k-th 1 in the bit
sequence
3
4
2
parent(k) 0s up to the k-th 1
children of k are stored after the k-th 0
7
9
5
6
8
supports parent, i-th child, degree (using rank
and select)
10
11
12
22
Level-order unary degree sequence
  • Space 2no(n) bits
  • Supports
  • parent
  • i-th child (and hence first child)
  • next sibling
  • degree
  • in constant time.
  • Does not support subtree size operation.

Implementation Delpratt-Rahman-Raman, WAE-06
23
Another approach

Write the degree sequence in depth-first order
3
3 2 0 1 0 0 3 0 2 0 0 0
2
0
3
0
0
0
1
2
In unary 1 1 1 0 1 1 0 0 1 0 0 0 1 1 1 0 0 1 1
0 0 0 0 Takes 2n-1 bits.

0
0
0
The representation of a subtree is together.
Supports subtree size along with other
operations. (Apart from rank/select, we need some
additional operations.)
24
Depth-first unary degree sequence
  • Space 2no(n) bits
  • Supports
  • parent
  • i-th child (and hence first child)
  • next sibling
  • degree
  • subtree size
  • in constant time.

25
Other useful operations

1
XML based applications level ancestor(x,l)
returns the ancestor of x at level l eg. level
ancestor(11,2) 4
3
4
2
7
9
5
6
8
Suffix tree based applications LCA(x,y)
returns the least common ancestor of x and
y eg. LCA(7,12) 4
10
11
12
26
Parenthesis representation

Associate an open-close parenthesis-pair with
each node
( )
Visit the nodes in pre-order, writing the
parentheses
( )
( )
( )
length 2n
( )
( )
( )
( )
( )
space 2n bits
One can reconstruct the tree from this sequence
( )
( )
( )
(
(
(
)
(
(
)
)
)
)
(
(
)
)
)
)
)
)
(
(
(
(
(
)
27
Operations

1
parent enclosing parenthesis
first child next parenthesis (if open)
3
4
2
next sibling open parenthesis following the
matching closing parenthesis (if exists)
7
9
5
6
8
subtree size half the number of parentheses
between the pair
with o(n) extra bits, all these can be supported
in constant time
10
11
12
( ( ( ) ( ( ) ) ) ( ) ( ( ) ( ( )
( ) ) ( ) ) ) 1 2 5 6 10 3
4 7 8 11 12 9
28
Parenthesis representation
  • Space 2no(n) bits
  • Supports
  • in constant time.
  • parent
  • first child
  • next sibling
  • subtree size
  • degree
  • depth
  • height
  • level ancestor
  • LCA
  • leftmost/rightmost leaf
  • number of leaves in the subtree
  • next node in the level
  • pre/post order number
  • i-th child

Implementation Geary et al., CPM-04
29
A different approach
  • If we group k nodes into a block, then pointers
    with the block can be stored using only lg k
    bits.
  • For example, if we can partition the tree into
    n/k blocks, each of size k, then we can store it
    using (n/k) lg n (n/k) k lg k (n/k) lg n n
    lg k bits.

A careful two-level tree covering method
achieves a space bound of 2no(n) bits.
30
Tree covering method
  • Space 2no(n) bits
  • Supports
  • in constant time.
  • parent
  • first child
  • next sibling
  • subtree size
  • degree
  • depth
  • height
  • level ancestor
  • LCA
  • leftmost/rightmost leaf
  • number of leaves in the subtree
  • next node in the level
  • pre/post order number
  • i-th child

31
Ordered tree representations
DFUDS-order rank, select
parent, first child, sibling
level-order rank, select
post-order rank, select
pre-order rank, select
next node in the level
i-th child, child rank
leaf operations
level ancestor
subtree size
Depth, LCA
height
degree
X X X X X X X X
X X X
X X
X
LOUDS
DFUDS
Paren.
Partition
32
Applications
  • Representing
  • suffix trees
  • XML documents (supporting XPath queries)
  • file systems (searching and Path queries)
  • representing BDDs

33
Conclusions
  • Succinct representations improve the space
    complexity without compromising on query times.
  • Trees can be represented in close to optimal
    space, while supporting a wide range of queries
    efficiently.
  • Open problems
  • Supporting updates efficiently.
  • Efficient external memory structures.

34
References
  • Jacobson, FOCS 89
  • Munro-Raman-Rao, FSTTCS 98 (JAlg 01)
  • Benoit et al., WADS 99 (Algorithmica 05)
  • Lu et al., SODA 01
  • Sadakane, ISSAC 01
  • Geary-Raman-Raman, SODA 04
  • Munro-Rao, ICALP 04
  • Jansson-Sadakane, SODA 06
  • Implementation
  • Geary et al., CPM 04
  • Delpratt-Rahman-Raman., WAE 06

35
  • Thank you.

36
Future work
  • Efficient algorithms for XPath queries
  • File system searches
  • Implementation

37
Dynamic binary trees
  • Raman-Rao, ICALP 03
  • A binary tree on n nodes can be represented using
    2no(n) bits to support
  • parent, left/right child, subtree size, preorder
    number in O(1) time
  • insert/delete nodes in O(1) amortized time
  • Can associate b O(lg n)-bit satellite data
    using
  • - bn o(bn) bits to support access in O(1) time
  • - bn o(n) bits to support access in
  • O((lg lg n)1e) time

38
k-ary trees
  • A k-ary tree is either empty or a node with
    exactly k children, each of which is a k-ary tree
  • A k-ary tree on n nodes can be represented using
  • n ?lg k? 2n o(n) bits to support
  • parent, i-th child, child labeled j, degree and
    subtree-size queries in O(1) time
  • Benoit-Demaine-Munro-Raman-Raman-Rao,
    Algorithmica
  • opt o(n) bits to support all except the
    subtree-size queries in O(1) time
  • Raman-Raman-Rao, SODA-02

39
Functions
  • A function f 1,,n?1,,n can be represented
  • - using n lg n O(n) bits
  • - fk(i) in O(1) time
  • - fk(i) in O(1output) time.
  • Can also be generalized to arbitrary functions (f
    1,,n?1,,m).

40
Summary of results
Write a Comment
User Comments (0)
About PowerShow.com