CSE 326: Data Structures Part Four: Trees - PowerPoint PPT Presentation

About This Presentation
Title:

CSE 326: Data Structures Part Four: Trees

Description:

CSE 326: Data Structures Part Four: Trees Henry Kautz Autumn 2002 Material Weiss Chapter 4: N-ary trees Binary Search Trees AVL Trees Splay Trees Other Applications ... – PowerPoint PPT presentation

Number of Views:162
Avg rating:3.0/5.0
Slides: 93
Provided by: csWashing2
Category:
Tags: cse | data | four | part | structures | trees

less

Transcript and Presenter's Notes

Title: CSE 326: Data Structures Part Four: Trees


1
CSE 326 Data StructuresPart Four Trees
  • Henry Kautz
  • Autumn 2002

2
Material
  • Weiss Chapter 4
  • N-ary trees
  • Binary Search Trees
  • AVL Trees
  • Splay Trees

3
Other Applications of Trees?
4
Tree Jargon
  • Length of a path number of edges
  • Depth of a node N length of path from root to N
  • Height of node N length of longest path from N
    to a leaf
  • Depth and height of tree height of root

depth0, height 2
A
C
D
B
F
E
depth 2, height0
5
Definition and Tree Trivia
  • Recursive Definition of a Tree
  • A tree is a set of nodes that is
  • a. an empty set of nodes, or
  • b. has one node called the root from which
    zero or more trees (subtrees) descend.
  • A tree with N nodes always has ___ edges
  • Two nodes in a tree have at most how many paths
    between them?

6
Implementation of Trees
  • Obvious Pointer-Based Implementation Node with
    value and pointers to children
  • Problem?

A
C
D
B
F
E
7
1st Child/Next Sibling Representation
  • Each node has 2 pointers one to its first child
    and one to next sibling

A
A
C
D
B
C
D
B
F
E
F
E
8
Nested List Implementation 1
  • Tree ( label Tree )

a
b
d
c
9
Nested List Implementation 2
  • Tree label (label Tree )

a
b
d
c
10
Application Arithmetic Expression Trees
Example Arithmetic Expression A (B
(C / D) ) Tree for the above expression

A
  • Used in most compilers
  • No parenthesis need use tree structure
  • Can speed up calculations e.g. replace
  • / node with C/D if C and D are known
  • Calculate by traversing tree (how?)

B
/
D
C
11
Traversing Trees
  • Preorder Root, then Children
  • A B / C D
  • Postorder Children, then Root
  • A B C D /
  • Inorder Left child, Root, Right child
  • A B C / D

A

B
/
D
C
12
Example Code for Recursive Preorder
void print_preorder ( TreeNode T) TreeNode
P if ( T NULL ) return
else print_element(T.Element)
P T.FirstChild while (P !
NULL) print_preorder ( P
) P P.NextSibling

What is the running time for a tree with N nodes?
13
Binary Trees
  • Properties
  • Notation depth(tree) MAX depth(leaf)
    height(root)
  • max of leaves 2depth(tree)
  • max of nodes 2depth(tree)1 1
  • max depth n-1
  • average depth for n nodes
  • (over all possible binary trees)
  • Representation

A
B
C
D
E
F
H
G
J
I
14
Dictionary Search ADTs
  • Operations
  • create
  • destroy
  • insert
  • find
  • delete
  • Dictionary Stores values associated with
    user-specified keys
  • keys may be any (homogenous) comparable type
  • values may be any (homogenous) type
  • implementation data field is a struct with two
    parts
  • Search ADT keys values
  • kim chi
  • spicy cabbage
  • kreplach
  • tasty stuffed dough
  • kiwi
  • Australian fruit

insert
  • kohlrabi
  • - upscale tuber

find(kreplach)
  • kreplach
  • - tasty stuffed dough

15
Naïve Implementations
unsorted array sorted array linked list
insert (w/o duplicates)
find
delete
  • Goal fast find like sorted array, dynamic
    inserts/deletes like linked list

16
Naïve Implementations
unsorted array sorted array linked list
insert (w/o duplicates) find O(1) O(n) find O(1)
find O(n) O(log n) O(n)
delete find O(1) O(n) find O(1)
  • Goal fast find like sorted array, dynamic
    inserts/deletes like linked list

17
Binary Search Tree Dictionary Data Structure
  • Search tree property
  • all keys in left subtree smaller than roots key
  • all keys in right subtree larger than roots key
  • result
  • easy to find any given key
  • inserts/deletes by changing links

8
11
5
12
10
6
2
4
14
7
9
13
18
In Order Listing
  • visit left subtree
  • visit node
  • visit right subtree

10
15
5
20
9
2
17
30
7
In order listing
19
In Order Listing
  • visit left subtree
  • visit node
  • visit right subtree

10
15
5
20
9
2
17
30
7
In order listing 2?5?7?9?10?15?17?20?30
20
Finding a Node
  • Node find(Comparable x, Node root)
  • if (root NULL)
  • return root
  • else if (x lt root.key)
  • return find(x,root.left)
  • else if (x gt root.key)
  • return find(x, root.right)
  • else
  • return root

10
15
5
20
9
2
30
7
17
runtime
21
Insert
  • Concept proceed down tree as in Find if new key
    not found, then insert a new node at last spot
    traversed
  • void insert(Comparable x, Node root)
  • // Does not work for empty tree when root is
    NULL
  • if (x lt root.key)
  • if (root.left NULL)
  • root.left new Node(x)
  • else insert( x, root.left )
  • else if (x gt root.key)
  • if (root.right NULL)
  • root.right new Node(x)
  • else insert( x, root.right )

22
Time to Build a Tree
  • Suppose a1, a2, , an are inserted into an
    initially empty BST
  • a1, a2, , an are in increasing order
  • a1, a2, , an are in decreasing order
  • a1 is the median of all, a2 is the median of
    elements less than a1, a3 is the median of
    elements greater than a1, etc.
  • data is randomly ordered

23
Analysis of BuildTree
  • Increasing / Decreasing ?(n2)
  • 1 2 3 n ?(n2)
  • Medians yields perfectly balanced tree
  • ?(n log n)
  • Average case assuming all input sequences are
    equally likely is ?(n log n)
  • equivalently average depth of a node is log
    nTotal time sum of depths of nodes

24
Proof that Average Depth of a Node in a BST
Constructed from Random Data is ?(log n)
  • Method Calculate sum of all depths, divide by
    number of nodes
  • D(n) sum of depths of all nodes in a random BST
    containing n nodes
  • D(n) D(left subtree)D(right subtree)
    adjustment for distance from root to subtree
    depth of root
  • D(n) D(left subtree)D(right subtree)
    (number of nodes in left and right subtrees) 0
  • D(n) D(L)D(n-L-1)(n-1)

25
Random BST, cont.
  • D(n) D(L)D(n-L-1)(n-1)
  • For random data, all subtree sizes equally likely

this looks just like the Quicksort average case
equation!
26
(No Transcript)
27
Random Input vs. Random Trees
Trees
  • Inputs
  • 1,2,3
  • 3,2,1
  • 1,3,2
  • 3,1,2
  • 2,1,3
  • 2,3,1

For three items, the shallowest tree is twice as
likely as any other effect grows as n
increases. For n4, probability of getting a
shallow tree gt 50
28
Deletion
10
15
5
20
9
2
17
30
7
Why might deletion be harder than insertion?
29
FindMin/FindMax
10
15
5
20
9
2
  • Node min(Node root)
  • if (root.left NULL)
  • return root
  • else
  • return min(root.left)

30
7
17
How many children can the min of a node have?
30
Successor
  • Find the next larger node
  • in this nodes subtree.
  • not next larger in entire tree
  • Node succ(Node root)
  • if (root.right NULL)
  • return NULL
  • else
  • return min(root.right)

10
15
5
20
9
2
17
30
7
How many children can the successor of a node
have?
31
Deletion - Leaf Case
10
Delete(17)
15
5
20
9
2
17
30
7
32
Deletion - One Child Case
10
Delete(15)
15
5
20
9
2
30
7
33
Deletion - Two Child Case
10
Delete(5)
20
5
30
9
2
7
replace node with value guaranteed to be between
the left and right subtrees the successor
Could we have used the predecessor instead?
34
Deletion - Two Child Case
10
Delete(5)
20
5
30
9
2
7
always easy to delete the successor always has
either 0 or 1 children!
35
Deletion - Two Child Case
10
Delete(5)
20
7
30
9
2
7
Finally copy data value from deleted successor
into original node
36
Lazy Deletion
  • Instead of physically deleting nodes, just mark
    them as deleted
  • simpler
  • physical deletions done in batches
  • some adds just flip deleted flag
  • extra memory for deleted flag
  • many lazy deletions slow finds
  • some operations may have to be modified (e.g.,
    min and max)

10
15
5
20
9
2
17
30
7
37
Dictionary Implementations
unsorted array sorted array linked list BST
insert find O(1) O(n) find O(1) O(Depth)
find O(n) O(log n) O(n) O(Depth)
delete find O(1) O(n) find O(1) O(Depth)
  • BSTs looking good for shallow trees, i.e. the
    depth D is small (log n), otherwise as bad as a
    linked list!

38
CSE 326 Data StructuresPart 3 Trees,
continuedBalancing Act
  • Henry Kautz
  • Autumn Quarter 2002

39
Beauty is Only ?(log n) Deep
  • Binary Search Trees are fast if theyre shallow
  • e.g. complete
  • Problems occur when one branch is much longer
    than the other
  • How to capture the notion of a sort of complete
    tree?

40
Balance
t
6
5
  • balance height(left subtree) - height(right
    subtree)
  • convention height of a null subtree is -1
  • zero everywhere ?? perfectly balanced
  • small everywhere ? balanced enough ?(log n)
  • Precisely Maximum depth is 1.44 log n

41
AVL Tree Dictionary Data Structure
8
  • Binary search tree properties
  • Balance of every node is -1?? b ? 1
  • Tree re-balances itself after every insert or
    delete

11
5
12
10
6
2
4
14
13
7
9
15
What is the balance of each node in this tree?
42
AVL Tree Data Structure
10
data
3
3
height
10
children
1
2
5
15
1
0
0
0
12
20
9
2
0
0
17
30
43
Not An AVL Tree
10
data
4
4
height
10
children
1
3
5
15
2
0
0
0
12
20
9
2
1
0
17
30
0
18
44
Bad Case 1
  • Insert(small)
  • Insert(middle)
  • Insert(tall)

2
S
1
M
0
T
45
Single Rotation
2
1
S
M
1
M
0
0
S
T
0
T
Basic operation used in AVL trees A right child
could legally have its parent as its left child.
46
General Case Insert Unbalances
h 1
h 2
a
a
h - 1
h 1
h - 1
h
X
b
X
b
h
h-1
h - 1
h - 1
Y
Z
Y
Z
h 1
b
h
a
h
Z
h - 1
h - 1
X
Y
47
Properties of General Insert Single Rotation
  • Restores balance to a lowest point in tree where
    imbalance occurs
  • After rotation, height of the subtree (in the
    example, h1) is the same as it was before the
    insert that imbalanced it
  • Thus, no further rotations are needed anywhere in
    the tree!

48
Bad Case 2
  • Insert(small)
  • Insert(tall)
  • Insert(middle)

2
S
1
T
Why wont a single rotation (bringing T up to the
top) fix this?
0
M
49
Double Rotation
2
2
S
S
1
M
1
1
M
T
0
0
0
S
T
0
T
M
50
General Double Rotation
h 3
a
h 2
h 2
c
h
b
Z
h1
h1
a
b
h
h1
W
h
h
h
c
Y
X
Z
W
h
Y
X
  • Initially insert into X unbalances tree (root
    height goes to h3)
  • Zig zag to pull up c restores root height to
    h2, left subtree height to h

51
Another Double Rotation Case
h 3
a
h 2
h 2
c
h
b
Z
h1
h1
a
b
h
h1
W
h
h
h
c
Y
X
Z
W
Y
h
X
  • Initially insert into Y unbalances tree (root
    height goes to h2)
  • Zig zag to pull up c restores root height to
    h1, left subtree height to h

52
Insert Algorithm
  • Find spot for value
  • Hang new node
  • Search back up looking for imbalance
  • If there is an imbalance
  • outside Perform single rotation and exit
  • inside Perform double rotation and exit

53
AVL Insert Algorithm
  • Node insert(Comparable x, Node root)
  • // returns root of revised tree
  • if ( root NULL )
  • return new Node(x)
  • if (x lt root.key)
  • root.left insert( x, root.left )
  • if (root unbalanced) rotate...
  • else // x gt root.key
  • root.right insert( x, root.right )
  • if (root unbalanced) rotate...
  • root.height max(root.left.height,
  • root.right.height)1
  • return root

54
Deletion (Really Easy Case)
10
Delete(17)
15
5
12
20
9
2
30
17
3
55
Deletion (Pretty Easy Case)
10
Delete(15)
15
5
12
20
9
2
30
17
3
56
Deletion (Pretty Easy Case cont.)
3
10
Delete(15)
2
2
17
5
1
1
0
0
12
20
9
2
0
0
30
3
57
Deletion (Hard Case 1)
Delete(12)
10
17
5
12
20
9
2
30
3
58
Single Rotation on Deletion
3
10
10
2
1
17
5
20
5
1
0
0
0
20
9
2
30
9
2
17
0
30
3
3
What is different about deletion than insertion?
59
Deletion (Hard Case)
Delete(9)
10
17
5
12
12
20
9
2
20
1
1
0
0
30
3
15
30
15
11
18
0
0
0
0
0
33
13
33
13
60
Double Rotation on Deletion
Not finished!
10
10
17
5
17
3
12
12
2
20
2
20
5
2
1
1
0
0
1
1
0
0
0
3
30
15
11
18
30
15
11
18
0
0
0
0
0
33
13
33
13
61
Deletion with Propagation
10
What different about this case?
17
3
12
20
5
2
1
1
0
0
We get to choose whether to single or double
rotate!
30
15
11
18
0
0
33
13
62
Propagated Single Rotation
4
10
17
3
2
10
17
3
20
1
2
1
0
18
12
20
5
2
12
30
3
0
1
1
0
0
1
0
0
0
30
33
15
5
2
11
15
11
18
0
0
0
33
13
13
63
Propagated Double Rotation
4
10
12
2
3
10
17
3
17
1
0
1
2
15
20
11
20
5
2
12
3
1
0
0
1
1
0
0
0
0
30
30
18
5
2
15
11
18
13
0
0
0
33
33
13
64
AVL Deletion Algorithm
  • Recursive
  • If at node, delete it
  • Otherwise recurse to find it in
  • 3. Correct heights
  • a. If imbalance 1,
  • single rotate
  • b. If imbalance 2 (or dont care),
  • double rotate
  • Iterative
  • 1. Search downward for
  • node, stacking
  • parent nodes
  • 2. Delete node
  • 3. Unwind stack,
  • correcting heights
  • a. If imbalance 1,
  • single rotate
  • b. If imbalance 2 (or dont care)
  • double rotate

65
AVL
  • Automatically Virtually Leveled
  • Architecture for inVisible Leveling
  • Articulating Various Lines
  • Amortizing? Very Lousy!
  • Amazingly Vexing Letters

66
AVL
  • Automatically Virtually Leveled
  • Architecture for inVisible Leveling
  • Articulating Various Lines
  • Amortizing? Very Lousy!
  • Amazingly Vexing Letters

Adelson-Velskii Landis
67
Pros and Cons of AVL Trees
  • Pro
  • All operations guaranteed O(log N)
  • The height balancing adds no more than a constant
    factor to the speed of insertion
  • Con
  • Space consumed by height field in each node
  • Slower than ordinary BST on random data
  • Can we guarantee O(log N) performance with less
    overhead?

68
Splay Trees
  • CSE 326 Data StructuresPart 3 Trees, continued

69
Today Splay Trees
  • Fast both in worst-case amortized analysis and in
    practice
  • Are used in the kernel of NT for keep track of
    process information!
  • Invented by Sleator and Tarjan (1985)
  • Details
  • Weiss 4.5 (basic splay trees)
  • 11.5 (amortized analysis)
  • 12.1 (better top down implementation)

70
Basic Idea
  • Blind rebalancing no height info kept!
  • Worst-case time per operation is O(n)
  • Worst-case amortized time is O(log n)
  • Insert/find always rotates node to the root!
  • Good locality
  • Most commonly accessed keys move high in tree
    become easier and easier to find

71
Idea
move n to root by series of zig-zag and zig-zig
rotations, followed by a final single rotation
(zig) if necessary
10
Youre forced to make a really deep access
17
5
9
2
3
72
Zig-Zag
Helped Unchanged Hurt
g
n
up 2
X
p
g
p
down 1
up 1
down 1
n
W
Y
W
Z
X
Y
Z
This is just a double rotation
73
Zig-Zig
n
g
Z
p
W
p
Y
g
X
n
X
W
Y
Z
74
Why Splaying Helps
  • Node n and its children are always helped
    (raised)
  • Except for last step, nodes that are hurt by a
    zig-zag or zig-zig are later helped by a rotation
    higher up the tree!
  • Result
  • shallow nodes may increase depth by one or two
  • helped nodes decrease depth by a large amount
  • If a node n on the access path is at depth d
    before the splay, its at about depth d/2 after
    the splay
  • Exceptions are the root, the child of the root,
    and the node splayed

75
Splaying Example
1
1
2
2
zig-zig
3
3
Find(6)
4
5
6
76
Still Splaying 6
1
1
2
6
zig-zig
3
3
2
5
4
77
Almost There, Stay on Target
1
6
zig
3
3
2
5
2
5
4
4
78
Splay Again
zig-zag
3
4
Find(4)
2
5
3
5
4
2
79
Example Splayed Out
4
6
1
zig-zag
3
5
4
2
3
5
2
80
Locality
  • Locality if an item is accessed, it is likely
    to be accessed again soon
  • Why?
  • Assume m ? n access in a tree of size n
  • Total worst case time is O(m log n)
  • O(log n) per access amortized time
  • Suppose only k distinct items are accessed in the
    m accesses.
  • Time is O(n log n m log k )
  • Compare with O( m log n ) for AVL tree

those k items are all at the top of the tree
getting those k items near root
81
Splay Operations Insert
  • To insert, could do an ordinary BST insert
  • but would not fix up tree
  • A BST insert followed by a find (splay)?
  • Better idea do the splay before the insert!
  • How?

82
Split
  • Split(T, x) creates two BSTs L and R
  • All elements of T are in either L or R
  • All elements in L are ? x
  • All elements in R are ? x
  • L and R share no elements
  • Then how do we do the insert?

83
Split
  • Split(T, x) creates two BSTs L and R
  • All elements of T are in either L or R
  • All elements in L are ? x
  • All elements in R are gt x
  • L and R share no elements
  • Then how do we do the insert?
  • Insert as root, with children L and R

84
Splitting in Splay Trees
  • How can we split?
  • We have the splay operation
  • We can find x or the parent of where x would be
    if we were to insert it as an ordinary BST
  • We can splay x or the parent to the root
  • Then break one of the links from the root to a
    child

85
Split
could be x, or what would have been the parent of
x
split(x)
splay
T
L
R
if root is gt x
  • if root is ? x

OR
L
R
L
R
? x
gt x
gt x
lt x
86
Back to Insert
x
L
R
? x
gt x
  • Insert(x)
  • Split on x
  • Join subtrees using x as root

87
Insert Example
Insert(5)
6
4
4
6
9
1
split(5)
6
1
1
9
9
4
7
2
2
7
7
2
5
4
6
1
9
2
7
88
Splay Operations Delete
x
delete x
L
R
gt x
lt x
Now what?
89
Join
  • Join(L, R) given two trees such that L lt R,
    merge them
  • Splay on the maximum element in L then attach R

L
90
Delete Completed
x
T
delete x
L
R
gt x
lt x
Join(L,R)
T - x
91
Delete Example
Delete(4)
6
4
6
9
1
find(4)
6
1
1
9
9
4
7
2
2
7
Find max
7
2
2
2
1
6
1
6
9
9
7
7
92
Splay Trees, Summary
  • Splay trees are arguably the most practical kind
    of self-balancing trees
  • If number of finds is much larger than n, then
    locality is crucial!
  • Example word-counting
  • Also supports efficient Split and Join operations
    useful for other tasks
  • E.g., range queries
Write a Comment
User Comments (0)
About PowerShow.com