Title: Unit 11: Data Structures
1Unit 11 Data Structures Complexity
- We discuss in this unit
- Graphs and trees
- Binary search trees
- Hashing functions
- Recursive sorting quicksort, mergesort
syllabus
basic programming concepts
object oriented programming
topics in computer science
2Graphs and Trees
- Graph a data representation which includes nodes
and edges, where each edge connects two nodes - Example the internet
- Tree a connected graph with no loops, and a root
- Example inheritance tree
3Graphs and Trees
b)
a)
ROOT
c)
internal vertices
LEAF NODES
4Binary Trees
- A rooted tree is called a binary tree if every
internal vertex has no more than 2 children - The tree is called a full binary tree if every
internal vertex has exactly 2 children - An ordered rooted tree is a rooted tree where the
children of each internal vertex are ordered we
call the children of a vertex the left child and
the right child, if they exist - A rooted binary tree of height H is called
balanced if all its leaves are at levels H or H-1
5Binary tree example
Lou
Hal
Max
Ken
Sue
Ed
Joe
Ted
6Tree Properties
- Theorem A tree with N vertices has N-1 edges
- Theorem There are at most 2 H leaves in a binary
tree of height H - Corallary If a binary tree with L leaves is
full and balanced, then its height is - H ? log2 L ?
- Theorem There are at most (2 H11) nodes in a
binary tree of height H
7Binary Search Tree (BST)
- A special kind of binary tree in which
- 1. Each vertex contains a distinct key value
- 2. The key values in the tree can be compared
using greater than and less than - 3. The key value of each vertex in the tree is
- less than every key value in its left subtree,
and greater than every key value in its right
subtree
8Example Binary Search Tree
Lou
Hal
Max
Ken
Sue
Ed
Joe
Ted
9Shape of a BST
- Depends on its key values and their order of
insertion - Insert the elements J E F T A
in that order. - The first value to be inserted is put into the
root.
10Inserting E into the BST
- Thereafter, each value to be inserted begins by
comparing itself to the value in the root, moving
left it is less, or moving right if it is
greater. This continues at each level until it
can be inserted as a new leaf.
11Inserting F into the BST
- Begin by comparing F to the value in the root,
moving left it is less, or moving right if it is
greater. This continues until it can be inserted
as a leaf.
F
12Inserting T into the BST
- Begin by comparing T to the value in the root,
moving left it is less, or moving right if it is
greater. This continues until it can be inserted
as a leaf.
F
13Inserting A into the BST
- Begin by comparing A to the value in the root,
moving left it is less, or moving right if it is
greater. This continues until it can be inserted
as a leaf.
14Order of insertion
- what BST is obtained by inserting the elements
A E F J T in that order?
15Another binary search tree
T
E
A
H
M
P
K
Add nodes containing these values in this
order D B L Q S
V Z
16Task is F in the tree?
J
T
E
A
V
M
H
P
17Search(x)
- start at the root of the tree which contains y
- the tree is empty ? x is not present
- x y (the item at the root) ? the root is
returned - x lt y ? recursively search the left subtree
- x gt y ? recursively search the right subtree
18Operations
- Search(x)
- Insert(x)
- Delete(x)
tree algs/demo
19Complexity
- Search(x) O(H)
- Insert(x) O(H)
- Delete(x) O(H)
- worst case O(n) when tree is a list
- best case O(log n) when tree is full and balanced
20Traversal Algorithms
- A traversal algorithm is a procedure for
systematically visiting every vertex of an
ordered binary tree - Tree traversals are defined recursively
- Three traversals are named
- preorder
- inorder
- postorder
21PREORDER Traversal Algorithm
- Let T be an ordered binary tree with root r
- If T has only r, then r is the preorder traversal
- Otherwise, suppose T1, T2 are the left and right
subtrees at r the preorder traversal is - visit r
- traverse T1 in preorder
- traverse T2 in preorder
22Preorder Traversal
Visit first
ROOT
T
E
A
H
M
Y
Visit left subtree second
Visit right subtree last
result J E A H T M Y
23INORDER Traversal Algorithm
- Let T be an ordered binary tree with root r
- If T has only r, then r is the inorder traversal
- Otherwise, suppose T1, T2 are the left and right
subtrees at r then - traverse T1 in inorder
- visit r
- traverse T2 in inorder
24Inorder Traversal
Visit second
ROOT
T
E
A
H
M
Y
Visit left subtree first
Visit right subtree last
result A E H J M T Y
25POSTORDER Traversal Algorithm
- Let T be an ordered binary tree with root r
- If T has only r, then r is the postorder
traversal - Otherwise, suppose T1, T2 are the left and right
subtrees at r then - traverse T1 in postorder
- traverse T2 in postorder
- visit r
26Postorder Traversal
Visit last
ROOT
T
E
A
H
M
Y
Visit left subtree first
Visit right subtree second
result A H E M Y T J
27A Binary Expression Tree
ROOT
-
8
5
INORDER TRAVERSAL 8 - 5 has value 3
PREORDER TRAVERSAL - 8 5 POSTORDER
TRAVERSAL 8 5 -
28Binary Expression Tree
- A special kind of binary tree in which
- 1. Each leaf node contains a single operand
- 2. Each nonleaf node contains a single binary
operator - 3. The left and right subtrees of an operator
node represent subexpressions that must be
evaluated before applying the operator at the
root of the subtree
29A Binary Expression Tree
What value does it have? ( 4 2 ) 3 18
30A Binary Expression Tree
What infix, prefix, postfix expressions does it
represent?
31A Binary Expression Tree
Infix ( ( 4 2 ) 3 ) Prefix
4 2 3 evaluate from
right Postfix 4 2 3
evaluate from left
32Levels Indicate Precedence
- When a binary expression tree is used to
represent an expression, the levels of the nodes
in the tree indicate their relative precedence of
evaluation - Operations at higher levels of the tree are
evaluated later than those below them the
operation at the root is always the last
operation performed
33Example
-
5
8
What infix, prefix, postfix expressions does it
represent?
34A binary expression tree
Infix ( ( 8 - 5 ) ( ( 4 2 ) / 3 )
) Prefix - 8 5 / 4 2 3 Postfix
8 5 - 4 2 3 / has operators in order used
35Hash tables
- Goal accesses data with complexity O(1), even
when the data needs to be dynamically
administered (mostly by inserting new or deleting
existing data) - Solution array
- Problem access is only O(1) if the index of the
element is known otherwise O(log n) - Solution each data item to be stored is
associated with a hash value which gives array
index - generate array of size m, where m is prime and
sufficiently large - assign unique numerical value N(k) to key of
element k - h(k) N(k) mod m
36Hash tables - cont
- Problem hash value is not unique anymore!
- Solution a collision procedure must determine
the position for the new object
hasshing
37Operations
- Search(x)
- Insert(x)
- Delete(x)
- Complexity
- worst case O(n) when hash value is always the
same (and table is really a list) - best case O(1) when hash value is always distinct
38Example keys are names
- list of unique identifiers
- washington ? 103288042987600
- lincoln ? 5201793578
- bush ? 151444
- the range currently is all positive integers,
which is not possible to implement - for m10007
- washington ? 3249
- lincoln ? 4873
- bush ? 1339
39why modulus prime number?
- underlying reason the new code numbers should
appear random - technical reason the modulus operation should be
a field - example
- original codes are all even numbers
- m is even
- only even buckets in the table will get filled,
half the table will be empty
40Back to sorting...
- Two recursive algorithms
- MergeSort
- QuickSort
41MergeSort
- divide array to two equal parts
- recursively sort left part
- recursively sort right part
- merge the two sorted lists
42QuickSort
- choose pivot arbitrarily
- divide to left part (lt than pivot) and right
part (gt than pivot) - sort each part recursively
43Complexity
- MergeSort
- worst case O(n log n)
- QuickSort
- worst case O(n2)
- average case O(n log n)