CS 2 - PowerPoint PPT Presentation

1 / 86
About This Presentation
Title:

CS 2

Description:

To be able to search, insert and delete data in a BST ... If our tree is false, we should insert the value here by creating a new node in the tree... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 87
Provided by: facul56
Category:
Tags: insert

less

Transcript and Presenter's Notes

Title: CS 2


1
CS 2
  • Introduction to
  • Object Oriented Programming
  • Chapter 20
  • Advanced Data Structures

2
CHAPTER GOALS
  • To learn about Binary Search Trees (BSTs) in
    Java
  • To be able to search, insert and delete data in a
    BST
  • To understand the implementation of hash tables
  • To be able to program hash functions

3
Motivation
  • Linked lists will store a collection of Objects,
    but access to those Objects is slow - O(N).
  • Not helped by ordering the data, because you
    dont really have random access
  • BSTs offer the opportunity for finding data in
    O(log N)
  • Hash Tables offer the opportunity for finding
    data in O(1).
  • Both, of course, have caveats about achieving the
    expected performance

4
Binary Trees
Probably the most famous type of tree in Computer
Science is related to the Binary Tree. A binary
tree is a tree structure in which each node has
at MOST two immediate child nodes. The ancestral
tree (from CS1321) was a binary tree.
5
false
false
false
false
By itself, a binary tree has no inherent order.
There is no ordered relationship between the data
in the parent node and the data in the child
nodes.
6
Binary Search Trees
What if we did imposed a relationship between the
data? This is the idea behind a Binary Search
Tree. When we consider a node of a binary
search tree, we state the following
  • All data contained in nodes in the left subtree
    of the current node will be less than the data in
    the current node.
  • All data contained in nodes in the right subtree
    of the current node will be greater than the data
    in the current node.
  • These properties will hold for all nodes in the
    tree.

7
Visually
RIGHT SUBTREE
LEFT SUBTREE
NOTE This is just an example. BSTs are not
limited to holding numbers.
8
So what does this buy us?
Well, lets look at searching for values in a BT
versus a BST. Lets start with searching in an
ordinary Binary Tree of numbers Were looking
for the number 7 in the following tree
9
(No Transcript)
10
Lets say that the basic algorithm is the
following 1) If we found the number, return
true 2) If we run out of tree, return false
3) Otherwise, search the left subtree for the
value 4) If we dont find it there, try the
right subtree
11
So we start off
Nope, 30 doesnt equal 7
12
Nope.
13
Not even
14
Ran out of stuff to try here, lets try the
right-hand side
15
Nope And if we try to recur left or right, we
find out that weve run out of places to check
in this sub-tree
16
So we come back up and try the right subtree of
60 With no luck
17
So we go back to the other subtree of 30, where
we started out
18
This process keeps repeating until we either
find the value or run out of places to try
completely
19
This process keeps repeating until we either
find the value or run out of places to try
completely
20
Finally, we find what we were looking for so we
stop and return true.
21
Ok, now a BST
This time were going to look for the value 100,
starting from the 35 node.
22
As before, we check to see if our current node is
the value were looking for (100). Again, we
fail. But somethings different this time
23
We know we have a BST. And we know something
about BSTs. All data that is smaller than the
current data is stored in the left subtree. All
the data thats larger than the current data is
in the right subtree
24
So. Do we have to go and explore the left
subtree if were looking for 100 and weve hit 35?
25
Nopewe just search the right subtree
26
Again, do we have to search to the left?
27
Nowe just go right And weve found it, so we
stop and return true.
28
So our algorithm seems to be
  • If we are searching a false tree at this point,
    return false
  • If the current data is the target, return true
  • Otherwise, if what were looking for is less than
    the current nodes data, recursively search the
    current nodes left subtree.
  • Otherwise, recursively search the right subtree

29
BST Class Structure
Tree
  • root

Node
Tree
  • data
  • left
  • right

Node
find
findNode
insert
insertNode
remove
remove
30
Code Overview
public class Tree private Node root
public Tree() root null
public void insert(Comparable obj)
ltcodegt public boolean find(Comparable
obj) ltcodegt public void
remove(Comparable obj) ltcodegt
private class Node ltcodegt
31
Inner Node Class
public class Tree private class
Node public Comparable data public
Node left public Node right public
Node remove(Comparable obj) ltcodegt
private Comparable removeLargest()
ltcodegt public Node findNode(
Comparable obj) ltcodegt public
void insertNode(Node newNode) ltcodegt

32
Find Method
public class Tree public boolean
find(Comparable obj) if (root null)
return false else return
(root.findNode(obj) ! null)
private class Node public Node
findNode( Comparable obj) int ans
obj.compareTo(data) if (ans 0) return
this else if( ans lt 0 ) if( left
null ) return null else return
left.findNode(obj) else if(right
null) return null else return
right.findNode(obj)
33
What if we wanted to do other things like insert
into and delete from a BST?
Youll find that as long as were dealing with
BSTs, the same general set of rules will hold
34
Inserting into a BST
Inserting an item into a BST is very much so like
the problem of inserting an item into a sorted
list. We want to maintain the BST property of
our BST when were finished! So lets ask a
couple of questions
35
Should we insert an item at the root of a BST?
18
36
Should we insert an item at the root of a BST?
18
We run into problems if we just arbitrarily
insert items at the root of our BST. Wed have
to guarantee that we still have a BST when were
done. As is shown here, wed have to
really re-arrange our tree to make this back into
a BST And thats too much trouble
37
What about inserting somewhere in the middle?
38
What about inserting somewhere in the middle?
This is even more complex than the first
case How would we re-arrange our tree?
39
Inserting at leaf node
This is the only easy way to insert items into a
BST. Instead of re-arranging our tree because
of a random insertion, we merely create new leaf
nodes off our branches (Replace a false with a
new node!)
40
The algorithm
  • If our tree is false, we should insert the value
    here by creating a new node in the tree
  • Otherwise, we need to make sure that we find the
    correct place to insert a new leaf node and
    maintain the property of the BST.
  • Is the new value less than the current nodes
    data? If so, look for a false to the left of the
    current node
  • Otherwise (new value is greater), look for a
    false located to the right
  • Repeat that until we find a location with false
    to replace

41
Visually
Value to add
Tree to insert into
38
42
Visually
Value to add
Tree to insert into
38
Have we found a location with false to park our
new value?
43
Visually
Value to add
Tree to insert into
38
Nope. 38 gt 35, so we should look for our false
location to the right of 35.
44
Visually
Value to add
Tree to insert into
38
76 isnt false either. And its larger than 38.
So we need to go left
45
Visually
Value to add
Tree to insert into
38
Weve hit the same situationgo left again
46
Visually
Value to add
Tree to insert into
38
Theres nothing after 40which means weve found
the right place to insert our item
47
Visually
Value to add
Tree to insert into
38
38
48
Insert Method
public class Tree public void
insert(Comparable obj) Node newNode new
Node() newNode.data obj newNode.left
null newNode.right null if (root
null) root newNode else root.insertNode(new
Node)
private class Node public void
insertNode(Node newNode) if
(newNode.data.compareTo(data) lt 0) if
(left null) left newNode else
left.insertNode(newNode) else // new
data is larger, go right if (right
null) right newNode else right.insertNode(ne
wNode)
49
Deleting from a BST
Just as we strove to maintain the BST properties
of our tree when adding an item to a BST, we also
need to maintain the properties of our BST when
we delete items. This is actually a fairly
complex problem. It is moremanageable if we
break the problem down into different cases.
First, lets assume that were already at the
node that we want to delete from our tree
50
The case of a leaf
35
false
false
false
51
The case of just one child
76
35
40
100
76
false
40
100
52
The other case
This is perhaps the hardest case to solve. We
want to replace 35 with the most viable candidate
in our BST and still maintain the BST property
53
The other case
There are two viable candidates in this tree
54
The other case
The largest value on the left side of the tree,
or.
55
The other case
The smallest value on the right side of the
tree. Lets discuss why
56
The largest and the smallest
No matter how you construct your tree, no matter
how many different values you put into it, the
value that is just larger than the root nodes
value will be the smallest (leftmost) value in
the roots right subtree. Also, the value
closest to the roots value, but just less than
it will be the largest (rightmost) value in the
left subtree of the root node. Using either of
those as a replacement for the root node will
maintain the BST ordering.
57
The algorithm.
The algorithm for finding the largest value of
the left subtree is to go left once and then go
right as often as possible.
58
The algorithm.
The algorithm for finding the smallest value of
the right subtree is to go right once and then go
left as often as possible.
59
Continued
Once we find the correct value, we replace our
original value (35) with the new one (lets say
we opted for the smallest of the right subtree,
40). We copy the value 40 onto our root node.
60
Continued
Now we need to delete the original, but now
duplicate, value of 40 in the tree.
61
Continued
Deleting the original 40 in the tree.
Note that we have to move the 50 up to maintain
the BST!
62
Continued
The 50 is now moved up, original 40 was replaced.
40
76
20
25
1
50
100
10
63
Remove Method
public class Tree public void
remove(Comparable obj) if (root ! null)
root root.remove(obj)
public class Node public Node
remove(Comparable obj) Node ans
this int sw obj.compareTo(data)
if (sw 0) // match found if(left null)
ans right // right leg else if(right
null) ans left // left leg else // both
there - find left largest if( left.right
null ) data left.data left
left.left else data left.removeLargest(
) else if( sw lt 0 ) if( left !
null ) left left.remove(obj) else
if( right ! null ) right right.remove(obj)
return ans
64
Remove Helper
public class Node private
Comparable removeLargest()
Comparable ans if( right.right null )
ans right.data right
right.left return ans else
return right.removeLargest()
65
BST caveat
  • O(log N) performance only achieved when the tree
    is
  • Full
  • Balanced
  • Special techniques are needed to keep the tree
    full and balanced when inserting and deleting
    data (beyond the scope of this class) such as
    red/black trees AVL trees

66
Questions?
67
Hash Tables
  • Hashing can be used to find elements in a data
    structure quickly without making a linear search
  • A hash function computes an integer value (called
    the hash code) from an object
  • That integer can then be used to directly access
    an array of data
  • To compute the hash code of object x    int h
    x.hashcode()

68
Sample Strings and Their Hash Codes
69
Simplistic Implementation of a Hash Table
  • To implement
  • Generate hash codes for objects
  • Make an array
  • Insert each object at the location of its hash
    code
  • To test if an object is contained in the array
  • Compute its hash code
  • Check if the array position with that hash code
    is occupied

70
Simplistic Implementation of a Hash Table
71
Problems with Simplistic Implementation
  • It is not usually possible to allocate an array
    that is large enough to hold all possible integer
    index positions
  • It is possible for two different objects to have
    the same hash code. This is called a Collision.

72
Solutions
  • Pick a reasonable array size and reduce the hash
    codes to fall within the indices of the array
  • int h x.hashCode()
  • if (h lt 0) h -h
  • h h size // the hashcode mod the array
    size
  • When elements collide (have the same hash code)
  • use a linked list to store multiple objects in
    the same array position
  • These linked lists are called buckets

73
Hash Table with Linked Lists
74
Algorithm for Finding an Object x in a Hash Table
  • Get the index h into the hash table
  • Compute the hash code.
  • Reduce it modulo the table size
  • Iterate through the elements of the bucket at
    position h in the array
  • For each element of the bucket, check whether it
    is equal to x
  • If a match is found among the elements of that
    bucket, then x is in the set
  • Otherwise, x is not in the set

75
Hash Tables
  • A hash table can be implemented as an array of
    buckets
  • Buckets are sequences of links that hold elements
    with the same hash code
  • The table size should be a prime number larger
    than the expected number of elements
  • If there are few collisions, then adding,
    locating, and removing hash table elements takes
    constant time
  • Big-O notation    O(1)

76
HashSet Class Structure
Tree
  • buckets

HashSetIterator
HashSet
HashSetIterator
contains
hasNext
add
next
remove
remove
iterator
Link
  • data
  • next

77
Check the Code (ch20/hashtable)
78
Hash Function
  • A hash function computes an integer hash code
    from an object
  • Choose a hash function so that different objects
    are likely to have different hash codes.
  • Bad choice for hash function for a string
  • Adding the unicode values of the characters in
    the string
  • Because permutations ("eat" and "tea") would
    have the same hash code

79
Computing Hash Codes
  • Hash function for a string s from standard
    library
  • final int HASH_MULTIPLIER 31
  • int h 0
  • for (int i 0 i lt s.length() i)
  • h HASH_MULTIPLIER h s.charAt(i)

80
A hashCode method for the Coin class
  • There are two instance fields String coin name
    and double coin value
  • Use String's hashCode method to get a hash code
    for the name
  • To compute a hash code for a floating-point
    number
  • Wrap the number into a Double object
  • Then use Double's hashCode method
  • Combine the two hash codes using a prime number
    as the HASH_MULTIPLIER

81
A hashCode method for the Coin class
  • class Coin
  • public int hashCode()
  • int h1 name.hashCode()
  • int h2 new Double(value).hashCode()
  • final int HASH_MULTIPLIER 29
  • int h HASH_MULTIPLIER h1 h2
  • return h
  • . . .

82
Creating Hash Codes for your classes
  • Use a prime number as the HASH_MULTIPLIER
  • Compute the hash codes of each instance field
  • For an integer instance field just use the field
    value
  • Combine the hash codes
  • int h HASH_MULTIPLIER h1 h2
  • h HASH_MULTIPLIER h h3
  • h HASH_MULTIPLIER h h4
  • . . .
  • return h

83
Creating Hash Codes for your classes
  • Your hashCode method must be compatible with the
    equals method
  • That is, make this property hold
  • x.equals(y) ? x.hashCode( ) y.hashCode( )
  • In general, define either both hashCode and
    equals methods or neither

84
Hashing caveats
  • Table size should be a bit bigger than the data
  • Collisions will cause a linear search of the
    local linked list
  • Computing the hash function for any single object
    is constant overhead cost

85
Summary you should now
  • About Binary Search Trees (BSTs) in Java
  • How to search, insert and delete data in a BST
  • the implementation of hash tables
  • How to program hash functions

86
Questions?
Write a Comment
User Comments (0)
About PowerShow.com