CS2851 Dr' Mark L' Hornick - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

CS2851 Dr' Mark L' Hornick

Description:

Like the JCF TreeMap, this class implements the Map interface ... Iterating through these Sets results in keys or entries in no apparent order. So... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 26
Provided by: drmarkh
Category:

less

Transcript and Presenter's Notes

Title: CS2851 Dr' Mark L' Hornick


1
Hashing, HashMaps and HashSets
2
JCF HashMap is a Map
  • Implements same methods found in TreeMap
  • put(), get()
  • remove()
  • entrySet(), keySet()
  • containsKey(), containsValue()
  • size(), equals(), clear()

3
The JCF HashMap
  • Like the JCF TreeMap, this class implements the
    Map interface
  • Implying a data structure based on key/value
    pairs
  • public class HashMapltK, Vgt
  • implements MapltK, Vgt
  • extends AbstractMapltK, Vgt
  • Example HashMapltString, Doublegt
    students//String holds student ID//Double
    holds gpa

4
JCF HashMap does not sort either keys or values
  • Implements Map, not SortedMap
  • entrySet(), keySet() generate unsorted Sets
  • Iterating through these Sets results in keys or
    entries in no apparent order
  • So
  • Why bother with a HashMap at all?
  • Whats the point?

5
Review performance of previously covered data
structures
  • ArrayList
  • get()
  • add()
  • contains()
  • LinkedList
  • get()
  • add()
  • contains()
  • TreeMap/TreeSet
  • get()
  • add() (put)
  • contains()

6
HashMaps advantage is overall performance
  • Constant time performance for ALL operations!
  • Put()
  • Get()
  • ContainsKey()
  • How???

7
Hash definition
  • A hash is a transformation of a key into a
    numeric value that maps to the index of an array
    (or table)
  • This is done in two steps
  • First, generate a numeric hashcode from the key
  • Second, transform the hashcode into an array
    index

Key
hashcode
index
8
How do you generate a hashcode?
  • In Java, most classes have a built-in hashCode()
    method
  • Classes that dont override hashCode() inherit
    the Object classs hashCode() method
  • Which returns the memory address of the object,
    which is non-deterministic

9
How do you transform a hashcode into an array
index?
  • First, consider the Integer class
  • Integers hashCode( ) method simply returns the
    underlying int
  • The HashMap class has a ltpackage-visiblegt hash()
    method
  • static int hash(Object x)
  • int h x.hashCode()
  • h (h ltlt 9)
  • h (h gtgtgt 14)
  • h (h ltlt 4)
  • h (h gtgtgt 10)
  • return h
  • This method further scrambles the hashcode for
    example
  • hash(123456789) // Returns 1272491941

10
How do you transform a hashcode into an array
index?
  • An index in the range 01023 can be computed as
    follows
  • int index hash (123456789) 1024
  • or
  • int index hash (123456789) 1023
  • The resulting index933
  • The second operation is computationally faster

11
How does the operator work?
  • The operator performs a bitwise and on its
    operands.
  • For each pair of bits a and b, if a and b are
    both 1 bits, a b 1. Otherwise, a b 0.
  • For example,
  • 10100001101001
  • 00000000001111
  • 00000000001001

12
1023 in binary form
  • 00000000000000000000000111111111
  • So (w 1023)
  • returns the rightmost 9 bits of the operand w
  • In general, this works well as long as the table
    length is a power of 2
  • Why??

13
Exercise
What are the index values xxx, yyy, and zzz?
14
More hashing examples(for a table 1024 in length)
  • 123456789 indexes to 933
  • 428671256 indexes to 500
  • 884739816 indexes to 234

15
Hashing can result in Collisions
  • 123456789 indexes to 933
  • 428671256 indexes to 500
  • 884739816 indexes to 234
  • 403578063 indexes to 933
  • When two different keys yield the same index,
    that is called a collision.
  • Keys that yield the same index are called
    synonyms.

16
Hashing is inefficient when there are a lot of
collisions
  • Ideally, we want the hashing algorithm to
    generate indices sprinkled randomly throughout
    the underlying table
  • The Uniform Hashing Assumption assumes
  • Each key is equally likely to hash to any one of
    the table addresses, independently of where the
    other keys have hashed

17
Even if this assumption is true, collisions still
occur
  • This is due to the finite set of indices in a
    table
  • An infinite number of keys cannot be mapped into
    a finite set of indices
  • So collision handlers have to be implemented

18
The JCF HashMap collision handling mechanism
  • At index i in table, store the linked list of all
    elements whose keys hash to I
  • This is called chaining
  • It implements a simple singly-linked list
  • Note The table length must be a power of 2.

19
(No Transcript)
20
As chains get long, performance degrades to O(m)
  • Once the table becomes 75 full, it is resized
  • All indices are recalculated
  • Chains are removed or reduced

21
(No Transcript)
22
(No Transcript)
23
Another collision handler
  • In Open Address hashing, when a collision occurs,
    the next available index is used to store the
    key/value
  • This leads to some interesting practical
    implementation problems (see the text)

24
HashSets
25
A HashSet is an unordered Collection in which the
element is the key
  • The HashSet class has all of the methods in the
    Collection interface
  • add, remove, size, contains,
  • plus toString (inherited from AbstractCollection)
  • public class HashSetltEgt
  • extends AbstractSetltEgt
  • implements SetltEgt, Cloneable,
  • java.io.Serializable
Write a Comment
User Comments (0)
About PowerShow.com