COMP 171 Data Structures and Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

COMP 171 Data Structures and Algorithms

Description:

COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables Data Dictionary A data structure that supports: Insert Search Delete Examples: Binary Search Tree Red ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 16
Provided by: Vincent159
Category:

less

Transcript and Presenter's Notes

Title: COMP 171 Data Structures and Algorithms


1
COMP 171Data Structures and Algorithms
  • Tutorial 10
  • Hash Tables

2
Data Dictionary
  • A data structure that supports
  • Insert
  • Search
  • Delete
  • Examples
  • Binary Search Tree
  • Red Black Tree
  • B Tree
  • Link List

3
Hash Table
  • An effective data dictionary
  • Worst Case T(n) time
  • Under assumptions O(1) time
  • Generalization of an array
  • Size proportional to the number of keys actually
    stored
  • Array index is computed from the key

4
Direct Address Tables
  • Universe U
  • a set that contains all the possible keys
  • Table has U slots.
  • Each key in U is mapped into one unique entry in
    the Table
  • Insert, delete and search takes O(1) time
  • Works well when U is small
  • If U is large, impractical

5
Hash Function
  • Assume Hash Table has m slots
  • Hash function h is used to compute slot from the
    key k
  • h maps U into the slots of a hash table
  • h U ? 0, 1, , m-1
  • U gt m, at least 2 keys will have the same hash
    value, collision
  • Good hash function can minimize the number of
    collision

6
  • If the keys are not natural number
  • Interpret them as natural number using suitable
    radix notation
  • Example character string into radix-128 integer
  • Division method
  • h(k) k mod m
  • m is usually a prime
  • Avoid m too close to an exact power of 2
  • Ex 11.3-3
  • Choose m 2p-1 and k is a character string
    interpreted in radix 2p. Show that if x can be
    derived from y by permuting its characters, then
    h(x) h(y).

7
  • Multiplication method
  • h(k) ?m ( k A mod 1)? , 0 lt A lt 1
  • Value m is not critical
  • Usually choose m to be power of 2
  • It works better with some values of A
  • Eg. (v5 1 ) / 2

8
Separate Chaining
  • Put all the elements that hash to the same slot
    in a link list
  • Element is inserted into the head of the link
    list
  • Worst case insertion O(1)
  • Worst case search O(n)
  • Worst case deletion O(1)

9
  • Given a hash table has m slots that stores n
    elements, we define load factor a
  • a n/m
  • Simple uniform hashing
  • Element is equally likely to hash into any of the
    m slots, independently of where any other element
    has hashed to
  • Under Simple uniform hashing
  • Average time for search T(1a)

10
Open Addressing
  • Each table slot contains either an element or NIL
  • When collision happens, we successively examine,
    or probe, the hash table until we find an empty
    slot to put the key
  • Deletion is done by marking the slot as Deleted
    but not NIL
  • Hash function h now takes two values
  • The key value and the probe number
  • h(k, i)

11
  • Linear Probing
  • h(k, i) ( h(k) i ) mod m
  • Initial probe determine the entire probe
    sequence, there are only m distinct probe
    sequence
  • Primary clustering
  • Quadratic Probing
  • h(k, i) ( h(k) c1i c2i ) mod m
  • there are only m distinct probe sequence
  • Secondary clustering

12
  • Double hashing
  • Make use of 2 different hash function
  • h(k, i) ( h1(k) ih2(k) ) mod m
  • ih2(k) should be co-prime with m
  • Usually take m as a prime number
  • Probe sequence depends on both has function, so
    there are m2 probe sequences
  • Double hashing is better then linear or quadratic
    probing

13
Trie
  • Assumption
  • Digital data / radix
  • Tree structure is used
  • Insertion is done by creating a path of nodes
    from the root to the data
  • Deletion is done by removing the pointer that
    points to that element
  • Time Complexity O(L)
  • Max of keys for given L 128L1 - 1

14
  • Memory Usage
  • Node size Number of node
  • ((N1)pointer size) (L n)
  • N radix
  • L maximum length of the keys
  • n number of keys
  • Improvement 1
  • Put all nodes into an array of nodes
  • Replace pointer by array index
  • Array index ? lg (L n) ?

15
  • Improvement 2
  • Eliminate nodes with a single child
  • Do Skipping
  • Label each internal node with its position
  • Improvement 3
  • De La Briandais Tree
  • Eliminate null pointer in the internal node
  • Save memory when array are sparsely populated
Write a Comment
User Comments (0)
About PowerShow.com