ITEC 2620M Introduction to Data Structures - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

ITEC 2620M Introduction to Data Structures

Description:

Hash function should evenly distribute keys across table ... If multiple keys are hashed to the same index/home position, quadratic probing ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 11
Provided by: mathY
Category:

less

Transcript and Presenter's Notes

Title: ITEC 2620M Introduction to Data Structures


1
ITEC 2620MIntroduction to Data Structures
  • Instructor Prof. Z. Yang
  • Course Website http//people.math.yorku.ca/zyang
    /itec2620m.htm
  • Office TEL 3049

2
HASHING
3
Key Points of this Lecture
  • Hash tables
  • Hash functions
  • Collision resolution and clustering
  • Deletions

4
Indices vs. Keys
  • Each key/record is associated with an array slot
  • We could map each key to each slot
  • e.g. last name to apartment number
  • We could then search either the array (unsorted?)
    or a look-up table (sorted?)
  • However, what if the look-up is actually a
    calculated function?
  • eliminate look-up!

5
Hash Functions
  • A hash function h() converts a key (integer,
    string, float, etc) into a table index
  • Example

6
Hash Tables
  • Records are stored in slots specified by a hash
    function
  • Look-up/store
  • Convert key into a table index with hash function
    h()
  • h(key) index
  • Find record/empty slot starting at index
    h(key)(use resolution policy if necessary)

7
Comments
  • Hash function should evenly distribute keys
    across table
  • not easy given unspecified input data
    distribution
  • Hash table should be about half full
  • note time-space tradeoff
  • more space -gt less time(and already twice as
    much space as a sorted array)
  • if half full, 50 chance of one collision
  • 25 chance of two collisions
  • etc...
  • 2 accesses on average(approaches n as table
    fills)

8
How to do better
  • What to do with collisions?
  • linear probing (classic hashing)
  • if collision, search spaces sequentially
  • To eliminate clustering, we would like each
    remaining slot to have equal probability
  • Cant use random needs to be reproducable
  • Pseudo-random probing (see text)
  • Goal of random probing? --gt cause divergence
  • Probe sequences should not all follow same path

9
Quadratic Probing
  • Simple divergence method
  • Linear probing ith probe is i slots away
  • Quadratic probing

10
Secondary Clustering
  • If multiple keys are hashed to the same
    index/home position, quadratic probing still
    follows the same path each time
  • This is secondary clustering
  • Use second hash function to determine probe
    sequence
Write a Comment
User Comments (0)
About PowerShow.com