Title: Hashing: Collision Resolution Schemes
1Hashing Collision Resolution Schemes
- Collision Resolution Techniques
- Introduction to Separate Chaining
- Collision Resolution using Separate Chaining
- Introduction to Collision Resolution using Open
- Addressing
2Collision Resolution Techniques
- There are three broad ways of collision
resolution - 1. Separate Chaining A linked list-based
implementation. - 2. Open Addressing Array-based implementation.
- (i) Linear probing (linear search)
- (ii) Quadratic probing (nonlinear search)
- (iii) Random increments/decrements
- (iv) Rehashing (double hashing)
- 3. Buckets methods Usually a combination of (1)
(2)
3Introduction to Separate Chaining
- The hash table is implemented as an array of
linked lists. - Inserting an item, r, at index i is simply
insertion into the linked list at position i. - Synonyms are chained in the same linked list.
- Retrieval of an item, r, with hash address, i, is
simply retrieval from the linked list at position
i. -
- Deletion of an item, r, with hash address, i, is
simply deleting r from the linked list at
position i.
4Separate Chaining with String Keys
- Recall that search keys can be numbers, strings
or some other object. - The following Java method implements such
technique - public static int hash(String key, int
tableSize) - int hashVal 0
- for (int i 0 i lt key.length() i)
- hashVal key.charAt(i)
-
- return hashVal tableSize
-
-
- The following class which describes commodity
items - class CommodityItem
- String name // commodity name
- int quantity // commodity quantity needed
- double price // commodity price
5Example 1 Separate Chaining
- Devise an appropriate hash function and use it to
load the information about the following
commodity items into a hash table of size 13
using separate chaining. - onion 1 10.0
- tomato 1 8.50
- cabbage 3 3.50
- carrot 1 5.50
- okra 1 6.50
- mellon 2 10.0
- potato 2 7.50
- Banana 3 4.0
- olive 2 15.0
- salt 2 2.50
- cucumber 3 4.50
- mushroom 3 5.50
- orange 2 3.00
6Example 1 Separate Chaining (cont'd)
okra
potato
0 1 2 3 4 5 6 7 8 9 10 11 12
onion
carrot
- Item Qty Price h(key)
- onion 1 10.0 1
- tomato 1 8.50 10
- cabbage 3 3.50 4
- carrot 1 5.50 1
- okra 1 6.50 0
- mellon 2 10.0 10
- potato 2 7.50 0
- Banana 3 4.0 11
- olive 2 15.0 10
- salt 2 2.50 7
- cucumber 3 4.50 9
- mushroom 3 5.50 6
- orange 2 3.00 12
cabbage
mushroom
salt
cucumber
mellon
tomato
olive
banana
orange
7Introduction to Open Addressing
- In this method the entries are placed inside the
array itself. - The probe sequence is essentially a sequence of
functions - h0, h1, h2, , hn-1
- where,
- hi K -gt 0, 1, , n-1
- To insert item r, we examine array locations
- h0(r), h1(r), h2(r), ...,
- Similarly, to find item r, we examine the same
sequence of - locations in the same order.
8Introduction to Open Addressing (cont'd)
- The most common probe sequences are of the form
- hi(r) (h(r) c(i)) mod n, i 0, 1, ,
n-1. - The function c(i) is required to have the
following two properties - Property 1
- c(0) 0.
- Property 2 The set of values
- c(0) mod n, c(1) mod n, c(2) mod n, ,
c(n-1) mod n - must contain every integer between 0 and n-1
inclusive.
9Open Addressing Linear Probing
- Linear Probe Here the function c(i) is a linear
function - in i
- c(i) ai b
- Property 1 requires that c(0) 0. Therefore, b
must be - zero.
- For c(i) ai to satisfy Property 2, a and n
must be relatively prime. - The linear probing sequence that is usually used
is - hi (r) (h(r) i) mod n, i0,1,2,, n-1
- Insert record at first empty slot and if no empty
slot is found then the hash table is full and
insertion fails.
10Example 2 Linear Probing
- Use the hash function h(r) r.id 13 to load
the following records into an array of size 13. - Al-Otaibi Ziyad 1.73 985926
- Al-Turki, Musab Ahmad Bakeer 1.60 970876
- Al-Saegh, Radha Mahdi 1.58 980962
- Al-Shahrani, Adel Saad 1.80 986074
- Al-Awami, Louai Adnan Muhammad 1.73 970728
- Al-Amer, Yousuf Jauwad 1.66 994593
- Al-Helal, Husain Ali AbdulMohsen 1.70 996321
- Then insert the following records using
linear probing to - resolve collisions, if any.
- Al-Najjar, Khaled Ziyad 1.69 987615
- Al-Ali, Amr Ali Zaid 1.79 987630
- Al-Ramadi, Husam Yahya 1.58 987602
11Example 2 Introduction to Hashing (cont'd)
0 1 2 3 4 5 6
7 8 9 10 11 12
Husain
Yousuf
Khalid
Radha
Amr
Musab
Adel
Husam
Louai
Ziyad
12Linear Probing Some Notes
- Notice from this table that a large cluster has
already been formed. - In general, empty cells following the cluster
have higher - chance of being hashed into.
- The probability of taking longer probe sequences
is much - higher with clusters.
- This is one disadvantage of linear probing. Other
methods - attempt to improve on this.
13Introduction to Retrieval Deletion
- Retrieval To search for a record we
- Calculate its hash value.
- Check that location of the array for the record.
- If found, return the record.
- If not, keep searching until you find the
record or you - reach an empty table location.
- Attempting to retrieve a non-existent record is
very expensive. - Deletion
- In open addressing, where a record is stored is
not ecessarily its home position. - We cannot just set the location of a deleted
record to empty. - A special flag or key value is needed to mark
deleted records - locations.
14Example 3 Retrieval Deletion
- Consider the following hash table constructed in
Example 2
0 1 2 3 4 5 6
7 8 9 10 11 12
Husain
Yousuf
Khalid
Radha
Amr
Musab
Adel
Husam
Louai
Ziyad
Delete Khalid's record (id 987615) and then
retrieve the records for Amr and then that of
Husam.
15Example 3 Retrieval Deletion
0 1 2 3 4 5 6
7 8 9 10 11 12
?
Husain
Yousuf
Radha
Amr
Musab
Adel
Husam
Louai
Ziyad
16Exercises
- 1.Given that,
- c(i) ai,
- for c(i) in linear probing, we discussed that
this equation - satisfies Property 2 only when a and n are
relatively prime. Explain what the - requirement of being relatively prime means in
simple plain language. - 2.Consider the general probe sequence,
- hi (r) (h(r)
c(i))mod n. - Are we sure that if c(i) satisfies Property 2,
then hi(r) will cover all n hash table - locations, 0,1,...,n-1? Explain.
- 3.Suppose you are given k records to be loaded
into a hash table of - size n, with k lt n using linear probing. Does the
order in which these - records are loaded matter for retrieval and
insertion? Explain. - 4.A prime number is always the best choice of a
hash table size. Is this - statement true of false? Justify your answer
either way.