Hash Tables - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

Hash Tables

Description:

It is built to enable fast searching. What LnkList Tree HashTable. Store Light Less light Medium ... 323 323, guava. 350 350, oranges ... – PowerPoint PPT presentation

Number of Views:35

Avg rating:3.0/5.0

Slides: 34

Provided by: sfu5

Category:

more less

Transcript and Presenter's Notes

Title: Hash Tables

1
IAT 355

Hash Tables
Binary Search
Sorting

2
Data Structures

With a collection of data, we often want to do
many things
Organize
Iterate
Add new
Delete old
Search

3
Data Structures

It is built to enable fast searching
What LnkList Tree HashTable
Store Light Less light Medium
Iterate simple complex extra work
Add O(1) O( lgN ) O(1)
Delete O(1) O( lgN ) O(1)
Search O(n) O(lgN) O(1)

4
Hash Table

An array in which items are not stored
consecutively - their place of storage is
calculated using the key and a hash function
Hashed key the result of applying a hash
function to a key
Keys and entries are scattered throughout the
array

key
entry
4
hash function
array index
Key
10
123
5
Hashing

insert compute location, insert TableNode O(1)
find compute location, retrieve entry O(1)
remove compute location, set it to null O(1)

key
entry
4
10
123
6
Hashing example

10 stock details, 10 table positions

key
entry

Stock numbers between 0 and 1000

85 85, apples
0

Use hash function stock no. / 100

What if we now insert stock no. 350?
Position 3 is occupied there is a collision

323 323, guava
462 462, pears

Collision resolution strategy insert in the next
free position (linear probing)

350 350, oranges

Given a stock number, we find stock by using the
hash function again, and use the collision
resolution strategy if necessary

912 912, papaya
7
Hashing performance

The hash function
Ideally, it should distribute keys and entries
evenly throughout the table
It should minimize collisions, where the position
given by the hash function is already occupied
The collision resolution strategy
Separate chaining chain together several
keys/entries in each position
Open addressing store the key/entry in a
different position
The size of the table
Too big will waste memory too small will
increase collisions and may eventually force
rehashing (copying into a larger table)
Should be appropriate for the hash function used
and a prime number is best

8
Hash function

Truncation
Ignore part of the key and use the rest as the
array index (converting non-numeric parts)
A fast technique, but check for an even
distribution throughout the table
Folding
Partition the key into several parts and then
combine them in any convenient way
Unlike truncation, uses information from the
whole key
Modular arithmetic (used by truncation folding,
and on its own)
To keep the calculated table position within the
table, divide the position by the size of the
table, and take the remainder as the new position

9
Hash Function Examples

Truncation If students have an 9-digit
identification number, take the last 3 digits as
the table position
e.g. 925371622 becomes 622
Folding Split a 9-digit number into three
3-digit numbers, and add them
e.g. 925371622 becomes 925 376 622 1923
Modular arithmetic If the table size is 1000,
the first example always keeps within the table
range, but the second example does not (it should
be mod 1000)
e.g. 1923 mod 1000 923 (in Java 1923
1000)

10
Choosing the table size to minimize collisions

As the number of elements in the table increases,
the likelihood of a collision increases - so make
the table as large as practical
If the table size is 100, and all the hashed keys
are divisible by 10, there will be many
collisions!
Particularly bad if table size is a power of a
small integer such as 2 or 10
More generally, collisions may be more frequent
if
greatest common divisor (hashed keys, table size)
gt 1
Therefore, make the table size a prime number
(gcd 1)

Collisions may still happen, so we need a
collision resolution strategy
11
Collision resolution chaining

Each table position is a linked list
Add the keys and entries anywhere in the list
(front easiest)
Advantages over open addressing
Simpler insertion and removal
Array size is not a limitation (but should still
minimize collisions make table size roughly
equal to expected number of keys and entries)
Disadvantage
Memory overhead is large if entries are small

No need to change position!
4
10
123
12
Applications of Hashing

Compilers use hash tables to keep track of
declared variables
A hash table can be used for on-line spelling
checkers if misspelling detection (rather than
correction) is important, an entire dictionary
can be hashed and words checked in constant time
Hash functions can be used to quickly check for
inequality if two elements hash to different
values they must be different
Storing sparse data

13
When to use hashing?

Good if
Need many searches in a reasonably stable table
Not So Good if
Many insertions and deletions,
If table traversals are needed
Need things in sorted order
More data than available memory
Use a tree and store leaves on disk

14
Java

class HashMap
Provides hash table functionality in Java
More overhead, but free implementation
Be careful to parameterize it carefully

15
Bucket Sort

For Each item to be sorted, compute
entryIndex key / tableSize
Chain entries on collision
Result Each table entry has all the entries in a
range of key values
For some problems, this is enough
Collision Detection

4
10
123
16
Bucket Sort

Frequently used in graphics interactive apps
Eg. One bucket per pixel row
Eg. One bucket for 64x64 pixel region
Put all data into buckets so that selection
(search) can rapidly locate good candidates

17
Search

Frequently wish to organize data to support
search
Eg. Search for single item
Eg. Search for all items between 3 and 7

18
Search

Often want to search for an item in a list
In an unsorted list, must search linearly
In a sorted list

19
Binary Search

Start with index pointer at start and end
Compute index between two end pointers

20
Binary Search

Compare middle item to search item
If search lt mid move end to mid -1

21
Binary Search

int Arr new int8
ltpopulate arraygt
int search 4
int start 0, end Arr.length, mid
mid (start end)/2
while( start ltend )
if(search Arrmid )
SUCCESS
if( search lt Arrmid )
end mid 1
else
start mid 1

22
Binary Search

Run Time
O( log(N) )
Every iteration chops list in half

23
Sorting

Need a sorted list to do binary search
Numerous sort algorithms

24
The family of sorting methods
Main sorting themes
Address- -based sorting
Comparison-based sorting
Proxmap Sort
RadixSort
Transposition sorting
BubbleSort
Diminishing increment sorting
Insert and keep sorted
Divide and conquer
Priority queue sorting
ShellSort
Selection sort
QuickSort
MergeSort
Insertion sort
Tree sort
Heap sort
25
Bubble sort transposition sorting

Not a fast sort!
Code is small

for (int iarr.length igt0 i--) for (int
j1 jlti j) if (arrj-1 gt arrj)
temp arrj-1
arrj-1 arrj arrj temp

26
Divide and conquer sorting
MergeSort
QuickSort
27
QuickSort divide and conquer sorting

As its name implies, QuickSort is the fastest
known sorting algorithm in practice
Its average running time is O(n log n)
The idea is as follows
1. If the number of elements to be sorted is 0 or
1, then return
2. Pick any element, v (this is called the pivot)
3. Partition the other elements into two disjoint
sets, S1 of elements ? v, and S2 of elements gt v
4. Return QuickSort (S1) followed by v followed
by QuickSort (S2)

28
QuickSort example
5
1
4
2
10
3
9
15
12
Pick the middle element as the pivot, i.e., 10
29
Partitioning example
5
11
4
25
10
3
9
15
12
Pick the middle element as the pivot, i.e., 10
30
10
4
5
25
11
3
9
15
12
9
4
5
3
10
25
11
15
12
31
Pseudocode for Quicksort

procedure quicksort(array, left, right)
if right gt left
select a pivot index (e.g. pivotIdx left)
pivotIdxNew partition(array, left, right,
pivotIdx)
quicksort(array, left, pivotIdxNew - 1)
quicksort(array, pivotIdxNew 1, right)

32
Pseudo code for partitioning
pivotIdx middle of array aswap apivotIdx
with afirst // Move the pivot out of the
way swapPos first 1 for( i swapPos 1 i
lt last i ) if (ai lt afirst)
swap aswapPos with ai swapPos
// Now move the pivot back to its rightful
place swap afirst with aswapPos-1 return
swapPos-1 // Pivot position
33
Java

Sort and binary search provided on Arrays
sort() ints, floats
sort( Object a, Comparator c )
you supply the Comparator object, which
Contains a function to compare 2 objects
binarySearch()
ints, floats.
Search Objects with Comparator object

Write a Comment

User Comments (0)