Dynamic Set ADT; Dynamic Set Dictionary - PowerPoint PPT Presentation

About This Presentation
Title:

Dynamic Set ADT; Dynamic Set Dictionary

Description:

Put all elements that hush to the same slot in an unsorted doubly-linked list, ... Selecting a hush function: Division method. Multiplication method. Gerda Kamberova ... – PowerPoint PPT presentation

Number of Views:145
Avg rating:3.0/5.0
Slides: 21
Provided by: csHof
Learn more at: https://cs.hofstra.edu
Category:
Tags: adt | dictionary | dynamic | hush | set

less

Transcript and Presenter's Notes

Title: Dynamic Set ADT; Dynamic Set Dictionary


1
Dynamic Set ADT Dynamic Set Dictionary
  • Definitions of Dynamic Set and Dictionary
  • Implementation of Dictionary with lists and
    arrays
  • Implementation with DAT
  • Implementation with hash table using collision
    resolution by chaining
  • Complexity of the operations under simple uniform
    hashing
  • Selecting a hash function
  • Applications

2
Dynamic Sets ADT
  • GOAL investigate data structures and algorithms
    that support efficient implementation of various
    operations on sets.
  • Dynamic sets may change size over time
  • Key identifier of an element.
  • Operations
  • Search
  • Insert
  • Delete
  • Min
  • Max
  • Predecessor
  • Successor

3
Dictionary Go look it up!
  • Primary use store data so they can be located
    quickly using keys.
  • Examples of dictionary
  • the set of bank accounts
  • the set of windows opened by GUI
  • student database
  • the symbol table used by compilers
  • Dictionary ADT A dynamic set which support
    Search, Insert, Delete, possibly Update.
  • Hash Table data structure implements
    Dictionary
  • worst-case time to perform the operations is
    O(n)
  • expected time is O(1)

4
Dictionary ADT
x
  • Dictionary of records, T.
  • Each record has (key,data),
  • Keys are distinct
  • x is a reference to a dictionary record
  • Operations
  • insert(T,x) inserts the record pointed to by x
    into T
  • delete(T,x), removes record point by x from T
  • search(T,key), returns a pointer to the record
    with the given key, or null if no record has
    that key
  • update(T,oldrec,newrec), updates the record
    pointed by oldrec to have the record pointed to
    by newrec.
  • Example
  • x search(key)
  • delete(x)

key
data
5
Implementation of Dictionary
  • Input size n, number records in the dictionary
  • Worst-case complexity sing lists and arrays

Insert T(n) Search T(n) Delete T(n) Update T(n) Space Complx.
Unsorted Dbl-lnkd
Sorted Dbl-lnkd
Sorted Array
6
Implementation of Dictionary DAT
  • Direct-access table, T
  • Datum or reference corresponding to key k is
    stored in slot k.
  • If T(k)NULL, no record with key k.
  • Example, r5

T
0





U
1
3
2
3
0
4
4
7
DAT Complexity of the operations
  • DirectAddress_Search( T, key k)
  • return Tk
  • DirectAddress_Insert( T, ptr x to element)
  • Tkey(x)x
  • DirectAddressDelete( T, ptr x to element)
  • Tkey(x) NULL
  • Each operation time.
  • Moderate r, rlt1000
  • What if the number of keys, n , stored at any
    particular time much smaller than r?
  • Example student dictionary, 109, n4000.

8
Hash Table
T
  • A version of DAT where item with key k is stored
    at slot h(k)
  • The keys do not have to be integers
  • h is a hash function, maps keys to integers,
  • h hashes to slot








0
1
2
U
3
h
4
hash function
m-1
9
Hash Tables
  • Hash tables are typically one of the most
    efficient ways of implementing a Dictionary ADT,
    particularly if we know something about the
    distribution of the key values
  • Hash tables do not support efficiently operations
    that rely on relative order of data elements, for
    example fining min, max or sorting
  • Since
  • h hashes multiple keys to the same slot.
  • Collision occurs when two keys hashed to the
    same slot
  • Cannot avoid collision. Must resolve it.

10
Collision resolution by chaining
  • Put all elements that hush to the same slot in an
    unsorted doubly-linked list, where the hash table
    entry, T(h(k)), is a pointer to the first item
    in the list








h
hash function
U
11
Implementation of a Dictionary using hashing with
collision resolution by chaining
  • T is the hash table, x is a ptr to a record
  • key(x) returns the key of the record pointed by x
  • ChainedHash_Search( T, k )
  • search key k in the unsorted list Th(key)
  • ChainedHash_Insert( T, x )
  • insert the record pointed by x at the head
    of the list Th(key(x))
  • ChainedHash_Delete( T, x )
  • delete x from the doubly-linked list
    Th(key(x))

12
Time complexity analysis of hashing with
collision resolution by chaining
  • Computing h(k), h(k)
  • Insert worst case
  • Search worst-case, all keys hash to the same
    slot,
  • Delete
  • Update , (including delete and reinsert
    if key changes).
  • Load factor , average
    number keys per slot
  • m size of the hash table
  • n number records currently in the dictionary

13
Average-case time complexity using hashing with
collision resolution by chaining
  • Assume simple uniform hashing for any key k,
    h(k) is equally likely to hash to any of the m
    slots of the table T
  • Search the expected time to search is the time
    to hash the key, plus the expected length of the
    list at the hashed slot.
  • Let T(j) the linked list at slot j
  • Let be R.V. denoting the length of T(j) ,
    j1,..,m.
  • What is the distribution of . For
    i0,1,2,,n,

14
Average-case time complexity using hashing with
collision resolution by chaining
  • Search (cont) T(j) is the linked list at slot
    j, is RV denoting the length of T(j) ,
    j1,..,m. The distribution of
  • The expected length is

15
Average-case time complexity using hashing with
collision resolution by chaining
  • Assumed simple uniform hashing
  • Search the expected time to search is the
    expected length to hash,
  • , plus E(T(j)),
  • The average-csea complexity for search is
    ,
  • i.e. one plus the avg number keys per slot
  • If
  • Insert, Delete, Update

16
Selecting a hash function (HF)
  • A good HF should not give preference to some
    slots over others
  • HF should distribute keys uniformly, i.e. a key
    is equally likely to hash to any of the slots
    simple uniform hashing (SUH)
  • If the keys are drawn from U according to
    distribution P, and K the RV representing the
    key drawn, for SUH
  • Example
  • U0, 1
  • P is the uniform distribution over 0,1, h
    that achieves SUH is given by
  • For m100, h(0.5)50, h(0.25)25,
  • The problem is that we do not know the key
    distribution P, so we cannot check (1). In
    practice, heuristics are used to derive HF.

17
Selecting a hash function
  • Regularity condition hash function should be
    independent of any patterns in the data. Similar
    keys should not hash to similar or close slots
  • Assume keys are natural numbers, if not will
    re-map them to the naturals.
  • Division Method for selecting HF
  • HF should depend on the complete data (all bits)
    m should not be power of 2 or 10. Why?
  • m10000 376218705
    593598705
  • Best select m to be prime, not close to powers of
    2 or 10
  • Given n,
  • decide what load factor (avg search time) the
    application can tolerate
  • Select m to be prime, close to n/load factor and
    not close to power of 2
  • Run experiments and test that that h does SUH

18
Selecting a hash function
  • Multiplication method
  • Define
  • Advantage value of m is not critical
  • Good choice of A,
  • Usually, m is power of 2 (easy to implement,
    multiplication by 2 is a shift)

19
ADT Dictionary
  • Hashing the preferred way to implement
    Dictionary
  • Achieves O(1) avg time to search when the load
    factor is close to 1 (gt0.8 rule of thumb)
  • O(1) time to insert, delete, update
  • Collision resolution how to deal with keys that
    hash to the same slot.
  • Collision resolution by chaining maintains
    unsorted doubly linked lists at the slots
    (overhead for maintaining the lists)
  • Selecting a hush function
  • Division method
  • Multiplication method

20
Applications of hashing
  • Compiler use hashing in symbol table
    implementation (to keep track of defined
    variables).
  • Graph problems where nodes are identified by
    names instead numbers
  • Game playing software to keep transposition
    table (of already encountered lines of play)
  • On-line spell checkers (without error
    correction). The whole dictionary is pre-hashed
    and words can be checked in constant time.
Write a Comment
User Comments (0)
About PowerShow.com