Symbol Tables - PowerPoint PPT Presentation

About This Presentation
Title:

Symbol Tables

Description:

hash = to chop into small pieces (Merriam- Webster) = to chop any patterns in the keys so that the results are uniformly distributed (cs311) ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 23
Provided by: vdou8
Category:
Tags: chop | symbol | tables

less

Transcript and Presenter's Notes

Title: Symbol Tables


1
Symbol Tables
  • Symbol tables are used by compilers to keep track
    of information about
  • variables
  • functions
  • class names
  • type names
  • temporary variables
  • etc.
  • Typical symbol table operations are Insert,
    Delete and Search
  • It's a dictionary structure!

2
Symbol Tables
  • What kind of information is usually stored in a
    symbol table?
  • type
  • storage class
  • size
  • scope
  • stack frame offset
  • register
  • We also need a way to keep track of reserved
    words.

3
Symbol Tables
  • Where is a symbol table stored?
  • array/linked list
  • simple, but linear lookup time
  • However, we may use a sorted array for reserved
    words, since they are generally few and known in
    advance.
  • balanced tree
  • O(lgn) lookup time
  • hash table
  • most common implementation
  • O(1) amortized time for dictionary operations

4
Hashing
  • Hash tables
  • use array of size m to store elements
  • given key k (the identifier name), use a function
    h to compute index h(k) for that key
  • collisions are possible
  • two keys hash into the same slot.
  • Hash functions
  • A good hash function
  • is easy to compute
  • avoids collisions (by breaking up patterns in the
    keys and uniformly distributing the hash values)

5
Hashing
  • In the following slides
  • k is a key
  • h(k) is the hash function
  • m is the size of the hash table
  • n is the number of keys in the hash table

6
Hashing
  • What makes a good hash function?
  • It is easy to compute
  • It minimizes collisions.
  • hash to chop into small pieces
    (Merriam- Webster) to chop any patterns in
    the keys so that the results are uniformly
    distributed (cs311)

7
Hashing
  • When the key is a string, we generally use the
    ASCII values of its characters in some way
  • Examples for k c1c2c3...cx
  • h(k) (c1128x-1c2128x-2...cx1280) mod m
  • h(k) (c1c2...cx) mod m
  • h(k) (h1(c1)h2(c2)...hx(cx)) mod m, where
    each hi is an independent hash function.

8
Hash functions
  • Truncation
  • Ignore part of the key and use the remaining part
    directly as the index.
  • Example if the keys are 8-digit numbers and the
    hash table has 1000 entries, then the first,
    fourth and eighth digit could make the hash
    function.
  • Not a very good method does not distribute keys
    uniformly

9
Hash functions
  • Folding
  • Break up the key in parts and combine them in
    some way.
  • Example if the keys are 9 digit numbers, break
    up a key into three 3-digit numbers and add them
    up.

10
Hash functions
  • Middle square
  • Compute kk and pick some digits from the
    resulting number.
  • Example given a 9-digit key k, and a hash table
    of size 1000 pick three digits from the middle of
    the number kk.
  • Works fairly well in practice if the keys do not
    have many leading or trailing zeroes.

11
Hash functions
  • Division
  • h(k)k mod m
  • Fast
  • Not all values of m are suitable for this. For
    example powers of 2 should be avoided because
    then k mod m is just the least significant digits
    of k
  • Good values for m are prime numbers .

12
Hash functions
  • Multiplication
  • h(k)?m ?(k ? c- ?k ? c?) ? , 0ltclt1
  • In English
  • Multiply the key k by a constant c, 0ltclt1
  • Take the fractional part of k ? c
  • Multiply that by m
  • Take the floor of the result
  • The value of m does not make a difference
  • Some values of c work better than others
  • A good value is

13
Hash functions
  • Multiplication
  • Example
  • Suppose the size of the table, m, is 1301.
  • For k1234, h(k)850
  • For k1235, h(k)353
  • For k1236, h(k)115
  • For k1237, h(k)660
  • For k1238, h(k)164
  • For k1239, h(k)968
  • For k1240, h(k)471

nice distribution!
14
Hash functions
  • Universal Hashing
  • Worst-case scenario The chosen keys all hash to
    the same slot. This can be avoided if the hash
    function is not fixed
  • Start with a collection of hash functions
  • Select one at random and use that.
  • Good performance on average the probability that
    the randomly chosen hash function exhibits the
    worst-case behavior is very low.

15
Load factor
  • Given a hash table of size m, and n elements
    stored in it, we define the load factor of the
    table as ?n/m
  • The load factor gives us an indication of how
    full the table is.
  • The possible values of the load factor depend on
    the method we use for resolving collisions.

16
Resolving collisions Chaining
  • Chaining
  • Put all the elements that collide in a chain
    (list) attached to the slot.
  • The hash table is an array of linked lists
  • The load factor indicates the average number of
    elements stored in a chain. It could be less
    than, equal to, or larger than 1.

a.k.a. closed addressing
17
Resolving collisions Chaining
  • Insert/Delete/Lookup in expected O(1) time
  • Keep the list doubly-linked to facilitate
    deletions
  • Worst case of lookup time is linear.
  • However, this assumes that the chains are kept
    small.
  • If the chains start becoming too long, the table
    must be enlarged and all the keys rehashed.

18
Resolving collisions Chaining
  • Assumption simple uniform hashing
  • any given key is equally likely to hash into any
    of the m slots
  • Analysis of unsuccessful search
  • average time to search unsuccessfully for key k
    the average time to search to the end of a chain.
  • The average length of a chain is ?.
  • Total (average) time required ?(1 ?)

19
Resolving collisions Chaining
  • Analysis of successful search
  • Expected number e of elements examined during a
    successful search for key k one more than
    the expected number of elements examined when k
    was inserted.
  • it makes no difference whether we insert at the
    beginning or the end of the list.
  • Take the average, over the n items in the table,
    of 1 plus the expected length of the chain to
    which the i th element was added

20
Resolving collisions Chaining
Total time ?(1 ?)
21
Resolving collisions Chaining
  • Both types of search take ?(1 ?) time on
    average.
  • If nO(m), then ?O(1) and the total time for
    Search is O(1) on average
  • Insert O(1) in the worst case
  • Delete O(1) in the worst case

22
Resolving collisions Chaining
  • Storage for the elements may be allocated and
    deallocated within the hash table itself by
    linking all unused slots into a free list.
  • Insert
  • if key k hashes into empty slot h(k), put it
    there and set a flag to indicate that this is the
    actual position where the element hashed.
  • if h(k) is not empty, and the element k1 it
    contains has its flag set, then use a slot off
    the free list to store k1. Its flag should be
    unset.
  • if h(k) is not empty, and the element k1 it
    contains has its flag unset, then move k1 to
    another slot and store k in h(k).
Write a Comment
User Comments (0)
About PowerShow.com