Multiple Choice Hash Tables with Moves on Deletes and Inserts PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Multiple Choice Hash Tables with Moves on Deletes and Inserts


1
Multiple Choice Hash Tables with Moves on Deletes
and Inserts
  • Adam Kirsch
  • Michael Mitzenmacher

2
Hashing Modern Perspective
  • For many situations (e.g., hardware for routers)
    multiple choice hash tables are state-of-the-art.
  • Each item gets d possible hash locations, placed
    in one.
  • Moving items among choices (e.g., cuckoo hashing)
    greatly improves space utilization.
  • Only cost may take many moves per insert.

3
Previously
  • Schemes that move at most 1 item per insertion.
  • Limit cost of cuckoo hashing.
  • Schemes that batch move operations in a queue.
  • Amortize cost of cuckoo hashing.
  • Using content addressable memories (CAMs) to
    reduce chance of overflow.
  • Small CAMs yield big gains.

4
Contributions
  • Consider potential of moving items on deletions.
  • Focus on one move per deletion/insertion.
  • Examine alternative approach using weaker hashing
    from KTC, Peacock Hashing.
  • Analyze limits of performance.

5
Multilevel Hash Table BK90
  • Use a multilevel hash table (MHT)
  • Can store n elements with d log log n O(1)
    levels in O(n) space with high probability
  • Example with d 4 hash functions

Level
1
2
x
3
Skew more elements placed by early hash
functions (double exponential decay)
4
6
Second Chance (SC) Scheme
  • Standard MHT fills from top down
  • elements cascade from table to table.
  • We try to slow cascade at every step.

x
Standard MHT Insertion
7
Second Chance (SC) Scheme
  • Standard MHT fills from top down
  • elements cascade from table to table.
  • We try to slow cascade at every step.

x
8
Second Chance (SC) Scheme
  • Standard MHT fills from top down
  • elements cascade from table to table.
  • We try to slow cascade at every step.

x
9
CAMs
  • Last few collisions hard to stop.
  • Can waste lots of space on few items.
  • Solution content addressable memory.
  • CAMs fully asociative.
  • Hold small numbers of items.

10
Moves on Deletions
  • Harder to manage.
  • What item to move up?

Level
1
2
x
3
4
11
Hint-Based Approach
  • Each cell stores hint for where an item to move
    on delete is held.
  • Hints can be kept fairly small.
  • About log n bits.
  • Various hint approaches possible.
  • We found replace hint on any collision works
    well.
  • May depend on item lifetime distribution, etc.
  • One move, recursive move variations.

12
Simulation Data
  • No current method of analysis for hints.
  • Use simulations. 10,000 trials per data point.
  • MHT levels decreasing in size by factor of 2.
    Plus small CAM.
  • With n items, top level has size n.
  • Space usage just above 50.
  • Load table to n elements, alternate
    inserts/deletes for 218 steps.
  • Exponentially distributed lifetimes.
  • Goal how many hash functions needed?

13
Simulation Results
14
Lessons from Simulations
  • No moves very weak.
  • Second Chance (move on insert) more powerful than
    hint-based move on delete.
  • But the two combine well.
  • Four hash functions better than 50 load, small
    CAM.

15
Alternative Weak Hashes
  • To avoid hints, overflow at each bucket splits to
    two buckets at next level.
  • Each bucket receives from four buckets.
  • Less spreading of items, but know where to look
    on deletes.
  • Conjecture loss of randomness implies weak
    performance.

16
Picturing Weak Hashes
17
Two Idealized Schemes
  • Each bucket holds random item, splits rest.
  • Each bucket counts items passed to bucket A and
    bucket B at next level, greedily holds item from
    bucket with larger count.
  • Assume invariants kept over insertions/deletions
    at all times.
  • Can be analyzed recursively level by level.
  • Get distribution of bucket loads at each level.
  • Obtain average case peformance.

18
Results
19
Conclusions
  • Weak hashes, based on buckets, much less
    effective than hints.
  • Even under optimistic assumptions.
  • One move approaches effective.
  • Move on insert/delete complement each other.
  • Need methods for analysis.
  • Challenging dependencies hard to get exact
    numbers.
Write a Comment
User Comments (0)
About PowerShow.com