Hashtables for Realtime and Embedded Systems - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Hashtables for Realtime and Embedded Systems

Description:

Hashtables for Real-time and Embedded Systems. Anand Krishnan ... Typically part of larger system with real-time requirements ... by Ambrose Bierce. ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 51
Provided by: anandkr
Category:

less

Transcript and Presenter's Notes

Title: Hashtables for Realtime and Embedded Systems


1
Hashtables for Real-time and Embedded Systems
Anand KrishnanApril 17, 2003Advisor Dr. Ron K.
Cytron Guest Advisor Dr. Douglas Niehaus
Center for Distributed Object Computing Department
of Computer Science Washington University
Sponsored by DARPA under contract F33615-00-C-1697
2
Overview
  • Motivation
  • Hashtable Organization
  • Real-time Hashtable model
  • Behavioral Analysis
  • Experiments and results
  • Conclusions and Future Work

3
Real-time and Embedded Systems(RTES)
  • Real-time Systems
  • Timing and predictability
  • Embedded Systems
  • Typically part of larger system with real-time
    requirements
  • Space Constraints

Multimedia
Avionics
Application
Libraries
Programming Language
Operating System
Hardware
4
Collection Objects
  • Abstract data type to store information
  • lists, sets, trees, hashtables etc
  • Part of the library of a language specification
  • Java Collection Library
  • The Standard Template Library (STL)
  • Building blocks of software modules and
    applications
  • Emphasis on average case performance
  • Necessity to migrate towards real-time
  • Focus of work on Hashtables
  • popular
  • excellent average case performance
  • interesting case for real-time systems

5
Overview
  • Motivation
  • Hashtable Organization
  • Real-time Hashtable model
  • Behavioral Analysis
  • Experiments and results
  • Conclusions and Future Work

6
Hashtable Organization
  • HASH There is no definition for this word,
    nobody
  • knows what hash is.
  • - Devils Dictionary by Ambrose Bierce.
  • Hashtable Provides an implementation of
    dictionary interface
  • Insertion, access and removal of entries
  • Each entry (key,value) pair
  • Operations on a given hashtable, HT
  • GET (key)
  • Returns value (key, value) ? HT
  • PUT (key, value)
  • HT ? (HT (k, v) k key) U (key,value)
  • REMOVE (key)
  • HT ? (HT (k, v) k key)

7
Hashtable Organization
  • Hash Function
  • h Keys ? Zn
  • Zn is the set of integers modulo n
  • n is the number of slots in the hashtable
  • h(key) hashes into index i, if h(key) i, ?i in
    Zn
  • Collision
  • For two distinct keys, x and y, h(x) h(y)
  • Collision Resolution
  • Open Addressing
  • Successive probing for empty slots
  • Concern Search time for an entry (element) can
    be O (n)
  • Chaining
  • Colliding entries are placed on a linked list
  • Concern Length of linked list or bucket

8
Hashtable Organization
  • Load (factor) Ratio of Number of entries to
    number of buckets
  • This reflects average performanceusually the
    measure of interest, but not for us.
  • Rehashing
  • grow hashtable by increasing the number of slots
  • Typically, if current load exceeds a threshold
    load factor
  • As part of a hashtable operation
  • New hash function reassigns elements over new
    space
  • Desirable property of a hash function
  • to delay rehash by even distribution of elements

9
Overview
  • Motivation
  • Hashtable Organization
  • Real-time Hashtable model
  • Behavioral Analysis
  • Experiments and results
  • Conclusions and Future Work

10
RTES Concerns for Hashtables
  • Real-time issues
  • Task provisioning
  • Unbounded hashtable operation on rehash
  • Over-provisioning (worst-case operation time)
  • Under-provisioning (average-case operation time)
  • Rehash triggered by average length of bucket
  • worst case?
  • Embedded system constraints
  • Hashtable expansion
  • allocate new table
  • rehash extant elements into new table
  • deallocate old table
  • Problems
  • Storage Blip
  • Holes in runtime storage heap, leading to
    excessive defragmentation

11
Amortized Rehashing
  • When to Rehash?
  • Trigger based on length of any given bucket
  • Rehash Triggering Length (RTL)
  • Space concerns
  • maintain two hash functions, H and H
  • H maps to space 1..B, H' maps to space 1..B'
  • allocate new buckets on demand
  • when H' hashes an element to some bucket in range
    B' - B, B')
  • Issue of Garbage Collection?
  • Amortization during rehashing
  • rehash (clean) extant elements from the B buckets
    incrementally over each hashtable operation
  • Cleaning Remapping all elements in a bucket
    using H'

12
Amortized Rehashing
Incremental rehash from old to new
RTL 5 B 6 B' 8
Key k
1
2
H'
H
3
4
New bucket
5
Old bucket
6
7
8
13
Amortized Rehashing
Clean the old bucket
RTL 5 B 6 B' 8
Key k
1
2
H'
H
3
4
5
6
7
8
14
Amortized Rehashing
Clean the old bucket
RTL 5 B 6 B' 8
Key k
1
2
H'
H
3
4
5
6
7
8
15
Amortized Rehashing
Clean the new bucket
RTL 5 B 6 B' 8
Key k
1
2
H'
H
3
4
5
6
7
8
16
Amortized Rehashing
Clean the new bucket
RTL 5 B 6 B' 8
Key k
1
2
H'
H
3
4
5
6
7
8
17
Amortized Rehashing
Clean one gratuitous bucket
RTL 5 B 6 B' 8
Key k
1
2
H'
H
3
4
5
6
7
8
18
Amortized Rehashing
Clean one gratuitous bucket
RTL 5 B 6 B' 8
Key k
1
2
H'
H
3
4
5
6
7
8
19
Amortized Rehashing
Perform hash operation at new bucket
RTL 5 B 6 B' 8
Key k
1
2
H'
H
3
4
5
6
7
8
20
Amortized Rehashing
  • Duties of a Hashtable operation
  • Clean old bucket H(k) if necessary
  • Clean or allocate new bucket H'(k) if necessary
  • Perform Hashtable operation on new bucket H'(k)
  • Perform incremental clean by cleaning gratuitous
    bucket
  • Hashtable Modes
  • Stable
  • Rehash
  • Issues resolved
  • Rehash triggered based on worst case length of
    bucket
  • Rehash distributed over multiple hashtable
    operations
  • Storage blip avoided
  • Worst case operation time a Length of longest
    bucket

21
Overview
  • Motivation
  • Hashtable Organization
  • Real-time Hashtable model
  • Behavioral Analysis
  • Stable and Rehash mode analysis
  • Incremental Cleaning Mechanisms
  • Optimization of hashtable operation duties
  • Experiments and results
  • Conclusions and Future Work

22
Analysis
Ridiculously Uniform Hash Assumption (RUHA)
Worst-Case Average-Case
Hashtable With B Buckets
23
Analysis
Ridiculously Uniform Hash Assumption (RUHA)
  • Bucket Increment
  • Rehash successful if
  • Bucket length bound
  • At any instant a buckets length is strictly
    bounded by 2 x RTL

Citation Friedman, Leidenfrost, Brodie, and
Cytron. Hashtables for embedded and real-time
systems. In Proceedings of IEEE Workshop on
Real-Time Embedded Systems, 2001.
Citation Friedman, Krishnan, Leidenfrost,
Brodie, Cytron, and Niehaus, Hashtables for
embedded and real-time systems. Technical
Report WUCS-03-15
24
Analysis
Hash Table With B Buckets
25
Stable mode model (analytical)
26
Bucket Length Bound
  • Bucket Length Bound
  • Denoted by K
  • Typically, K ? (2 x RTL - 1)
  • RTL 5
  • Various bounds (K)
  • Peaks of Stable mode curve

27
Rehash mode model
28
Rehash mode model (analytical)
  • RTL 5
  • K 10
  • Bounded by
  • Stable mode curve

29
Incremental Cleaning Mechanisms
  • Need based Incremental Clean
  • Cleaning Rate number of gratuitous
  • Guaranteed Cleaning Rate
  • List of unclean buckets
  • Greedy Cleaning
  • choose buckets based on increasing or decreasing
    order of lengths
  • Prioritized Cleaning
  • Schedule buckets observed but not visited if
    longer than bound.

30
Optimization of operation duties
  • Problem Inefficiencies in performing hashtable
    operation
  • Examine elements that are already clean
  • What if element is found in old bucket ?
  • Solution Maintain 2 lists for each bucket
  • In sync list and Out of sync list
  • Cost extra space

31
Using Bucket with 2 lists
  • Cleaning only examines In sync list of old
    bucket
  • Enables cleaning and operation to be performed
    simultaneously
  • Reduces the average time for an operation
  • Useful for Systems with low or no cache
  • Our implementation did not show a difference in
    timings.

32
Overview
  • Motivation
  • Hashtable Organization
  • Real-time Hashtable model
  • Behavioral Analysis
  • Experiments and results
  • Conclusions and Future Work

33
Real-time Readiness Ratio
  • Worst-case to Average-case operation time

Reasonably Bounded
RT Hash behavior under Solaris
34
Experimental setup
  • Keys Strings from dictionary, used by Unix
    spell
  • Hash function
  • Javas hashCode()
  • Secondary hash function
  • Involves Insertion of elements into the hashtable
  • Metrics of interest
  • Measured fraction of buckets exceeding the bound
    at any instant
  • Number of operations examining a bucket of length
    longer than the bound
  • Load factor
  • Space versus time tradeoff

Knuth hash
int kHash(Object key) double A
0.618033988 int h key.hashCode()
double fractionPart (h A) -
Math.floor(h A) return
(int)(fractionPart B)
35
Knuth Hash Function
  • Why we see what we see?
  • Bucket Bound (RUHA)
  • Bucket increment (RUHA)
  • Hash function?

36
Knuth Hash function Various Bounds
  • RTL 5
  • K 10
  • Minimum Bucket increment
  • Cleaning Rate 3

37
Knuth Hash function Various Bounds
  • RTL 5
  • K 14
  • Minimum Bucket increment
  • Cleaning Rate 3

38
Number of Violating Operations
  • Real-time system concern
  • Violating operation rather than hashtable state
  • Operations that examine bucket violating the bound
  • RTL 5, Various Bounds
  • Minimum Bucket increment
  • Cleaning Rate 3 and 1.
  • Higher cleaning rate is better
  • Quicker rehash
  • But more rehashes
  • Bad for Average performance

39
Compensation Factor and Bucket Increment
N
1 -
Hash Function Uniformity measure, HFU

(RTL 1) x B
B x RTL 1
(1 HFU CB)
x
B'

RTL - 1
40
Supplemental Hash Function
  • Supplemental hash (Doug Lea)
  • java.util.HashMap in Suns Java 1.4.1
  • Better distribution of elements
  • What about Prioritized Cleaning?

int sHash(Object key) int h
key.hashCode() h (h ltlt 9) h (h
gtgtgt 14) h (h ltlt 4) h (h gtgtgt 10)
return h B
41
Violating Operations Varying RTL
  • Various RTL, Bound 2 x RTL - 1
  • Minimum Bucket increment
  • Cleaning Rate 1
  • Boundary value, RTL 2
  • Supplemental Hash lives up to
  • its promise
  • Load Factor
  • 85 for RTL 10,20

42
Overview
  • Motivation
  • Hashtable Organization
  • Real-time Hashtable model
  • Behavioral Analysis
  • Experiments and results
  • Conclusions and Future Work

43
Conclusions
  • Developed and analyzed a probabilistic model to
    characterize bounds on Hashtable operations
  • Increasing permissible bound
  • Compensating additional buckets
  • Varying RTL
  • Prioritized Cleaning
  • Optimized the functionality of operations
  • Targeted towards systems with no cache

44
Future Work
  • Migrate towards ACE implementation
  • Explore effect of Hashtable size reduction
  • Sequence of operations known
  • Incremental work pattern
  • Hashtables
  • BSRB trees
  • Incremental garbage collector
  • Explore Lock-free incremental work

45
Thanks
  • Dr. Ron K. Cytron
  • Dr. Doug Niehaus, University of Kansas
  • Thesis Committee
  • Dr. Chris Gill
  • Dr. Chenyang Lu
  • All members of the DOC Group
  • Scott Friedman, Nick Leidenfrost and Ben Brodie
  • Morgan Deters, Sharath Cholleti, Martin
    Linenweber
  • Vignesh Nandakumar and Ravi Pratap

46
Questions
47
Technical Approach Bounded Resources
Solution use two-dimensional hash
48
Prioritized Cleaning K 9
49
Supplemental Hash Function
50
Supplemental Hash Prioritized Cleaning
Write a Comment
User Comments (0)
About PowerShow.com