Title: Scalable and LockFree Concurrent Dictionaries
1Scalable and Lock-Free Concurrent Dictionaries
- HÃ¥kan Sundell
- Philippas Tsigas
2Outline
- Synchronization Methods
- Dictionaries
- Concurrent Dictionaries
- Previous results
- New Lock-Free Algorithm
- Experiments
- Conclusions
3Synchronization
- Shared data structures needs synchronization
- Synchronization using Locks
- Mutually exclusive access to whole or parts of
the data structure
P1
P2
P3
P1
P2
P3
4Blocking Synchronization
- Drawbacks
- Blocking
- Priority Inversion
- Risk of deadlock
- Locks Semaphores, spinning, disabling interrupts
etc. - Reduced efficiency because of reduced parallelism
5Non-blocking Synchronization
- Lock-Free Synchronization
- Optimistic approach (i.e. assumes no
interference) - The operation is prepared to later take effect
(unless interfered) using hardware atomic
primitives - Possible interference is detected via the atomic
primitives, and causes a retry - Can cause starvation
- Wait-Free Synchronization
- Always finishes in a finite number of its own
steps.
6Dictionaries (Sets)
- Fundamental data structure
- Works on a set of ltkey,valuegt pairs
- Three basic operations
- Insert(k,v) Adds a new item
- vFindKey(k) Finds the item ltk,vgt
- vDeleteKey(k) Finds and removes the item ltk,vgt
7Previous Non-blocking Dictionaries
- M. Michael High Performance Dynamic Lock-Free
Hash Tables and List-Based Sets, SPAA 2002 - Based on Singly-Linked List
- Linear time complexity!
- Fast Lock-Free Memory Management
- Causes retries of concurrent search operations!
- Building-block of Hash Tables
- Assumes each branch is of length ltlt10.
- However, Hash Tables might not be uniformly
distributed.
8Randomized Algorithm Skip Lists
- William Pugh Skip Lists A Probabilistic
Alternative to Balanced Trees, 1990 - Layers of ordered lists with different densities,
achieves a tree-like behavior - Time complexity O(log2N) probabilistic!
Head
Tail
25
50
1
2
3
4
5
6
7
9New Lock-Free Concurrent Skip List
- Define node state to depend on the insertion
status at lowest level as well as a deletion flag - Insert from lowest level going upwards
- Set deletion flag. Delete from highest level
going downwards
1
2
3
4
5
6
7
D
D
D
D
D
D
D
3
2
1
p
3
2
1
p
D
10Overlapping operations on shared data
Insert 2
2
- Example Insert operation- which of 2 or 3 gets
inserted? - Solution Compare-And-Swap atomic
primitiveCAS(ppointer to word, oldword,
newword)booleanatomic do if p old then
p new return true else return false
1
4
3
Insert 3
11Concurrent Insert vs. Delete operations
b)
1
4
2
a)
- Problem- both nodes are deleted!
- Solution (Harris et al) Use bit 0 of pointer to
mark deletion status
Delete
3
Insert
b)
1
4
2
a)
c)
3
12New Lock-Free Dictionary - Techniques Summary
- Based on Skip Lists
- Treated as layers of ordered lists
- Uses CAS atomic primitive
- Lock-Free memory management
- IBM Freelists
- Reference counting (ValoisMichaelScott)
- Helping scheme
- Back-Off strategy
- All together proved to be linearizable
13Experiments
- Experiment with 1-30 threads performed on systems
with 2 respective 64 cpus. - Each thread performs 20000 operations, whereof
the first total 50-10000 operations are Inserts,
remaining are equally randomly distributed over
Insert, FindKey and DeleteKeys. - Fixed Skiplist maximum level of 10.
- Compare with implementation by Michael, using
same scenarios. - Averaged execution time of 50 experiments.
14SGI Origin 2000, 64 cpus.
15Linux Pentium II, 2 cpus
16Conclusions
- Our lock-free implementation also includes the
value-oriented operations FindValue and
DeleteValue. - Our lock-free algorithm is suitable for both
pre-emptive as well as systems with full
concurrency - Will be available as part of NOBLE software
library, http//www.noble-library.org - See Technical Report for full details,http//www.
cs.chalmers.se/phs
17Questions?
- Contact Information
- Address HÃ¥kan Sundell vs. Philippas
Tsigas Computing Science Chalmers University
of Technology - Email ltphs , tsigasgt _at_ cs.chalmers.se
- Web http//www.cs.chalmers.se/phs/wa
rp
18Dynamic Memory Management
- Problem System memory allocation functionality
is blocking! - Solution (lock-free), IBM freelists
- Pre-allocate a number of nodes, link them into a
dynamic stack structure, and allocate/reclaim
using CAS
Allocate
Head
Mem 1
Mem 2
Mem n
Reclaim
Used 1
19The ABA problem
- Problem Because of concurrency (pre-emption in
particular), same pointer value does not always
mean same node (i.e. CAS succeeds)!!!
Step 1
1
7
6
4
Step 2
2
7
3
4
20The ABA problem
- Solution (Valois et al) Add reference counting
to each node, in order to prevent nodes that are
of interest to some thread to be reclaimed until
all threads have left the node
1
6
New Step 2
1
1
CAS Failes!
2
7
3
?
?
?
4
1
21Helping Scheme
- Threads need to traverse safely
- Need to remove marked-to-be-deleted nodes while
traversing Help! - Finds previous node, finish deletion and
continues traversing from previous node
or
1
4
2
1
4
2
?
?
1
4
2
22Back-Off Strategy
- For pre-emptive systems, helping is necessary for
efficiency and lock-freeness - For really concurrent systems, overlapping CAS
operations (caused by helping and others) on the
same node can cause heavy contention - Solution For every failed CAS attempt, back-off
(i.e. sleep) for a certain duration, which
increases exponentially
23Non-blocking Synchronization
- Lock-Free Synchronization
- Avoids problems with locks
- Simple algorithms
- Fast when having low contention
- Wait-Free Synchronization
- Always finishes in a finite number of its own
steps. - Complex algorithms
- Memory consuming
- Less efficient in average than lock-free
24Full SGI
25Full Linux
26The algorithm in more detail
- Insert
- Create node with random height
- Search position (Remember drops)
- Insert or update on level 1
- Insert on level 2 to top (unless already deleted)
- If already deleted then HelpDelete(1)
- All of this while keeping track of references,
help deleted nodes etc.
27The algorithm in more detail
- DeleteKey
- Search position (Remember drops)
- Mark node at level 1 as deleted, otherwise fail
- Mark next pointers on level 1 to top
- Delete on level top to 1 while detecting helping,
indicate success - Free node
- All of this while keeping track of references,
help deleted nodes etc.
28The algorithm in more detail
- HelpDelete(level)
- Mark next pointer at level to top
- Find previous node (info in node)
- Delete on level unless already helped, indicate
success - Return previous node
- All of this while keeping track of references,
help deleted nodes etc.
29Correctness
- Linearizability (Herlihy 1991)
- In order for an implementation to be
linearizable, for every concurrent execution,
there should exist an equal sequential execution
that respects the partial order of the operations
in the concurrent execution
30Correctness
- Define precise sequential semantics
- Define abstract state and its interpretation
- Show that state is atomically updated
- Define linearizability points
- Show that operations take effect atomically at
these points with respect to sequential semantics - Creates a total order using the linearizability
points that respects the partial order - The algorithm is linearizable
31Correctness
- Lock-freeness
- At least one operation should always make
progress - There are no cyclic loop depencies, and all
potentially unbounded loops are gate-keeped by
CAS operations - The CAS operation guarantees that at least one
CAS will always succeed - The algorithm is lock-free