Scalable and LockFree Concurrent Dictionaries - PowerPoint PPT Presentation

1 / 31

About This Presentation

Title:

Scalable and LockFree Concurrent Dictionaries

Description:

Locks: Semaphores, spinning, disabling interrupts etc. ... Each thread performs 20000 operations, whereof the first total 50-10000 ... – PowerPoint PPT presentation

Number of Views:56

Avg rating:3.0/5.0

Slides: 32

Provided by: hkansu

Category:

more less

Transcript and Presenter's Notes

Title: Scalable and LockFree Concurrent Dictionaries

1
Scalable and Lock-Free Concurrent Dictionaries

Håkan Sundell
Philippas Tsigas

2
Outline

Synchronization Methods
Dictionaries
Concurrent Dictionaries
Previous results
New Lock-Free Algorithm
Experiments
Conclusions

3
Synchronization

Shared data structures needs synchronization
Synchronization using Locks
Mutually exclusive access to whole or parts of
the data structure

P1
P2
P3
P1
P2
P3
4
Blocking Synchronization

Drawbacks
Blocking
Priority Inversion
Risk of deadlock
Locks Semaphores, spinning, disabling interrupts
etc.
Reduced efficiency because of reduced parallelism

5
Non-blocking Synchronization

Lock-Free Synchronization
Optimistic approach (i.e. assumes no
interference)
The operation is prepared to later take effect
(unless interfered) using hardware atomic
primitives
Possible interference is detected via the atomic
primitives, and causes a retry
Can cause starvation
Wait-Free Synchronization
Always finishes in a finite number of its own
steps.

6
Dictionaries (Sets)

Fundamental data structure
Works on a set of ltkey,valuegt pairs
Three basic operations
Insert(k,v) Adds a new item
vFindKey(k) Finds the item ltk,vgt
vDeleteKey(k) Finds and removes the item ltk,vgt

7
Previous Non-blocking Dictionaries

M. Michael High Performance Dynamic Lock-Free
Hash Tables and List-Based Sets, SPAA 2002
Based on Singly-Linked List
Linear time complexity!
Fast Lock-Free Memory Management
Causes retries of concurrent search operations!
Building-block of Hash Tables
Assumes each branch is of length ltlt10.
However, Hash Tables might not be uniformly
distributed.

8
Randomized Algorithm Skip Lists

William Pugh Skip Lists A Probabilistic
Alternative to Balanced Trees, 1990
Layers of ordered lists with different densities,
achieves a tree-like behavior
Time complexity O(log2N) probabilistic!

Head
Tail

25
50
1
2
3
4
5
6
7
9
New Lock-Free Concurrent Skip List

Define node state to depend on the insertion
status at lowest level as well as a deletion flag
Insert from lowest level going upwards
Set deletion flag. Delete from highest level
going downwards

1
2
3
4
5
6
7
D
D
D
D
D
D
D
3
2
1
p
3
2
1
p
D
10
Overlapping operations on shared data
Insert 2
2

Example Insert operation- which of 2 or 3 gets
inserted?
Solution Compare-And-Swap atomic
primitiveCAS(ppointer to word, oldword,
newword)booleanatomic do if p old then
p new return true else return false

1
4
3
Insert 3
11
Concurrent Insert vs. Delete operations
b)
1
4
2
a)

Problem- both nodes are deleted!
Solution (Harris et al) Use bit 0 of pointer to
mark deletion status

Delete
3
Insert
b)
1
4
2

a)
c)
3
12
New Lock-Free Dictionary - Techniques Summary

Based on Skip Lists
Treated as layers of ordered lists
Uses CAS atomic primitive
Lock-Free memory management
IBM Freelists
Reference counting (ValoisMichaelScott)
Helping scheme
Back-Off strategy
All together proved to be linearizable

13
Experiments

Experiment with 1-30 threads performed on systems
with 2 respective 64 cpus.
Each thread performs 20000 operations, whereof
the first total 50-10000 operations are Inserts,
remaining are equally randomly distributed over
Insert, FindKey and DeleteKeys.
Fixed Skiplist maximum level of 10.
Compare with implementation by Michael, using
same scenarios.
Averaged execution time of 50 experiments.

14
SGI Origin 2000, 64 cpus.
15
Linux Pentium II, 2 cpus
16
Conclusions

Our lock-free implementation also includes the
value-oriented operations FindValue and
DeleteValue.
Our lock-free algorithm is suitable for both
pre-emptive as well as systems with full
concurrency
Will be available as part of NOBLE software
library, http//www.noble-library.org
See Technical Report for full details,http//www.
cs.chalmers.se/phs

17
Questions?

Contact Information
Address Håkan Sundell vs. Philippas
Tsigas Computing Science Chalmers University
of Technology
Email ltphs , tsigasgt _at_ cs.chalmers.se
Web http//www.cs.chalmers.se/phs/wa
rp

18
Dynamic Memory Management

Problem System memory allocation functionality
is blocking!
Solution (lock-free), IBM freelists
Pre-allocate a number of nodes, link them into a
dynamic stack structure, and allocate/reclaim
using CAS

Allocate
Head
Mem 1
Mem 2
Mem n

Reclaim
Used 1
19
The ABA problem

Problem Because of concurrency (pre-emption in
particular), same pointer value does not always
mean same node (i.e. CAS succeeds)!!!

Step 1
1
7
6
4
Step 2
2
7
3
4
20
The ABA problem

Solution (Valois et al) Add reference counting
to each node, in order to prevent nodes that are
of interest to some thread to be reclaimed until
all threads have left the node

1

6

New Step 2
1
1
CAS Failes!
2
7
3
?
?
?
4
1
21
Helping Scheme

Threads need to traverse safely
Need to remove marked-to-be-deleted nodes while
traversing Help!
Finds previous node, finish deletion and
continues traversing from previous node

or
1
4
2

1
4
2

?
?
1
4
2

22
Back-Off Strategy

For pre-emptive systems, helping is necessary for
efficiency and lock-freeness
For really concurrent systems, overlapping CAS
operations (caused by helping and others) on the
same node can cause heavy contention
Solution For every failed CAS attempt, back-off
(i.e. sleep) for a certain duration, which
increases exponentially

23
Non-blocking Synchronization

Lock-Free Synchronization
Avoids problems with locks
Simple algorithms
Fast when having low contention
Wait-Free Synchronization
Always finishes in a finite number of its own
steps.
Complex algorithms
Memory consuming
Less efficient in average than lock-free

24
Full SGI
25
Full Linux
26
The algorithm in more detail

Insert
Create node with random height
Search position (Remember drops)
Insert or update on level 1
Insert on level 2 to top (unless already deleted)
If already deleted then HelpDelete(1)
All of this while keeping track of references,
help deleted nodes etc.

27
The algorithm in more detail

DeleteKey
Search position (Remember drops)
Mark node at level 1 as deleted, otherwise fail
Mark next pointers on level 1 to top
Delete on level top to 1 while detecting helping,
indicate success
Free node
All of this while keeping track of references,
help deleted nodes etc.

28
The algorithm in more detail

HelpDelete(level)
Mark next pointer at level to top
Find previous node (info in node)
Delete on level unless already helped, indicate
success
Return previous node
All of this while keeping track of references,
help deleted nodes etc.

29
Correctness

Linearizability (Herlihy 1991)
In order for an implementation to be
linearizable, for every concurrent execution,
there should exist an equal sequential execution
that respects the partial order of the operations
in the concurrent execution

30
Correctness

Define precise sequential semantics
Define abstract state and its interpretation
Show that state is atomically updated
Define linearizability points
Show that operations take effect atomically at
these points with respect to sequential semantics
Creates a total order using the linearizability
points that respects the partial order
The algorithm is linearizable

31
Correctness

Lock-freeness
At least one operation should always make
progress
There are no cyclic loop depencies, and all
potentially unbounded loops are gate-keeped by
CAS operations
The CAS operation guarantees that at least one
CAS will always succeed
The algorithm is lock-free

Write a Comment

User Comments (0)