Transactional Memory - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Transactional Memory

Description:

Transactional Memory Companion s for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Art of Multiprocessor Programming * – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 73
Provided by: Mauri76
Category:

less

Transcript and Presenter's Notes

Title: Transactional Memory


1
Transactional Memory
  • Companion slides for
  • The Art of Multiprocessor Programming
  • by Maurice Herlihy Nir Shavit

2
Traditional Software Scaling
7x
Speedup
3.6x
1.8x
User code
Traditional Uniprocessor
Time Moores law
3
Multicore Software Scaling
User code
Multicore
Unfortunately, not so simple
4
Real-World Multicore Scaling
Speedup
2.9x
2x
1.8x
User code
Multicore
Parallelization and Synchronization require
great care
5
Why?
Amdahls Law Speedup 1/(ParallelPart/N
SequentialPart) Pay for N 8 cores
SequentialPart 25 Speedup only 2.9 times!
As num cores grows the effect of 25 becomes more
accute 2.3/4, 2.9/8, 3.4/16, 3.7/32.
6
Shared Data Structures
Fine Grained
Coarse Grained
25 Shared
25 Shared
75 Unshared
75 Unshared
7
A FIFO Queue
Tail
Head
Enqueue(d)
Dequeue() gt a
8
A Concurrent FIFO Queue
Simple Code, easy to prove correct
Object lock
Contention and sequential bottleneck
9
Fine Grain Locks
Finer Granularity, More Complex Code
Tail
Head
P Dequeue() gt a
Q Enqueue(d)
Verification nightmare worry about deadlock,
livelock
10
Fine Grain Locks
Complex boundary cases empty queue, last item
Tail
Head
P Dequeue() gt a
Q Enqueue(b)
Worry how to acquire multiple locks
11
Moreover Locking Relies on Conventions
  • Relation between
  • Lock bit and object bits
  • Exists only in programmers mind

Actual comment from Linux Kernel (hat tip
Bradley Kuszmaul)
/ When a locked buffer is visible to the I/O
layer BH_Launder is set. This means before
unlocking we must clear BH_Launder,mb() on
alpha and then clear BH_Lock, so no reader can
see BH_Launder set on an unlocked buffer and
then risk to deadlock. /
11
12
Lock-Free (JDK 1.5)
Even Finer Granularity, Even More Complex Code
Tail
Head
P Dequeue() gt a
Q Enqueue(d)
Worry about starvation, subtle bugs, hardness to
modify
13
Composing Objects
Complex Move data atomically between structures
More than twice the worry
14
Transactional Memory
Great Performance, Simple Code
Tail
Head
P Dequeue() gt a
Q Enqueue(d)
Dont worry about deadlock, livelock, subtle
bugs, etc
15
Promise of Transactional Memory
Dont worry which locks need to cover which
variables when
Tail
Head
P Dequeue() gt a
Q Enqueue(d)
TM deals with boundary cases under the hood
16
Composing Objects
Will be easy to modify multiple structures
atomically
Provide Composability
17
The Transactional Manifesto
  • Current practice inadequate
  • to meet the multicore challenge
  • Research Agenda
  • Replace locking with a transactional API
  • Design languages to support this model
  • Implement the run-time to be fast enough

17
18
Transactions
  • Atomic
  • Commit takes effect
  • Abort effects rolled back
  • Usually retried
  • Serizalizable
  • Appear to happen in one-at-a-time order

18
19
Atomic Blocks
atomic x.remove(3) y.add(3)atomic y
null
19
20
Atomic Blocks
atomic x.remove(3) y.add(3)atomic y
null
No data race
20
21
Designing a FIFO Queue
Public void LeftEnq(item x) Qnode q new
Qnode(x) q.left this.left this.left.right
q this.left q
Write sequential Code
21
22
Designing a FIFO Queue
Public void LeftEnq(item x) atomic Qnode q
new Qnode(x) q.left this.left
this.left.right q this.left q
22
23
Designing a FIFO Queue
Public void LeftEnq(item x) atomic Qnode q
new Qnode(x) q.left this.left
this.left.right q this.left q
Enclose in atomic block
23
24
Warning
  • Not always this simple
  • Conditional waits
  • Enhanced concurrency
  • Complex patterns
  • But often it is
  • Works for sadistic homework

24
25
Composition
Public void Transfer(QueueltTgt q1, q2) atomic
T x q1.deq() q2.enq(x)
Trivial or what?
25
26
Roll Back
Public T LeftDeq() atomic if (this.left
null) retry
Roll back transaction and restart when something
changes
26
27
OrElse Composition
atomic x q1.deq() orElse x
q2.deq()
Run 1st method. If it retries
Run 2nd method. If it retries
Entire statement retries
27
28
Transactional Memory
  • Software transactional memory (STM)
  • Hardware transactional memory (HTM)
  • Hybrid transactional memory (HyTM, try in
    hardware and default to software if unsuccessful)

28
29
Hardware versus Software
  • Do we need hardware at all?
  • Analogies
  • Virtual memory yes!
  • Garbage collection no!
  • Probably do need HW for performance
  • Do we need software?
  • Policy issues dont make sense for hardware

29
30
Transactional Consistency
  • Memory Transactions are collections of reads and
    writes executed atomically
  • Tranactions should maintain internal and external
    consistency
  • External with respect to the interleavings of
    other transactions.
  • Internal the transaction itself should operate
    on a consistent state.

31
External Consistency
Invariant x 2y
4
x
Transaction A Write x Write y
2
y
Transaction B Read x Read y Compute z
1/(x-y) 1/2
Application Memory
32
Simple Lock-Based STM
  • STMs come in different forms
  • Lock-based
  • Lock-free
  • Here we will describe a simple lock-based STM

33
Synchronization
  • Transaction keeps
  • Read set locations values read
  • Write set locations values to be written
  • Deferred update
  • Changes installed at commit
  • Lazy conflict detection
  • Conflicts detected at commit

34
STM Transactional Locking
Map
V
Array of Versioned- Write-Locks
Application Memory
V
V
34
35
Reading an Object
Mem
Locks
V
  • Put Vs value in RS
  • If not already locked

35
36
To Write an Object
Mem
Locks
V
  • Add V and new value to WS

36
37
To Commit
Mem
Locks
V
  • Acquire W locks
  • Check Vs unchanged
  • In RS WS
  • Install new values
  • Increment Vs
  • Release

X
V1
V
Y
V1
V
37
38
Problem Internal Inconsistency
  • A Zombie is a currently active transaction that
    is destined to abort because it saw an
    inconsistent state
  • If Zombies see inconsistent states errors can
    occur and the fact that the transaction will
    eventually abort does not save us

39
Internal Consistency
Invariant x 2y
4
x
Transaction B Read x 4
2
Transaction A Write x (kills
B) Write y
y
Transaction B (zombie) Read y 4 Compute
z 1/(x-y)
Application Memory
DIV by 0 ERROR
40
Solution The Global Clock
  • Have one shared global clock
  • Incremented by (small subset of) writing
    transactions
  • Read by all transactions
  • Used to validate that state worked on is always
    consistent

41
Read-Only Transactions
Mem
Locks
  • Copy V Clock to RV
  • Read lock,V
  • Read mem
  • Check unlocked
  • Recheck V unchanged
  • Check V lt RV

12
Reads form a snapshot of memory. No read set!
32
56
100
19
17
Private Read Version (RV)
41
42
Regular Transactions
Mem
Locks
  • Copy V Clock to RV
  • On read/write, check
  • Unlocked
  • V RV
  • Add to R/W set

12
32
56
19
100
17
Private Read Version (RV)
42
43
Regular Transactions
Mem
Locks
  • Acquire locks
  • WV FInc(V Clock)
  • Check each V RV
  • Update memory
  • Set write Vs to WV

12
x
100
32
56
19
100
100
101
100
y
17
Private Read Version (RV)
Shared Version Clock
43
44
Hardware Transactional Memory
  • Exploit Cache coherence
  • Already almost does it
  • Invalidation
  • Consistency checking
  • Speculative execution
  • Branch prediction optimistic synch!

44
45
HW Transactional Memory
read
active
caches
Interconnect
memory
45
46
Transactional Memory
read
active
active
caches
memory
46
47
Transactional Memory
active
committed
active
caches
memory
47
48
Transactional Memory
write
committed
active
caches
memory
48
49
Rewind
write
aborted
active
active
caches
memory
49
50
Transaction Commit
  • At commit point
  • If no cache conflicts, we win.
  • Mark transactional entries
  • Read-only valid
  • Modified dirty (eventually written back)
  • Thats all, folks!
  • Except for a few details

50
51
Not all Skittles and Beer
  • Limits to
  • Transactional cache size
  • Scheduling quantum
  • Transaction cannot commit if it is
  • Too big
  • Too slow
  • Actual limits platform-dependent

51
52
TM Design Issues
  • Implementation choices
  • Language design issues
  • Semantic issues

53
Granularity
  • Object
  • managed languages, Java, C,
  • Easy to control interactions between
    transactional non-trans threads
  • Word
  • C, C,
  • Hard to control interactions between
    transactional non-trans threads

54
Direct/Deferred Update
  • Deferred
  • modify private copies install on commit
  • Commit requires work
  • Consistency easier
  • Direct
  • Modify in place, roll back on abort
  • Makes commit efficient
  • Consistency harder

55
Conflict Detection
  • Eager
  • Detect before conflict arises
  • Contention manager module resolves
  • Lazy
  • Detect on commit/abort
  • Mixed
  • Eager write/write, lazy read/write

56
Conflict Detection
  • Eager detection may abort transaction that could
    have committed.
  • Lazy detection discards more computation.

57
Contention Management Scheduling
  • How to resolve conflicts?
  • Who moves forward and who rolls back?
  • Lots of empirical work but formal work in infancy

58
Contention Manager Strategies
  • Exponential backoff
  • Priority to
  • Oldest?
  • Most work?
  • Non-waiting?
  • None Dominates
  • But needed anyway

Judgment of Solomon
59
I/O System Calls?
  • Some I/O revocable
  • Provide transaction-safe libraries
  • Undoable file system/DB calls
  • Some not
  • Opening cash drawer
  • Firing missile

60
I/O System Calls
  • One solution make transaction irrevocable
  • If transaction tries I/O, switch to irrevocable
    mode.
  • There can be only one
  • Requires serial execution
  • No explicit aborts
  • In irrevocable transactions

61
Exceptions
int i 0 try atomic i node
new Node() catch (Exception e)
print(i)
62
Exceptions
Throws OutOfMemoryException!
int i 0 try atomic i node
new Node() catch (Exception e)
print(i)
63
Exceptions
Throws OutOfMemoryException!
int i 0 try atomic i node
new Node() catch (Exception e)
print(i)
What is printed?
64
Unhandled Exceptions
  • Aborts transaction
  • Preserves invariants
  • Safer
  • Commits transaction
  • Like locking semantics
  • What if exception object refers to values
    modified in transaction?

65
Nested Transactions
atomic void foo() bar() atomic void bar()

66
Nested Transactions
  • Needed for modularity
  • Who knew that cosine() contained a transaction?
  • Flat nesting
  • If child aborts, so does parent
  • First-class nesting
  • If child aborts, partial rollback of child only

67
Open Nested Transactions
  • Normally, child commit
  • Visible only to parent
  • In open nested transactions
  • Commit visible to all
  • Escape mechanism
  • Dangerous, but useful
  • What escape mechanisms are needed?

68
Strong vs Weak Isolation
  • How do transactional non-transactional threads
    synchronize?
  • Similar to memory-model theory?
  • Efficient algorithms?

69
I, for one, Welcome our new Multicore Overlords
  • Multicore forces us to rethink almost everything

70
I, for one, Welcome our new Multicore Overlords
  • Multicore forces us to rethink almost everything
  • Standard approaches too complex

71
I, for one, Welcome our new Multicore Overlords
  • Multicore forces us to rethink almost everything
  • Standard approaches wont scale
  • Transactions might make life simpler

72
I, for one, Welcome our new Multicore Overlords
  • Multicore forces us to rethink almost everything
  • Standard approaches wont scale
  • Transactions might
  • Plenty more to do
Write a Comment
User Comments (0)
About PowerShow.com