Title: Michael Factor IBM IL
1Optimistic Concurrency for Clustersvia
Speculative Locking
- Michael Factor (IBM IL)
- Assaf Schuster (Technion)
- Konstantin Shagin (IBM IL)
- Tal Zamir (Technion)
2Motivation
- Locks are commonly used in distributed systems
- Protect access to shared data
- Coarse-grained/fine-grained locking
- Blocking lock acquisition operations are
expensive - Require cooperation with remote machines
- Obtain ownership, memory consistency operations,
checkpointing - On the applications critical path
- Reduce concurrency
- Hence, removal of blocking lock acquisitions has
potential to boost performance
3Speculative locking
- Speculative Locking (SL) suggests that the
application should continue its execution without
waiting for a lock acquisition to be completed - Such execution may cause data consistency
violations - A thread may access invalid data
- Consistency violations are detected by the system
and a rollback is performed if such a violation
is detected - Thus, speculative locking
- Removes the lock acquisition from the critical
path - Allows concurrent execution of critical sections
protected by the same lock - In a number of cases speculative locking is
especially advantageous - Little data contention
- Threads access different data
- Threads rarely perform write operations
- Little lock contention
- Conservative thread-safe libraries using
coarse-grained locks
4Blocking lock acquisition vs. speculative
B
A
A
B
spec. acq (L)
acq(L)
spec. acq(L)
write(Y)
write(Y)
request(L)
request(L)
executing in speculative mode
blocked
rel (L)
rel (L)
read(X)
ownership(L)
ownership(L)
conflict check
read(X)
blocking
speculative
5Speculative locking for a general-purpose
distributed runtime
- We suggest employment of SL in a general-purpose
distributed runtime system - Since rollback capability is required, it is best
suited for fault-tolerant systems - We implement and evaluate SL in JavaSplit, a
fault-tolerant distributed runtime system for
Java - The protocol consists of three main components
- Acquire operation logic
- Data conflict detection
- Checkpoint management
- Although we demonstrate distributed speculative
locking in a specific system, the approach is
generic enough to be applied to any other
distributed runtime with rollback capabilities.
6JavaSplit overview
- Executes standard multithreaded Java programs
- Each application thread runs in a separate JVM
- The shared objects are managed by a distributed
shared memory protocol - The memory model is Lazy Release Consistency
- The protocol is object-based
- Can tolerate multiple concurrent node failures
- Thread checkpoints and shared objects are
replicated - Employs bytecode instrumentation to distribute
threads, preserve memory consistency and enable
checkpointing. Hence it is - Transparent the programmer is not aware of the
system - Portable executes on a collection of JVMs
7Blocking Lock Acquisition in JavaSplit
- The requester sends a lock acquisition request to
the current lock owner and waits for a reply - When the owner releases the lock, it takes a
checkpoint of its state and transfers the lock - A checkpoint is required to support JavaSplits
fault-tolerance scheme - Along with the lock, write notices are
transferred, invalidating objects whose
modifications should be observed by the acquirer
(to maintain LRC)
8Speculative Lock Acquisition in JavaSplit
- The requester sends a lock acquisition request to
the current lock owner and continues execution - Until lock ownership is received, the requester
is in speculative mode - While in speculative mode, object read accesses
are logged (using Java bytecode instrumentation) - When lock ownership is received, write notices
are examined to detect a data conflict - Upon data conflict detection, the thread rolls
back - A thread can speculatively acquire a number of
locks
9Blocking lock acquisition vs. speculative
blocking
speculative
10Managing consistent checkpoints (theory)
- Requirement 1 Must ensure a thread can rollback
to a state preceding any potentially conflicting
access - Since a conflicting access can be may occur only
in speculative mode, it is sufficient to
guarantee there is a valid checkpoint taken
before speculating - In a fault-tolerant system, each thread has at
least one valid checkpoint preceding speculation - It remains to ensure this checkpoint is not made
invalid by checkpoints taken while in speculation
mode - Requirement 2 Must prevent the speculating
thread from affecting other threads or monitor
such dependencies - If a speculating thread affects another thread,
then the latter must be rolled back along with
the former in the case of a data conflict
11Managing consistent checkpoints (practice)
- In JavaSplit, a thread can rollback only to the
most recent checkpoint - In addition, a thread has to checkpoint (only)
before transferring lock ownership to another
thread - Consequently, in order to ensure a speculating
thread can rollback to a state preceding
speculation we prevent threads from transferring
lock ownership while in speculative mode - This satisfies the requirement 1
- The transfer of lock ownership is postponed until
the thread leaves the speculative mode - Coincidentally, this also satisfies the
requirement 2, because in JavaSplit, a thread
may affect other threads only when transferring
lock ownership - Thus, a speculating thread cannot affect other
threads
12Speculative Deadlocks
A
B
- Two threads concurrently acquire locks from each
other - The threads enter speculative mode and therefore
cannot transfer lock ownership - No application deadlock
- A number of threads can be involved in such a
dependency loop - Deadlock detection and recovery are required
13Speculative Deadlocks (2)
- Deadlock Avoidance
- Limiting the number of speculations per session
- Returning lock ownership to passive home nodes on
release - Deadlock Detection
- Timeout-based approach
- Simple but inaccurate
- Message-based approach
- Similar to Chandy-Misra-Haas edge-chasing
algorithm - Can be used to automatically detect a more
accurate timeout value - Deadlock Recovery
- Thread rolls back to its latest checkpoint
- Before continuing execution, all pending lock
requests are served - Heuristic next few lock acquisitions are
non-speculative
14Speculation Preclusion
- The protocol allows to acquire lock speculatively
or the old fashioned blocking way - Threads can decide at run time whether to apply
speculation in each particular instance - This enables run time heuristic logic that
- Detects acquire ops. that are likely to result in
a rollback, and - Prevents speculative acquisitions
- For a time period
- In a number of next acquire operations
- The speculation preclusion algorithm should be as
lightweight as possible because it is invoked on
each lock acquisition - Static analysis can also be employed to detect
cases in which rollback is likely to occur
15Transactional Memory
- Transactional memory is another form of
optimistic concurrency control akin to
speculative locking - The basic principle is the same
- optimistically execute a critical code segment
- determine whether there have been data conflicts
- roll back in case validation fails
- However, the programming paradigm and the
implementation details differ significantly
16Distributed Transactional Memory
- Non-scalable/somewhat inefficient implementations
(drawn from the classic hardware/software
transactional memory protocols) - Broadcast based
- Centralized
- Require global cooperation
- Less optimistic than distributed speculative
locking - Try to ensure validity of one transaction before
executing another, hence - Fail to overlap communication with computation
- Can be classified to blocking-open and
blocking-commit schemes - Blocking-open schemes that often induce a remote
blocking open operation when accessing an object
for the first time within a transaction - Blocking-commit schemes that require a blocking
remote request prior to committing - Question Is SL a suitable alternative for TM in
a distributed setting? - It is surely more optimistic, but speculative
deadlock and other factors may hinder this
advantage - We only scratch the surface of this question
17(No Transcript)
18Performance Evaluation
- Test bed
- Cluster of 15 commodity workstations
- Workstation parameters
- Windows XP SP2
- Intel Pentium Xeon dual-core 1.6 GHz processor
- 2GB RAM
- 1 Gbit Ethernet
- Each station executes two JavaSplit threads
- Standard Sun JRE 1.6
19Traveling Salesman Problem
- Single lock application
- Lock protects the globally known minimal path
- Non-speculative system has scalability issues due
to lock acquisition operations - Speculative system does not need to wait for lock
acquisitions, but has a slight overhead due to
speculation failures (data conflicts) - With a single thread, the read access monitoring
overhead results in a slowdown
20SPECjbb (low acquisition frequency)
- Business server middleware application
- Workers process incoming client requests, each
requiring exclusive access to specific stock
objects - Fine-grained locking
- Job processing time is non-negligible (low lock
acquisition frequency) - Non-SL system does not have a scalability issue
- SL removes the lock acquisition overhead
- No speculative deadlocks
- No data conflicts
21SPECjbb (high acquisition frequency)
- Job processing time minimized to simulate
extremely high lock acquisition frequency - In the non-SL system, the lock acquisition
overhead is high - SL has to set a quota on the number of
speculations per speculative session, to avoid
speculative deadlocks - SL reduces the lock acquisition overhead, but
does not eliminate it here
22HashtableDB
- Naive benchmark demonstrating an optimal scenario
for SL - A single lock protects a hash table object
- Coarse-grained locking
- Processing is performed while holding the lock
- Non-speculative system does not scale, only one
thread can perform computation at any given
moment - SL system has near full linear scalability
23Checkpointing Frequency
- The underlying fault-tolerant system performs
checkpoints at a given frequency - This graph demonstrates the effect of
checkpointing frequency on the performance of
SPECjbb - A high checkpointing frequency results in higher
checkpointing overhead, but smaller speculation
failure recovery penalty - This tradeoff is balanced by an ideal frequency
(per execution)
24Speculative Deadlocks
- In the presence of speculative deadlocks, setting
a quota on the number of speculations per session
is required - The probability of speculative deadlocks is a
complex function of multiple factors, also
effected by this quota - The higher the quota, the more time can be spent
in speculative mode, increasing deadlock
probability
25Speculative Deadlocks (2)
- However, a higher speculation quota allows the
application to continue execution (without
blocking) for longer periods of time - Thus, per execution, an optimal value of
speculation quota exists
26Questions?
The end