Distributed Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Distributed Systems

Description:

Session 8: Concurrency Control Christos Kloukinas Dept. of Computing City University London Last session 0.1 Naming 1 Naming Service Examples e.g NFS, X.500, DNS 2 ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 55
Provided by: Dr23752
Category:

less

Transcript and Presenter's Notes

Title: Distributed Systems


1
Distributed Systems
  • Session 8 Concurrency Control
  • Christos Kloukinas
  • Dept. of Computing
  • City University London

2
Last session
  • 1 Location Transparency
  • Not a good idea to hard code location information
    in components--gt Migration difficult
  • 2 Naming
  • Associating external names to references
  • 3 Trading
  • looking up servers by what services they offer

3
0.1 Naming
  • 1 Naming Service Examples
  • e.g NFS, X.500, DNS
  • 2 Common Characteristics
  • External names, hierarchies, contexts,
    persistence of bindings, resolve and bind
    operations.
  • 3 CORBA Naming Service
  • interface NamingContext
  • 4 Limitations- not always the case that we know
    names

4
0.2. Java Example Client Finding Objects
  • ORB ORB.init(args,null)
  • 1. org.omg.CORBA.Object objRef
    org.omg.CORBA.resolve_initial_references
    ("NameService")
  • CosNaming.NamingContext root
    CosNaming.NamingContextHelper.narrow(objRef)
  • 2. CosNaming.NameComponent name
  • new NameComponent(UEFA,ORG),
  • new NameComponent(England,Country),
  • new NameComponent(Premier,League),
  • new NameComponent(Arsenal,Club)
  • 3. Team tTeamHelper.narrow(root.resolve(name))
  • 4. t.print()

Transparently get the naming service
Casting
5
0.3 Trading
  • Characteristics
  • Need a trader (mediator), Quality of service,
    language to express quality of service.
  • Quality of service can be expressed statically
    (e.g. privacy, precision) or dynamically (e.g
    performance)
  • Service matching and service shopping
  • Example Video on Demand
  • OMG/CORBA Trading Service

6
Session 8 - Outline
  • 1 Motivation
  • 2 Concurrency Control Techniques
  • 3 CORBA Concurrency Control Service
  • 4 Summary

7
1 Motivation
  • How can multiple components in a distributed
    system use a shared component concurrently
    without violating the integrity of the component?
  • This question is of fundamental importance as
    there are only very few distributed systems where
    all components are only used by a single
    component at a time.

8
1 Motivation (ctd.)
  • Resources maintained concurrently may be hardware
    components (e.g. a printer), operating system
    resources (e.g. files or sockets), databases
    (e.g. the bank accounts kept by different banks)
    or CORBA objects.
  • For some types of accesses, resources may have to
    be accessed in mutual exclusion
  • It does not make sense to have print jobs of
    different users being printed in an interleaved
    way
  • Only one user should be editing a file at a time,
    otherwise the changes made by other users would
    be overwritten if the last user saves his or her
    file
  • integrity of databases or CORBA objects may be
    lost through concurrent updates.
  • Hence, the need arises to restrict the concurrent
    access of multiple components to a shared
    resource in a sensible way.

9
1 Motivation (ctd.)
  • Concurrent access and updates of resources which
    maintain state information may lead to
  • lost updates
  • inconsistent analysis
  • Motivating example for lost updates
  • Cash withdrawal from ATM and concurrent
  • Credit of cheque
  • Motivating example for inconsistent analysis
  • Funds transfer between accounts of one customer
  • Sum of account balances (Report for Inland
    Revenue)

10
1 Motivating Examples
  • class Account
  • protected float balance
  • public float get_balance() return balance
  • void debit(float amount)
  • float newbalance-amount
  • balancenew
  • void credit(float amount)
  • float newbalanceamount
  • balancenew

The object stores the balance in the instance
variable balance. The object can return the
current balance through operation
get_balance().
The debit() operation subtracts the amount
passed as a parameter from the balance and the
credit() operation adds the amount passed as a
parameter.
11
1 Lost Updates
Balance of account anAcc at t0 is 75
Customer_at_ATM
Clerk_at_Counter
t0
anAcc.debit(50) new25 balance25
t1
anAcc.credit(50) new125 balance125
t2
t3
t4
t5
t6
Time WRITER
WRITER
12
1 Inconsistent Analysis
Balances at t0 Acc1 7500, Acc2 0
Funds transfer
Inland Revenue Report
t0
Acc1.debit(7500) Acc1.new0
Acc1.balance0 Acc2.credit(7500)
Acc2.new7500 Acc2.balance7500
t1
float sum0 sumAcc2.get_bal() //
sum0 sumAcc1.get_bal() // sum0
t2
t3
t4
t5
t6
t7
Time WRITER
READER
13
2 Concurrency Control Techniques
  • 1 Assessment Criteria
  • 2 Pessimistic Concurrency Control
  • e.g. Two Phase Locking (2PL)
  • 3 Optimistic Concurrency Control
  • 4 Comparison

14
Concurrency Control Techniques
  • Ensures integrity of shared resource amidst
    concurrent access
  • e.g in database, ensures users from editing same
    record at the same time
  • concerned with serialising transactions, ensuring
    safe execution
  • resolving conflicts and deadlocks
  • ensuring fairness among concurrent processes
  • restoring component integrity

15
2.1 Assessment Criteria
  • Serialisability Concurrent threads are
    serialisable, if they can be executed one after
    another and have the same effect on shared
    resources. It can be proven that serialisable
    threads do not lead to lost updates and
    inconsistent analysis.
  • Deadlock freedom Concurrency control techniques
    that use locking may force threads to wait for
    other threads to release a lock before they can
    access a resource. This may lead to situations
    where the wait-for relationship is cyclic and
    threads are deadlocked.
  • Fairness refers to the fact whether all threads
    have the same chances to get access to resources.
  • Complexity On the other hand to compute
    precisely those and only those schedules that are
    serialisable may be very complex and we are
    interested in the complexity that a concurrency
    control schedule has in order to estimate its
    performance overhead.
  • Concurrency!!! We are also interested in the
    degree of concurrency that a control scheme
    allows threads to perform. It is obviously
    undesirable to restrict schedules that do not
    cause serialisability problems.

16
Concurrency Control Techniques Families
  • Pessimistic
  • Assumes that collisions are likely to occur.
    Locks are used.
  • Changes are consistent and safe
  • - Is not scalable
  • Optimistic
  • The idea is that you accept the fact that
    collisions occur infrequently, and instead of
    trying to prevent them you simply choose to
    detect them and then resolve the collision when
    it does occur.
  • Uses timestamps, and actions can be rolled back

17
2.2 Two Phase Locking (2PL)
  • The most popular concurrency control technique.
    Used in
  • RDBMSs (Oracle, Ingres, Sybase, DB/2, etc.)
  • ODBMSs (O2, ObjectStore, Versant, etc.)
  • Transaction Monitors (CICS, etc)
  • The principal component that implements 2PL is a
    lock manager from which concurrent processes or
    threads acquire locks on every shared resource
    they access.
  • The lock manager investigates the request and
    compares it with the locks that were already
    granted on the resource .
  • If the requested lock does not conflict with an
    already granted lock, the lock manager will grant
    the lock and note that the requester is now using
    the resource.

18
Terminology
  • Locks and Locksets
  • Locking
  • Lock Compatibility
  • Locking Conflict
  • Deadlocks
  • Waiting graph
  • Locking granularity
  • Hierarchical Locking
  • Locking transparency

19
2.2 Locks
  • A lock is a token that indicates that a process
    accesses a resource in a particular mode.
  • Minimal lock modes read and write.
  • Locks are used to indicate to concurrent
    processes or threads the way in which a resource
    is used.
  • The lock manager, therefore, maintains a set of
    locks for each resource I.e. associates locksets
    with every shared object

20
2.2 Locking
  • Processes acquire locks before they access
    shared resources and release locks afterwards.
  • 2PL Processes do not acquire locks once they
    have released a lock.
  • Typical 2PL locking profile of a process

Number of locks held
Time
21
2.2 Locking
  • 2PL is based on the assumption that processes or
    threads always acquire locks before they access a
    shared resource and that they release a lock if
    they do not need the resource anymore.
  • In 2PL, processes do not acquire locks once they
    have released a lock.
  • This means that threads operate in cycles where
    there is a lock acquisition phase and a lock
    release phase in each cycle.
  • 2PL has its name due to these two phases.

22
2.2 Lock Compatibility
  • The lock manager grants locks to requesting
    processes or threads on the basis of already
    granted locks and their compatibility with the
    requested lock.
  • The very core of any pessimistic concurrency
    control technique that is based on locking is the
    definition of a lock compatibility matrix. It
    defines the different lock modes and the
    compatibility between them.
  • Minimal lock compatibility matrix

23
2.2 Locking Conflicts
  • Locking conflict When access cannot be granted
    due to incompatibility between requested lock and
    previously-granted lock
  • On the occasion of a locking conflict,
  • Requester cannot use the resource until the
    conflicting lock has been released.
  • There are two approaches to handle locking
    conflicts.
  • The requesting process can be forced to wait
    until the conflicting locks are released. This
    may, however, be too restrictive since the
    process or thread may well do other computations
    in between.
  • Alert the process or thread that the lock cannot
    be granted. It can then continue with other
    processing until a point in time when it
    definitely needs to get access to the resource.
  • Several 2PL implementations provide two locking
    operations, a blocking and a non-blocking one, so
    the requester can decide.

24
2.2 Example (Avoiding Lost Updates)
Balance of account anAcc at t0 is 75
Customer_at_ATM
Clerk_at_Counter
anAcc.debit(50) anAcc.lock(write)
new75-5025 balance25 anAcc.unlock(write)
t0
anAcc.credit(50) anAcc.lock(write)
new255075 balance75 anAcc.unlock(write)
t1
t2
t3
t4
t5
t6
Time
25
2.2 Example (Avoiding Lost Updates)
  • Before the account objects are changed, the debit
    and credit operations request a lock on the
    account object from the lock manager.
  • Then the lock manager detects a write/write
    locking conflict and forces the second process to
    wait until the first process has released its
    lock. Then the second process reads the
    up-to-date value of the balance of the account
    and modifies it without loosing the update of the
    first process.

26
2.2 Deadlocks
  • Recall that lock manager may force processes or
    threads to wait for other processes to release
    locks.
  • This solves problem of lost update and
    inconsistent analysis.
  • Processes may request locks for more than one
    object
  • Situations may arise where two or more processes
    or threads are mutually waiting for each other to
    release their locks..
  • These situations are called deadlocks and
  • Very undesirable as they block threads and
    prevent them from finishing their jobs.
  • Hence 2PL is NOT deadlock-free.

27
Waiting Graph
p4
p2
p1
p3
p9
p6
p5
p8
p7
In this process waiting graph, the four processes
P1, P2,P3,P7 are in a deadlock
28
2.2.1 Deadlock Detection and Resolution
  • Deadlocks are resolved by lock managers.
  • Manager maintains up-to-date representation of
    the waiting graph.
  • Manager records every locking conflict by
    inserting a graph edge.
  • Also when a conflict is resolved by releasing a
    conflicting lock the respective edge has to be
    deleted.
  • Manager uses waiting graph to detect deadlocks.
  • Resolution Break cycles, i.e. select one process
    or thread that participates in such a cycle and
    abort it.
  • Select a node that has maximum incoming or
    outgoing edges to reduce chances of further
    deadlock
  • An abortion of a process requires to undo all
    actions that the process has performed and to
    release all locks the process has held!!!

29
2.2 Locking Granularity
  • Observation Objects that are accessed
    concurrently are often contained in more coarse
    grained composite objects e.g
  • Directories can contain other directories, files
    are contained in directories, files have records
  • Relational databases contain a set of tables,
    which contain a set of tuples, which contain
    attributes or
  • Distributed composite objects may act as
    containers for component objects, which may again
    contain other objects
  • A normal access pattern is to visit all or a
    large subset of the objects that are contained.
  • Concurrency control manager can save effort by
    exploiting containment hierarchies.

30
2.2.1 Locking Granularity
  • Two phase locking is applicable to resources of
    any granularity.
  • It works for CORBA objects as well as for files
    and directories or even complete databases.
  • However, the degree of concurrency that is
    achieved with 2PL depends on the granularity that
    is used for locking.
  • A high degree of concurrency is achieved with
    small locking granules.
  • The disadvantage of choosing a small locking
    granularity is that a huge number of locks have
    to be acquired if bigger granules have to be
    locked.
  • Trade-off Degree of concurrency Vs locking
    overhead.
  • If we decrease the granularity we can process
    more processes concurrently but have to be
    prepared to spend higher costs for the management
    of locks.
  • The dilemma can be resolved using an
    optimisation, which is hierarchical locking.

31
2.2.2 Containment Hierarchy
Bank
Bank
G2
Gn
G1
Group of Branches
B1
B2
Bn
Branches
Containment hierarchy of account objects
Accounts
32
2.3 Hierarchical Locking
  • Allows locking of all objects contained in a
    composite object (container).
  • BUT also allows a process to indicate, at
    container level, the sub-resources that it is
    intending to use in a particular mode.
  • The hierarchical locking schemes therefore
    introduce intention locks, such as intention read
    and intention write locks.
  • I.e intention locks are acquired for a composite
    object before a process requests a real lock for
    an object that is contained in the composite
    object.
  • Intention locks signal to those processes that
    wish to lock entire composite object that some
    other processes currently has locks for objects
    contained in composite object

33
2.3.1 Hierarchical Locking
  • Intention Read Indicate that some process has or
    is about to acquire read lock on the objects
    inside a composite object
  • Intention Write indicate that some process has
    or is about to acquire write locks on object in
    composite object.
  • Processes that want to lock a certain resource
    would then acquire intention locks on the
    container of that resource and all its
    containers.
  • The lock compatibility matrix is defined in a way
    that a locking conflict will arise if a container
    object is already locked in either read or write
    mode.

34
2.3.2 Hierarchical Locking
  • NB Intention read and intention write are
    compatible because they do not actually
    correspond to any locks.
  • Other modes
  • IR lock is compatible with R lock because
    accessing object for reading does not change
    values
  • IR lock is incompatible with W lock because it
    is not possible to modify every element of the
    composite object while some other process process
    is reading the state of an object of the
    composite
  • etc etc
  • Hence the advantage of hierarchical locking is
    that it
  • enables different lock granularities to be used
    at the same time
  • Overhead is that for every individual object
    intention locks have to be used on every
    composite object in which the object is
    contained. (may be contained in more than one
    containers)

35
2.4 Transparency of Locking
  • The last question that we have to discuss is WHO
    is acquiring the locks, i.e. who invokes the lock
    operation for a resource. The options are
  • the concurrency control infrastructure, such as
    the concurrency control manager of a database
    management system
  • the implementation of components or
  • the clients of the components.
  • The first option is very much desirable as then
    concurrency control would be transparent to the
    application programmers of both the component and
    its clients.
  • Unfortunately this is only possible on limited
    occasions (in a database system) because the
    concurrency control manager would have to manage
    all resources and it would have to be informed
    about every single resource access.
  • The last option is very undesirable and it is in
    fact always avoidable. Hence distributed
    components should be designed in a way that
    concurrency control is hidden within their
    implementation and not exposed at their interface
    and is transparent to designers of CLIENTS

36
2.4 Optimistic Concurrency Control
  • In general, the complexity of two phase locking
    is linear in the number of the accessed
    resources. With hierarchical locking it is even
    slightly more complex as also containers of
    resources have to be locked in intentional mode.
  • This overhead, however, is unreasonable if the
    probability of a locking conflict is very
    limited.
  • Given the motivating examples we discussed
    earlier, it is quite unlikely that you withdraw
    cash from an ATM in that very millisecond when a
    clerk credits a cheque.
  • This is where optimistic concurrency control
    comes in.
  • It follows a laissez-faire approach and works as
    a watchdog that detects conflicts only when they
    really happen.

37
2.3 Optimistic Concurrency Control (ctd.)
  • Every thread or process works on its private
    logical copy of the set of shared resources.
  • While a process or thread accesses resources, the
    concurrency control manager keeps a log of them.
  • Timestamps are required
  • At a certain point in time, the access patterns
    are validated against conflicts with concurrent
    processes or threads.
  • If no conflicts occurred the changes done can be
    made known to the global set of resources.
  • If conflicts occurred the process has to discard
    its logical copy and start over again on an
    up-to-date copy of the resources.

38
Phases
  • 1. Read
  • Process/transaction executes reading values
    ,writing to a private copy
  • 2. Validation
  • when process completes, manager checks whether
    process could have possibly conflicted with any
    other concurrent process. If there is a
    possibility, the process aborts, and restarts.
  • 3. Write
  • If there is no possibility of conflict, the
    transactions commits.
  • If there are few conflicts,
  • validation can be done efficiently, and leads to
    better performance than other concurrency control
    methods. Unfortunately, if there are many
    conflicts, the cost of repeatedly restarting
    operation, hurts performance significantly

39
2.3 Validation Prerequisites
  • As a pre-requisite for optimistic concurrency
    control it is required to separate the overall
    sequence of operations a process performs into
    distinguishable units. A validation of the access
    pattern of a unit is then performed during a
    validation phase at the end of each unit.
  • For each unit the following information has to be
    gathered
  • Starting time of the unit st(U).
  • Time stamp for start of validation TS(U).
  • Ending time of unit E(U).
  • Read and write sets RS(U) and WS(U). (set of
    resources U has accessed in read and write mode)
  • Needs precise time information!!!
  • Requires synchronisation of the local clocks!!!
    (of resources CORBA objects)

40
2.3 Validation Set
  • The validation of a unit has to be done against
    all concurrent units that have already been
    successfully validated. We, therefore denote the
    set of those units as the validation set VU(u).
  • VU(u) is formally defined as
  • VU(u)x st(u)ltE(x) and x has been validated
  • i.e VU(u) contains units x that were active
    concurrently with u but have been validated
    before it

41
2.3 Conflict Detection
  • During the validation phase, the concurrency
    control manager has to look for two types of
    conflicts read/write and write/write conflicts.
  • A read/write conflict occurred during the course
    of a unit u iff
  • ??u ?? VU(u) WS(u) ? RS(u) ??????
  • ?RS(u) ? WS(u) ????
  • A write/write conflict occurred during the course
    of a unit u iff
  • ??u ? VU(u) WS(u) ? WS(u) ????
  • In both cases the unit cannot be completed but
    has to be undone.

--u has written a resource that this other unit
U has read and vice versa
--u has modified a resource that this other unit
u has modified as well
42
Optimistic Conc. Control Example (1/3)
  • Assume that you have the following optimistic
    units
  • Unit start time end time read set write set
  • 1 1 5 1,3,5
    2,4
  • 2 3 7 2,3,5
    6,4
  • 3 5 9 2,3,5
    7,8
  • 4 10 15 7,3,5
    7,8
  • What is the validation set (VU) of each one of
    them?
  • Which ones have a conflict (read/write or
    write/write) and where exactly does the conflict
    appear?
  • Which of the transactions in the table above will
    get validated?

43
Optimistic Conc. Control Example (2/3)
  • VU(1)
  • Why? Because when it finishes, no other unit has
    finished yet.
  • So, unit 1 gets validated immediately.
  • VU(2) 1
  • Why? Because the end time of unit 1 (5) is
    greater than the starting time of unit 2 (3) and
    unit 1 has been validated.
  • Unit 2 has a read/write conflict with unit 1 (in
    resource 2) and a write/write with unit 1 (in
    resource 4).

44
Optimistic Conc. Control Example (3/3)
  • VU(3)
  • Why? Because only unit 2 has an end time greater
    than the starting time of unit 3 but unit 2 has
    not been validated (so its ignored).
  • Therefore, unit 3 gets validated immediately.
  • VU(4)
  • Why? Because no unit has an end time greater than
    the starting time of unit 4.
  • Thus, unit 4 will be validated as well.

45
2.4 Comparison
  • Both, pessimistic and optimistic techniques,
  • guarantee serialisability of processes
  • impose a serious complexity in that they need the
    ability to undo the effect of processes and
    threads.
  • Pessimistic techniques cause a
  • considerable concurrency control overhead through
    locking and
  • they are not deadlock-free
  • However, they are sufficiently efficient when
    conflicts are likely.
  • A serious advantage of optimistic techniques
  • a neglectable overhead when conflicts are
    unlikely
  • Furthermore they are deadlock-free.
  • However the computation of conflict sets is very,
    very difficult and complex in a distributed
    setting. Moreover the optimistic techniques
    assume the existence of synchronised clocks,
    which are generally not available in a
    distributed setting.

46
2.4 Comparison (ctd.)
  • In summary, the disadvantages of optimistic
    concurrency control overwhelm the advantages and
    in most distributed systems concurrency is
    controlled using pessimistic techniques.

47
3 CORBA Concurrency Control Service
Application Objects
CORBAfacilities
Object Request Broker
CORBAservices
Concurrency Control
48
3 Lock Compatibility
  • The Concurrency Control service supports
    hierarchical locking, as many CORBA objects take
    the role of container objects.
  • As a further optimisation the service defines a
    lock type for upgrade locks.
  • Upgrade locks are read locks that are not
    compatible to themselves. Upgrade locks are used
    in occasions when the requester knows that it
    only needs a read lock to start with but later
    will have to acquire a write lock on that
    resource as well.
  • If two processes are in this situation, they
    would run into a deadlock if they used only read
    locks. With upgrade locks the deadlock can be
    prevented as the second process trying to acquire
    the upgrade lock will be delayed already.

49
3 Lock Compatibility (ctd.)
  • Compatibility matrix

50
3 Locksets
  • The central object type defined by the
    Concurrency Control service is the lockset. A
    lockset is associated to a resource.
  • With the Concurrency Control service, concurrency
    control has to be managed by the implementation
    of a shared resource. Hence the implementation of
    a resource would usually have a hidden lockset
    attribute.
  • Operation implementations included in that
    resource acquire locks before they access or
    modify the resource.

51
3 The IDL Interfaces
  • interface LocksetFactory
  • LockSet create()
  • interface Lockset
  • void lock(in lock_mode mode)
  • boolean try_lock(in lock_mode mode)
  • void unlock(in lock_mode mode)
  • void change_mode(in lock_mode held,
  • in lock_mode new)

52
3 The IDL Interfaces (ctd.)
  • A LocksetFactory facilitates the creation of new
    locksets. The create operation of that interface
    would usually be executed during the construction
    of an object that implements a shared resource.
  • The Lockset interface provides operations to
    lock, unlock and upgrade locks. The difference
    between lock and try_lock is that the former is
    blocking while the latter would return control to
    the caller also when the lock has not been
    granted.
  • Used at the servant internally, clients dont see
    them

53
4 Summary
  • 1 Motivation
  • 2 Concurrency Control Techniques
  • 3 CORBA Concurrency Control Service

54
4 Summary
  • Lost updates and inconsistent analysis.
  • Pessimistic vs. optimistic concurrency control
  • Pessimistic control
  • higher overhead for locking.
  • works efficiently in cases where conflicts are
    likely
  • Optimistic control
  • small overhead when conflicts are unlikely.
  • distributed computation of conflict sets
    expensive.
  • requires global clock.
  • CORBA uses pessimistic two-phase locking.
Write a Comment
User Comments (0)
About PowerShow.com