Outline - PowerPoint PPT Presentation

About This Presentation
Title:

Outline

Description:

Tightly-coupled operating system for multi-processors and homogeneous multicomputers ... This layer is the last part of a basic network protocol stack ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 92
Provided by: xiuwe
Learn more at: http://www.cs.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Outline


1
Outline
  • Announcement
  • Midterm Review
  • Distributed File Systems continued
  • If we have time

2
Announcements
  • Please turn in your homework 3 at the beginning
    of class
  • The midterm will be on March 20
  • This coming Thursday
  • It will be an open-book, open-note exam

3
Operating System
  • An operating system is a layer of software on a
    bare machine that performs two basic functions
  • Resource management
  • To manage resources so that they are used in an
    efficient and fair manner
  • User friendliness

4
Distributed Systems
  • A distributed system is a collection of
    independent computers that appears to its users
    as a single coherent system
  • Independent computers mean that they do not share
    memory or clock
  • The computers communicate with each other by
    exchanging messages over a communication network

5
Distributed Systems cont.
6
Distributed Systems cont.
  • Advantages
  • The computing power of a group of cheap
    workstations can be enormous
  • Decisive price/performance advantage over
    traditional time-sharing systems
  • Resource sharing
  • Enhanced performance
  • Improved reliability and availability
  • Modular expandability

7
Distributed System Architecture cont.
  • Distributed systems are often classified based on
    the hardware
  • Multiprocessor systems
  • Homogenous multi-computer systems
  • Heterogeneous multi-computer systems

8
Distributed Operating Systems
  • Hardware for distributed systems is important,
    but the software largely determines what a
    distributed system looks like to a user
  • Distributed operating systems are much like the
    traditional operating systems
  • Resource management
  • User friendliness
  • The key concept is transparency

9
Distributed Operating Systems cont.
  • In a truly distributed operating system, the user
    views the system as a virtual uniprocessor system
    even though physically it consists of multiple
    computers
  • In other words, the use of multiple computers and
    accessing remote data and resources should be
    invisible to the user

10
Overview of Different Kinds of Distributed Systems
11
Multicomputer Operating Systems
  • General structure of a multicomputer operating
    system

12
Network Operating System
1-19
13
Middleware and Openness
1.23
  • In an open middleware-based distributed system,
    the protocols used by each middleware layer
    should be the same, as well as the interfaces
    they offer to applications.

14
Comparison Between Systems
15
Issues in Distributed Operating Systems
  • Absence of global knowledge
  • In a distributed system, due to the
    unavailability of a global memory and a global
    clock and due to unpredictable message delays, it
    is practically impossible to for a computer to
    collect up-to-date information about the global
    state of the distributed system
  • Therefore a fundamental problem is to develop
    efficient techniques to implement a decentralized
    system wide control
  • Another problem is how to order all the events

16
Issues in Distributed Operating Systems cont.
  • Naming
  • Plays an important role in achieving location
    transparency
  • A name service maps a logical name into a
    physical address by making use of a table lookup,
    an algorithm, or a combination of both
  • In distributed systems, the tables may be
    replicated and stored at many places
  • Consider naming in a distributed file system

17
Issues in Distributed Operating Systems cont.
  • Scalability
  • Systems generally grow with time, especially
    distributed systems
  • Scalability requires that the growth should not
    result in system unavailability or degraded
    performance
  • This puts additional constraints on design
    approaches

18
Issues in Distributed Operating Systems cont.
  • Compatibility
  • Refers to the interoperability among the
    resources in a system
  • Three different levels
  • Binary level
  • All processors execute the same binary
    instruction repertoire
  • Virtual binary level
  • Execution level
  • Same source code can be compiled and executed
    properly
  • Protocol level
  • A common set of protocols

19
Issues in Distributed Operating Systems cont.
  • Process synchronization
  • The synchronization of processes in distributed
    systems is difficult because of the
    unavailability of shared memory
  • It needs to synchronize processes running on
    different computers when they try to concurrently
    access a shared resource
  • This is the mutual exclusion problem as in
    classical operating systems

20
Issues in Distributed Operating Systems cont.
  • Resource management
  • Resource management needs to make both local and
    remote resources available to uses in an
    effective manner
  • Data migration
  • Distributed file system
  • Distributed shared memory
  • Computation migration
  • Remote procedure call
  • Distributed scheduling

21
Issues in Distributed Operating Systems cont.
  • Structuring
  • The distributed operating system requires some
    additional constraints on the structure of the
    underlying operating system
  • The collective kernel structure
  • An operating system is structured as a collection
    of processes that are largely independent of each
    other
  • Object-oriented operating system
  • The operating systems services are implemented
    as objects

22
Clients and Servers
  • General interaction between a client and a server.

23
Layered Protocols
  • Layers, interfaces, and protocols in the OSI
    model.

24
Network Layer
  • The primary task of a network layer is routing
  • The most widely used network protocol is the
    connection-less IP (Internet Protocol)
  • Each IP packet is routed to its destination
    independent of all others
  • A connection-oriented protocol is gaining
    popularity
  • Virtual channel in ATM networks

25
Transport Layer
  • This layer is the last part of a basic network
    protocol stack
  • In other words, this layer can be used by
    application developers
  • An important aspect of this layer is to provide
    end-to-end communication
  • The Internet transport protocol is called TCP
    (Transmission Control Protocol)
  • The Internet protocol also supports a
    connectionless transport protocol called UDP
    (Universal Datagram Protocol)

26
Sockets
  • Socket primitives for TCP/IP.

27
Sockets cont.
  • Connection-oriented communication pattern using
    sockets.

28
Socket Programming
  • Review
  • IP
  • TCP
  • UDP
  • Port
  • Server Design Issues
  • Iterative vs. concurrent server
  • Stateless vs. stateful server
  • Multithreaded server

29
A Multithreaded Server
30
The Message Passing Model
  • The message passing model provides two basic
    communication primitives
  • Send and receive
  • Send has two logical parameters, a message and
    its destination
  • Receive has two logical parameters, the source
    and a buffer for storing the message

31
Semantics of Send and Receive Primitives
  • There are several design issues regarding send
    and receive primitives
  • Buffered or un-buffered
  • Blocking vs. non-blocking primitives
  • With blocking primitives, the send does not
    return control until the message has been sent or
    received and the receive does not return control
    until a message is copied to the buffer
  • With non-blocking primitives, the send returns
    control as the message is copied and the receive
    signals its intention to receive a message and
    provide a buffer for it

32
Semantics of Send and Receive Primitives cont.
  • Synchronous vs. asynchronous primitives
  • With synchronous primitives, a SEND primitive is
    blocked until a corresponding RECEIVE primitive
    is executed
  • With asynchronous primitives, a SEND primitive
    does not block if there is no corresponding
    execution of a RECEIVE primitive
  • The messages are buffered

33
Remote Procedure Call
  • RPC is designed to hide all the details from
    programmers
  • Overcome the difficulties with message-passing
    model
  • It extends the conventional local procedure calls
    to calling procedures on remote computers

34
Steps of a Remote Procedure Call cont.
35
Remote Procedure Call cont.
  • Design issues
  • Structure
  • Mostly based on stub procedures
  • Binding
  • Through a binding server
  • The client specifies the machine and service
    required
  • Parameter and result passing
  • Representation issues
  • By value and by reference

36
Remote Object Invocation
  • Extend RPC principles to objects
  • The key feature of an object is that it
    encapsulates data (called state) and the
    operations on those data (called methods)
  • Methods are made available through an interface
  • The separation between interfaces and the objects
    implementing these interfaces allows us to place
    an interface at one machine, while the object
    itself resides on another machine

37
Distributed Objects
  • Common organization of a remote object with
    client-side proxy.

38
Inherent Limitations of a Distributed System
  • Absence of a global clock
  • In a centralized system, time is unambiguous
  • In a distributed system, there exists no system
    wide common clock
  • In other words, the notion of global time does
    not exist
  • Impact of the absence of global time
  • Difficult to reason about temporal order of
    events
  • Makes it harder to collect up-to-date information
    on the state of the entire system

39
Inherent Limitations of a Distributed System
  • Absence of shared memory
  • An up-to-date state of the entire system is not
    available to any individual process
  • This information, however, is necessary to reason
    about the systems behavior, debugging,
    recovering from failures

40
Lamports Logical Clocks
  • Logical clocks
  • For a wide of algorithms, what matters is the
    internal consistency of clocks, not whether they
    are close to the real time
  • For these algorithms, the clocks are often called
    logical locks
  • Lamport proposed a scheme to order events in a
    distributed system using logical clocks

41
Lamports Logical Clocks cont.
  • Definitions
  • Happened before relation
  • Happened before relation (?) captures the causal
    dependencies between events
  • It is defined as follows
  • a ? b, if a and b are events in the same process
    and a occurred before b.
  • a ? b, if a is the event of sending a message m
    in a process and b is the event of receipt of the
    same message m by another process
  • If a ? b and b ? c, then a ? c, i.e., ? is
    transitive

42
Lamports Logical Clocks cont.
  • Definitions continued
  • Causally related events
  • Event a causally affects event b if a ? b
  • Concurrent events
  • Two distinct events a and b are said to be
    concurrent (denoted by a b) if a ? b and b ? a
  • For any two events, either a ? b, b ? a, or a b

43
Lamports Logical Clocks cont.
  • Implementation rules
  • IR1 Clock Ci is incremented between any two
    successive events in process Pi
  • Ci Ci d ( d gt 0)
  • IR2 If event a is the sending of message m by
    process Pi, then message m is assigned a
    timestamp tm Ci(a). On receiving the same
    message m by process Pj, Cj is set to
  • Cj max(Cj, tm d)

44
An Example
45
Total Ordering Using Lamports Clocks
  • If a is any event at process Pi and b is any
    event at process Pj, then a gt b if and only if
    either
  • Where is any arbitrary relation that
    totally orders the processes to break ties

46
A Limitation of Lamports Clocks
  • In Lamports system of logical clocks
  • If a ? b, then C(a) lt C(b)
  • The reverse if not necessarily true if the events
    have occurred on different processes

47
A Limitation of Lamports Clocks
48
Vector Clocks
  • Implementation rules
  • IR1 Clock Ci is incremented between any two
    successive events in process Pi
  • Cii Cii d ( d gt 0)
  • IR2 If event a is the sending of message m by
    process Pi, then message m is assigned a
    timestamp tm Ci(a). On receiving the same
    message m by process Pj, Cj is set to
  • Cjk max(Cjk, tmk)

49
Vector Clocks cont.
50
Vector Clocks cont.
  • Assertion
  • At any instant,
  • Events a and b are casually related if ta lt tb or
    tb lt ta. Otherwise, these events are concurrent
  • In a system of vector clocks,

51
Causal Ordering of Messages
  • The causal ordering of messages tries to maintain
    the same causal relationship that holds among
    message send events with the corresponding
    message receive events
  • In other words, if Send(M1) -gt Send(M2), then
    Receive(M1) -gt Receive(M2)
  • This is different from causal ordering of events

52
Causal Ordering of Messages cont.
53
Causal Ordering of Messages cont.
  • The basic idea
  • It is very simple
  • Deliver a message only when no causality
    constraints are violated
  • Otherwise, the message is not delivered
    immediately but is buffered until all the
    preceding messages are delivered

54
Birman-Schiper-Stephenson Protocol
55
Schiper-Eggli-Sando Protocol
56
Schiper-Eggli-Sando Protocol cont.
57
Schiper-Eggli-Sando Protocol cont.
58
Local State
  • Local state
  • For a site Si, its local state at a given time is
    defined by the local context of the distributed
    application, denoted by LSi.
  • More notations
  • mij denotes a message sent by Si to Sj
  • send(mij) and rec(mij) denote the corresponding
    sending and receiving event.

59
Definitions cont.
60
Definitions cont.
61
Global State cont.
62
Definitions cont.
Strongly consistent global state A global state
is strongly consistent if it is consistent and
transitless
63
Global State cont.
64
Chandy-Lamports Global State Recording Algorithm
65
Cuts of a Distributed Computation
  • A cut is a graphical representation of a global
    state
  • A consistent cut is a graphical representation of
    a consistent global state
  • Definition
  • A cut of a distributed computation is a set
    Cc1, c2, ...., cn, where ci is a cut event at
    site Si in the history of the distributed
    computation

66
Cuts of a Distributed Computation cont.
67
Cuts of a Distributed Computation cont.
68
Cuts of a Distributed Computation cont.
69
Cuts of a Distributed Computation cont.
70
Cuts of a Distributed Computation cont.
71
The Critical Section Problem
  • When processes (centralized or distributed)
    interact through shared resources, the integrity
    of the resources may be violated if the accesses
    are not coordinated
  • The resources may not record all the changes
  • A process may obtain inconsistent values
  • The final state of the shared resource may be
    inconsistent

72
Mutual Exclusion
  • One solution to the problem is that at any time
    at most only one process can access the shared
    resources
  • This solution is known as mutual exclusion
  • A critical section is a code segment in a process
    which shared resources are accessed
  • A process can have more than one critical section
  • There are problems which involve shared resources
    where mutual exclusion is not the optimal solution

73
The Structure of Processes
  • Structure of process Pi
  • repeat
  • entry section
  • critical section
  • exit section
  • reminder section
  • until false

74
Requirements of Mutual Exclusion Algorithms
  • Freedom from deadlocks
  • Two or more sites should not endlessly wait for
    messages
  • Freedom from starvation
  • A site would wait indefinitely to execute its
    critical section
  • Fairness
  • Requests are executed in the order based on
    logical clocks
  • Fault tolerant
  • It continues to work when some failures occur

75
Performance Measure for Distributed Mutual
Exclusion
  • The number of messages per CS invocation
  • Synchronization delay
  • The time required after a site leaves the CS and
    before the next site enters the CS
  • System throughput 1/(sdE), where sd is the
    synchronization delay and E the average CS
    execution time
  • Response time
  • The time interval a request waits for its CS
    execution to be over after its request messages
    have been sent out

76
Performance Measure for Distributed Mutual
Exclusion
77
A Centralized Algorithm
  • It is a simple solution
  • One site, called the control site, is responsible
    for granting permission to the CS execution
  • To request the CS, a site sends a REQUEST message
    to the control site
  • When a site is done with CS execution, it sends a
    RELEASE message to the control site
  • The control site queues up the requests for the
    CS and grant them permission

78
Distributed Solutions
  • Non-token-based algorithms
  • Use timestamps to order requests and resolve
    conflicts between simultaneous requests
  • Lamports algorithm and Ricart-Agrawala Algorithm
  • Token-based algorithms
  • A unique token is shared among the sites
  • A site is allowed to enter the CS if it possess
    the token and continues to hold the token until
    its CS execution is over then it passes the
    token to the next site

79
Lamports Distributed Mutual Exclusion Algorithm
  • This algorithm is based on the total ordering
    using Lamports clocks
  • Each process keeps a Lamports logical clock
  • Each process is associated with a unique id that
    can be used to break the ties
  • In the algorithm, each process keeps a queue,
    request_queuei, which contains mutual exclusion
    requests ordered by their timestamp and
    associated id
  • Ri of each process consists of all the processes
  • The communication channel is assumed to be FIFO

80
Lamports Distributed Mutual Exclusion Algorithm
cont.
81
Lamports Distributed Mutual Exclusion Algorithm
cont.
82
Ricart-Agrawala Algorithm
83
A Simple Toke Ring Algorithm
  • When the ring is initialized, one process is
    given the token
  • The token circulates around the ring
  • It is passed from k to k1 (modulo the ring size)
  • When a process acquires the token from its
    neighbor, it checks to see if it is waiting to
    enter its critical section
  • If so, it enters its CS
  • When exiting from its CS, it passes the token to
    the next
  • Otherwise, it passes the token to the next

84
Suzuki-Kasamis Algorithm
  • Data structures
  • Each site maintains a vector consisting the
    largest sequence number received so far from
    other sites
  • The token consists of a queue of requesting sites
    and an array of integers, consisting of the
    sequence number of the request that a site
    executed most recently

85
Suzuki-Kasamis Algorithm cont.
86
Distributed Deadlock Detection
  • In distributed systems, the system state can be
    represented by a wait-for graph (WFG)
  • In WFG, nodes are processes and there is a
    directed edge from node P1 to node P2 if P1 is
    blocked and is waiting for P2 to release some
    resource
  • The system is deadlocked if there is a directed
    cycle or knot in its WFG
  • The problem is how to maintain the WFG and detect
    cycle/knot in the graph

87
Distributed Deadlock Detection cont.
  • Centralized detection algorithms
  • Distributed deadlock algorithms
  • Path-pushing
  • Edge-chasing
  • Diffusion computation
  • Global state detection
  • You need to know the basic ideas but not the
    details about those algorithms

88
Agreement Protocols
  • In distributed systems, sites are often required
    to reach mutual agreement
  • In distributed database systems, data managers
    must agree on whether to commit or to abort a
    transaction
  • Reaching an agreement requires the sites have
    knowledge about values at other sites
  • Agreement when the system is free from failures
  • Agreement when the system is prone to failure

89
Agreement Problems
  • There are three well known agreement problems
  • Byzantine agreement problem
  • Consensus problem
  • Interactive consistency problem

90
Lamport-Shostak-Pease Algorithm
91
Lamport-Shostak-Pease Algorithm cont.
Write a Comment
User Comments (0)
About PowerShow.com