Distributed Systems - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Distributed Systems

Description:

Distributed Systems CSE 380 Lecture Note 13 Insup Lee Introduction to Distributed Systems Why do we develop distributed systems? availability of powerful yet cheap ... – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 22
Provided by: cisUpenn5
Category:

less

Transcript and Presenter's Notes

Title: Distributed Systems


1
Distributed Systems
  • CSE 380
  • Lecture Note 13
  • Insup Lee

2
Introduction to Distributed Systems
  • Why do we develop distributed systems?
  • availability of powerful yet cheap
    microprocessors (PCs, workstations), continuing
    advances in communication technology,
  • What is a distributed system?
  • A distributed system is a collection of
    independent computers that appear to the users of
    the system as a single system.
  • Examples
  • Network of workstations
  • Distributed manufacturing system (e.g., automated
    assembly line)
  • Network of branch office computers

3
Advantages of Distributed Systems over
Centralized Systems
  • Economics a collection of microprocessors offer
    a better price/performance than mainframes. Low
    price/performance ratio cost effective way to
    increase computing power.
  • Speed a distributed system may have more total
    computing power than a mainframe. Ex. 10,000 CPU
    chips, each running at 50 MIPS. Not possible to
    build 500,000 MIPS single processor since it
    would require 0.002 nsec instruction cycle.
    Enhanced performance through load distributing.
  • Inherent distribution Some applications are
    inherently distributed. Ex. a supermarket chain.
  • Reliability If one machine crashes, the system
    as a whole can still survive. Higher availability
    and improved reliability.
  • Incremental growth Computing power can be added
    in small increments. Modular expandability
  • Another deriving force the existence of large
    number of personal computers, the need for people
    to collaborate and share information.

4
Advantages of Distributed Systems over
Independent PCs
  • Data sharing allow many users to access to a
    common data base
  • Resource Sharing expensive peripherals like
    color printers
  • Communication enhance human-to-human
    communication, e.g., email, chat
  • Flexibility spread the workload over the
    available machines

5
Disadvantages of Distributed Systems
  • Software difficult to develop software for
    distributed systems
  • Network saturation, lossy transmissions
  • Security easy access also applies to secrete
    data

6
Hardware Concepts
  • Taxonomy (Fig. 9-4)
  • MIMD (Multiple-Instruction Multiple-Data)
  • Tightly Coupled versus Loosely Coupled
  • Tightly coupled systems (multiprocessors)
  • shared memory
  • intermachine delay short, data rate high
  • Loosely coupled systems (multicomputers)
  • private memory
  • intermachine delay long, data rate low

7
Bus versus Switched MIMD
  • Bus a single network, backplane, bus, cable or
    other medium that connects all machines. E.g.,
    cable TV
  • Switched individual wires from machine to
    machine, with many different wiring patterns in
    use.
  • Multiprocessors (shared memory)
  • Bus
  • Switched
  • Multicomputers (private memory)
  • Bus
  • Switched

8
Bus-based Multiprocessors
  • Bus-based multiprocessors (Fig. 9-5)
  • cache memory
  • hit rate
  • cache coherence
  • write-through cache propagate write immediately
  • snoopy cache monitor when its entry becomes
    obsolete

9
Switched Multiprocessors
  • Switched Multiprocessors (Fig. 9-6)
  • for connecting large number (say over 64) of
    processors
  • crossbar switch n2 switch points
  • omega network 2x2 switches for n CPUs and n
    memories, log n switching stages, each with n/2
    switches,
  • total (n log n)/2 switches
  • delay problem E.g., n1024, 10 switching stages
    from CPU to memory. a total of 20 switching
    stages. 100 MIPS 10 nsec instruction execution
    time need 0.5 nsec switching time
  • NUMA (Non-Uniform Memory Access) placement of
    program and data
  • building a large, tightly-coupled, shared memory
    multiprocessor is possible, but is difficult and
    expensive

10
Multicomputers
  • Bus-Based Multicomputers (Fig. 9-7)
  • easy to build
  • communication volume much smaller
  • relatively slow speed LAN (10-100 MIPS, compared
    to 300 MIPS and up for a backplane bus)
  • Switched Multicomputers (Fig. 9-8)
  • interconnection networks E.g., grid, hypercube
  • hypercube n-dimensional cube

11
Software Concepts
  • Software more important for users
  • Three types
  • Network Operating Systems
  • (True) Distributed Systems
  • Multiprocessor Time Sharing

12
Network Operating Systems
  • loosely-coupled software on loosely-coupled
    hardware
  • A network of workstations connected by LAN
  • each machine has a high degree of autonomy
  • rlogin machine
  • rcp machine1file1 machine2file2
  • Files servers client and server model
  • Clients mount directories on file servers
  • Best known network OS
  • Suns NFS (network file servers) for shared file
    systems (Fig. 9-11)
  • a few system-wide requirements format and
    meaning of all the messages exchanged

13
NFS
  • NFS Architecture
  • Server exports directories
  • Clients mount exported directories
  • NSF Protocols
  • For handling mounting
  • For read/write no open/close, stateless
  • NSF Implementation

14
(True) Distributed Systems
  • tightly-coupled software on loosely-coupled
    hardware
  • provide a single-system image or a virtual
    uniprocessor
  • a single, global interprocess communication
    mechanism, process management, file system the
    same system call interface everywhere
  • Ideal definition
  • A distributed system runs on a collection of
    computers that do not have shared memory, yet
    looks like a single computer to its users.

15
Multiprocessor Operating Systems
  • (Fig. 9-12)
  • Tightly-coupled software on tightly-coupled
    hardware
  • Examples high-performance servers
  • shared memory
  • single run queue
  • traditional file system as on a single-processor
    system central block cache
  • Fig. 9-13 for comparisons

16
Design Issues of Distributed Systems
  • Transparency
  • Flexibility
  • Reliability
  • Performance
  • Scalability

17
1. Transparency
  • How to achieve the single-system image, i.e., how
    to make a collection of computers appear as a
    single computer.
  • Hiding all the distribution from the users as
    well as the application programs can be achieved
    at two levels
  • hide the distribution from users
  • at a lower level, make the system look
    transparent to programs.
  • 1) and 2) requires uniform interfaces such as
    access to files, communication.

18
Types of transparency
  • Location Transparency users cannot tell where
    hardware and software resources such as CPUs,
    printers, files, data bases are located.
  • Migration Transparency resources must be free to
    move from one location to another without their
    names changed.E.g., /usr/lee, /central/usr/lee
  • Replication Transparency OS can make additional
    copies of files and resources without users
    noticing.
  • Concurrency Transparency The users are not
    aware of the existence of other users. Need to
    allow multiple users to concurrently access the
    same resource. Lock and unlock for mutual
    exclusion.
  • Parallelism Transparency Automatic use of
    parallelism without having to program explicitly.
    The holy grail for distributed and parallel
    system designers.
  • Users do not always want complete transparency a
    fancy printer 1000 miles away

19
2. Flexibility
  • Make it easier to change
  • Monolithic Kernel systems calls are trapped and
    executed by the kernel. All system calls are
    served by the kernel, e.g., UNIX.
  • Microkernel provides minimal services. (Fig
    9-15)1) IPC 2) some memory management 3) some
    low-level process management and scheduling 4)
    low-level i/oE.g., Mach can support multiple
    file systems, multiple system interfaces.

20
3. Reliability
  • Distributed system should be more reliable than
    single system. Example 3 machines with .95
    probability of being up. 1-.053 probability of
    being up.
  • Availability fraction of time the system is
    usable. Redundancy improves it.
  • Need to maintain consistency
  • Need to be secure
  • Fault tolerance need to mask failures, recover
    from errors.

21
4. Performance
  • Without gain on this, why bother with distributed
    systems.
  • Performance loss due to communication delays
  • fine-grain parallelism high degree of
    interaction
  • coarse-grain parallelism
  • Performance loss due to making the system fault
    tolerant.

22
5. Scalability
  • Systems grow with time or become obsolete.
    Techniques that require resources linearly in
    terms of the size of the system are not scalable.
    e.g., broadcast based query won't work for
    large distributed systems.
  • Examples of bottlenecks
  • Centralized components a single mail server
  • Centralized tables a single URL address book
  • Centralized algorithms routing based on complete
    information

23
Communication Networks
  • Computers are connected through a communication
    network
  • Wide Area Networks (WAN)connect computers spread
    over a wide geographic areapoint-to-point or
    store-and-forward -- data is transferred between
    computers through a series of switchesswitch --
    a special purpose computer responsible for
    routing data (to avoid network congestion)data
    can be lost due to switch crashes, communication
    link failures, limited buffers at switches,
    transmission errors, etc.

24
  • Packet Switching versus Circuit Switching
  • i) circuit switching -- a dedicated path
    between a source and a destination e.g.,
    telephone connection. wastes bandwidth
    (bandwidth amount of data transmitted
    in a given time period)
  • ii) packet switching -- message or data is
    broken into packets packets are routed
    independently better network utilization
    disassemble and assembler overheads The ISO
    OSI Reference Model
  • Local Area Networks (LAN)
Write a Comment
User Comments (0)
About PowerShow.com