Distributed Systems - PowerPoint PPT Presentation

1 / 21

About This Presentation

Title:

Distributed Systems

Description:

Distributed Systems CSE 380 Lecture Note 13 Insup Lee Introduction to Distributed Systems Why do we develop distributed systems? availability of powerful yet cheap ... – PowerPoint PPT presentation

Number of Views:84

Avg rating:3.0/5.0

Slides: 22

Provided by: cisUpenn5

Learn more at: https://www.cis.upenn.edu

Category:

more less

Transcript and Presenter's Notes

Title: Distributed Systems

1
Distributed Systems

CSE 380
Lecture Note 13
Insup Lee

2
Introduction to Distributed Systems

Why do we develop distributed systems?
availability of powerful yet cheap
microprocessors (PCs, workstations), continuing
advances in communication technology,
What is a distributed system?
A distributed system is a collection of
independent computers that appear to the users of
the system as a single system.
Examples
Network of workstations
Distributed manufacturing system (e.g., automated
assembly line)
Network of branch office computers

3
Advantages of Distributed Systems over
Centralized Systems

Economics a collection of microprocessors offer
a better price/performance than mainframes. Low
price/performance ratio cost effective way to
increase computing power.
Speed a distributed system may have more total
computing power than a mainframe. Ex. 10,000 CPU
chips, each running at 50 MIPS. Not possible to
build 500,000 MIPS single processor since it
would require 0.002 nsec instruction cycle.
Enhanced performance through load distributing.
Inherent distribution Some applications are
inherently distributed. Ex. a supermarket chain.
Reliability If one machine crashes, the system
as a whole can still survive. Higher availability
and improved reliability.
Incremental growth Computing power can be added
in small increments. Modular expandability
Another deriving force the existence of large
number of personal computers, the need for people
to collaborate and share information.

4
Advantages of Distributed Systems over
Independent PCs

Data sharing allow many users to access to a
common data base
Resource Sharing expensive peripherals like
color printers
Communication enhance human-to-human
communication, e.g., email, chat
Flexibility spread the workload over the
available machines

5
Disadvantages of Distributed Systems

Software difficult to develop software for
distributed systems
Network saturation, lossy transmissions
Security easy access also applies to secrete
data

6
Hardware Concepts

Taxonomy (Fig. 9-4)
MIMD (Multiple-Instruction Multiple-Data)
Tightly Coupled versus Loosely Coupled
Tightly coupled systems (multiprocessors)
shared memory
intermachine delay short, data rate high
Loosely coupled systems (multicomputers)
private memory
intermachine delay long, data rate low

7
Bus versus Switched MIMD

Bus a single network, backplane, bus, cable or
other medium that connects all machines. E.g.,
cable TV
Switched individual wires from machine to
machine, with many different wiring patterns in
use.
Multiprocessors (shared memory)
Bus
Switched
Multicomputers (private memory)
Bus
Switched

8
Bus-based Multiprocessors

Bus-based multiprocessors (Fig. 9-5)
cache memory
hit rate
cache coherence
write-through cache propagate write immediately
snoopy cache monitor when its entry becomes
obsolete

9
Switched Multiprocessors

Switched Multiprocessors (Fig. 9-6)
for connecting large number (say over 64) of
processors
crossbar switch n2 switch points
omega network 2x2 switches for n CPUs and n
memories, log n switching stages, each with n/2
switches,
total (n log n)/2 switches
delay problem E.g., n1024, 10 switching stages
from CPU to memory. a total of 20 switching
stages. 100 MIPS 10 nsec instruction execution
time need 0.5 nsec switching time
NUMA (Non-Uniform Memory Access) placement of
program and data
building a large, tightly-coupled, shared memory
multiprocessor is possible, but is difficult and
expensive

10
Multicomputers

Bus-Based Multicomputers (Fig. 9-7)
easy to build
communication volume much smaller
relatively slow speed LAN (10-100 MIPS, compared
to 300 MIPS and up for a backplane bus)
Switched Multicomputers (Fig. 9-8)
interconnection networks E.g., grid, hypercube
hypercube n-dimensional cube

11
Software Concepts

Software more important for users
Three types
Network Operating Systems
(True) Distributed Systems
Multiprocessor Time Sharing

12
Network Operating Systems

loosely-coupled software on loosely-coupled
hardware
A network of workstations connected by LAN
each machine has a high degree of autonomy
rlogin machine
rcp machine1file1 machine2file2
Files servers client and server model
Clients mount directories on file servers
Best known network OS
Suns NFS (network file servers) for shared file
systems (Fig. 9-11)
a few system-wide requirements format and
meaning of all the messages exchanged

13
NFS

NFS Architecture
Server exports directories
Clients mount exported directories
NSF Protocols
For handling mounting
For read/write no open/close, stateless
NSF Implementation

14
(True) Distributed Systems

tightly-coupled software on loosely-coupled
hardware
provide a single-system image or a virtual
uniprocessor
a single, global interprocess communication
mechanism, process management, file system the
same system call interface everywhere
Ideal definition
A distributed system runs on a collection of
computers that do not have shared memory, yet
looks like a single computer to its users.

15
Multiprocessor Operating Systems

(Fig. 9-12)
Tightly-coupled software on tightly-coupled
hardware
Examples high-performance servers
shared memory
single run queue
traditional file system as on a single-processor
system central block cache
Fig. 9-13 for comparisons

16
Design Issues of Distributed Systems

Transparency
Flexibility
Reliability
Performance
Scalability

17
1. Transparency

How to achieve the single-system image, i.e., how
to make a collection of computers appear as a
single computer.
Hiding all the distribution from the users as
well as the application programs can be achieved
at two levels
hide the distribution from users
at a lower level, make the system look
transparent to programs.
1) and 2) requires uniform interfaces such as
access to files, communication.

18
Types of transparency

Location Transparency users cannot tell where
hardware and software resources such as CPUs,
printers, files, data bases are located.
Migration Transparency resources must be free to
move from one location to another without their
names changed.E.g., /usr/lee, /central/usr/lee
Replication Transparency OS can make additional
copies of files and resources without users
noticing.
Concurrency Transparency The users are not
aware of the existence of other users. Need to
allow multiple users to concurrently access the
same resource. Lock and unlock for mutual
exclusion.
Parallelism Transparency Automatic use of
parallelism without having to program explicitly.
The holy grail for distributed and parallel
system designers.
Users do not always want complete transparency a
fancy printer 1000 miles away

19
2. Flexibility

Make it easier to change
Monolithic Kernel systems calls are trapped and
executed by the kernel. All system calls are
served by the kernel, e.g., UNIX.
Microkernel provides minimal services. (Fig
9-15)1) IPC 2) some memory management 3) some
low-level process management and scheduling 4)
low-level i/oE.g., Mach can support multiple
file systems, multiple system interfaces.

20
3. Reliability

Distributed system should be more reliable than
single system. Example 3 machines with .95
probability of being up. 1-.053 probability of
being up.
Availability fraction of time the system is
usable. Redundancy improves it.
Need to maintain consistency
Need to be secure
Fault tolerance need to mask failures, recover
from errors.

21
4. Performance

Without gain on this, why bother with distributed
systems.
Performance loss due to communication delays
fine-grain parallelism high degree of
interaction
coarse-grain parallelism
Performance loss due to making the system fault
tolerant.

22
5. Scalability

Systems grow with time or become obsolete.
Techniques that require resources linearly in
terms of the size of the system are not scalable.
e.g., broadcast based query won't work for
large distributed systems.
Examples of bottlenecks
Centralized components a single mail server
Centralized tables a single URL address book
Centralized algorithms routing based on complete
information

23
Communication Networks

Computers are connected through a communication
network
Wide Area Networks (WAN)connect computers spread
over a wide geographic areapoint-to-point or
store-and-forward -- data is transferred between
computers through a series of switchesswitch --
a special purpose computer responsible for
routing data (to avoid network congestion)data
can be lost due to switch crashes, communication
link failures, limited buffers at switches,
transmission errors, etc.

Packet Switching versus Circuit Switching
i) circuit switching -- a dedicated path
between a source and a destination e.g.,
telephone connection. wastes bandwidth
(bandwidth amount of data transmitted
in a given time period)
ii) packet switching -- message or data is
broken into packets packets are routed
independently better network utilization
disassemble and assembler overheads The ISO
OSI Reference Model
Local Area Networks (LAN)