Title: Distributed Processing Chapter 1 : Introduction
1Distributed ProcessingChapter 1 Introduction
2Problem
- There are n nodes, each of which has a value. A
node wants to know the maximum value among the n
nodes. - Centralized Approach
- A server maintains the values of n nodes and each
node reports its value to the server. - Then the query node sends a message to ask the
maximum value to the server, which will answer to
the query. - Distributed Approach
- Each node communicates with its 6 nearest
neighbor nodes to inform its value. - Then the query node eventually finds the maximum
value by exchanging information with its neighbor
nodes.
3Discussion
- Question 1 Find the algorithm for distributed
approach. - Question 2 Compare the performance
- In terms of the number of communications
- Question 3 Make a comparison table for the two
approaches
4Definition of a Distributed System
- Distributed system
- 1) A collection of (scalability)
- 2) independent computers that (heterogeneity)
- 3) appears to its users as a single coherent
system (transparency) - Distributed System versus Parallel System
- Separated Operating System vs. Single Operating
System - Message Passing vs. Shared Memory
5Why Distributed System ?
- Performance
- Incremental Growth (Scalability)
- 1 single mainframe of price W
- N small machines of price W/N
- Fault Tolerance
- 1 single mainframe critical weak point
- Failure of a machine replacement by other
machines - Geographical Distribution and Availability
- Flexible configuration
- e.g. 1 Disk server, 3 Computing servers, 1
Graphic server, etc. - Geographical availability
6Distributed System - Scalibility and
Heterogeneity
1.1
A distributed system organized as middleware. ?
Heterogeneity and Scalability
7Distributed System - Transparency
Different forms of transparency in a distributed
system.
Transparency Description
Access Hide differences in data representation and how a resource is accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another location while in use
Replication Hide that a resource may be shared by several competitive users
Concurrency Hide that a resource may be shared by several competitive users
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory or on disk
8Distributed System Heterogeneity
Application Program or Client
Client has to be provided with one different
driver for each server
9Distributed System Heterogeneity and
Object-Oriented Approach
Application Program or Client
Predefined interface
Wrapping with predefined interface
Encapsulation Object-Oriented Approach
10Hardware Concepts Multiprocessor
1.6
11Multiprocessors (1)
- A bus-based multiprocessor.
1.7
12Multiprocessors (2)
- (a) A crossbar switch
- (b) An omega switching network
1.8
13Homogeneous Multicomputer Systems
14Software Concepts
System Description Main Goal
DOS Tightly-coupled operating system for multi-processors and homogeneous multicomputers Hide and manage hardware resources
NOS Loosely-coupled operating system for heterogeneous multicomputers (LAN and WAN) Offer local services to remote clients
Middleware Additional layer atop of NOS implementing general-purpose services Provide distribution transparency
- An overview of
- DOS (Distributed Operating Systems)
- NOS (Network Operating Systems)
- Middleware
15Issues in System Design
- Transparency
- Flexibility
- Reliability
- Performance
- Scalability
- Interoperability
16Transparency
- Hiding physical details about
- Location
- Migration
- Duplication
- Relocation
- Concurrency
- Parallelism
- Location
- Access
17Flexibility
- Should be easy to modify functionality and
architecture - To provide with Configurability, Avalability and
Autonomy - Micro-Kernel vs. Monolithic Kernel
- Monolithic Kernel Provides all functionalities
of OS. example. UNIX - Micro-Kernel
- Minimal subset of OS what users want
- Example
- Kernel Watch
18Reliability
- Important Goal of Distributed System
- Reliability
- Security
- Fault-Tolerance
- Failure Probability P
- Should be P P1P2P3 Pn
- But often P P1 P2 P3 Pn in reality
19Performance and Scalability
- Improve performance by parallelism
- Throughput T
- Ideally should be T Tn when n is the number of
sites - In reality T lt Tn
- Due to some Bottleneck
Throughput
Number of sites
??
20Granularity of Parallelism
- Unit of Task
- Fine-Granularity vs. Coarse Granularity
- Fine-Granularity
- Large number of small tasks
- Need a large amount of inter-task communication
- Not good for distributed system (good for
Parallel system) - Coarse-Granularity
- Small number of big tasks
- Only small amount of inter-task communication
- Good for distributed system
21Interoperability
- Easy to collaborate with other systems in
run-time - Compatibility, Portability
- How to achieve Interoperability
- Well-Defined API set
- Standardization
22Hardware Concepts Multiprocessor
1.6
23Multiprocessors (1)
- A bus-based multiprocessor.
1.7
24Multiprocessors (2)
- (a) A crossbar switch
- (b) An omega switching network
1.8
25Homogeneous Multicomputer Systems