Title: Introduction to Distributed Systems
1 Introduction to Distributed Systems
- What is distributed Computing?
- Distribute To divide among several or many,
systematically or merely at random. - Distributed system Collection of independent
computers that appear to the users of the system
as a single computer. - Distributed programming techniques allow software
to take advantage of resources located on the
Internet, on corporate and organization
intranets, and on networks. - Distributed programming usually involves network
programming in one form or another. That is, a
program on one computer on a network needs some
hardware or software resource that belongs to
another computer either on the same network or on
some remote network.
2Introduction to Distributed Systems
3Introduction to Distributed Systems
- Examples of Distributed Systems
- Network of workstations (NOW) a group of
networked personal workstations connected to one
or more server machines. - The Internet
- An intranet a network of computers and
workstations within an organization, segregated
from the Internet via a protective device (a
firewall). - Actual example of a large-scale distributed
system eBay - Actual example of a small-scale distributed
system smart home - Computers in a distributed system
- Workstations computers used by end-users to
perform computing - Server machines computers which provide
resources and services - Personal Assistance Devices handheld computers
connected to the system via a wireless
communication link.
4Introduction to Distributed Systems
- The network really is the computer.
- Tim OReilly, in an address at 6/2000 Java One
- By now, it's a truism that the Internet runs on
open source. Bind, the Berkeley Internet Name
Daemon, is the single most mission critical
program on the Internet, followed closely by
Sendmail and Apache, open source servers for two
of the Internet's most widely used application
protocols, SMTP and HTTP. - Early killer apps
- - usenet distributed bulletin board
- - email
- - talk
- Recent killer apps
- - the web
- - collaborative computing
5Introduction to Distributed Systems
Centralized vs. Distributed Computing
6Introduction to Distributed Systems
- Monolithic mainframe applications vs. distributed
applications - The monolithic mainframe application
architecture - Separate, single-function applications, such as
order-entry or billing - Applications cannot share data or other resources
- Developers must create multiple instances of the
same functionality (service). - Proprietary (user) interfaces
- The distributed application architecture
- Integrated applications
- Applications can share resources
- A single instance of functionality (service) can
be reused. - Common user interfaces
7Introduction to Distributed Systems
- Evolution of pardigms
- Client-server Socket API, remote method
invocation - Distributed objects
- Object broker CORBA
- Network service Jini
- Object space JavaSpaces
- Mobile agents
- Message oriented middleware (MOM) Java Message
Service - Collaborative applications
8Introduction to Distributed Systems
- Cooperative distributed computing projects
- Cooperative distributed computing projects (also
called distributed computing in some literature)
these are projects that parcel out large-scale
computing to workstations, often making use of
surplus CPU cycles. - Example seti_at_home project to scan data
retrieved by a radio telescope to search for
radio signals from another world. - Why distributed computing?
- Economics distributed systems allow the pooling
of resources, including CPU cycles, data storage,
input/output devices, and services. - Reliability a distributed system allow
replication of resources and/or services, thus
reducing service outage due to failures. - The Internet has become a universal platform for
distributed computing
9Introduction to Distributed Systems
- The Weaknesses and Strengths of Distributed
Computing - In any form of computing, there is always a
tradeoff in advantages and disadvantages - Some of the reasons for the popularity of
distributed computing - The affordability of computers and availability
of network access - Resource sharing
- Scalability
- Fault Tolerance
- The disadvantages of distributed computing
- Multiple Points of Failures the failure of one
or more participating computers, or one or more
network links, can spell trouble. - Security Concerns In a distributed system, there
are more opportunities for unauthorized attack.
10Introduction to Distributed Systems
- The Architecture of Distributed Applications
11Introduction to Distributed Systems
- Network standards and protocols
- On public networks such as the Internet, it is
necessary for a common set of rules to be
specified for the exchange of data. - Such rules, called protocols, specify such
matters as the formatting and semantics of data,
flow control, error correction. - Software can share data over the network using
network software which supports a common set of
protocols. - Protocols
- In the context of communications, a protocol is a
set of rules that must be observed by the
participants. - In communications involving computers, protocols
must be formally defined and precisely
implemented. For each protocol, there must be
rules that specify the followings - How is the data exchanged encoded?
- How are events (sending , receiving) synchronized
so that the participants can send and receive in
a coordinated order? - The specification of a protocol does not dictate
how the rules are to be implemented.
12Introduction to Distributed Systems
- The network architecture
- Network hardware transfers electronic signals,
which represent a bit stream, between two
devices. - Modern day network applications require an
application programming interface (API) which
masks the underlying complexities of data
transmission. - A layered network architecture allows the
functionalities needed to mask the complexities
to be provided incrementally, layer by layer. - Actual implementation of the functionalities may
not be clearly divided by layer.
13Introduction to Distributed Systems
- The OSI seven-layer network architecture
14Introduction to Distributed Systems
- Network Architecture
- The division of the layers is conceptual the
implementation of the functionalities need not be
clearly divided as such in the hardware and
software that implements the architecture. - The conceptual division serves at least two
useful purposes - Systematic specification of protocols it allows
protocols to be specified systematically - Conceptual Data Flow it allows programs to be
written in terms of logical data flow.
15Introduction to Distributed Systems
- The TCP/IP Protocol Suite
- The Transmission Control Protocol/Internet
Protocol suite is a set of network protocols
which supports a four-layer network architecture. - It is currently the protocol suite employed on
the Internet.
16Introduction to Distributed Systems
- The TCP/IP Protocol Suite -2
- The Internet layer implements the Internet
Protocol, which provides the functionalities for
allowing data to be transmitted between any two
hosts on the Internet. - The Transport layer delivers the transmitted data
to a specific process running on an Internet
host. - The Application layer supports the programming
interface used for building a program.
17Introduction to Distributed Systems
- Network Resources
- Network resources are resources available to the
participants of a distributed computing
community. - Network resources include hardware such as
computers and equipment, and software such as
processes, email mailboxes, files, web documents.
- An important class of network resources is
network services such as the World Wide Web and
file transfer (FTP), which are provided by
specific processes running on computers. - One of the key challenges in distributed
computing is the unique identification of
resources available on the network, such as
e-mail mailboxes, and web documents. - Addressing an Internet Host
- Addressing a process running on a host
- Email Addresses
- Addressing web contents URL
18Introduction to Distributed Systems
The Internet Topology
19Introduction to Distributed Systems
- The Internet Topology
- The internet consists of an hierarchy of
networks, interconnected via a network backbone. - Each network has a unique network address.
- Computers, or hosts, are connected to a network.
Each host has a unique ID within its network. - Each process running on a host is associated with
zero or more ports. A port is a logical entity
for data transmission.
20Introduction to Distributed Systems
- The Internet addressing scheme
- In IP version 4, each address is 32 bit long.
- The address space accommodates 232 (4.3 billion)
addresses in total. - Addresses are divided into 5 classes (A through
E)
21Introduction to Distributed Systems
- The Internet addressing scheme - 2
22Introduction to Distributed Systems
- Example
- Suppose the dotted-decimal notation for a
particular Internet address is129.65.24.50. The
32-bit binary expansion of the notation is as
follows - Â
-
- Since the leading bit sequence is 10, the
address is a Class B address. Within the class,
the network portion is identified by the
remaining bits in the first two bytes, that is,
00000101000001, and the host portion is the
values in the last two bytes, or
0001100000110010. For convenience, the binary
prefix for class identification is often included
as part of the network portion of the address, so
that we would say that this particular address is
at network 129.65 and then at host address 24.50
on that network.
23Introduction to Distributed Systems
- Another example
- Given the address 224.0.0.1, one can expand it as
follows - Â
- The binary prefix of 1110 signifies that this is
class D, or multicast, address. Data packets
sent to this address should therefore be
delivered to the multicast group
0000000000000000000000000001.
24Introduction to Distributed Systems
- The Internet Address Scheme 3
- For human readability, Internet addresses are
written in a dotted decimal notation
nnn.nnn.nnn.nnn, where each nnn group is a
decimal value in the range of 0 through 255 - Internet host table (found in /etc/hosts file)
- 127.0.0.1 localhost
- 129.65.242.5 falcon.csc.calpoly.edu falcon
loghost - 129.65.241.9 falcon-srv.csc.calpoly.edu
falcon-srv - 129.65.242.4 hornet.csc.calpoly.edu hornet
- 129.65.241.8 hornet-srv.csc.calpoly.edu
hornet-srv - 129.65.54.9 onion.csc.calpoly.edu onion
- 129.65.241.3 hercules.csc.calpoly.edu
hercules
25Introduction to Distributed Systems
- IP version 6 Addressing Scheme
- There are three types of addresses
- Unicast An identifier for a single interface.
- Anycast An identifier for a set of interfaces
(typically belonging to different nodes). - Multicast An identifier for a set of interfaces
(typically belonging to different nodes). A
packet sent to a multicast address is delivered
to all interfaces identified by that address. - The Domain Name System (DNS)
- For user friendliness, each Internet address is
mapped to a symbolic name, using the DNS, in the
format of - ltcomputer-namegt.ltsubdomain hierarchygt.ltorganizatio
ngt.ltsector namegt.ltcountry codegt - e.g., www.csc.calpoly.edu.us
26Introduction to Distributed Systems
27Introduction to Distributed Systems
- The Domain Name System
- For network applications, a domain name must be
mapped to its corresponding Internet address. - Processes known as domain name system servers
provide the mapping service, based on a
distributed database of the mapping scheme. - The mapping service is offered by thousands of
DNS servers on the Internet, each responsible for
a portion of the name space, called a zone. The
servers that have access to the DNS information
(zone file) for a zone is said to have authority
for that zone. - Top Level Domain Names
- .com For commercial entities, anyone in the
world, can register. - .net Originally designated for organizations
directly involved in Internet operations. It is
increasingly being used by businesses when the
desired name under "com" is already registered by
another organization. Today anyone can register a
name in the Net domain. - .org For miscellaneous organizations, including
non-profits. - .edu For four-year accredited institutions of
higher learning. - .gov For US Federal Government entities
- .mil For US military
- Country Codes For individual countries based on
the International Standards Organization. For
example, ca for Canada, and jp for Japan.
28Introduction to Distributed Systems
29Introduction to Distributed Systems
- Name lookup and resolution
- If a domain name is used to address a host, its
corresponding IP address must be obtained for the
lower-layer network software. - The mapping, or name resolution, must be
maintained in some registry. - For runtime name resolution, a network service is
needed a protocol must be defined for the naming
scheme and for the service. - Example
- The DNS service supports the DNS
- the Java RMI registry supports RMI object lookup
- JNDI is a network service lookup protocol.
30Introduction to Distributed Systems
- Addressing a process running on a host logical
ports
31Introduction to Distributed Systems
- Well Known Ports
- Each Internet host has 216 (65,535) logical
ports. Each port is identified by a number
between 1 and 65535, and can be allocated to a
particular process. - Port numbers between 1 and 1023 are reserved for
processes which provide well-known services such
as finger, FTP, HTTP, and email.
32Introduction to Distributed Systems
- Choosing a port to run your program
- For our programming exercises when a port is
needed, choose a random number above the well
known ports 1,024- 65,535. - If you are providing a network service for the
community, then arrange to have a port assigned
to and reserved for your service. - The Uniform Resource Identifier (URI)
- Resources to be shared on a network need to be
uniquely identifiable. - On the Internet, a URI is a character string
which allows a resource to be located. - There are two types of URIs
- URL (Uniform Resource Locator) points to a
specific resource at a specific location - URN (Uniform Resource Name) points to a specific
resource at a nonspecific location.
33Introduction to Distributed Systems
- A URL has the format of
- protocol//host addressport/directory
path/file namesection
34Introduction to Distributed Systems
- More on URLs
- The path in a URL is relative to the document
root of the server. On the CSL systems, a users
document root is /www. - A URL may appear in a document in a relative
form - lt a hrefanother.htmlgt
- and the actual URL referred to will be
another.html preceded by the protocol, hostname,
directory path of the document .
35Introduction to Distributed Systems
- Design Issues in Distributed Systems
- Transparency is the most important issue in
truly distributed systems is to make a group of
machines appear as if it is an old timesharing
system. - Different types of transparency
- Location Transparency Users can not tell where
the resources are located (hardware, software
resources, CPU, printers, files, databases, etc.)
- Migration Transparency Resources must be free
to move from one machine to another without
changing their names. E.g. Moving the mount
points of remote file systems. /usr/dist on the
sun cluster. - Replication Transparency Users can't tell how
many copies exist. System may make multiple
copies for reliability (a disk failure), improved
performance (heavily used files). As long as the
users don't observe anomalous behavior
(coherency) it should not matter.
36Introduction to Distributed Systems
- Concurrency Transparency Multiple users can
share resources automatically. - Multiple readers OK
- Multiple writers Provide automatic mechanisms
to sequentialize this to maintain correctness. - Parallelism Transparency Activities may happen
in parallel without the users knowing about it.
Hard to achieve. Advanced users may want to
exploit the presence of multiple processors.
Because the state-of-the-art is not close to
achieving this automatically. The end is not in
sight!!!! - Sometimes users don't want total transparency.
- Use a special printer
- Use a special hardware accelerator attached to a
particular machine.
37Introduction to Distributed Systems
- Reliability
- One machine goes down -gt another one performs the
computation. - User never sees the difference, except perhaps in
the performance level. - E.g. 5 file servers that have duplicate data.
- Probability of one failing 0.05.
- Probability of all of them failing simultaneously
is 0.54 0.000006 practically negligible.
(Logical OR of the individuals) - In practice distributed systems depend on
several pieces all working simultaneously for the
system to work. (Logical AND of the components)
Distributed system is one on which I cannot get
any wok done because some machine I have never
heard of has crashed. (Lamport)
38Introduction to Distributed Systems
- Reliability has several facets
- Availability
- Fraction of the time the system is available for
use. - Use as few components that need to work as
simultaneously as possible. (reduce the logical
AND) - Allow redundancy (increase logical OR). Replicate
key pieces of hardware and software. - However one has to worry about the issues of
consistency as the degree of redundancy
increases. Tradeoff. - Security
- Also a key issue in reliability
- Easier to authenticate in centralized systems
Use password and OK after that. - Distributed systems Messages between machines.
How do you authenticate? Anybody can put any kind
of message on the network.
39Introduction to Distributed Systems
- Fault tolerance
- How easily / transparently does the system get
out of a failure of some kind? E.g. A machine
goes down? What happens to the process that was
running? Can it be restarted in some other place
exactly at the point the original process left
off. - Important in business/banking systems.
- Performance
- An important aspect of distributed systems
- Many different metrics can be used
- response time/turnaround time
- system utilization
- network capacity utilization
- Performance measurements depend a great deal on
the types of situation. E.g. large number of
compute bound jobs with little/no I/O Vs. large
database applications.
40Introduction to Distributed Systems
- Granularity of computation
- Fine grained e.g. simple operations that can be
done with a few instructions. Lots of
interaction, coordination, I/O, etc. Distributing
them would be too much overhead. - Coarse-grained long computation times. little
I/O, coordination, interaction, Better suited for
distribution. - Scalability
- Designed for 100s of CPUs. How will it work for
100, 000 CPUs? - E.g. French PTT system
- What principles to use?
- Avoid centralized components, e.g. single mail
servers, single file servers. - Avoid centralized tables, databases, etc., e.g.
telephone directory - Avoid centralized algorithms, .e.g an algorithm
that first collects information about the whole
system before computing an optimal route to send
a message.
41Introduction to Distributed Systems
- Characteristics of decentralized algorithms
- Lack of complete information about the whole
system - Make decisions based on local information
- If one machine is down, the algorithm should
still work. - No assumption about a global clock
- General Discussion
- Distributed To divide the computation among
several what ? - Processors/nodes
- Processor CPU (include cache, etc.)
- Nodes single/multiprocessor, memory, I/O,
possibly network interface - Communication The work has to be divided and
distributed so, communication is central. - Bus (processor) or Network (node)
- The parameters CPU, Memory, I/O, Network,
System Software
42Introduction to Distributed Systems
- Many types of Systems
- The differences are difficult to clearly state.
- Some believe that it is a continuum.
- Centralized Single system
- Decentralized multiple systems, but no
coordination - Distributed multiple systems with coordination
- Homogeneous All systems are same/similar
- Heterogeneous Dissimilar nodes in the system
- Server A system providing some services, e.g.
file systems usually more powerful and complex
hardware - Client A system with minimal resources, depends
on servers to get tasks accomplished.
43Introduction to Distributed Systems
- Networked System
- High degree of autonomy of machines
- Loosely-coupled hardware and loosely coupled
software - Machines run their own OS
- May have their own local disk
- Operations have explicit names of the machines
- rlogin Eagle
- May have a file server but the view from
different machines is different. - True Distributed System
- Tightly-coupled software on loosely coupled
hardware - Create an illusion of a Single System Image ,
Virtual Uniprocessor - Uniform view of file system, uniform protection
mechanisms - Uniform communication schemes
44Introduction to Distributed Systems
- Multiprocessor Systems
- Tightly coupled software on tightly coupled
hardware - Typically single run queue
- Shared (logically) memory
- File system is like the centralized system
- Cluster Systems
- Parallel or distributed system
- Consists of a collection of interconnected whole
computers - Utilized as a single computing resource
- Peer relationship between the nodes in a cluster
- Nodes of a cluster do not maintain their internal
anonymity
45Introduction to Distributed Systems
- Networked Distributed Multiprocessor Cluster
- Number of Nodes 1000s 1000s 1000s 10s
- Performance Metric Response Time Response
Time TurnaroundTime TurnAroundTime - Virtual Processor View No Yes Yes Yes
- Node Individualization Yes Yes No No
- Operating Systems Heterogenous
Homogeneous Homogeneous Homogeneous - Copies of OS N N 1 N
- Communication Shared Files Messages Shared
Memory Messages - Network Protocol Required Required Not
Required Not Required - Run Queue No No Yes No
- Inter-node Security None Required None Required
46Introduction to Distributed Systems
- Summary
- We discussed the following topics
- What is meant by distributed computing
- Rationale for distributed Systems
- Centralised versus distributed Systems
- Basic concepts in data communication in
distributed systems - Network architectures the OSI model and the
Internet model - Naming schemes for network resources
- The three-layered architecture of distributed
applications presentation layer, application or
business logic, the service layer - Design issues in distributed systems