Title: Fundamentals Stream Session 1: Introduction to Distributed Systems
1Fundamentals Stream Session 1Introduction to
Distributed Systems
Distributed Systems
- CSC 253
- Gordon Blair, François Taïani
2Overview of the Session
- What is a distributed system?
- Why distributed systems?
- Difficulties in developing distributed systems
- The importance of software platforms
- Focus on middleware
- Goals of middleware
Associated Reading Tanenbaum and van Steen, pp.
1-42
3What is a Distributed System?
A distributed system is a collection of
independent computers that appears to its users
as a single coherent system Tanenbaum and van
Steen
A distributed system is one in which hardware or
software components located at networked
computers communicate and coordinate their
actions only by passing messages Coulouris,
Dollimore and Kindberg
A distributed system is one that stops you
getting work done when a machine youve never
even heard of crashes Lamport
4What is a Distributed System (continued)?
A distributed system is a system designed to
support the development of applications and
services which can exploit a physical
architecture consisting of multiple autonomous
processing elements that do not share primary
memory but cooperate by sending asynchronous
messages over a communications network Blair
and Stefani, 1998
5Examples of Distributed Systems
- The Web
- If you buy a flight on-line, your computer
interacts with a distant machine - 2 computers interacts to deliver you a service
- Is it 2 or is it more?
- Civil Aviation
- Modern planes have replicated flight control
computers (Airbus A330/A340 planes have 5 for
instance) - Additional processing units are used for specific
needs (data logging, cockpit displays, etc.) - Peer-to-Peer (P2P) file sharing networks
- Each user contributes to the network
- Auto-organisation, no centralised entity
6Why Distributed Systems?
- Because the world is distributed
- You want to book an hotel in Sydney, but you are
in Lancaster - You want to be able to retrieve money from any
ATM in the world, but your bank is in Preston - A railway network is a distributed transport
system. It needs a distributed computing system. - An airplane has one cockpit, but 2 wings
- Because problems rarely hits two different places
at exactly the same time - As a company having only one database server is a
bad idea - Having two in the same room is better, but still
risky - Because joining forces increases performance,
availability, etc. - High Performance Computing, replicated web
servers, etc.
7A Short History of DSs
- Distributed Computer Systems are largely
concerned with - Data processing/management/presentation
(computing side) - Communication/ coordination (distributed side)
- Those concerns existed well before computers were
invented - Ancient empires needed efficient communication
systems - cf. the Postal Service of the Persian Empire (6th
century BC),cf. the Roman roads (many still
visible today), etc. - max message speed 200 miles/day in the Persian
system - assuming a good infrastructure (roads, horses,
staging posts) - Delays impose distributed organisations
- Persian and Roman empires extended over 1000s of
miles - Trust / secrecy / reliability issues
- Am I sure Governor X is doing what he says he is?
- This all has not really changed! Things have only
speeded up!
8A Short History of DSs
- Computers are far more recent than empires
- The first modern computers appeared just after
WWII - They were slow, bulky, and incredibly expensive
- The ENIAC (1945), used by the US army 30 tons,
170 m2 footprint, 180 kilowatts, 18,000 vacuum
tubes, and 5,000 additions/second (5KHz), - Price 500,000 (in US of the time, would be
roughly 5,000,000 today) - And distributed computing is even more recent
- For a long time, only very few computers around
anyway - No practical technology to connect them
- This all changed in the 80s
- The rise of the micro-computers (PC, Mac, etc.)
- The launch of the Internet (1982, TCP/IP),
after 10 years of development - The pace of change is accelerating as we enter
the 21st century - Heterogeneous connectivity and devices
- Fusion of services (telephony, broadcasting,
Internet)
9But Why is This so Hard?
- A bank asks you to program their new ATM software
- Central bank computer (server) stores account
information - Remote ATMs authenticate customers and deliver
money - A first version of the program
- ATM (ignoring authentication and security
issues) - Ask customer how much money s/he wants
- Send message with ltcustomer ID, withdraw, amountgt
to bank server - Wait for bank server answer ltOKgt or ltrefusedgt
- If ltOKgt give money to customer, else display
error message - Central Server
- Wait for messages from ATM ltcustomer ID,
withdraw, amountgt - If enough money withdraw money, send ltOKgt, else
send ltrefusedgt
10Why is This so Hard (continued)?
ATM
Bank Server
John 500
- But ...
- What if the bank server crashes just after 2 and
before 3? - What if the ltOKgt message gets lost? Takes days to
arrive? - What is the ATM crashs after 1, but before 4?
11Byzantine Generals A Further Cautionary Tail
- Assumptions
- Asynchronous and unreliable communication
- No traitors
12Byzantine Generals (continued)
- To focus the discussion let us focus on reaching
agreement between two generals - General Ferguson ?
- General Lennon ?
- Let us try to design an appropriate protocol
- Ferguson sends a message to Lennon and awaits an
ack/ on receipt Lennon sends an ack back to
Ferguson - If Ferguson does not get an ack, he timeouts and
tries again after n timeouts he assumes failure - He decides on attack if ack received, otherwise
decides to abort - But Lennon needs to know if Ferguson received the
ack to also safely compute attack - gt need to acknowledge the ack?
- Etc.
13Byzantine Generals (continued)
- Lemma
- Given our assumptions, it is impossible for
Generals Ferguson and Lennon to reach agreement
on attack or abort - Outline proof
- Suppose there is a minimum sequence of messages
for Ferguson and Lennon to reach agreement - Focus on the final message M from say Ferguson
to Lennon Ferguson cannot know whether this
message arrives - Suppose Ferguson has just decided on attack (a
parallel argument exists for abort) his
decision is unaffected by whether his final
message to Lennon arrives, and hence he has no
real knowledge of the state of Lennon gt another
ack needed - Hence the original message sequence is not
minimal (QED)
14The Overall Lesson
- To develop good distributed systems
- Developing good software is difficult
- Developing good distributed software is even
harder - To do this you need help!
- Very hard to build such systems on bare bones
hardware - Strong need for software platforms
- Network operating systems
- Distributed operating systems
- Middleware!
15Network OSs
- Network OS
- Distribution not hidden by OS
- But a number of services offered to support
distribution - remote login (ssh), remote copying (scp)
- distributed file systems (samba, nsf)
- Well adapted to heterogeneous DS
- I can mount a samba share (MS technology) on my
Mac (Apple) - I can use ssh to connect from my PC to a Unix
server
Distributed Applications
Machine A
Machine C
Network OSServices
Network OSServices
Network OSServices
Network
16Distributed OSs
- Distributed Operating Systems
- Completely hide the distribution single system
image - The user has the illusion to use one single
multiprocessor machine (symmetric
multi-processing, SMP) - Many similarities between DOS and OS with SMP
support - Essentially used for homogeneous cluster of
machines interconnected with high performance
networks
Distributed Applications
Machine A
Machine C
Distributed OS Services
Network
17Middleware
- Pros and Cons of DOS and NOS
- DOS are user-friendly but dont scale well.
Reverse for NOS - How to get the best of both world?
- Solution Middleware!
18Goals of a Middleware Platform
- Resource sharing
- The ability to access and share resources in a
distributed environment - The bread and butter of distributed systems
- Transparency
- The ability to view a distributed system as if it
were a single computer - Varying dimensions of transparency incl.
location, access, migration, etc. - The degree of transparency is a key decision in
any systems architecture - Openness
- The offering of services according to standard
rules (syntax and semantics) - Openness provides support for the key properties
of portability and interoperability - Again, the degree of openness is a key factor in
systems design - Extensibility
- The ability to be able to introduce new or
modified functionality
19Goals of a Middleware Platform (continued)
- Scalability
- Scalable with respect to size, e.g. support
massive growth in the number of users of a
service - Scalable with respect to geography, e.g.
supporting distributed systems that span
continents (dealing with latencies, etc) - Scalable with respect to administration, e.g.
supporting systems which span many diferent
administrative organisations - Dependability
- Security
- Providing secure and authenticated channels,
access control, key management, etc. - Fault-tolerance
- Providing highly available and resilient
distributed applications and services
20Styles of Middleware Platform
- Client-server platforms
- e.g. DCE (Distributed Computing Environment)
- One of the first, historical importance
- Distributed object technology
- e.g. CORBA
- e.g. Java RMI (as featured in this course)
- Component technologies
- e.g. Enterprise Java Beans
- Others styles
- Resource discovery platforms (e.g. Jini), agent
platforms, (e.g. Aglets), group communication
services (e.g. Java Groups), distributed file
systems, publish-subscribe systems, distributed
transaction services, distributed document-based
systems, P2P technology, etc.
21Overall Syllabus (Revisited)
22Expected Learning Outcomes
- At the end of this session
- You should be able to define distributed computer
systems - You should be able to explain why distribution is
needed - You should know everyday examples of distributed
systems - You should understand how they evolved into their
current form - You should have an intuition of why realising
good distributed systems is difficult - You should understand what the role of middleware
in supporting the development of distributed
applications and services