Fundamentals Stream Session 1: Introduction to Distributed Systems - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Fundamentals Stream Session 1: Introduction to Distributed Systems

Description:

Additional processing units are used for specific needs (data logging, cockpit displays, etc. ... An airplane has one cockpit, but 2 wings ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 23
Provided by: Francoi95
Category:

less

Transcript and Presenter's Notes

Title: Fundamentals Stream Session 1: Introduction to Distributed Systems


1
Fundamentals Stream Session 1Introduction to
Distributed Systems
Distributed Systems
  • CSC 253
  • Gordon Blair, François Taïani

2
Overview of the Session
  • What is a distributed system?
  • Why distributed systems?
  • Difficulties in developing distributed systems
  • The importance of software platforms
  • Focus on middleware
  • Goals of middleware

Associated Reading Tanenbaum and van Steen, pp.
1-42
3
What is a Distributed System?
A distributed system is a collection of
independent computers that appears to its users
as a single coherent system Tanenbaum and van
Steen
A distributed system is one in which hardware or
software components located at networked
computers communicate and coordinate their
actions only by passing messages Coulouris,
Dollimore and Kindberg
A distributed system is one that stops you
getting work done when a machine youve never
even heard of crashes Lamport
4
What is a Distributed System (continued)?
A distributed system is a system designed to
support the development of applications and
services which can exploit a physical
architecture consisting of multiple autonomous
processing elements that do not share primary
memory but cooperate by sending asynchronous
messages over a communications network Blair
and Stefani, 1998
5
Examples of Distributed Systems
  • The Web
  • If you buy a flight on-line, your computer
    interacts with a distant machine
  • 2 computers interacts to deliver you a service
  • Is it 2 or is it more?
  • Civil Aviation
  • Modern planes have replicated flight control
    computers (Airbus A330/A340 planes have 5 for
    instance)
  • Additional processing units are used for specific
    needs (data logging, cockpit displays, etc.)
  • Peer-to-Peer (P2P) file sharing networks
  • Each user contributes to the network
  • Auto-organisation, no centralised entity

6
Why Distributed Systems?
  • Because the world is distributed
  • You want to book an hotel in Sydney, but you are
    in Lancaster
  • You want to be able to retrieve money from any
    ATM in the world, but your bank is in Preston
  • A railway network is a distributed transport
    system. It needs a distributed computing system.
  • An airplane has one cockpit, but 2 wings
  • Because problems rarely hits two different places
    at exactly the same time
  • As a company having only one database server is a
    bad idea
  • Having two in the same room is better, but still
    risky
  • Because joining forces increases performance,
    availability, etc.
  • High Performance Computing, replicated web
    servers, etc.

7
A Short History of DSs
  • Distributed Computer Systems are largely
    concerned with
  • Data processing/management/presentation
    (computing side)
  • Communication/ coordination (distributed side)
  • Those concerns existed well before computers were
    invented
  • Ancient empires needed efficient communication
    systems
  • cf. the Postal Service of the Persian Empire (6th
    century BC),cf. the Roman roads (many still
    visible today), etc.
  • max message speed 200 miles/day in the Persian
    system
  • assuming a good infrastructure (roads, horses,
    staging posts)
  • Delays impose distributed organisations
  • Persian and Roman empires extended over 1000s of
    miles
  • Trust / secrecy / reliability issues
  • Am I sure Governor X is doing what he says he is?
  • This all has not really changed! Things have only
    speeded up!

8
A Short History of DSs
  • Computers are far more recent than empires
  • The first modern computers appeared just after
    WWII
  • They were slow, bulky, and incredibly expensive
  • The ENIAC (1945), used by the US army 30 tons,
    170 m2 footprint, 180 kilowatts, 18,000 vacuum
    tubes, and 5,000 additions/second (5KHz),
  • Price 500,000 (in US of the time, would be
    roughly 5,000,000 today)
  • And distributed computing is even more recent
  • For a long time, only very few computers around
    anyway
  • No practical technology to connect them
  • This all changed in the 80s
  • The rise of the micro-computers (PC, Mac, etc.)
  • The launch of the Internet (1982, TCP/IP),
    after 10 years of development
  • The pace of change is accelerating as we enter
    the 21st century
  • Heterogeneous connectivity and devices
  • Fusion of services (telephony, broadcasting,
    Internet)

9
But Why is This so Hard?
  • A bank asks you to program their new ATM software
  • Central bank computer (server) stores account
    information
  • Remote ATMs authenticate customers and deliver
    money
  • A first version of the program
  • ATM (ignoring authentication and security
    issues)
  • Ask customer how much money s/he wants
  • Send message with ltcustomer ID, withdraw, amountgt
    to bank server
  • Wait for bank server answer ltOKgt or ltrefusedgt
  • If ltOKgt give money to customer, else display
    error message
  • Central Server
  • Wait for messages from ATM ltcustomer ID,
    withdraw, amountgt
  • If enough money withdraw money, send ltOKgt, else
    send ltrefusedgt

10
Why is This so Hard (continued)?
ATM
Bank Server
John 500
  • But ...
  • What if the bank server crashes just after 2 and
    before 3?
  • What if the ltOKgt message gets lost? Takes days to
    arrive?
  • What is the ATM crashs after 1, but before 4?

11
Byzantine Generals A Further Cautionary Tail
  • Assumptions
  • Asynchronous and unreliable communication
  • No traitors

12
Byzantine Generals (continued)
  • To focus the discussion let us focus on reaching
    agreement between two generals
  • General Ferguson ?
  • General Lennon ?
  • Let us try to design an appropriate protocol
  • Ferguson sends a message to Lennon and awaits an
    ack/ on receipt Lennon sends an ack back to
    Ferguson
  • If Ferguson does not get an ack, he timeouts and
    tries again after n timeouts he assumes failure
  • He decides on attack if ack received, otherwise
    decides to abort
  • But Lennon needs to know if Ferguson received the
    ack to also safely compute attack
  • gt need to acknowledge the ack?
  • Etc.

13
Byzantine Generals (continued)
  • Lemma
  • Given our assumptions, it is impossible for
    Generals Ferguson and Lennon to reach agreement
    on attack or abort
  • Outline proof
  • Suppose there is a minimum sequence of messages
    for Ferguson and Lennon to reach agreement
  • Focus on the final message M from say Ferguson
    to Lennon Ferguson cannot know whether this
    message arrives
  • Suppose Ferguson has just decided on attack (a
    parallel argument exists for abort) his
    decision is unaffected by whether his final
    message to Lennon arrives, and hence he has no
    real knowledge of the state of Lennon gt another
    ack needed
  • Hence the original message sequence is not
    minimal (QED)

14
The Overall Lesson
  • To develop good distributed systems
  • Developing good software is difficult
  • Developing good distributed software is even
    harder
  • To do this you need help!
  • Very hard to build such systems on bare bones
    hardware
  • Strong need for software platforms
  • Network operating systems
  • Distributed operating systems
  • Middleware!

15
Network OSs
  • Network OS
  • Distribution not hidden by OS
  • But a number of services offered to support
    distribution
  • remote login (ssh), remote copying (scp)
  • distributed file systems (samba, nsf)
  • Well adapted to heterogeneous DS
  • I can mount a samba share (MS technology) on my
    Mac (Apple)
  • I can use ssh to connect from my PC to a Unix
    server

Distributed Applications
Machine A
Machine C
Network OSServices
Network OSServices
Network OSServices
Network
16
Distributed OSs
  • Distributed Operating Systems
  • Completely hide the distribution single system
    image
  • The user has the illusion to use one single
    multiprocessor machine (symmetric
    multi-processing, SMP)
  • Many similarities between DOS and OS with SMP
    support
  • Essentially used for homogeneous cluster of
    machines interconnected with high performance
    networks

Distributed Applications
Machine A
Machine C
Distributed OS Services
Network
17
Middleware
  • Pros and Cons of DOS and NOS
  • DOS are user-friendly but dont scale well.
    Reverse for NOS
  • How to get the best of both world?
  • Solution Middleware!

18
Goals of a Middleware Platform
  • Resource sharing
  • The ability to access and share resources in a
    distributed environment
  • The bread and butter of distributed systems
  • Transparency
  • The ability to view a distributed system as if it
    were a single computer
  • Varying dimensions of transparency incl.
    location, access, migration, etc.
  • The degree of transparency is a key decision in
    any systems architecture
  • Openness
  • The offering of services according to standard
    rules (syntax and semantics)
  • Openness provides support for the key properties
    of portability and interoperability
  • Again, the degree of openness is a key factor in
    systems design
  • Extensibility
  • The ability to be able to introduce new or
    modified functionality

19
Goals of a Middleware Platform (continued)
  • Scalability
  • Scalable with respect to size, e.g. support
    massive growth in the number of users of a
    service
  • Scalable with respect to geography, e.g.
    supporting distributed systems that span
    continents (dealing with latencies, etc)
  • Scalable with respect to administration, e.g.
    supporting systems which span many diferent
    administrative organisations
  • Dependability
  • Security
  • Providing secure and authenticated channels,
    access control, key management, etc.
  • Fault-tolerance
  • Providing highly available and resilient
    distributed applications and services

20
Styles of Middleware Platform
  • Client-server platforms
  • e.g. DCE (Distributed Computing Environment)
  • One of the first, historical importance
  • Distributed object technology
  • e.g. CORBA
  • e.g. Java RMI (as featured in this course)
  • Component technologies
  • e.g. Enterprise Java Beans
  • Others styles
  • Resource discovery platforms (e.g. Jini), agent
    platforms, (e.g. Aglets), group communication
    services (e.g. Java Groups), distributed file
    systems, publish-subscribe systems, distributed
    transaction services, distributed document-based
    systems, P2P technology, etc.

21
Overall Syllabus (Revisited)
22
Expected Learning Outcomes
  • At the end of this session
  • You should be able to define distributed computer
    systems
  • You should be able to explain why distribution is
    needed
  • You should know everyday examples of distributed
    systems
  • You should understand how they evolved into their
    current form
  • You should have an intuition of why realising
    good distributed systems is difficult
  • You should understand what the role of middleware
    in supporting the development of distributed
    applications and services
Write a Comment
User Comments (0)
About PowerShow.com