Overview Part 2

About This Presentation

Title:

Overview Part 2

Description:

Active Systems = Asynchronous Middleware is capable of rapid response ... WAFT: Support for Fault-Tolerance. in WA-OO-Systems. Byzantine Fault-Tolerance for DSS ... – PowerPoint PPT presentation

Number of Views:99

Avg rating:3.0/5.0

Slides: 61

Provided by: gerdlie

Category:

more less

Transcript and Presenter's Notes

Title: Overview Part 2

1
Distributed Systems

Lecture 2
Overview (Part 2)
23. April, 2002

2
Schedule of Today

Some other international DS People
German Researchers
Two Approaches towards DS
Goals and Challenges of DS

3
DS People
Notion History

Jean Bacon (Cambridge, UK)
Opera group ? Systems group in CA , MSSA,
IMP (interactive presentations support),
Active Systems Asynchronous Middleware is
capable of rapid response to events and may be
used for many application areas, including
detection of mobile users or computers,
response to faults in telecommunications
networks,
detection of illegal entry by
surveillance equipment,
detection of suspicious patterns of use
of bank,
credit or phone cards.
The approach is to use Middleware IDL
(Interface Definition Language) to specify the
events that a given service is able to detect and
notify

4
Ranking of US CS Departments
5
Cornell People

Ken Birman
Secure, reliable scalable DS
ISIS (Toolkit ?commercial), Horus, Ensemble,
Springlass
Emin Gün Sirer
Spin, Kimera, MagnetOS, CliqueNet

6
Cornell people

Fred B. Schneider
Language Based Security
Containment and Integrity for Mobile Code
Cornell Online Certification Authority (COCA)
Andrew Myers
THOR, a Distributed OO-Data Base
Jiv an extended version of Java protecting
privacy, ...

7
Illinois People

Roy Campell
Active Spaces, 2K comp.-based OS
Mobile Security, Cherubim, Seraphim,
..., many more
Klara Nahrstedt
Ad hoc networks, QoS networking,
...

8
Active Spaces
9
2 K
10
Illinois People

M. Dennis Mikunas
Mobile security, security architecture, active
spaces, network-centric operating system.
Daniel Reed
Smart Environments
Performance instrumentation and analysis
techniques for large scale parallel systems and
resource management policies.

11
Smart Environments

Intelligent Information Spaces
Test bed to explore and evaluate intelligent
devices and augmented realities
Proposed work in ubiquitous information spaces
spans three basic areas
interoperable component architectures for device
coordination,
seamless object communication for user quality
of service (QoS), and
adaptive user context and modality management.

12
UCLA People

Leonard Kleinrock
Inventor of the Internet, ANDS,
SSN, Travler, WAMIS, SESAME
...
Gerald F. Popek
Panda, Ficus, Truffles, Travler

13
Advanced Networking and Distributed Systems
(ANDS)
14
UCLA People

Peter Reiher
Dsitributed Operating File Systems
Majid Sarrafzadeh
Embedded System Design
Low-Power Computing
Reconfigurable Computing
VLSI CAD
e-commerce

15
Yale People

Arvind Krishnamurthy
Power aware File Systems for mobiles,
Probabilistic Packet Scheduling
Yang Richard Yang
Network congestion, mobile wireless networks
Network scurity
Edmund Yeh
Queuing theory, wireless systems,
Data networks

16
Washington People

Thomas Anderson
Detour Towards a Virtual Internet
Portolano Invisible Computing
Access Communication Computation
for WAN and Systems Research
WebOS OS support for WA applications
NoW (Network of Workstations)
Steven Gribble
Ninja, DDS, TACC, Denali, Piazza

17
WebOS

WebOS provides basic operating systems services
needed to build
applications that are geographically distributed,
highly available,
incrementally scalable, and dynamically
reconfiguring. Our initial
implementation is split into the following
pieces
WebFS A global file system layer
Active Names A mechanism for logically moving
service functionality
Secure Remote Execution
...

18
Washington People

Ed Lazowksa
Quantitative System performance,
Parallel Distributed Systems
Hank M. Levy
SMT Simultaneous Multithreading,
Web Analysis, Piazza
Porcupine, Opal, Etch, ...
GMS (Global Memory System)

19
Texas Austin

Lorenzo Alvisi
Lightweight Fault-Tolerance
Cache Consistency in WANs
WAFT Support for Fault-Tolerance
in WA-OO-Systems
Byzantine Fault-Tolerance for DSS
Mike Dahlin
Peer-to-peer study group
Lab. for Advanced Systems Research (LASR)
OS Support for a Program-Enabled Web
C0PE Consistent 0-Administrator Personal
Environment

20
Texas Austin

Mohamed G. Gouda
Programming Methodology,
Concurrent and Distr. Computing,
Fault-tolerant Computing, Secure Computing,
Network Protocols,
Formal Methods
Jayadev Misra
Parallel and distributed computing
Proving distributed algorithms

21
Wisconsin People

Andrea C. Arpaci-Dusseau
Gray Box System, WIND, NoW-Sort,
Implicit Coscheduling
Remzi H. Arpaci-Dusseau
Storage Systems and I/O (WIND)Empirical
Analysis
Stoprage Management

22
Wisconsin People

Lawrence H. Landweber
TheoryNet, CSNET,
Mentor of Internet,
First OSI protocol implementation, ...
Marvin Solomon
OO-database systems,
Software development tools,
Distributed OS,
Computer networks,

23
European Scene

DS groups all over the continent
UK, F, Ne, No, I, S, ...
Have a look of your own

24
German DS People

Sebastian Abeck
Application Management
Teleteaching
Frank Bellosa (Erlangen)
Power Management
Components for distributed real-time systems

List is not yet complete
25
German DS People

Alejandro Buchmann (Darmstadt)
TRUSTED (Testbed for Reliable, Ubiquitous,
Secure, Transactional, Event-driven and
Distributed Systems
Peter Druschel (Rice, Houston)
Pastry/PAST Peer-to-peer systems
ScalaServer System support for scalable network
servers

26
German DS People

Claudia Eckert (Darmstadt)
Mobile Computing
Security in DS
GSFS (Group aware cryptographic
file system)
Kurt Geihs (TU Berlin)
Middleware for future application scenarios
QoS distributed Systems

27
German DS People

Hermann Härtig (TU Dresden)
DROPS, L4Linux, (Verified)Fiasco,
COMQUAD(Components with QUantitative properties
and ADaptivity) together with
Alexander Schill, Heinrich Hußmann, Klaus
Meissner (TU Hohenheim), Klaus Meyer-Wegener,
Andreas Pfitzmann
µSINA (Secure Internet
Networking Architecture),

related to our research group
28
German DS People

Gerd Hegering (TU München)
LEONET (Naturanaloge Lern- und
Optimierungsvefahren für vernetzte Systeme
Service Oriented Accounting Management
Hans-Ulrich Heiss (TU Berlin)
Cluster Computing
Security
Resource Management in DS
DISCOURSE (4 Berlin Unis)

29
German DS People

Gernot Heiser
OS, Embedded Systems, DS, SASOS, e.g. Mungi,
IA 64-bit Linux, L4-ports on MIPS and Alpha,
Gelato,
Contributiobns to HW-Developments
U4600, a 64-bit computer based on MIPS
R4600 processor,
used as a research and teaching platform,
and
PLEB, a computer about 10x7x1.5cm in size,
based on the StrongARM SA-1100 processor and
used for a number of projects including remote
data capture, robotics and research into
ubiquitous computing.

related to our research group
30
German DS People

Winfried Lamersdorf (Hamburg)
Distributed Applications
Open Distributed Software Architectures
Friedemann Mattern (ETH Zürich)
Ubiquitious Computing
Middleware (MICO)
Security Privacy

31
German DS People

Michael Merz (Hamburg)
OSM
COSMOS (Common Open
Software Market for SMEs)
Max Mühlhäuser (Darmstadt)
uBiZ ( ubiquitous business, information, and
zest )
uLearn

No photo available
32
German DS People

Jürgen Nehmer (Kaiserslautern)
Ara, GeneSys, B10, Squirrel
Mosquito, Panda, ....

33
German DS People

Hajo Plattner (Walldorf)
R3
Arno Puder (ATT Labs)
Mico ( Open source CORBA
implementation )
COST ( Opensource testbed)

34
German DS People

Hajo Rothermel (Stuttgart)
Mobile Computing
Ditributed Multimedia
Groupware workflow a
Communication protocols
Trusts security
Arno Schill (Dresden)
Standards
QoS
Mobile Computing

35
German DS People

Johann Schlichter (TU München)
Community Online Services
Agent based information management
Wolfgang Schröder-Preikschat
(Magdeburg?Erlangen)
Pure ( Portable universal
runtime executive)
PEACE ( Process execution and
communication environment)

36
German DS People

Peter Sturm (Trier)
WWW-Caching
Customized Software for large Systems
Distributed Applications
Martina Zitterbart
Multicast Group Communication
Mobile Comunication
Distributed persistent objects, ....

No photo available
No photo available
37
Impetus
Notion History

The first driving force behind the trend towards
distributed systems is economics.
A. Tanenbaum
Two different starting-points for distributed
systems
Distribute
Connect

38
Notion History
Distribution Problem
Suppose you have an expensive mainframe with an
OS and applications.
1. How to distribute these applications onto
cheaper PCs or WSs? 2. How to distribute
services of the centralized OS amongst the nodes?
39
Notion History
Connection Problem
Suppose you have n specialized PCs and/or WSs
with different OSes or hosts spread all over the
world
1. How to connect these systems to get an
appropriate remote service? 2. How to support
this heterogeneity and how to
meet platform dependant formats?
Hope To end the tyranny of geography.
40
Real Impetus
Notion History

PCs have been the driving force to develop DS.
Typical requirement of interconnected PCs (at
that time)
Data sharing (e.g., distributed file systems,
Web)
Device Sharing (expensive peripherals, color
printer)
Flexibility (workloads can be shifted to less
loaded
machines, e.g. rlogin)
Communication (Email, etc.)

41
Potential Goals
Goals Challenges
Potential benefits of a distributed system over a
centralized mainframe?
42
Goals Challenges
Potential Goals
Potential benefits of a distributed system over
an isolated PC?
43
Goals Challenges
Major Disadvantages?
44
Problems and Challenges of DS
Goals Challenges

Transparency
Flexibility
Reliability
Performance
Scalability

45
Transparency
Goals Challenges

A user wants to have a single image view when
working with a DS,
i.e. he/she does not have to be aware of where
objects are located,
or at what time its better to get a certain
service done in time etc.

Transparency forms
Location
Migration
Replication
Failure
Concurrency
Parallelism

Resources can be established where they are needed
Resources can migrate without name change
Resources can be replicated as often as needed
Users are unaware of failures of individual
components
Users are unaware of sharing resources with others
Users are unaware of parallel execution of
activities
46
More on Transparency in DS
Goals Challenges

Access transparency (access onto a remote object
similar to an access onto a local one)
Location transparency
location of processes (tasks)
location of CPUs and other resources
location of files
mobile users

47
More on Transparency in DS
Goals Challenges

Parallelism transparence (user writes a serial
program, compiler and OS handle the rest, i.e.
how many threads on what nodes at what time etc.)
Scalability transparence (extending the system
with new nodes at runtime)
Fault transparency (losses of nodes not
noticeable)
Performance transparency
(no node collapses due to a unbalanced load)
Concurrency transparency
(user should not care about other users)

48
Replication
Goals Challenges

Objectives
Improve availability / Fault Tolerance
Improve performance of queries
(either throughput or latency, or better
both)
Lower cost

49
Flexibility
Goals Challenges

Flexibility is a natural characteristic of any
DS,
but you have to support, e.g. it should be easy
to add new system components or to update old
ones.
Furthermore it may be attractive to install
different versions of one system component
(debuggers, compilers etc.)
?
Client server architectures commonly used for DS

50
Client Server Model
Goals Challenges
Client
Server
Kernel
Kernel
Interconnection medium
Remark Though communication is between client
and server, kernels and
communication layers of both nodes are involved
51
Layout of a Request/Reply-Message
Goals Challenges
struct message nodeid_t source / system
supplied / nodeid_t dest / receiver
identity / int opcode / desired operation
/ int count / data size / int xyz /
specific / int result / for reply
/ char objectN_Name / name of target
object / char dataBUF_SIZE / data to
be transferred /
52
Naming Problem
Goals Challenges

Before a client can use any service he/she has to
locate that service. How to do?
Machine.server_process is sufficient to route a
request to the target, but .
- Server ID may have changed due to a restart
- Server has migrated to another
machine
Global process IDs, but how to allocate unique
global process IDs?
Establishing a central service contradicts to
scalability
Use broadcasting to locate the target server,
but broadcasting eats up resources,
not a scalable solution, well get sequences of
messages of the type
where ? here ? request ? reply
Introduce a specific name server, i.e. resulting
in the following sequence
lookup(service) ? reply(server) ?
request(server) ? reply(result)

53
Reliability
Goals Challenges
DS provide higher availability by replication,
but in general more distributed components are
needed to perform a complex service.
? Potentially additional points of failure ?
system architect must increase reliability by
extra measures!

Reliability requires
Consistency
Security
Fault Tolerance

54
Performance
Goals Challenges
Hard to achieve the outermost performance in a DS
because other requirements may conflict with
performance, in particular - Transparency -
Migration - Reliability
Remark DS should offer high performance per
unit cost
55
Performance Measures
Goals Challenges

Bandwidth (throughput) bits to be
transmitted/ time
(e.g. bandwidth of Ethernet 10 Mbps
? to transmit 1 bit it takes 0.1 µs)
Latency (delay) time needed to transfer
a packet from source to target
(latency(transcontinental network) ? 10 ms,
i.e. from NY to LA)
Round-trip time time needed to transfer
acknowledge
Latency Propagation Transmit Queue, whereby
Propagation gt Distance/Speed_of_Light
Transmit Size/Bandwidth

SoL(vacuum) 3108 m/s SoL(cable) ? 2.3 108
m/s SoL(fiber) ? 2.0 108 m/s
56
Application dominates Performance
Goals Challenges
Though bandwidth latency define the performance
of a link, their relative importance depends on
the application.
Example 1 Client sends 1-byte message to server
and receives a 1-byte answer. This application is
heavily latency bound, suppose no serious
computing is involved to give the answer Medium
1 Transcontinental channel with a 100 ms RTT
Medium 2 SAN-channel with a 1 ms RTT Whether
this channel has a bandwidth of 1 Mbps or 100
Mbps is irrelevant, because time to transmit one
byte is either 8 µs or 0.08 µs.
57
Application dominates Performance
Goals Challenges
Though bandwidth latency define the performance
of a link, their relative importance depends on
the application.
Example 2 A digital library is asked to fetch a
25 MB image In this case bandwidth of the channel
is very important, whereas its negligible
whether its medium 1 or medium 2, i.e. 100
ms-RTT or a 1 ms RTT. The time to transmit this
image is about 20 s gt its irrelevant whether
its only 20.001 s or 20.1 s.
58
Scalability
Goals Challenges
Performance must not degrade with a growing DS, ?
consequence for a system-architect