Software Engineering of Distributed Systems - PowerPoint PPT Presentation

1 / 39

About This Presentation

Title:

Software Engineering of Distributed Systems

Description:

Hide the fact that the resources are physically distributed over a network -- transparency ... comm based on broadcasting -- WAN needs special location services ... – PowerPoint PPT presentation

Number of Views:69

Avg rating:3.0/5.0

Slides: 40

Provided by: ruthda

Category:

more less

Transcript and Presenter's Notes

Title: Software Engineering of Distributed Systems

1
Software Engineering of Distributed Systems

University of Colorado
Boulder
ECEN5053

2
Course Logistics

Introductions
http//ece.colorado.edu/swengctf
http//ece.colorado.edu/swengctf/distributed
Format
Calendar
Exams -- final exam only
Homework -- in teams of 2 to 3
Phone number for late arrival
Contact information
Text web site www.cdk3.net -- see key pts.

3
Outline for this session

Definition of distributed systems
Purposes
Demands/challenges
Hardware concepts
Software concepts
An example model

4
Definition of a Distributed System

A distributed system is a collection of
independent computers that appears to its users
as a single coherent system. Andrew Tanenbaum
A distributed system is one in which components
located at networked computers communicate and
coordinate their actions only by passing
messages. Coulouris et al (your text)
concurrency of components
lack of a global clock
independent failures of components

5
Alternative definition of a distributed system

You know you have one when the crash of a
computer youve never heard of stops you from
getting any work done. Leslie Lamport

6
If true, implied characteristics?

Computer heterogeneity the user
Communication paths from users perspective
User interaction with system from various
locations
User interaction with applications
Scalability
Availability
Addition or temporary removal of certain
components

7
Examples?

internet --
Not quite there -- some internet applications
more so than others
Some applications, user must be very aware of
which computer is being accessed
and what else?

8
Timeline of what had to happen first
high speed networks
9
Necessary Developments

Take an historical view
1945 - 1985
Computers are large expensive
Most organizations had only a few
lacked a way to connect them
operated independently from one another
By mid-80s ... powerful microprocessors with
power of a then-contemporary mainframe
High speed networks!
Result Easy to combine large numbers of
computers via a high-speed network.

10
Purposes -- what problems are solved?

Easily connect users to remote resources
Share resources with remote users in a controlled
way
Hide the fact that the resources are physically
distributed over a network -- transparency
Should be an open system
Offers services by standard rules that describe
the syntax and semantics of those services
Should be scalable
size, geography, and administration

11
Purpose 1 Access and sharing remotely

Why share?
economics
ease of collaboration -- virtual organizations
ease of info exchange
commerce
Connectivity and sharing lead to security issues
Currently, inadequate protection

12
Purpose 2 Transparency
Transparency Description -- Hide
Access differences in data representation how resource is accessed
Location where a resource is located
Migration that a resource may move locations
Relocation that a resource may be moved while in use
Replication that a resource is replicated
Concurrency that a resource may be shared by competitors
Failure failure and recovery of a resource
Persistence whether a sw resource is in memory or on disk
13
Degree of Transparency

Hiding all distribution aspects not always good
idea
Some times desirable to remain fixed
Messages between processes that are thousands of
miles apart will take hundreds of milliseconds
Trade-off between high degree of transparency and
performance -- why?
The degree of desirable transparency should be
considered in context with other issues such as
performance and cost

14
Purpose 3 Openness

Offers services according to standard rules
describing syntax and semantics of the services.
Rules are formalized in protocols
Services generally specified through interfaces
using Interface Definition Language (IDL)
specify syntax only
natural language used to describe semantics
allows arbitrary process that needs an interface
to talk to another process that provides it
proper interfaces are complete and neutral

15
Goals of Openness

Interoperability and portability
completeness and neutrality are prerequisites
Flexible
easy to configure the system out of different
components from different developers
easy to add new components without impact
easy to replace existing ones without impact
i.e. extensible
easier said than done

16
Purpose 4 Flexibility -- Policy and Mechanism

System must be organized as a collection of
relatively small and easily replaceable or
adaptable components
Need for change component does not provide
optimal policy for a specific user or app
Example differing caching policies
Need to be able to separate policy mechanism

17
Purpose 5 Scability Challenges -- Size

Size
Limitations of centralized services, data, and
algorithms -- become bottleneck
Unlimited processing power and storage cannot
overcome communication limitations
Decentralization introduces some kinds of
uncertainty

18
Purpose 6 Scalability Challenges -- Geography

Existing distributed systems designed for LANs
are based on synchronous communication
Communication in WANs is inherently unreliable
and almost always point-to-point
LANs provide reliable comm based on broadcasting
-- WAN needs special location services
Centralized components prevent geographic scale

19
Purpose 7 Scalability Challenges --
Administration

How to scale across multiple independent
administrative domains
Conflicting policies
usage (payment)
management
security
protect against malice from the new domains
protect against malice from the distributed
system -- e.g. downloaded programs

20
Scaling Techniques

Scalability problems appear as performance ones
hide communication latencies
avoid waiting for responses as much as possible
i.e. construct the requestor to use asynchronous
comm as much as possible
reduce overall communication
distribution -- spreading component parts across
the system, e.g. DNS (see next slide)
replication across the distributed system
increases availability (helps hide latency)
helps balance the load between components

21
Example Dividing DNS name space into zones
Generic
Countries
int
com
mil
org
...
gov
edu
Z1
colorado
Z2
cs ece ...
Z3
22
Outline

Definition
Purposes
Demands/challenges
Hardware concepts
Software concepts
An example model

23
Hardware Concepts

Introduction to how distributed systems can be
organized
how they are interconnected
how they communicate

Shared bus-based Private bus-based
Shared switch-based Private switch-based
Memory
Interconnection
24
Shared Memory Private Memory

Multiprocessors (not multicomputers)
Single physical address space shared by all CPUs
CPU A writes 37 to address 1000
CPU B then reads from address 1000 and gets 37
e.g., multiple processors on a board with shared
memory
Multicomputers
Every machine has its own private memory
CPU A writes 37 to its address 1000
CPU B reads from its address 1000 and gets
whatever happens to be there not affected by the
other write
For example, PCs connected by a network

25
Bus-based Switch-based

Bus architecture of the interconnection network
single network, backplane, bus, cable or other
medium that connects all the machines
For example, cable television
Switched architecture
Individual wires from machine to machine with
many different wiring patterns in use
Msgs move along wires with an explicit switching
decision made at each step to route the message
along one of the outgoing wires.
e.g., worldwide public telephone system

26
Divide conquer -- select and explain

Performance Impacts
bus, shared memory
switched, shared memory
not quite shared memory
homogeneous multicomputers
private memory, bus-based network
private memory, switch-based network
heterogeneous multicomputer systems

27
Performance Impacts--bus, shared memory

Bus-based multiprocessor, shared memory
Coherent memory
Bus contention
If cache memory for each CPU has a high hit rate,
bus traffic drops dramatically
but introduces serious problem -- what is it?
Caching and memory coherence is an issue for
distributed systems
Limited scalability

28
Performance impacts -- switched, shared memory

1. Divide memory into modules connect them to
CPUs with a matrix of switches called a crossbar
switch
Allows multiple CPUs to access shared memory
simultaneously
One still has to wait if both want to access same
module
2. Network of switches to route any input to any
output
May be several switching stages in-between
Need extremely fast switching to reduce latency

29
Performance impacts--not quite shared memory

Reduce cost of switching with hierarchical system
SOME memory associated with each CPU (not shared)
Access to own local memory is quick
Accessing anybody elses memory is available but
slower
NUMA - Non Uniform Memory Access
better average access times than switched nws
whats the problem?

30
Performance impacts-- homogeneous multicomputers
(SANs)

System of individual computers. Therefore...
Each CPU has direct connection to its own local
memory
Challenges surround communication between the
CPUs
Traffic volume will be orders of magnitude lower
than when interconnection network is also used
for CPU-to-memory traffic

31
Performance impacts - private memory, bus-based
network (SANs)

Processors connected thru shared multiaccess
network such as Fast Ethernet
Limited scalability -- performance degrades with
25-100 nodes depending on amt of communication

32
Performance impacts - private memory,
switch-based network (SANs)

Messages are routed through an interconnection
network instead of broadcast as in bus-based
Interconnection networks vary
Grid -- suitable to 2-dimensional problems
Hypercube -- n-dimensional cube
MPPs - massively parallel processors (1000s)
high-performance proprietary interconnection
network designed for low latency, high bandwidth
COWs - clusters of workstations
Std wkstns connected by off-the-shelf
communication components no special measures for
high bandwidth or reliability --gt ??

33
Performance impacts - heterogeneous
multicomputer systems

Most distributed systems are these
Computers are heterogeneous w.r.t. processor
type, memory size, I/O bandwidth, etc.
Interconnection networks can be heterogeneous,
too
Many large-scale heterogeneous multicomputers
lack a global system view
cannot assume same performance or services are
available everywhere
THEREFORE sophisticated software is needed
shield application developers from what is going
on at hardware level (transparency)

34
Software Concepts

Distributed systems software
acts as resource manager(s) for the underlying
hardware
Hide intricacies and heterogeneity of underlying
hardware
The issues that this software faces are the core
of distributed systems principles we will study
this semester

35
When is a distributed system not a distributed
system?

Distributed operating system
Not intended to handle a collection of
independent computers
Network operating system
Does not provide a view of a single coherent
system
true distributed system
Goal scalability and openness of network o.s.
and transparency and ease of use of distributed
o.s.
Additional layer called middleware

36
Various middleware models (paradigms)

A particular paradigm is a set of decisions about
how to describe distribution and communication
Distributed file systems
Remote procedure calls
Distributed objects
Distributed documents
See table

37
Sample Paradigms
Paradigm Distribution Communication
Distributed file system Dist. xparency suppd for traditional files
Remote proc calls Network xparency allows process to call procedure on remote machine
Distributed objects meth. invocation interface implementation on process mach. translates invoc into msg sent to remote object reply msg --gt return value
Distributed documents Info orgd as docs each doc somewhere in the world
38
Each paradigm must address these issues

Communication
Processes their synchronization
Processes their interaction
Naming
Consistency and replication
Fault tolerance
Security

39
Software Engineering of Distributed Systems

Requirements specification of these issues in
distributed systems -- how to recognize, analyze,
specify, trace, and manage
Design -- how to choose, represent, and verify
Implementation -- tools, language support
Testing -- static and dynamic

Write a Comment

User Comments (0)