Designing Distributed Systems

About This Presentation

Title:

Designing Distributed Systems

Description:

( attracts many 'experts', higly political) high daily burn-rate (forces rash decisions) ... etc. Especially bad when coupled with latest (immature) technology ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 64

Provided by: kri71

Category:

more less

Transcript and Presenter's Notes

Title: Designing Distributed Systems

1
Designing Distributed Systems

Design for Performance, Reliability, Flexibility
and Security

2
Goal

The previous sessions have tackled performance,
reliability etc. issues that where inherent in
the distributed technologies that weve been
discussing. The goal of this session is to look
at how the distributed technologies are used,
thereby adding an ARCHITECTURE layer on top of
the pure distribution technology.

With this layer in place, we can start with so
called disruptive technologies (webservices,
peer-to-peer) in the next session.
3
Overview

Design principles
The importance of architecture
example 1 large scale project architectural
validation
example 2 portal project caching, replication
and asynchronous requests
example 3 system management change management
example data replication performance and
maintenance

4
Design Principles of Distributed Systems

Locality Carefully co-locate components that
interact heavily.
Sharing Do not perform the same work twice or
more times.
Parallelize Design your system in a way that
lets you do things concurrently. Avoid
unnecessary serialization.
Consistency Carefully evaluate the level of
consistency that is needed with respect to
caching and replication

These principles are described in an extra
chapter of Darrel Ince, Developing Distributed
and e-commerce Applications. See resources.
5
Locality
host X
host A
Order EJB
Order EJB
host C
Facade
Item EJB
Item EJB
host B
Customer EJB
Customer EJB
Enterprise Java Beans introduced local interfaces
as an addition in Release 2.0 to respect the
principle of locality which suggests to
concentrate heavily interacting objects in one
place. Even though there is NO functional
difference between an item with a remote and a
local interface.
6
Sharing (1)
XML parser pool
SAXp.
Object Pool
SAXp.
Client
getParser()
SAXp
SAXp.
returnParser()
SAXp.
SAXp.
Pooling is useful in almost every case even
locally. But see what happens if you run XML over
http and you create a new parser for every
request and there may be MANY requests per
second.
7
Sharing (2)
object cached by service
Client A
Object X
Service
getX()
X
X
Y
Client B
Y
Object X
getX()
Distributed applications without caching do not
work. Try to minimize backend requests while
still keeping application logic sane.
8
Parallelize
load balancer
Client A
host A
Dispatcher
request()
Host B
Client B
request()
A design that respects parallel processing scales
much better here every request can get handled
by any thread running on any host. Avoid
synchronization (wait) points e.g. in servlet
engines or database connections
9
Caching and Replication
host
Web Server
Client A
request()
replicas
Host
Client B
request()
With caching the caching components bear the
responsibility for data validity. In case of
replication the data source is responsible to
keep the replicas consistent and up-to-date.
10
Reasons for Replication

Better latency through distribution of content
(Akamai)
Better latency through pre-aggregation of content
Better throughput through several data sources
Better availability through several sources

Both services and/or data can and in many cases
should be replicated for availability reasons.
Some applications only work because of reduced
latency through replication. But the price for
replication is paid in hardware and consistency
protocols.
11
Synchronous Replication
Replicas lock data
Host
Client A
locked
Change request()
locked
2PC
locked
Synchronous replication provides strong
consistency between replicas and source via the
2PC protocol. Expensive.
12
Asynchronous Replication
Client A
Replicas lock data
Read request()
Update
Stale data (dirty read)
Host
locked
Client B
Update
Change request()
Synchronous replication provides strong
consistency between replicas and source via the
2PC protocol. Expensive.
13
Aggregating Replication
Aggregator
Client A
Synchronous part
Read Request
Asynchronous part
Aggregation replication can be achieved via push
(update calls from backends). If done via pull
(aggregator daemon process polls data sources) it
is more like an aggressive, look-ahead caching.
14
Uni-directional, async. Replication
Replicas
Client B
update
Read request()
Master
update
The read-only QOS makes this topology rather
attractive with respect to throughput and
latency. The only possible inconsistency is a
dirty read.
15
Bi-directional, async. Replication
Replicas
Client B
notify
Change request()
Master
update
Synchronous replication provides strong
consistency between replicas and source via the
2PC protocol. Expensive.
16
Peer-to-Peer async. Replication
Client B
notify
Change request()
notify
Asynchronous P2P replication could be used e.g.
to share session state across web containers. A
special form based on multicast provides better
consisteny and is called virtual synchrony (see
Group Communication in Birman). Croquet uses p2p
synchronous replication with 2PC but does it
scale?
17
Peer-to-Peer sync. Replication
notify
change
notify
A special form based on multicast provides better
consisteny and is called virtual synchrony (see
Group Communication in Birman). Croquet uses p2p
synchronous replication with 2PC but does it
scale?
18
Conflict Resolution via Master
Client B
Loser
Change request()
Update
Lost update
Master
locked
Client B
Winner
Change request()
The request against the master always wins and
overwrites the changes against a replica. In p2p
systems timestamps can be used as well (ordering
problem, clock problem etc.). For conflict
resolution in group communication see Ken Birman.
19
TCP Performance

Multi-Step Connection Initiation (syn/syn ack)
Congestion Control (50 decrease on every lost
packet problem for wireless)
Connection splitting with proxies (reduced RTT)
Flow Control (sliding window size)
Connection Caching (re-use existing TCP
connection, e.g. http1.1 to retrieve images from
web-server

http uses TCP as transport protocol and it seems
to do well even with high connection
initialization costs. But different physical
carriers may influence congestion control
protocols heavily, routing in personal areas
networks may need to respect energy conservation
policies etc.
20
TCP Splitting or do Proxies slow things?
client
server
t0
t1
RTT
client
proxy
server
t0
t1
RTT/2
A proxy can increase throughput considerably as
it will cut down on round-trip time and make the
sender send packets more frequently. (L.Reith,
see resources)
21
Distributed Technologies in a Portal
LDAP or DCE
Quotes
Distributed Cache
Directory
WebService
JMS
JNDI
Application Server Web- Tier
Application Server EJB Tier
Web Server
JDBC
RMI
XML-RPC
CORBA
News
Part of a Portal running on a Web Cluster. How do
we turn all this distributed stuff into an
application architecture?
E-bank
22
Architecture is key

Sooner or later a distributed computing project
will need to define the following artefacts
Information Architecture
Distribution Architecture
System Architecture
Physical Architecture
Architectural Validation
This is a question of pay me now or pay me
later as you will see in the following sections

23
Example 1 the large distributed project
The German state police wants a new IT-System
Inpol-neu. Looks like it will be too expensive,
too slow and probably cancelled after spending a
lot of money (see resources for Ct article) The
system should store all kinds of crime-related
data and allow fancy queries in realtime. The
owners are the states of the BRD.
Large projects have their own rules and problems,
besides the fact that most of them are somehow
distributed designs. If you want to survive those
projects, read Death march projects by.
E.Yourdon.
24
The large distributed project gone foul
Oracle DB
runtime filtering per instance (row)!!
Inpol-neu
corba
corba
users
Agil Access-Server
LDAP
Meta DB
meta data (xml)
thin client
fat client
ultrathin client
A fairly standard design. The bottleneck was
supposedly the handling of access rights very
late in the DB using stored SQL. Every user used
the same table-schema but runtime DB access
control filtered out things that a user should
not see. User rights were additionally
complicated by the multi-party character of the
project Every state in the BRD defined rules and
variables in a special way.
25
Key ingredients for desaster

planned costs gt100 Mio. (attracts many experts,
higly political)
high daily burn-rate (forces rash decisions)
overloaded with features
New technology driven at the limit (XML meta-data
approach to cover serious differences within
customers)
High scalability requirements lots of data to be
handled, high performance, security etc.
Especially bad when coupled with latest
(immature) technology
Multi-party support means too many different
requirements and tedious project handling.
Lots of paper pushing while the core assumptions
of the whole system stay unchallenged (in this
case the very late and complex access control in
the DB)

Some of these cannot be prevented easily. At
least an Extreme Testing Approch could be
followed to discover architectural flaws early.
But it seems that it is the structure of those
large-scale projects that makes these simple and
effective countermeasures impossible. Common
sense and sanity seem to be missing in those
projects and you can feel it. Use early
architectural validation to avoid those problems.
26
Architectural Validation

how does the architecture handle change? In the
inpol-case deleting data required changes in
base-algorithms on all levels because a
re-arrangement/split of data was necessary.
Where are the main bottlenecks in the system?
Is horizontal scaling possible (if not, specify
approach for failover). Remember the facade
(component) pattern along whose lines machines
can be split.
Is only vertical scalability possible? How far?

architectural validation is the phase in a
project where these questions are answered. Dont
start a big project before you have those
answers. (Of course, if already a big team has
been assembled you may not get the time to do the
basic validation the results are well known
27
Use of extreme testing to reduce architectural
risk
Source Ted Osborne from Empirix
28
Example 2 portal project

An enterprise portal combines different
applications and data sources within an
enterprise or across enterprises into a
consistent and convenient view to clients.
Portals use many different kinds of
infrastructure, protocols and services and are
therefore distributed applications.
Portal projects are also notorious for their
performance and reliability problems.

29
Common customize, filter, contact etc.
Dynamic and personalized homepage
Welcome Mrs. Rich, We would like to point you to
our New Instrument X that fits nicely To your
current investment strategy.
Portfolio Siemens, Swisskom, Esso,
Messages 3 new From foo hi Mrs. Rich
Common Banner
News IBM invests in company Y
Quotes UBS 500, ARBA 200
Links myweather.com, UBS glossary etc.
Research asian equity update
Charts Sony
30
Portal Problem Analysis

Reliability
Performance, Caching and Architecture
GUI design
Implementation
Infrastructure
Maintenance
Management

31
System Architecture Diagram

The system architecture captures the main objects
and their interactions. It describes the
processing that happens within the system.

Important Try to capture the essence of the
system architecture in one diagram. It serves as
a communication tool between developers and to
other groups (management) as well. It also makes
system inherent problems more visible.
32
PortalPage Request Flow and Assembly
2
Profile
Synchronous HandlerGroup
3
Portal DB
Start()
1
Homepage Handler
5
Cache
Start()
Marketdata
Cache prefetch
Wait(timout)
Cache fetch
Research
4
Image Handler
Telebanking
Asynchronous HandlerGroup
Quotes
6
News
Telebanking
Servlet Thread
Threadpool Thread
33
Reliability Problems

Java VM blows up in case of stalled backend
requests
No service access layer to control availability
of backend systems
Side-effects of internal threading

34
Java VM memory consumption during complex
homepage request
Memory Resources
CPU activity
Request start
completion
35
The need for a Service Access Layer
Market data Cache
SAL
Portal DB
Requests
Market Data service
The service access layer tracks backend system
connections and prevents request from blocking on
dead connections. It also monitors connection
quality and provides an interface where an
asynchronous loader component can plug in
36
Data Aggregation What, Where and How?
Distribution Architecture
Service Access Layer
determines

Handle interface changes
Disable broken connections
Add new sources
Poll and re-enable sources
Keep statistics on sources

Sources, Protocols, Schemata
Data rates
Response times (average, over day, downtimes)
QOS (e.g. Realtime quotes)
Push/Pull
Security (encryption etc.)

determines
The SAL shields the portal from external
data/application sources
Reliability/ Performance
Problem analysis
37
Distribution Architecture
Getting this information requires tracking
backend services and writing test programs. The
results determine what can be combined on a
personalized homepage.
38
The need for an asynchronous loader
cache
DB
request
pre-load cache asynchronously
async. loader
The async. loader decouples synchronous request
time from asynchronous retrieve time. The is a
tight limit on what can be done in a distributed
system while a user is waiting.
39
Performance

Caching
Pooling
Threading
Synchronization
Synchronous vs. asynchronous requests

40
Performance, Caching and Architecture

No Information Architecture existed Information
not qualified with respect to aging and QOS.
Caching possibilities not used (http) or
underestimated (20 secs. Are static!)
No compression or web accelerators used.
Architecture not fit to support caching (where
and what analysis missing)
Large scale portal needs fragment architecture
Tactical mistakes no automatic service time
control, no automatic DB connection hold control,
internal threading introduced too early

41
Caching Why, What, Where and how much?
Information Architecture
System Architecture

Result Objects/Value Objects
Invalidation mechanism
Addressing of fragments
Cache Subsystem QOS (e.g. automatic re-load)

determine

Lifecycle
Fragmentation
QOS (e.g. Realtime quotes)

Caching possibilities
The DB is usually THE bottleneck in a large-scale
portal
Throughput/ Performance
Problem analysis
42
Information Architecture Lifecycle Aspects
For every bit of information you must know how
long it is valid and what invalidates it
43
How Information- and Distribution Architecture
drive the Portal

IA defines pieces of information to aggregate or
integrate
Profile server
Service Access
Aggre gation
Integ ration Inter- Pret.
Ext. Service
Request
DA tells portal how to map/locate IA defined
fragments (separation of concerns)
Ext. Service
Portal DB
Back-ends
44
Cache fragments, locations and dependencies
(without client and proxy side caches)
Market data Cache
Domain Object Cache (charts, News, Market Data
User Etc.)
SAL
Research Result Bean cache
Hand lers
Full-Page Cache Per user
Portal DB
Quotes Result Bean cache
Controller Servlet
JSPs
Market Data service
News Result Bean cache
Fully processed Page
Page parts, processed
Distributed cache, raw data
Service Access Layer
45
Fragment Based Information Architecture
Channel Access Layer
Normalized Request Object
AL Fragment Cache
Aggregation layer
Profile Info Personalization Rule
Engine Authorization
invalidates
Fragment Description Instance
Integration layer
IL Fragment Cache
invalidates
Fragment Request
Object Dependency Graph
Service Access layer
notifies
Datacache 1
Datacache 2
Storage manager
Storage manager
Goal minimize backend access through fragment
assembly (extension of IBM Watson research)
46
Physical Architecture
The physical architecture deals with reliability
issues (replication, high-availability etc.) and
horizontal and/or vertical scalability. A
projects physical architecture needs to define
the scalability methods FROM THE BEGINNING
because of their influence on the overall system
architecture (e.g. distributed caching) A
horizontally scalable application can be
replicated on more hosts. It avoids a single
point of failure. If an application scales only
vertically this means that one can only install
more CPUs or RAM on the single instance of the
applications host. This type of application has
limited scalability and availability (a so called
HA-application)
47
Physical Portal Architecture Web Cluster
Host (user data)
Auth Service
App. Server Clone
Web Proxy
E-BANK App.
Load Balancer
Web Server
Market Data
Internet Client
Clone
Market Data
App. Server Clone
Web Proxy
Web Server
Portal DB
Clone
F
F
F
F
App. Server Clone
Issues load handling, SSL, fail over, vertical
and horizontal scalability, firewalls and
authentication through SSO
Intranet Client
Clone
Web Proxy
Web Server
48
Physical Architecture Alternatives
several copies of the application on one or more
(smaller) hosts
one (big) application instance only
host1..n
large host
App. Server 1 JVM
App. Server 2 JVM
App. Server JVM
Portal DB
Portal DB
49
Session Failover
host1..n
App. Server 1 JVM
App. Server 2 JVM
Portal DB session state
Your application may hold session state. If it is
a requirement that a client session is not
allowed to break in case of a crashed application
then the session state needs to be made
persistent. A different application server can
then continue the session. But watch out This
requirement has serious consequences with respect
to performance and hardware costs.
50
Several Hosts Without Distributed Cache
Internet
Load balancer
host1
host2
App. Server 1 JVM
App. Server 2 JVM
App. Server 3 JVM
App. Server 4 JVM
cache1
cache2
cache3
cache4
X
X
X
X
BTW ALL external sources suffer from multiple
access!
Item X is loaded several times performance AND
consistency problem!
Portal DB
X
51
Several Hosts With Distributed Cache
Internet
Load balancer
host1
host2
App. Server 1 JVM
App. Server 2 JVM
App. Server 3 JVM
App. Server 4 JVM
Dist.Cache
X
Portal DB
X
Ext. service
52
(No Transcript)
53
Advanced VM clustering in SAP
Client
Processes dynamicall load JVMs and also session
state for a single user. A JVM runs only one user
at a time
see unbreakable Java (resources)
ICM/Dispatcher
Process
Process
Process
Process
Process
JVM
JVM
JVM
JVM
JVM
Session User1
Session User2
Session User3
Session User4
54
Clustering 101

Virtual IP architecture which hides individual
hosts behind common address Consider to put all
hosts into same subnet to avoid layer 3 routing.
This allows direct responses to clients.
loadbalancing always dynamic or are fixed object
referenced handed out? (making the dynamic load
balancing algorithms useless). Do you use hot
stand by or shared load approaches?
Timed tasks how do you prevent them from running
on all machines in the cluster? (e.g.
synchronization through DB)
Session state server affinity? size? network
storage? Did you partition your cluster for
service groups (e.g. 2 machines form one service
group each to avoid session data replication
across all machines
Files need to be available from all machines
(e.g config files). Can be bottleneck if written
concurrently
Naming Services a single point of failure?
EJB component caching/replication transactional
safety? Are stubs replication aware?
primary key generation cluster safe?

Topics taken from D.Purcell, Moving to a cluster
(see resources). How do the current J2EE
application server solve this problem?
55
Example 3 System Management

As soon as an application runs on several hosts
or in several clones on one host we have the
problem of application configuration and change
management.
The problem gets worse when firewalls and
demilitarized zones prevent direct access to the
application servers.
How are applications made aware of configuration
changes RELIABLY?

How do the current J2EE application server solve
this problem?
56
Pull Model Update Problem
Internet
Load balancer
host2
App. Server 3 JVM
App. Server 4 JVM
System Management console changes X! How is this
change propagated to the individual clones or
hosts?
cache3
cache4
X
X
BTW the application needs an update (push)
mechanism as well, e.g. if a user right changes!
Portal DB
X
57
Push Model With Update Notifications
Internet
Load balancer
host1
host2
App. Server 1 JVM
App. Server 2 JVM
App. Server 3 JVM
App. Server 4 JVM
JMS Publish/ Subscribe Infrastructure
cache1
cache2
cache3
cache4
X
X
X
X
Because of the IA/DA definitions individual
fragments can be invalidated
The service access layers listen for changed
sources!
Portal DB
X
58
Example 4 Data Replication Problems
Frequently companies use application towers to
provide functionalities for their customers.
These applications store e.g. personalization
data. Now business wants to hide these towers
behind a portal WHILE STILL ALLOWING clients to
access the individual applications. The business
requirement is as follows A client accessing a
functionality through the portal will always see
the latest personalized information without the
need to explicitly synchronize portal and
application.
59
Solution 1
application
change settings
store settings
retrieve and compare settings
portal
login to portal
store settings
The business requirements are fulfilled settings
changed by application are effective in the
portal as well. The downside the portal needs to
retrieve and compare all settings every time a
user logs in.
60
Solution 2
application
change settings
store settings
retrieve and store settings on demand
portal
login to portal
store settings
In this case the portal will initially retrieve
the application settings and store them in the
portal DB. From then on the portal will always
use those settings during a log-on. The portal
will offer a synchronize button where users can
request a synchronization on demand. The
downside the handling of settings is no longer
transparent to clients.
61
Data Replication Problems Maintenance
Besides the effects on performance, replication
of data always introduces a maintenance problem
as well where is the master copy of the data?
who is allowed to change the master and
where? You will also need two GUIs to maintain
the replicas.
62
Resources (1)

Desaster Inpol-neu, Christiane Schulzki-Haddouti,
Ct 24/2001 pg. 108ff. A typical case of
large-scale, top-notch IT-project gone foul.
Darrel Ince, Developing Distributed and
e-commerce Applications. A very good introduction
to all the topics necessary for building
real-world apps. Still rather thin. The content
and style comes close to what is covered in this
lecture.
David Purcell, Moving to a cluster...
www.sys-con.com/story/print.cfm?storyid47354
IBM Websphere clustering redpaper on
www.redbooks.ibm.com
Luiz Andre Barroso et.al, Web Search for a
planet the google clustering architecture.
Describes an architecture optimized for
read/search access and not the typical
transaction processes. Compare the machine types
and numbers with a large web shop.
Sing Li, High-impact Web tier clustering, Part 1
and 2 Scaling Web services using Java Groups
etc. (www.ibm.com/developerworks )
Thomas Smits, Unbreakable Java A java server
that never goes down. Describes SAPs approach for
creating reliable Java VM environments by
separating session state from VM using shared
memory technology. Also processes are separated
from VMs

63
Resources (2)

L.Reith, Concept of data distribution for
worldwide distributed services in service
-oriented architectures, HDM/DaimlerChrysler
2007.
Tangosol distributed cache http//www.infoq.com/
presentations/distributed-caching-lessons
http//jroller.com/page/ rolsen?entrybuilding_a_d
sl_in_ruby1

Write a Comment

User Comments (0)