Fault Tolerance in Distributed Systems - PowerPoint PPT Presentation

1 / 44

About This Presentation

Title:

Fault Tolerance in Distributed Systems

Description:

Title: PowerPoint Presentation Last modified by: G kay Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:69

Avg rating:3.0/5.0

Slides: 45

Provided by: edut1551

Category:

more less

Transcript and Presenter's Notes

Title: Fault Tolerance in Distributed Systems

1
Fault Tolerance in Distributed Systems

Gökay Burak AKKUS
Cmpe516 Fault Tolerant Computing

2
Distributed Systems

Main focus on Services based systems
Web Services
Grid Computing...

3
Service Orientation

diverse programming languages
on diverse platforms
Span organisational boundaries
Service Oriented Architectures (SOA)
Web Services
Grid Computing
SOA is an architectural model that emphasises
properties of interoperability and location
transparency
Collection of services
each service can be considered as a resource that
is either provided or consumed

4
Dependability

Dependability is a collective term that
encompasses
Reliability
Performance
Maintainability
Security
Reliability is the part of dependability
concerned with the probability that a given
system will behave according to its requirements

5
SOAs

the development and integration of complex
systems by representing software functionality as
discoverable services on a network.
A traditional way to increase the dependability
of distributed systems is through the use of
fault tolerance techniques

The approach of design diversity
Multi-Version design (MVD)
availability of multiple functionally-equivalent
services

7
Comparison

Single-version system
Traditional MVD system
Provenance-aware MVD system

8
CMF

Common mode failure
one of shared services fail,
then the failure may propagate back to the
calling services.
occurs when independent or nonindependent faults
lead to similar errors between versions of an MVD
system.

Such failures are a worst case scenario in a
fault-tolerant system as such failures may be
passed through the system undetected
often safer to return no result, and alert an
operator and/or place a system in a safestate,
than it is to allow an undetected error occur.

10
(No Transcript)
11
CMF by failure of a shared service

reduces the confidence that can be placed in the
results of design diversity-based fault tolerance
schemes
Provenance introduced as a solution to this
problem

12
Provenance

The provenance of a piece of data is the
documentation of process that led to that data.
Provenance can be used for
verifying a process,
reproduction of a process
and providing context to a piece of result data

13
Provenance in the context of SOAs

interaction provenance
for some data, interaction provenance is the
documentation of interactions between actors that
led to the data
actor provenance
For some data, actor provenance is documentation
that can only be provided by a particular actor
pertaining to the process that led to the data
In a workflow based SOA interaction, provenance
provides a record of the invocations of all the
services that are used in a given workflow,
including the input and output data of the
various invoked services.

14
Usage of provenance

Through an analysis of interaction provenance,
patterns in workflow execution can be detected
The data of whether a common service was invoked
by various other services in a workflow can be
used in a fault tolerance algorithm to see if any
faults in a workflow stem from the misbehaviour
of one service.

Provenance provides a picture of a system's
current and past operational state, which can be
used to isolate and detect faults
A scheme that performs voting on the results of
functionally-equivalent services in order to mask
faults of the fault model (next slide) is proposed

16
(No Transcript)
17
(No Transcript)
18
PReServ

Provenance Recording for Services
a Java-based Web Services implementation of the
Provenance Recording Protocol
provenance aware SOA by using 3 components
A provenance store that stores, and allows for
queries of provenance
A client side library for communicating with the
provenance store
A handler for the Apache Axis Web Service
container that automatically records interaction
provenance for Axis based services and clients by
recording incoming and outgoing SOAP messages in
a specified provenance store.

19
MVD system

A service i invokes k services in its workflow
a counter Ck stores the number of times a service
k is invoked by MVD channel workflows in the
system.
if i produces a result that agrees with the
consensus result, then every Sk in that services
workflow is increased by one, else Sk is set to
0.
weightings of each service k is then calculated as

20
Voting

FT Grid system used for voting
Based on weighting eliminated results are
obtained
User defined values are also added for voting
process

If a service k1 has a degree of 1, then only one
MVD channel invokes that service
If k1 has a degree of 2, then two MVD channels
invoke it
then bias the weightings of Sk based on
user-defined settings
Example
a user specifies a bias of 0.95 for a servicewith
a degree of 2
then the final weighting of a service where Si
has a degree of 2
Wi Si 0.95
if any service within a given channel fall below
a user-defined minimum weighting, then that
channel is discarded from the voting process.

22
Experiments

a total of 12 web services developed and spread
across 5 machines
using Apache Tomcat/Axis as a hosting environment
each with provenance functionality, and each
registered with a UDDI server.
5 Import Duty services developed
4 Exchange Rate services developed
3 Tax Lookup services developed

23
(No Transcript)
24

simulate a design defect and/or malicious attack
by perturbing code in two of the exchange rate
services ER3 and ER4
probability of failure (in this case, returning
an incorrect value) of 0.33 and 0.5 respectively.

25
Applied Experiments

Experiment 1
Execute a single version client-side application
that invokes a random import duty service,
passing it a randomly generated set of
parameters.
then compare the result it receives against the
fault-free local import duty service, and logs
whether or not a correct answer has been returned.

Experiment-2
execute a client-side MVD application with no
provenance capability
application invokes all 5 import duty services,
and waits for the first three results to be
returned.
application discards the results of any import
duty service whose weighting falls below a
user-defined value, and performs consensus voting
on the remaining results.
if no consensus be reached, or the number of
channels to vote on are less than three, then the
client waits for an additional MVD channel to
return results,
checks the channels weighting to see whether it
should be discarded, and then votes accordingly.
consensus is reached, or all 5 channels have been
This continues until either consensus is reached,
or all 5 channels have been invoked
then compare the results

Experiment-3
execute an MVD client-side application with
provenance capability.
Client invokes all 5 import duty services, and
waits for the first three results to be returned.
Analyzes provenance records of these channels,
and discards the results of any channel that
includes a service that falls below a minimum,
user-defined weighting.
if no consensus be reached, or the number of
channels to vote on be less than three, then the
MVD application waits for an additional channel
to return results, checks to see if this channel
should be discarded, and then votes accordingly.
This continues until either consensus is reached,
or all 5 channels have been invoked
Results from the voter are then compared against
the local fault free import duty service.

28
Experimental Results

Each experiment iterates 1000 times
Each experiment is repeated three times.
test system
Apache Tomcat 5.0.28
Web Services implemented using Apache Axis 1.1,
5 dual 3Ghz Xeon processor machines
Fedora Core Linux 2

29
Generation of Weightings

history-based weighting scheme used
a client application similar to provenance-aware
MVD scheme is ran
history weightings based on the consensus results
of 1000 invocations of all five import duty
services
No logging or verification of results

30
(No Transcript)
31
(No Transcript)
32

the weightings of ER3 and ER4 show significant
deviations
This is due to the faults that are injected into
ER3 and ER4
Based on the results
minimum acceptable weightings are set

33
(No Transcript)
34
Experiment 1- Single version system with no
provenance capability

1000 tests on a random import duty service
164 incorrect results
16.4 undetected incorrect results
Time for UDDI query of import duty service
279.72 ms
Total time until a result 3895 ms.

35
(No Transcript)
36

Common-mode failures are frequent
each channel has an approximately the same
weighting value as there is no provenance data
So unreliable channels are not discarded from
voting
Total time for result 4842 ms
1 sec longer

37
(No Transcript)
38
MVD system with provenance capability

No single common-mode failure occurs
Timing approximately the same value of
experiment-2

39
(No Transcript)
40
Conclusion

Solutions for the provision of dependability in
service-oriented architectures are needed
Approach To extend the concept of
design-diversity-based fault tolerance schemes
(such as multi-version design) to the
service-oriented paradigm
Leverage the benefits of SOAs in order to produce
cheaper MVD systems that has traditionally been
the case
Problem Without the knowledge of the workflow of
the services that forms channels within the MVD
system, the potential arises for multiple
channels to depend on the same service
Lead to increased incidence of common mode failure

41
Conclusion

The technique of provenance to analyze a
services workflow is proposed
An initial scheme that uses provenance to
calculate weightings of channels within an MVD
system based on their workflow is detailed
A system is implemented to demonstrate the
effectiveness of the scheme
Three different client applications is used to
test approach
Single-version system Fail on 16.4 of test
iterations
Traditional MVD fault tolerance Fail on 7.6 of
test iterations
Provenance-aware MVD scheme Failure rate of 0.6
More dependable, no-common mode failures
occurring negligible performance overhead

42
Finally

This paper
Details the potential for provenance data to be
used during the voting process of an MVD scheme
Implements an initial proof-of-concept for the
approach
Future work will include
investigation into obtaining QoS indicators from
the metadata of each service in an MVD channels
workflow (facilitated through actor provenance)
and applying these to the weighting algorithm
investigating the relationship between shared
components and common-mode failure in more detail
(to more finely tune voting scheme)

43
References