Franois Taani Lancaster University formerly LAASCNRS - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Franois Taani Lancaster University formerly LAASCNRS

Description:

Example: middleware non-determinism. some mutex operations must to be intercepted at OS level ... here are relevant for determinism. Mutexes creates here are ... – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 54
Provided by: Francoi95
Category:

less

Transcript and Presenter's Notes

Title: Franois Taani Lancaster University formerly LAASCNRS


1
A Multi-Level Meta-Object Protocol for
Fault-Tolerance in Complex Architectures
  • François TaïaniLancaster University (formerly
    LAAS-CNRS)

2
Motivation
  • Increasingly complex Computer systems (COTS /
    Layers) are used for increasingly critical
    applications.
  • Most COTS have not been built with dependability
    in mind.
  • Dependability is a system-wide issue.

? How to add fault-tolerance to complex
multi-layered software systems in a transparent
and disciplined way?
3
The Vision
4
Outline
  • Reflection Fault-Tolerance
  • Limitation Motivating Example
  • A New Multi-Level MOP Concepts Design
  • Practical Application CORBA Linux

5
What is Reflection?
  • "the ability of a system to think and act about
    itself"

meta-level
observation
control
meta-model"generic connector"
meta-interfaces
base-level
original system
  • separating fault-tolerance from functional
    concerns

6
What are Meta-Object Protocols?
  • A particular way of organising a reflective system

meta-objects
MOP
base objects
7
Former Work
  • F. Kon, F. Costa, G. Blair, R. H. Campbell, The
    case for reflective middleware, Communications
    of the ACM 45(6), 2002, p.33-38
  • J.-P. Fassino, J.-B. Stefani, J. Lawall and G.
    Muller,THINK A Software Framework for
    Component-based Operating System Kernels, Usenix
    Annual Technical Conference, Monterey (USA),
    2002, p.73-86
  • B. Garbinato, R. Guerraoui, K. R. Mazouni,
    Implementation of the GARF Replicated Objects
    Platform,Distributed Systems Engineering
    Journal, 2(1), 1995, p.14-27,
  • G. Agha, S. Frolund, R. Panwar, D. Sturman, A
    Linguistic Framework for Dynamic Composition of
    Dependability Protocols, DCCA-3, Palermo
    (Sicily), Italy, 1993, p.197-207

8
Reflection Fault Tolerance
  • Reflection has been used to add FT to complex
    systems but
  • Only one level of abstraction at a time
    considered so far.

Single-level Reflection ? Limited Fault-Tolerance
9
Motivating Example Replication Multithreading
  • Goal Transparent replication of a CORBA server
  • multi-layer POSIX (OS) CORBA (middleware)
  • multithreaded concurrent processing of requests
  • thread pool upper limit on concurrency

Client
Server
Replication
CORBA
CORBA
OS
OS
10
Motivating Example Replication Multithreading
  • Goal Transparent replication of a CORBA server
  • multi-layer POSIX (OS) CORBA (middleware)
  • multithreaded concurrent processing of requests
  • thread pool upper limit on concurrency
  • Problem 1 state capture / restoration
  • application state
  • middleware OS state

replication
CORBA
OS
11
Motivating Example Replication Multithreading
  • Goal Transparent replication of a CORBA server
  • multi-layer POSIX (OS) CORBA (middleware)
  • multithreaded concurrent processing of requests
  • thread pool upper limit on concurrency
  • Problem 1 state capture / restoration
  • application state
  • middleware OS state
  • Problem 2 control of non-determinism
  • assumption multi-threading only source of
    non-determinism
  • how to replicate non-deterministic mutex
    decisions?

replication
CORBA
OS
12
Enforcing Determinism OS Only
  • The same lock allocation can be enforced on all
    replicas.
  • All replicas reach the same state.
  • Only a small subset of the lock allocations
    impacts determinism.

network
application
application
middleware
middleware
OS
OS
up to 203 synch. operationsper request in
middleware (ORBacus) TAO 52, omniORB 64
13
Smart Multi-Level Reflection
  • With middleware and application semantics
  • OS-level actions can be given a higher level
    semantic.
  • This semantic allows optimal use of OS level
    reflection.

application
application
middleware
middleware
OS
OS
Reification of application middleware activity
14
Rationale behind Multi-Level Reflection
  • Complex systems contain heterogeneous abstraction
    levels. ? Available (meta)-information is
    heterogeneous .
  • Higher levels
  • Rich semantics
  • But they lack information / control.
  • Lower levels
  • Complete Information / control.
  • But lacking semantics
  • Complementary roles lower level information
    control needs to be enriched with higher level
    semantics

15
Implementing Multi-Level Reflection
  • Goal To provide a multi-reflective framework for
    the fault-tolerance of complex, non-reflective
    industrial platforms
  • Challenges
  • Requirements What kind of information is needed
    for fault tolerant mechanisms? Where should this
    information be found?
  • Design How to design a multi-level meta-object
    protocol that supports multi-level reflection?
  • Instrumentation How to instrument an industrial,
    non-reflective platform in a non-invasive,
    transparent way?

16
Requirements
  • Example CORBA Middleware Determinism

17
Requirements
  • Multi-level nature of the requirements

18
Requirements
  • The corresponding meta-interface

interface MetaRequestLifecycle /
Communication / requestHasBeenReceived
(RequestID) replyHasBeenSent
(RequestID) / Control Path /
requestBeforeApplication (RequestID)
requestAfterApplication (RequestID) /
Synchronisation / requestBeforeContentionPoin
t (RequestID, RequestContentionPoint)
requestAfterContentionPoint (RequestID,
RequestContentionPoint)
19
Implementing Multi-Level Reflection
  • Goal To provide a multi-reflective framework for
    the fault-tolerance of complex, non-reflective
    industrial platforms
  • Challenges
  • Requirements What kind of information is needed
    for fault tolerant mechanisms? Where should this
    information be found?
  • Design How to design a multi-level meta-object
    protocol that supports multi-level reflection?
  • Instrumentation How to instrument an industrial,
    non-reflective platform in a non-invasive,
    transparent way?

20
Semantics and Architecture
  • Example middleware non-determinism
  • some mutex operations must to be intercepted at
    OS level
  • but not all of them (otherwise highly
    inefficient)
  • question How to distinguish between mutexes that
    are relevant and those that are not?
  • Proposal use of semantic context
  • We need to understand the purpose of OS level
    mutex operations in the more general context of
    the whole system activity
  • Approach backtracking the computation approach
    that results in a low level OS operation being
    called
  • Simplest backtracking approach stack inspection

21
Semantic Joint Points
middleware
internal threading library
pthread_mutex_init()(mutex creation)
OS
22
Meta-markers
  • To leverage the notion of semantic context, a
    mechanism is needed to transport information
    between different abstraction levels
  • A mechanism encountered in plants in periods of
    droughts the root system communicates with the
    foliage using dedicated chemical substances call
    phytohormones
  • Phytohormones travel through the sap
  • Design based on this metaphore.
  • Sap threads
  • Phytohormones metamarkers

23
Inter-Level Communication with Meta-Markers
thread execution path
higherlevel
lowerlevel
meta-level
base level
24
Using Meta-Markers for MOP Design
  • Meta-markers can be used to design a multi-level
    MOP
  • Example synchronisation facet for middleware
    determinism

interface MetaRequestLifecycle ... /
Synchronisation / requestBeforeContentionPoin
t (RequestID, RequestContentionPoint)
requestAfterContentionPoint (RequestID,
RequestContentionPoint)
  • Two issues to be solved by meta-markers
  • P1 the global semantic context of mutex creation
    must be captured by meta-markers
  • P2 meta-markers must insure a correct
    instrumentation of the selected mutexes

25
Capturing Semantics
  • Problem P1 is solved by source code annotation of
    semantic joint points

init_and_run_middleware(..)
init_request_queue(..) init_some_refcount_
object(..) ... run_ORB()
Mutexes creates here are relevant for determinism
Mutexes creates here are not.
26
Meta-Markers as Meta-Mutex Factories
middleware
thread execution path
meta-level
base level
OS
27
Back to the Meta-Interface
interface MetaRequestLifecycle /
Communication / requestHasBeenReceived
(RequestID) replyHasBeenSent
(RequestID) / Control Path /
requestBeforeApplication (RequestID)
requestAfterApplication (RequestID) /
Synchronisation / requestBeforeContentionPoin
t (RequestID, RequestContentionPoint)
requestAfterContentionPoint (RequestID,
RequestContentionPoint)
28
Implementation
  • Multilevel interception frameworkto control
    non-determinism 8000 LoC C based on CORBA and
    POSIX only platform independent.

29
Case Study Orbacus
  • Behavioural analysis a reverse engineering tool
    dedicated to complex multi-layer systems (20 000
    LoC in C Java)

object creation
thread creation
class
method call
30
Related Publications
  • F. Taïani, J.-C. Fabre, M.-O. Killijian, A
    Multi-Level Meta-Object Protocol for
    Fault-Tolerance in Complex Architectures, The
    IEEE/IFIP Int. Conf. on Dependable Sys. and
    Networks (DSN-05), 2005
  • F. Taïani, J.-C. Fabre, M.-O. Killijian, Towards
    Implementing Multi-Level Reflection for
    Fault-Tolerance, The IEEE/IFIP Int. Conf. on
    Dependable Sys. and Networks (DSN-03), 2003
  • François Taïani, Jean-Charles Fabre, Marc-Olivier
    Killijian, Principles of Multi-Level Reflection
    for Fault-Tolerant Architectures, The IFIP 2002
    Pacific Rim Int. Symp. on Dependable Computing
    (PRDC'2002), 2002
  • CosmOpen A Reverse-Engineering Tool for Complex
    Open-Source Architectures, François Taïani,
    Student Forum of the IEEE/IFIP Int. Conf. on
    Dependable Sys. and Networks, 2003.

31
Conclusion
  • Tension between comprehensive and adaptable
    fault-tolerance, and the multi-component and
    multi-layered nature of modern complex software
    systems.
  • Our proposal to solve this conflict Multi-Level
    Reflection
  • Combines reflective capabilities found in lower
    and higher levels in a global system overview.
  • MLR supported by a multi-level MOP based on
  • semantic joint points
  • meta-markers
  • Outlook Aspect Orientation
  • Deep Aspects
  • Make aspects aware of software thickness

32
The End.
33
The Resulting Approach
34
OS Level Only
  • The same thread scheduling can be enforced on all
    replicas.
  • All replicas reach the same state.
  • But this over-constrains the replicas' execution
  • Impossible to relate OS level activities to
    request processing.
  • All lock operations must be replicated.

(request R1)
(request R2)
(request R1)
?
thread T1
shared variable X
replica 2
replica 1
thread T2
(request R2)
Replication of every lock decision ? highly
inefficient
35
Smart Scheduling Replication
  • With CORBA and application semantics
  • OS-level actions can be given a meaning.
  • This semantic allows optimal use of OS level
    reflection.
  • Which thread executes which request does not
    matter.

no need to replicate this scheduling decision
request R1
request R2
request R1
thread T1
shared variable X
replica 2
replica 1
thread T2
request R2
In practice only 1.5 of scheduling has to be
replicated!
36
Capturing Fault-Tolerance Needs
  • My proposal Reflective Footprints
  • They explicitly capture the reflexive capacities
    that are needed by a family of mechanisms.
  • They uncouple algorithmic core from concrete
    instrumentation.
  • They are architecture neutral.
  • Example replication

37
Instrumentation
  • CORBA-POSIX mapping is generic.
  • Instrumentation on GNU/Linux ORBacus
  • The concrete architecture must be bound to the
    generic mapping.
  • Complex reverse-engineering ORBacus gt 110 000
    LoC
  • Important abstraction effort (dedicated tool,
    CosmOpen)
  • Interface centered approach roots /
    foliage metaphor

38
The Multi-Level Meta-Model...
  • Meta-model centered on the lifecycle of a CORBA
    request
  • aggregates OS-level synchronization and request
    lifecycle

RequestBeforeApplication
RequestAfterApplication
...
request inapplication
requestpost-processing
request pre-processing
RequestContentionPoint(OS level synchronization)
...
ReceptionEnd
ReplyStart
sending of reply
request reception
ReceptionStart
ReplyEnd
39
Experimental Apparatus
  • CosmOpen semi-automatic reverse-engineering
    suite
  • Dedicated to the abstracting effort needed for my
    work.
  • Graph manipulation operators, relies on dot (ATT
    tool ?)
  • Structural behavioral analysis.
  • Very useful to handle very large graphs
  • A trace of ORBacus 2066 invocations ? 2066
    nodes
  • Free Software http//www.laas.fr/ftaiani/7-soft
    ware
  • Model extraction
  • Structural extraction 4280 lines of C (with
    Doxygen)
  • Behavioral extraction 1660 lines of C (with
    gdb)
  • Graph manipulation 17010 lines of Java
  • CosmOpen 22950 LoC

40
Instrumentation
  • Behavioral middleware model
  • obtained with CosmOpen
  • relates OS level actions to application level
    operations
  • identifies points of instru-mentation of
    meta-model

41
Instrumentation
  • Generic shared library (C) for OS interception
  • 6590 lines of C
  • meta-classes to intercept locks and mutex
    individually
  • MetaMutex, MetaSocket
  • supports " transcendence " by piggybacking
    threads
  • MetaThreadInfo, ThreadMetaMutex, ThreadMetaSocket
  • Generic shared library (C) for multi-level
    interception
  • 1460 lines of C
  • uses OS interception to implement its meta-model
  • RequestContentionPoint, MetaRequestLifeCycle
  • Instrumenting ORBacus' original code
  • Very low intrusion 35 new lines
  • 0,02 of original code

42
Lessons Learnt
  • The resulting meta-interface is consistent
    homogeneous
  • Supports non-determinism and checkpointing.
  • My prototype implements the part on
    non-determinism.
  • Efficient for instance, replicating
    synchronization During the processing of one
    request in ORBacus
  • 203 synch operations are observed (pthread_...)
  • My prototype only needs to intercept 3 (gain x
    67).
  • My previous analysis guaranties that these 3
    interceptions are sufficient to maintain the ORB
    consistency.
  • Very low intrusion 0,02 of original code was
    modified
  • Reusable tool CosmOpen, generic interception
    libraries

43
Outlook
  • Components and OSS is not enough!
  • ?  Reflective component model 

meta-interface
standard interface
component
  • To probe further http//www.laas.fr/ftaiani/7-
    software/

44
Multi-Level Reflective Architecture
  • Vision to provide an holistic and consistent
    meta-model of the system to enable a transparent
    fault-tolerance.
  • Problem How to design reflective mechanisms that
    are powerful enough to realise a multi-level
    reflective platform?

application
45
Multi-Level Reflective Architecture
  • Goal to provide an holistic and consistent
    meta-model of the system to enable a transparent
    fault-tolerance.

application
46
Which information is needed?
  • One instrumentation can be reused ? better
    quality, ? costs.
  • Fault-tolerance can be changed during system
    development.
  • Lays the path for dynamic adaptation.
  • My proposal Reflective Footprints
  • They explicitly capture the reflexive needs of a
    family of FT-mechanisms.

semi-active rep.
active replication
passive rep.
47
Replication's Footprint
Reflexive Facets
48
Replication's Footprint
Reflexive Facets
49
Attic
50
Handling Non-Determinism in CORBA
51
Meta-Markers as Meta-Object Factories
52
Meta-markers
53
Semantic Joint Points
  • A and B code location in the middleware where
    the purpose of the mutex invocation becomes
    explicit.
  • We call A and B Semantic Joint Points
Write a Comment
User Comments (0)
About PowerShow.com