Title: The JgroupARM Dependable Computing Toolkit
1 The Jgroup/ARMDependable Computing
Toolkit Hein Meling Stavanger University College
NorwayDepartment of Electrical and Computer
Engineering Alberto Montresor University of
Bologna - ItalyDepartment of Computer Science
2Context
- (Distributed) systems that require
- Reliable and high-availability operation
- Fault tolerance
- (Load balancing)
- Based on cheap hardware and software
- Commercial off the shelf, and not custom hardware
- Heterogenous software (OS) architectures
- Middleware architectures for distributed
computing - Middleware between the application and OS
3Types of Failures
- Processor failures
- Crash failures
- Value failures (very expensive)
- Network failures
- Operating System hangs
- Memory leaks
- Software design errors(beyond state-of-the-art)
4Overview
- Jgroup
- A toolkit aimed at supporting the development of
reliable and highly-available applications. - Autonomous Replication Management (ARM)
- A framework for server replica deployment and
recovery without user intervention. - History
- Formal specification (1996-97)
- Algorithm description and Jgroup implementation
- Integration with existing technologies (Java RMI
/ Jini) - The ARM framework (2000-03)
- Development of Jgroup-based applications
5- Summary
- Introduction
- Object Group Communication
- The ARM framework
- Integration with Java RMI / Jini
- Conclusions
6The Problem
- Some environments supporting distributed
computing - CORBA (OMG)
- DCOM / .NET (Microsoft)
- Java RMI / Jini / EJB (Sun)
- Characteristics
- Object-oriented
- Based on client - server remote method
invocations - Promote modularity, reusability,
interoperability, portability
7Java Remote Method Invocations
- Java RMI protocol
- enables objects residing in different JVMs to
communicate through remote method invocations
JVM1
JVM2
8Java Remote Method Invocations
method()
return x
JVM1
JVM2
9The Problem
- Distributed computing environments did not
provide adequate support for developing reliable
and high-available applications - Lack of reliable one-to-many interaction
primitives - From the clients point of view non-transparent
access to replicated servers - From the servers point of view no support for
maintaining consistency
10The Solution The Object Group Paradigm
- Object group
- A dynamic collection of server objects that
cooperate in order to deliver some service and
maintain shared state - Group method invocations
- The act of invoking a method on an object group
- The method is executed by a certain number of
servers in the object group, depending on the
invocation semantics
ObjectGroup
11The Solution The Object Group Paradigm
- From the clients point of view
- Groups must be transparent - like standard remote
objects - Clients need not be aware that they are
interacting with an object group instead of a
single server - From the servers point of view
- Server implementation - as transparent as
possible - Servers forming a group
- must cooperate to maintain shared state and
- to appear as a single object
12Group Communication
- Group communication has been shown to be a
powerful paradigm for supporting the development
of dependable applications in distributed systems - Management of dynamic groups(join/leave
operations) - Failure monitoring(crashes / partitionings)
- One-to-many communication
- Ordering of events (FIFO, Causal, Atomic)
- State synchronization tools
13Other Object Group Systems
- CORBA
- Electra Cornell, Zurich
- Object Group Service (OGS) EPFL, Lausanne
- Eternal UC Santa Barbara, Eternal Systems
- Newtop Newcastle, UK
- Java RMI
- Filterfresh Bell Labs
- JavaGroups Cornell
- Aroma UC Santa Barbara
- DCOM
- Quintet Cornell
14Jgroup Yet Another Object Group Service?
- Support for partition-awareness
- Modern wide-area communication networks are often
characterized as highly partitionable - Jgroup supports the development of reliable and
high-available applications in partitionable
systems - Moreover
- Is extends modern technologies like Java RMI and
Jini - Is completely written in Java (portability)
- Supports complex merging service
- Extensible deployment, recovery and upgrade
facilities
15Autonomous Replication Management
- Support for transparent replica deployment
- Placing server replicas on machines in the
network - Selecting machines so that each application can
tolerate both network and machine failures - Support for replica recovery
- Jgroup detect and report failures
- ARM replace any crashed server replica with a new
instance
16- Summary
- Introduction
- Object Group Communication
- The ARM framework
- Integration with Java RMI / Jini
- Conclusions
17Group Membership
- Group membership service tracks both voluntary
and involuntary changes in the groups membership - Variations are reported to group members through
the installation of views - Installed views
- Consist of a collection of members
- Correspond to the groups current membership as
perceived by the members included in the view
18Group Membership A Simple Scenario
view
19Partition-awareness
- What kind of behavior can we expect from
fault-tolerant applications in the presence of
network partitioning? - The primary-partition approach
20Support for partition-awareness
- Jgroup supports dependability in partitionable
systems - Development of applications aware of the
existence of partitions (on the server-side) - Partition-aware applications take advantage of
their semantics in order to be more available - Computations continue in all partitions of the
system
21Group Membership A Partitioning Scenario
22Example Task Execution Service
Primary Partition
Partition-aware
23Comparison
- Primary-partition approach
- Easy to maintain a single, coherent shared
state(strong consistency) - Servers in non-primary partitions unable to serve
requests (low availability) - Partition-aware approach
- Servers in multiple partitions may be able to
serve requests(high availability) - Partitions evolve independently, possibly leading
to inconsistent states (loose consistency)
24Comparison (Cont.)
- Primary-partition approach
- Development of fault-tolerant applications is
simpler(active replication of existing non
fault-tolerant servers) - Developers cannot exploit application semantics
in order to provide a more available service - Partition-aware approach
- Applications adapt their behavior and remain
available in many partitions (perhaps by reducing
their quality of service) - Development of fault-tolerant applications is
more complex (case-by-case design is needed)
25The State Merging Problem
- During partitioning, the state of servers
belonging to distinct partitions may become
inconsistent - When the partition disappears, an
application-specific state merging protocol may
be needed - Servers participating in the protocol try to
define a new shared state that reconciles (when
possible) the divergences
26The State Merging Problem
- During partitioning, the state of servers
belonging to distinct partitions may become
inconsistent - When the partition disappears, an
application-specific state merging protocol may
be needed - Servers participating in the protocol try to
define a new shared state that reconciles (when
possible) the divergences
27The State Merging Problem
- State merging protocols are based on the exchange
of information among servers that have been
partitioned - Jgroup provides a state merging service (SMS)
that simplifies the development of state merging
protocols - NOTE
- Determining
- what information needs to be exchanged
- how to use it to construct a new consistent
shared state - is an application-dependent problem
28General Schema for State Merging Protocols
- In each of the merging partitions, a coordinator
is selected - SMS interrogates each coordinator to obtain
information about its current state - State information from a coordinator is passed to
servers that used to be partitioned from it - Each of the servers merge information from
coordinators with their own state
S1
getState()
S2
putState()
S3
S4
29General Schema for State Merging Protocols
- In each of the merging partitions, a coordinator
is selected - SMS interrogates each coordinator to obtain
information about its current state - State information from a coordinator is passed to
servers that used to be partitioned from it - Each of the servers merge information from
coordinators with their own state
S1
getState()
S2
putState()
S3
S4
30Full Object-Orientation
- Existing object group systems fail to provide a
completely object-oriented environment for
software developers
Remote methodinvocations
Messagemulticasting
31View Synchrony
- View synchrony (1)
- If a correct server S executes an invocation
during a view, then - all servers within the view will also execute the
invocation, - or S will install a new view
- View synchrony does not admit executions like
this
S1
S2
S3
S4
32View Synchrony
- View Synchrony (2)
- All servers that survive from one view to the
same next view execute the same set of
invocations in the original view - View synchrony does not admit executions like
this
S1
S2
S3
S4
33Internal Group Method Invocations
- Synchronous invocations
- The method invocation terminates by returning a
vector of return values, one from each server at
which the method was executed - Asynchronous invocations
- The method invocation terminates immediately
replies (if any) are returned to a callback
object - Can be used to simulate message multicasting
through void methods (one-way)
34Internal Invocations example
Synchronous invocation
S1
S2
S3
35Internal Invocations example
S1
S2
S3
36External Group Method Invocations
- Anycast invocations
- Are executed by at least one server in the object
group (unless the client is partitioned from the
group) - Efficiency (same cost as standard RMI
interactions) - Useful for read methods on replicated databases
- Multicast invocations
- Are executed by all servers in a view, following
the view synchrony semantics - More costly (involve several servers)
- Useful for write methods on replicated databases
37External invocations example
C1
S1
S2
S3
C2
Multicast invocation registry.bind(name, obj)
Anycast invocation registry.lookup(name)
38- Summary
- Introduction
- Object Group Communication
- The ARM framework
- Integration with Java RMI / Jini
- Conclusions
39Replication Management The Problem
- Object Group Systems support replication
transparency - Membership management
- Reliable multicast
- But does not support full failure transparency
- Application or manual support to distribute
replicas - Application support or manual intervention
required to recover from replica failures - Complicated tasks
- Application implementations prone to contain
errors - These tasks should not be left to the application
developer
40Solution Autonomous Replication Management
- Support for creating object groups
- By placing individual members on distinct
machines - Each application may specify a replication policy
- For example, redundancy level 3
- Support for failure recovery
- Jgroup detects and reports failures to ARM
- ARM reacts by creating a replacement member for
each failed member, perhaps on a different
machine - Each application may specify a recovery policy
41ARM Replica Distribution
item.ntnu.no
Management Client
Router
ux.his.no
42ARM Replica Distribution
item.ntnu.no
Management Client
Router
ux.his.no
43ARM Recovery from Crash Failure
item.ntnu.no
Group Leader
Management Client
Router
ux.his.no
44ARM Recovery from Crash Failure
item.ntnu.no
Group Leader
Management Client
Router
ux.his.no
45- Summary
- Introduction
- Object Group Communication
- The ARM framework
- Integration with Java RMI / Jini
- Conclusions
46Introduction to Jini
- Jini is an API built on top of the Java 2
platform - enables spontaneous networks of devices/software
services to assemble into federations of objects - addresses the distribution problems in these
federations through a set of simple interfaces
and protocols
Jini Network
47Jini Architecture
- The components of the Jini architecture may be
divided in three categories - Infrastructure i.e. the components that enables
building a federated Jini system - Model that supports and encourages the
production of reliable distributed services - Services that can be made part of a federated
Jini system and which offer functionality to any
other member of the federation - Javaspaces
48Jini Infrastructure
- The infrastructure is composed of
- Java RMI protocolenables objects residing in
different JVMs to communicate through remote
method invocations
JVM1
JVM2
49Jini Infrastructure
- The infrastructure is composed of
- Lookup Service defines how services may become
part of a Jini system and clients retrieve
services by their types and attributes.
50The Jini Programming Model
- The programming model is based on three distinct
paradigms for distributed computing - Leases extend the Java programming model by
adding the time to the notion of holding a
reference to a resource - Transactionsallow a set of operations on one or
more remote participants to be grouped in such a
way that either all succeed or all fail - Eventsenable objects to register interest in
changes of the abstract state of remote objects
51Jini and Fault Tolerance
- Jini fault tolerance is based on leases and
transactions - leases enable the detection of service failures
- transactions provide consistency by guaranteeing
all-or-nothing semantics - Unfortunately, no support for high-availability
is present in Jini - No support for replication
- Failure of transaction manager ? clients and
participants must wait for the recovery of the
manager before serving further requests
52Enhancing Jini with Fault-Tolerance
- Extending Jini with the Object Group Paradigm
- Infrastructure
- Extending Java RMI for Group Method Invocations
- Extending the Lookup Service for dealing with
Group Proxies - Programming Model
- Object Group Paradigm as alternative programming
model - Integration between transactional and object
group model - Services
- Replicated JavaSpaces
53Extending Java RMI
- RMI group at Javasoft designed Java RMI in order
to be extensible - The RemoteRef interface enables programmers to
write their own references to remote objects on
the client-side - Unfortunately, RemoteRefs are not sufficient
- There is no possibility to modify the behavior of
RMI on the server side
RemoteRef
54The Jgroup Approach (Current Version)
Statically or dynamically generated
implementsthe remote interface
Method dispatchers
Fixed stub for server proxy
55Designing a New Java RMI API
- We have cooperated with Sun Microsystems to
design a new RMI API - Fully customizable, on both the client-side and
the server-side - Based on Dynamic Proxy Classes (JDK 1.3)(No need
for static stub generators) - Two different versions
- One-to-one (remote method invocations)
- Voted down in JSR-078
- Being included in the "Davis" release of Jini
- One-to-many (group method invocations)
56Jgroup with 1-to-1 Customizable RMI
Statically or dynamically generated
implementsthe remote interface
Method dispatchers
57Jgroup with 1-to-Many Customizable RMI
Customizableobjects
Multicast RMI
58Extending the Lookup Service
- Jini enables the registration of customized
proxies for services - this feature can be used to register group
proxies using any implementation of the lookup
service - Group proxies, however, differ from standard
proxies as their contents may be dynamic - server registration ? server reference added to
group proxy - server removal, lease expired ? server reference
removed from group proxy - We have developed an alternative implementation
of the lookup specification capable to deal with
group proxies
59The Jgroup Lookup Service
Lookup
Invocation
60Extending the Jini Programming Model
- Jgroup Jini programming model for
fault-tolerance - Leases transactions
- Object group communication
- Problem
- transactions and group communication considered
as separate aspects of fault-tolerance - their composition does not result in any
meaningful combination of their respective
strengths - We need the possibility of using replication in
transactions - Transaction managers
- Participants
- Clients
61- Summary
- Introduction
- Object Group Communication
- The ARM framework
- Integration with Java RMI / Jini
- Conclusions
62Applications (Research)
- Jgroup/ARM is being used for
- A distributed auction system
- Partitionable auctions
- Panzieri, Amoroso et al., University of Bologna,
2002 - An online-upgrade service for active replication
- Solarski, GMD Fokus
- A replication management framework
- Application-specific replication and recovery
strategies - Meling, HiS
- Dependable naming service
- Support for extensible group proxies (JERI)
- Meling et al., HiS
63Applications (Education)
- Jgroup is being used at the
- Stavanger University College in the Advanced
Programming course - University of Bologna in the Distributed System
course - Norwegian University of Science and Technology in
the Dependable Systems course - Source for several projects and thesis
- Low-level communication protocols (Bologna)
- Replication services (Bologna)
- Wide-area distributed services (Padova)
- Management and deployment issues (HiS)
64- Thank You!
- http//jgroup.sourceforge.net/