Sminaire Ingnierie des Systmes Complexes - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Sminaire Ingnierie des Systmes Complexes

Description:

(alerts) IT experts (score cards) Client (excel) First step: ... Sterling: ' Data synchronization: What is Bad Data Costing Your Company ' ... – PowerPoint PPT presentation

Number of Views:141

Avg rating:3.0/5.0

Slides: 33

Provided by: cmis9

Category:

more less

Transcript and Presenter's Notes

Title: Sminaire Ingnierie des Systmes Complexes

1
Séminaire Ingénierie des Systèmes Complexes

15 Mai 2006
SLA-based routing for middleware A step towards
self-optimizing BPM
Yves Caseau
Bouygues Telecom

Comment construire une infrastructure
dintégration adaptative en fonction des
priorités métier et des engagements de Qualité de
Service ?
2
Position du Problème

Soit (1) un ensemble de composants qui exécutent
des processus

Help
PFS
Customer Base
Provisioning
CRM
adapter

Bus
Processflow Engine

(2) Un contrat de service (3) des aléas
.

Pics dactivité
Pannes
Autres processus

20 clients par Heure en moins De 2 minutes

Question peut-on automatiser le pilotage des
processus ?

3
Glossaire des Acronymes ?

EAI Enterprise Application Integration
SOA Service-Oriented Approach
BPM Business Process Management
QoS Quality of Service
SLA Service Level Agreement
Et . XML, UML, UDDI, WSDL, BPEL, ETL, BAM

4
Outline

I Bouygues Telecoms IT Architecture ( EAI BPM
)
II Optimization of Application Integration
II Self-Adaptive Middleware and Passing
Strategies
IV Control Strategies and Rules
V Conclusion

The problem
10 of a solution
5
Bouygues Telecoms IT history

95-99 (Exponential growth) IT built around BSCS
integrated package (Billing, CRM, Provisioning,
Customer database, )
Why change ?
Capacity and Performance problems
Too much ad-hoc development (costs)
Time-to-market increase decrease in flexibility
99-2000 strategy
Take ownership of IT Business objects process,
integration
Performance and scalability component
architecture
Flexibility BPM architecture flexible
components (meta-data)
Quality of Service redundant Secure
Infrastructure, SLA monitoring

I Enterprise Architecture and Re-engineering
6
A Focus on Business Processes
IT Systems
C. Management
CRM
DWH
Billing
Provisioning
Task
P1
Tasks
P2
Task
Business Processes
Task
distribution
EAI Infrastructure
Process Management
Transport
Business Objects Management
Technical Processes
Directories
Business Processes
Each Transition is defined
through business object updates
I Enterprise Architecture and Re-engineering
7
Three dimensions of Enterprise Architecture
I Enterprise Architecture and Re-engineering
8
Fractal Enterprise Architecture

Two recursive patterns
Recursive decomposition support local vs. global
perspectives
Scale
Constraints
performances,
Technology
Deployment
Common Features
Object Model
Gateway Web service technology

Gateway Double proxy
Bus interne
gateway
I Enterprise Architecture and Re-engineering
9
Our Enterprise Architecture

Demand Management
(process consistency)
BPM
Many instances
standards
EAI 2-level processflow (dynamic object
distribution)
Customer repository
Directory/reference
Synchro/resync
Data consistency

I Enterprise Architecture and Re-engineering
10
Business Objects

The cornerstone of our IT is our business object
model, organized into a hierarchy of models
UML model gt XML schema -gt automated data
transformation
Business objects are distributed into many
components (keep the data where it is
philosophy)

Model hierarchy
I Enterprise Architecture and Re-engineering
11
Data Architecture

Timeless problems to be solved
Copy Synchronization
Manage synchronization flows
Maintain snapshots coherence
General case is impossible (too costly)
OK if coherence is restricted to a set of
observations that is structured around business
processes
Interactions
Activities interact through (1)
messages/services (2) shared resources (objects)
Coherence gt signalization / exclusion /
serialization

I Enterprise Architecture and Re-engineering
12
The Truth about BPM ?

Agility is not a matter of technology but design
Modularity, Flexibility of functional analysis
Upward-compatibility of XML exchange formats
Agile Testing easier said than done
Synchronization of distributed heterogeneous data
sources is an old and hard problem
Coherence of synchronization and process control
flows
Need for re-synchronization (recovery)
Shared resources means that process executions
are not independent (serialization and
transaction mechanisms are needed)
Business Process Operations the hard part
Monitoring is more difficult because the system
is more robust ?
Incidents must be resolved on active systems
A new culture

I Enterprise Architecture and Re-engineering
13
Part II

I Enterprise Architecture and Re-engineering
II Optimization of Application Integration
III Self-Adaptive Middleware and Passing
Strategies
IV Control Strategies and Rules
V Conclusion

II Optimization of Application Integration
14
Motivations for OAI

Quality of Service is the foremost IT objective
for a mobile operator
IT is re-engineered around business processes
(BPM)
QoS is defined through SLA (Service Level
Agreement)
Throughput a flow of 3000 new subscription per
day
Latency end-to-end processing time for a new
subscription
Example less than 4 hours
Availability of 7/24 time when subscription
service is available
The challenge is to optimize the QoS of a chain
from the specification of its links

II Optimization of Application Integration
15
Business Process and Priorities

i-mode launch example
i-mode subscription is one of many business
processes
Others include billing / Account management /
SLA goals seemed straightforward

Customer Base
CRM
Service Platform
Provisioning
Order Management
Fraud
Help
Accounts
Network
Processes
Systems
Infrastructure
III OAI et Processus
16
OAI Optimization of Application Integration
Goals (1) Sizing Rules (2) Monitoring
strategy (3) Operations incident
protocols (4) Design routing / sorting rules

IT Systems
throughput
latency
availability
Message protocol

Goals (SLA)
- Availability
Latency
Throughput
For each process

Midleware
Throughput
Latency
Availability
Message routing

Processes
I Optimization of Application Integration
17
The challenge of OAI

Why is OAI hard ?
Asynchronous availability is hard to compute
Sizing (multi-commodity flow)
Stochastic (irregular flows bursts)
Non-linear behavior (message protocol)
Monitoring is difficult (for explanations)
Functional dependencies between processes
(QoS/QoD)
Culture problem
Batch, Client/server, 3/3 architecture have been
around for a while -gt incident solving know-how
Distributed, asynchronous systems that exchange
messages are far less common
BP culture is long to grow (cf. next slide)

I Optimization of Application Integration
18
Business Process Monitoring
First step Taking ownership of business processes
Operations 7/7 24/24 (alerts)
Client (excel)
IT experts (score cards)

BPM architecture is process-oriented gt better
monitoring
BAM monitoring tools are more and more
relevant
BUT
Double cycle of maturity
True complexity

Business Maturity
Processes
Processes
SLA
Applications
Applications
Technical Maturity
Errors
Systems
Incident
19
Quality of service and Quality of Data

References
Sterling Data synchronization What is Bad
Data Costing Your Company
DWHI Data Quality and the bottom line
achieving business success through a commitment
to high quality data
Error rates ranging from a few up to a few 10s
of !
Direct impact loss of revenue
Bouygues Telecoms experience coupled
degradation
QoS gt QoD
De-synchronization gt functional errors
QoS degradation gt process exception handling gt
errors (input coherence)
QoD gt QoS
Data mapping inconstancies gt Errors with
adaptors and gateways gt pending customer
requests
More exception handling gt Longer processing time
gt non-respect of SLA

20
Part III

I Enterprise Architecture and Re-engineering
II Optimization of Application Integration
III Self-Adaptive Middleware and Passing
Strategies
IV Control Strategies and Rules
V Conclusion

III Self-Adaptive Middleware
21
SLAs, Priorities and Adaptive Strategies

Each process has a SLA (throughput, latency,
availability)
Business processes have different priorities
An adaptive strategy should balance the load
according to priorities and SLAs
Self-adaptive tolerance to bursts
Self-healing tolerance to short failures
(fail-over)
Two approaches
Message Handling Rules modify the order in
which messages are handled (higher priority
first)
Control Rules slow down lower priority flows

III Self-Adaptive Middleware
22
Simulation Model

5 Processes (simplified real problem)
P1 is a high priority subscription process.
(high latency)
P2 is a medium priority automated baring process.
P3 is a lower priority (3) barring.
P4 is a high-priority de-barring process (low
latency)
P5 is a query process of medium priority.
Finite-event model
Scenarios to evaluate graceful degradation

Infrastructure
StartTask
StartTask
EndTask
EndTask
ReceivedTask
ReceivedTask
Processflow Engine
System
TimeOutAlert
StartProcess
EndProcess
SetStatus
Failure
Monitor
III Self-Adaptive Middleware
23
Routing Strategies

FCFS (FIFO)
Default method for most middleware respects
temporal constraints
However, temporal ordering is not preserved by
load distribution
LCFS (FILO)
Good strategy for handling backlogs
SLA routing
Prediction of processing time based on SLA
Combination with priorities
Process high priority messages first

III Message Passing
24
Scenarios

3 types of scenarios
Reference static (with overload)
Burst
Component failure
Different event distribution (uniform, Poisson,
)
Performance evaluation
Multiple runs
Average, standard deviation of SLA achievement
Goal is to observe graceful degradation
(lower priority processes degrade first)

III Message Passing
25
Results

Priority routing works. The algorithms that use
process priority as part of the sorting strategy
are able to maintain the SLA of high priority
processes much longer.
The second lesson is that FCFS is not a good
default algorithm. LCFS does better as soon as
the event flow become tight.
The combination of priority and SLA sorting is
the best approach.

III Message Passing
26
Part IV

I Enterprise Architecture and Re-engineering
II Optimization of Application Integration
III Self-Adaptive Middleware and Passing
Strategies
IV Control Strategies and Rules
V Conclusion

IV Control Strategies
27
Flow Rules

First intuition at Bouygues Telecom was to
implement control flow mechanisms (emergency
mode)
Before actually implementing it in the EAI
adapter, we use the simulation engine to evaluate
two strategies
RS1 When the QoS of a system X fails lower than
90 of its SLA level (cf. Section 3), we reduce
the flow of systems that are providers of X
whose priority is lower than X. A dual rule
restores the default setting once the QoS of X
reaches 90.
RS2 This is a similar rule, but the triggering
condition is based on processes. When the QoS of
a process P fails below 90, we reduce the flow
of all systems that have a lower priority than P
and who are providers of a system that supports
P.
Control flow is more complex to operate but it is
not necessarily part of the middleware
infrastructure

IV Control Strategies
28
Routing Rules

We implemented rules that dynamically change the
message handling strategy (using a status
FAST means use PRL to process a backlog)
RS3 When the QoS of a system X drops below 95,
the system is switched to FAST status. The system
resumes normal status once the QoS returns above
95.
RS4 When the QoS of a process P drops below 95,
all systems that support this process are
switched to FAST status.
RS5 A system is switched to FAST status whenever
its mailbox size grows over 100. Obviously, the
triggering size is a constant that depends on the
volume that is processed by the EAI and the
number of connected systems.

IV Control Strategies
29
Results
Does not provide any stable improvement

Small improvement
Simpler is better

IV Control Strategies
30
Conclusions

A first step towards autonomic BPM
Self-optimization
Priority handling works it is possible and
fairly simple to take process priority into
account for routing messages and the results show
a real improvement.
Routing (mailbox sorting) algorithm matters the
more sophisticated SLA projection technique
showed a real improvement over a FCFS policy.
Control rules are interesting, but they are
secondary to the routing policy it is more
efficient to deal with congestion problems with a
distributed routing strategy rather than with a
global rule schema.
Self-healing some form of self-healing is
demonstrated but true self-healing requires
collaboration with HW
Self-configuration the goal is to make
configuration declarative (e.g., SLA) vs.
defining time resource configuration (e.g.,
schedules)

V Conclusions
31
Next Steps for Bouygues Telecom

Promote SLA in BPM standards (BPEL lt- WSLA, QML,
)
Priorities in BPM engines (lobbying)
Organic operations
From a mechanical toward a biology vision of
fault-tolerance ?
Incidents do occur - handling is part of
business know-how, and often relies on a deep
understanding of business logic.
Incident recovery strategies tools are
first-class citizens of the IT infrastructure.

ST4
ST1 secours
ST3 secours
ST1
ST2
ST3
ST2
ST1
ST3
System-based monitoring / recovery
Process monitoring / recovery
V Conclusions
32
Références

Problème général
Urbanisation et BPM, 2e édition, Dunod, Mars
2006.
OAI expérimentations
Self-Adaptive and Self-Healing Message Passing
Strategies for Process-Oriented Integration
Infrastructures. ECBS 2004 506-512
Self-adaptive middleware Supporting business
process priorities and service level agreements.
Advanced Engineering Informatics 19(3) 199-211
(2005)
Systèmes complexes Architecture
Organisationnelle
Comment modéliser les flux dinformation dans une
entreprise (à partir des processus) en fonction
de lorganisation ?
http//organisationarchitecture.blogspot.com/
Livre en préparation pour Janvier 2006