Sminaire Ingnierie des Systmes Complexes - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Sminaire Ingnierie des Systmes Complexes

Description:

(alerts) IT experts (score cards) Client (excel) First step: ... Sterling: ' Data synchronization: What is Bad Data Costing Your Company ' ... – PowerPoint PPT presentation

Number of Views:141
Avg rating:3.0/5.0
Slides: 33
Provided by: cmis9
Category:

less

Transcript and Presenter's Notes

Title: Sminaire Ingnierie des Systmes Complexes


1
Séminaire Ingénierie des Systèmes Complexes
  • 15 Mai 2006
  • SLA-based routing for middleware A step towards
    self-optimizing BPM
  • Yves Caseau
  • Bouygues Telecom

Comment construire une infrastructure
dintégration adaptative en fonction des
priorités métier et des engagements de Qualité de
Service ?
2
Position du Problème
  • Soit (1) un ensemble de composants qui exécutent
    des processus

Help
PFS
Customer Base
Provisioning
CRM
adapter

Bus
Processflow Engine
  • (2) Un contrat de service (3) des aléas
    .
  • Pics dactivité
  • Pannes
  • Autres processus

20 clients par Heure en moins De 2 minutes
  • Question peut-on automatiser le pilotage des
    processus ?

3
Glossaire des Acronymes ?
  • EAI Enterprise Application Integration
  • SOA Service-Oriented Approach
  • BPM Business Process Management
  • QoS Quality of Service
  • SLA Service Level Agreement
  • Et . XML, UML, UDDI, WSDL, BPEL, ETL, BAM

4
Outline
  • I Bouygues Telecoms IT Architecture ( EAI BPM
    )
  • II Optimization of Application Integration
  • II Self-Adaptive Middleware and Passing
    Strategies
  • IV Control Strategies and Rules
  • V Conclusion

 The problem 
10 of  a solution 
5
Bouygues Telecoms IT history
  • 95-99 (Exponential growth) IT built around BSCS
    integrated package (Billing, CRM, Provisioning,
    Customer database, )
  • Why change ?
  • Capacity and Performance problems
  • Too much ad-hoc development (costs)
  • Time-to-market increase decrease in flexibility
  • 99-2000 strategy
  • Take ownership of IT Business objects process,
    integration
  • Performance and scalability component
    architecture
  • Flexibility BPM architecture flexible
    components (meta-data)
  • Quality of Service redundant Secure
    Infrastructure, SLA monitoring

I Enterprise Architecture and Re-engineering
6
A Focus on Business Processes
IT Systems
C. Management
CRM
DWH
Billing
Provisioning
Task
P1
Tasks
P2
Task
Business Processes
Task
distribution
EAI Infrastructure
Process Management
Transport
Business Objects Management
Technical Processes
Directories
Business Processes
Each Transition is defined
through business object updates
I Enterprise Architecture and Re-engineering
7
Three dimensions of Enterprise Architecture
I Enterprise Architecture and Re-engineering
8
Fractal Enterprise Architecture
  • Two recursive patterns
  • Recursive decomposition support local vs. global
    perspectives
  • Scale
  • Constraints
  • performances,
  • Technology
  • Deployment
  • Common Features
  • Object Model
  • Gateway Web service technology

Gateway Double proxy
Bus interne
gateway
I Enterprise Architecture and Re-engineering
9
Our Enterprise Architecture
  • Demand Management
  • (process consistency)
  • BPM
  • Many instances
  • standards
  • EAI 2-level processflow (dynamic object
    distribution)
  • Customer repository
  • Directory/reference
  • Synchro/resync
  • Data consistency

I Enterprise Architecture and Re-engineering
10
Business Objects
  • The cornerstone of our IT is our business object
    model, organized into a hierarchy of models
  • UML model gt XML schema -gt automated data
    transformation
  • Business objects are distributed into many
    components (keep the data where it is
    philosophy)

Model hierarchy
I Enterprise Architecture and Re-engineering
11
Data Architecture
  • Timeless problems to be solved
  • Copy Synchronization
  • Manage synchronization flows
  • Maintain  snapshots  coherence
  • General case is impossible (too costly)
  • OK if coherence is restricted to a set of
    observations that is structured around business
    processes
  • Interactions
  • Activities interact through (1)
    messages/services (2) shared resources (objects)
  • Coherence gt signalization / exclusion /
    serialization

I Enterprise Architecture and Re-engineering
12
The Truth about BPM ?
  • Agility is not a matter of technology but design
  • Modularity, Flexibility of functional analysis
  • Upward-compatibility of XML exchange formats
  • Agile Testing easier said than done
  • Synchronization of distributed heterogeneous data
    sources is an old and hard problem
  • Coherence of synchronization and process control
    flows
  • Need for re-synchronization (recovery)
  • Shared resources means that process executions
    are not independent (serialization and
    transaction mechanisms are needed)
  • Business Process Operations the hard part
  • Monitoring is more difficult because the system
    is more robust ?
  • Incidents must be resolved on active systems
  • A new culture

I Enterprise Architecture and Re-engineering
13
Part II
  • I Enterprise Architecture and Re-engineering
  • II Optimization of Application Integration
  • III Self-Adaptive Middleware and Passing
    Strategies
  • IV Control Strategies and Rules
  • V Conclusion

II Optimization of Application Integration
14
Motivations for OAI
  • Quality of Service is the foremost IT objective
    for a mobile operator
  • IT is re-engineered around business processes
    (BPM)
  • QoS is defined through SLA (Service Level
    Agreement)
  • Throughput a flow of 3000 new subscription per
    day
  • Latency end-to-end processing time for a new
    subscription
  • Example less than 4 hours
  • Availability of 7/24 time when subscription
    service is available
  • The challenge is to optimize the QoS of a chain
    from the specification of its links

II Optimization of Application Integration
15
Business Process and Priorities
  • i-mode launch example
  • i-mode subscription is one of many business
    processes
  • Others include billing / Account management /
  • SLA goals seemed straightforward

Customer Base
CRM
Service Platform
Provisioning
Order Management
Fraud
Help
Accounts
Network
Processes
Systems
Infrastructure
III OAI et Processus
16
OAI Optimization of Application Integration
Goals (1) Sizing Rules (2) Monitoring
strategy (3) Operations incident
protocols (4) Design routing / sorting rules
  • IT Systems
  • throughput
  • latency
  • availability
  • Message protocol
  • Goals (SLA)
  • - Availability
  • Latency
  • Throughput
  • For each process
  • Midleware
  • Throughput
  • Latency
  • Availability
  • Message routing

Processes
I Optimization of Application Integration
17
The challenge of OAI
  • Why is OAI hard ?
  • Asynchronous availability is hard to compute
  • Sizing (multi-commodity flow)
  • Stochastic (irregular flows bursts)
  • Non-linear behavior (message protocol)
  • Monitoring is difficult (for explanations)
  • Functional dependencies between processes
    (QoS/QoD)
  • Culture problem
  • Batch, Client/server, 3/3 architecture have been
    around for a while -gt incident solving know-how
  • Distributed, asynchronous systems that exchange
    messages are far less common
  • BP culture is long to grow (cf. next slide)

I Optimization of Application Integration
18
Business Process Monitoring
First step Taking ownership of business processes
Operations 7/7 24/24 (alerts)
Client (excel)
IT experts (score cards)
  • BPM architecture is process-oriented gt better
    monitoring
  • BAM monitoring tools are more and more
    relevant
  • BUT
  • Double cycle of maturity
  • True complexity

Business Maturity
Processes
Processes
SLA
Applications
Applications
Technical Maturity
Errors
Systems
Incident
19
Quality of service and Quality of Data
  • References
  • Sterling  Data synchronization What is Bad
    Data Costing Your Company 
  • DWHI  Data Quality and the bottom line
    achieving business success through a commitment
    to high quality data 
  • Error rates ranging from a few up to a few 10s
    of !
  • Direct impact loss of revenue
  • Bouygues Telecoms experience coupled
    degradation
  • QoS gt QoD
  • De-synchronization gt functional errors
  • QoS degradation gt process exception handling gt
    errors (input coherence)
  • QoD gt QoS
  • Data mapping inconstancies gt Errors with
    adaptors and gateways gt pending customer
    requests
  • More exception handling gt Longer processing time
    gt non-respect of SLA

20
Part III
  • I Enterprise Architecture and Re-engineering
  • II Optimization of Application Integration
  • III Self-Adaptive Middleware and Passing
    Strategies
  • IV Control Strategies and Rules
  • V Conclusion

III Self-Adaptive Middleware
21
SLAs, Priorities and Adaptive Strategies
  • Each process has a SLA (throughput, latency,
    availability)
  • Business processes have different priorities
  • An adaptive strategy should balance the load
    according to priorities and SLAs
  • Self-adaptive tolerance to bursts
  • Self-healing tolerance to short failures
    (fail-over)
  • Two approaches
  • Message Handling Rules modify the order in
    which messages are handled (higher priority
    first)
  • Control Rules slow down lower priority flows

III Self-Adaptive Middleware
22
Simulation Model
  • 5 Processes (simplified real problem)
  • P1 is a high priority subscription process.
    (high latency)
  • P2 is a medium priority automated baring process.
  • P3 is a lower priority (3) barring.
  • P4 is a high-priority de-barring process (low
    latency)
  • P5 is a query process of medium priority.
  • Finite-event model
  • Scenarios to evaluate graceful degradation

Infrastructure
StartTask
StartTask
EndTask
EndTask
ReceivedTask
ReceivedTask
Processflow Engine
System
TimeOutAlert
StartProcess
EndProcess
SetStatus
Failure
Monitor
III Self-Adaptive Middleware
23
Routing Strategies
  • FCFS (FIFO)
  • Default method for most middleware respects
    temporal constraints
  • However, temporal ordering is not preserved by
    load distribution
  • LCFS (FILO)
  • Good strategy for handling backlogs
  • SLA routing
  • Prediction of processing time based on SLA
  • Combination with priorities
  • Process high priority messages first

III Message Passing
24
Scenarios
  • 3 types of scenarios
  • Reference static (with overload)
  • Burst
  • Component failure
  • Different event distribution (uniform, Poisson,
    )
  • Performance evaluation
  • Multiple runs
  • Average, standard deviation of SLA achievement
  • Goal is to observe graceful degradation
    (lower priority processes degrade first)

III Message Passing
25
Results
  • Priority routing works. The algorithms that use
    process priority as part of the sorting strategy
    are able to maintain the SLA of high priority
    processes much longer.
  • The second lesson is that FCFS is not a good
    default algorithm. LCFS does better as soon as
    the event flow become tight.
  • The combination of priority and SLA sorting is
    the best approach.

III Message Passing
26
Part IV
  • I Enterprise Architecture and Re-engineering
  • II Optimization of Application Integration
  • III Self-Adaptive Middleware and Passing
    Strategies
  • IV Control Strategies and Rules
  • V Conclusion

IV Control Strategies
27
Flow Rules
  • First intuition at Bouygues Telecom was to
    implement control flow mechanisms (emergency
    mode)
  • Before actually implementing it in the EAI
    adapter, we use the simulation engine to evaluate
    two strategies
  • RS1 When the QoS of a system X fails lower than
    90 of its SLA level (cf. Section 3), we reduce
    the flow of systems that are providers of X
    whose priority is lower than X. A dual rule
    restores the default setting once the QoS of X
    reaches 90.
  • RS2 This is a similar rule, but the triggering
    condition is based on processes. When the QoS of
    a process P fails below 90, we reduce the flow
    of all systems that have a lower priority than P
    and who are providers of a system that supports
    P.
  • Control flow is more complex to operate but it is
    not necessarily part of the middleware
    infrastructure

IV Control Strategies
28
Routing Rules
  • We implemented rules that dynamically change the
    message handling strategy (using a status
    FAST means use PRL to process a backlog)
  • RS3 When the QoS of a system X drops below 95,
    the system is switched to FAST status. The system
    resumes normal status once the QoS returns above
    95.
  • RS4 When the QoS of a process P drops below 95,
    all systems that support this process are
    switched to FAST status.
  • RS5 A system is switched to FAST status whenever
    its mailbox size grows over 100. Obviously, the
    triggering size is a constant that depends on the
    volume that is processed by the EAI and the
    number of connected systems.

IV Control Strategies
29
Results
Does not provide any stable improvement
  • Small improvement
  • Simpler is better

IV Control Strategies
30
Conclusions
  • A first step towards autonomic BPM
  • Self-optimization
  • Priority handling works it is possible and
    fairly simple to take process priority into
    account for routing messages and the results show
    a real improvement.
  • Routing (mailbox sorting) algorithm matters the
    more sophisticated SLA projection technique
    showed a real improvement over a FCFS policy.
  • Control rules are interesting, but they are
    secondary to the routing policy it is more
    efficient to deal with congestion problems with a
    distributed routing strategy rather than with a
    global rule schema.
  • Self-healing some form of self-healing is
    demonstrated but true self-healing requires
    collaboration with HW
  • Self-configuration the goal is to make
    configuration declarative (e.g., SLA) vs.
    defining time resource configuration (e.g.,
    schedules)

V Conclusions
31
Next Steps for Bouygues Telecom
  • Promote SLA in BPM standards (BPEL lt- WSLA, QML,
    )
  • Priorities in BPM engines (lobbying)
  • Organic operations
  • From a mechanical toward a biology vision of
    fault-tolerance ?
  •  Incidents do occur  - handling is part of
    business know-how, and often relies on a deep
    understanding of business logic.
  • Incident recovery strategies tools are
    first-class citizens of the IT infrastructure.

ST4
ST1 secours
ST3 secours
ST1
ST2
ST3
ST2
ST1
ST3
System-based monitoring / recovery
Process monitoring / recovery
V Conclusions
32
Références
  • Problème général
  • Urbanisation et BPM, 2e édition, Dunod, Mars
    2006.
  • OAI expérimentations
  • Self-Adaptive and Self-Healing Message Passing
    Strategies for Process-Oriented Integration
    Infrastructures. ECBS 2004 506-512
  • Self-adaptive middleware Supporting business
    process priorities and service level agreements.
    Advanced Engineering Informatics 19(3) 199-211
    (2005)
  • Systèmes complexes Architecture
    Organisationnelle
  • Comment modéliser les flux dinformation dans une
    entreprise (à partir des processus) en fonction
    de lorganisation ?
  • http//organisationarchitecture.blogspot.com/
  • Livre en préparation pour Janvier 2006
Write a Comment
User Comments (0)
About PowerShow.com