Title: ECCS 2005
1ECCS 2005
- November 18th, 2005
- A Few Steps Towards Managing the Complexity of
IT - Yves Caseau
- Bouygues Telecom
2Outline
- I Bouygues Telecoms IT Complexity
- II Enterprise Architecture
- III Business Process Management
- IV QoS Delivery and Optimization of Application
Integration - V Next Steps
3Which complexity ?
- Size
- Over 1 million function points, 50 M TPMC (1000
servers), 700 To - 60 makes a tightly integrated global system
(SIC) - Impact Testing makes a larger and larger part
of software projects costs - Time Dependency
- Production Planning
- Project planning
- Quality of Service
- Customer-facing IT
- Level of expectation is constantly increasing
I Bouygues Telecoms IT
4Bouygues Telecoms IT history
- 95-99 (Exponential growth) IT built around BSCS
integrated package (Billing, CRM, Provisioning,
Customer database, ) - Why change ?
- Capacity and Performance problems
- Too much ad-hoc development (costs)
- Time-to-market increase decrease in flexibility
- 99-2000 strategy
- Take ownership of IT Business objects process,
integration - Performance and scalability component
architecture - Flexibility BPM architecture flexible
components (meta-data) - Quality of Service redundant Secure
Infrastructure, SLA monitoring
I Bouygues Telecoms IT
5Enterprise architecture EAI
- Principles of Enterprise Architecture
- Component-based Integration Infrastructure
- Model-based approach
- Business objects
- Business processes
- Business Logic separated from services
- XML Interface standardization
- Expected Benefits (complexity perspective)
- Modularity (sub-system design)
- Upward compatibility (fewer impacts)
- Sharing, re-use (Service Oriented Architecture)
- 2001 to 2005 Back-office re-engineering
- Integration Infrastructure2001-2002
- Mediation 2001-2003
- Siebel CRM 2001 to 2004
- Provisioning 1Q04
- Rating 2002 (data), 2004 (voice)
- Roaming 1Q04 (rating engine reuse)
- Billing 2005 Geneva (package)
II Enterprise Architecture and Re-engineering
6A Focus on Business Processes
IT Systems
C. Management
CRM
DWH
Billing
Provisioning
Task
P1
Tasks
P2
Task
Business Processes
Task
distribution
EAI Infrastructure
Process Management
Transport
Business Objects Management
Technical Processes
Directories
Business Processes
Each Transition is defined
through business object updates
III Business Process Management
7BPM Maturity model
Process Measure
Process Analysis
Value Analysis
Analyze
Implementation
Improve
100
100
Solvable equations model Ad hoc
Real time Continuous Planned Ad hoc
continuous automated agile slow
100 0
Dynamic BPM processflow automated manual
0
0
Management automation
Part of IT within business process
Value to processes assignment
Formal vs. informal description of processes
Automated performance measurement
Economic performance model
Optimization Lifecycle reactivity
IT Â process-orientationÂ
- BPM is a quiet revolution, which spans many
years. - Two major themes TQM and value analysis
III Business Process Management
8The Truth about BPM ?
- Agility is not a matter of technology but design
- Modularity, Flexibility of functional analysis
- Upward-compatibility of XML exchange formats
- Agile Testing easier said than done
- Synchronization of distributed heterogeneous data
sources is an old and hard problem - Coherence of synchronization and process control
flows - Need for re-synchronization (recovery)
- Shared resources means that process executions
are not independent (serialization and
transaction mechanisms are needed) - Business Process Operations the hard part
- Monitoring is more difficult because the system
is more robust ? - Incidents must be resolved on active systems
- A new culture
III Business Process Management
9OAI Problem Definition
- Context (1) business processes which run over a
shared set of components
Help
PFS
Customer Base
Provisioning
CRM
adapter
Bus
Processflow Engine
- (2) Service Level Agreements (3)
random events
- Activité bursts
- Failures
- Interaction with other processes
20 clients per Hour in less than 2 minutes
- Question Can process management (load balancing)
be automated to maximize business priority
satisfaction ?
IV Optimization of Application Integration
10OAI Optimization of Application Integration
Goals (1) Sizing Rules (2) Monitoring
strategy (3) Operations incident
protocols (4) Design routing / sorting rules
- i-mode launch example
- i-mode subscription is one of many business
processes - Others include billing / Account management /
- SLA goals seemed straightforward
- Midleware
- Throughput
- Latency
- Availability
- Message routing
Customer Base
Service Platform
CRM
Processes - SLA - Priorities
- Goals (SLA)
- - Availability
- Latency
- Throughput
- For each process
Provisioning
Order Management
Fraud
- IT Systems
- throughput
- latency
- availability
- Message protocol
Help
Accounts
Network
IV Optimization of Application Integration
11The challenge of OAI
- Why is OAI hard ?
- Asynchronous availability is hard to compute
- Sizing (multi-commodity flow)
- Stochastic (irregular flows bursts)
- Non-linear behavior (message protocol)
- Monitoring is difficult (for explanations)
- Functional dependencies between processes
(QoS/QoD) - Culture problem
- Batch, Client/server, 3/3 architecture have been
around for a while -gt incident solving know-how - Distributed, asynchronous systems that exchange
messages are far less common - BP culture is long to grow (global perspective)
IV Optimization of Application Integration
12SLAs, Priorities and Adaptive Strategies
- Business processes have different priorities
- An adaptive strategy should balance the load
according to priorities and SLAs - Self-adaptive tolerance to bursts
- Self-healing tolerance to short failures
(fail-over) - Two approaches
- control flow mechanisms
- we used the simulation engine to evaluate two
strategies - RS1 When the QoS of a system X fails lower than
90 of its SLA level (cf. Section 3), we reduce
the flow of systems that are providers of X
whose priority is lower than X. A dual rule
restores the default setting once the QoS of X
reaches 90. - RS2 This is a similar rule, but the triggering
condition is based on processes. When the QoS of
a process P fails below 90, we reduce the flow
of all systems that have a lower priority than P
and who are providers of a system that supports
P. - Control flow is more complex to operate but it is
not necessarily part of the middleware
infrastructure
- Message Handling Rules modify the order in
which messages are handled - FCFS (FIFO)
- Default method for most middleware respects
temporal constraints - However, temporal ordering is not preserved by
load distribution - LCFS (FILO)
- Good strategy for handling backlogs
- SLA routing
- Prediction of processing time based on SLA
- Combination with priorities
- Process high priority messages first
IV Optimization of Application Integration
13Conclusions for Finite-Event Simulation
- A first step towards autonomic BPM
- Self-optimization
- Priority handling works it is possible and
fairly simple to take process priority into
account for routing messages and the results show
a real improvement. - Routing (mailbox sorting) algorithm matters the
more sophisticated SLA projection technique
showed a real improvement over a FCFS policy. - Control rules are interesting, but they are
secondary to the routing policy it is more
efficient to deal with congestion problems with a
distributed routing strategy rather than with a
global rule schema. - Self-healing some form of self-healing is
demonstrated but true self-healing requires
collaboration with HW - Self-configuration the goal is to make
configuration declarative (e.g., SLA) vs.
defining time resource configuration (e.g.,
schedules)
V Conclusions
14Next Steps for Bouygues Telecom (I)
- Promote SLA in BPM standards (BPEL lt- WSLA, QML,
) - Priorities in BPM engines
- Organic operations
- From a mechanical toward a biology vision of
fault-tolerance ? -  Incidents do occur - handling is part of
business know-how, and often relies on a deep
understanding of business logic. - Incident recovery strategies tools are
first-class citizens of the IT infrastructure.
ST4
ST1 secours
ST3 secours
ST1
ST2
ST3
ST2
ST1
ST3
System-based monitoring / recovery
Process monitoring / recovery
V Conclusions
15Next Steps for Bouygues Telecom (II)
- Complexity Management is crucial to meet new
challenges - Key component is human resource
- Change management
- Training (hands-on experiments)
- System Thinking (Fifth Discipline ?)
- BPM
- Self-apply CMMI ( ITIL)
- Dual maturity chains (technical business)
- Enterprise Architecture
- Corporate-scale SOA
- Promote re-use and sharing
- Distributed Data Architecture
- Long running transactions
- Fractal architecture
V Conclusions