Title: Visible Ops: A Statistical Approach To Prioritizing ITIL Processes And Controls A Five Year Study
1Visible OpsA Statistical Approach To
Prioritizing ITIL Processes And Controls(A Five
Year Study)
- Gene Kim, CTO, Tripwire, Inc.Tampa Bay itSMF
LIGDecember 6, 2005
2Thought Experiment Meaningful Metrics
Phase 2
- Which is more desirable?
- 1000 servers, configured identically, but
configured insecurely - 1000 servers, configured randomly, but 50
configured in a secure manner
3The Highest Performing IT Organizations
- Best in class Ops and Security organizations
have - Highest ratio of staff for pre-production
processes - Lowest amount of unplanned work
- Highest change success rate
- Best posture of compliance
- Lowest cost of compliance
4Common Traits Of The Highest Performers
- Culture of change management
- Integration of IT operations and security
processes via problem management and change
management processes - Processes that serve both organizational needs,
as well as business objectives - Highest rate of effective change (approved
changes, change success rate) - Culture of causality
- Highest service levels (MTTR, MTBF)
- Highest first fix rate (unneeded rework)
- Culture of compliance and continual reduction of
operational variance - Production configurations
- Highest level of pre-production staffing
- Effective pre-production controls
- Effective pairing of preventive and detective
controls
5Causal Factors of IT Downtime
Operator Error 60
System Outages 20
5
Security Related
15
Non-Security Related
Application Failure 20
Source IDC, 2004
6Common Process Areas Of High Performers
- All the high-performers had self-derived the same
way of working - Culture of change management
- Culture of causality
- Culture of compliance and desire to continually
reduce variance
7Capability Levels
Organization controls the changes
- 4 - Continuously Improving
- lt5 of time spent on unplanned work
Changes control the organization
- 3 - Closed-Loop Process
- 15-35 of time spent on unplanned work
- 2 - Using Honor System
- 35-50 of time spent on unplanned work
- 1 - Reactive
- Over 50 of time spent on unplanned work
Effectiveness
Reactive
Using The Honor System
Closed-Loop Change Mgt
ContinuouslyImproving
Based on the IT Process Institutes Visible Ops
Framework
8Visible Ops Four Steps To Build An Effective
Change Management Process
- Each of the four Visible Ops steps is
- A finite project not a ISO 9001 initiative or a
vague 5-year vision - Catalytic returns more resources to the
organization than it consumes, fueling the next
steps - Sustaining process stays in place, even when the
initial force behind it disappears - Auditable supports factual reporting and
attestation to process adherence and consistency - Ordered must be done in the specified order to
achieve the above - Model based on five years studying
high-performing IT Ops and Security organizations - Visible Ops has been donated to the ITPI
9Visible Ops Four Steps To Build An Effective
Change Management Process
Phase 2 Catch and Release, Find Fragile Artifacts
Phase 3 Establish Repeatable Build Library
Tripwire protects fragile artifacts. Tripwire
enforces change freeze and prevents configuration
drift.
Tripwire captures known good state in
preproduction. Tripwire captures production
changes that need to be baked into the build.
Phase 1 Electrify Fence, Modify First Response
Phase 4 Continually improve
Tripwire enforces the change process. Tripwire
rules out change as early as possible in the
repair cycle.
Tripwire detects change, which all process areas
hinge upon.
10Which Metric Do You Want To Improve?
Phase 4
- Release
- Time to provision known good build
- turns to a known good build
- Shelf life of build
- of systems that match known good build
- of builds that have security sign-off
- of fast-tracked builds
- Ratio of release engineers to sysadmins
- Controls
- of changes authorized per week
- of actual changes made per week
- Number of unauthorized change
- Change failure rate
- of emergency changes
- of service-affecting outages
- of special changes
- of business as usual changes
- Change management overhead
- Configuration variance
- Resolution
- MTTR, MTBF
- of time spent on unplanned work
11Visible Ops Phase 1 Ungoverned Change
(Unplanned work gt 100)
Unplanned work
Our prediction is that as failed changes and
unauthorized changes increase, unplanned work
increases at a growing rate, to the point where
overtime or additional staff are required. Note
that the total number of changes does not have to
increase for this to occur.
Failed changes orNum of unauthorized changes
Change rate
time
12Visible Ops Phase 1 Stabilized Patient
Unplanned work
The better alternative is to reduce or eliminate
all unauthorized or failed change. One purpose of
the survey is to find which controls are best at
achieving this objective. If you can reduce
unauthorized change, we predict unplanned work
will start to decrease.
Failed changes orNum of unauthorized changes
Change rate
time
13Visible Ops Phase 1 Increasing Auditability
Auditors perception of assurance
Control over change
Time spent on audit prep and liaising
We predict that improved control over change will
also reduce compliance costs and effort, as well
as increasing auditors perception of effective
IT controls.
of time spent on compliance activities
time
14Visible Ops Phase 2 Drifting Configurations
Unplanned work
Once change management is under control, we then
predict that as IT configurations diverge from
their desired state, unplanned work will increase
at a growing rate because changes will not be
consistently successful.
Change success rate
Mastery of each configuration
of unique configurations
time
15Visible Ops Phase 2 Find Fragile Artifacts
Change success rate
Mastery of each configuration
Unplanned work
If you can reduce the configuration variance,
unplanned work will decrease as the change
success rate increases. This VEESC study also
investigates which controls are most effective at
achieving this objective.
of unique configurations
time
161. How Do You Electrify Fence?
- Must have a report that shows management that all
production changes are authorized - What changes map to authorized and approved work
orders? - What changes do not match expected changes?
172. What Happens When You Touch The Fence?
- All the high-performing IT organizations had some
common processes for handling unauthorized change - Making engineering team own the controls We
just detected an unauthorized change you have
four hours to retroactively document your cowboy
change, otherwise we mobilize security. - Deterrent and cultural controls E.g., wall of
shame, two strikes and youre out - Auditors love it when Management owns the
controls - Preventive policies
- Detective controls showing policies are being
enforced - Documentation of corrective actions, showing
deterrent controls
18Biggest Mistakes That IT Executives Make
- Not locking down change
- We cant we wont be able to get anything
done. - The business doesnt pay us to not make
changes. - Not electrifying fence
- We dont need to we trust our own people.
- Our people are professionals and dont need
constant micromanagement. - Not tackling culture issues
- Technology or process whiteboarding is easier
to justify and implement than tackling people and
culture issues
19Thought Experiment
- Which is more desirable?
- 1000 servers, configured identically, but
configured insecurely - 1000 servers, configured randomly, but 50
configured in a secure manner
Most high performing organizations would choose
the first. Why? Ability to systematically change
all configurations, ability to defeat entropy,
ability to maintain any desired state
20Problem statement
- There are many best-practice frameworks for IT.
Each framework offers a substantial number of
recommendations for IT controls and processes.
For example - Information Technology Infrastructure Library
(ITIL) - Control Objectives for Information Technology
(COBIT) - Organizations spent considerably more than
planned on compliance with Section 404 of the
Sarbanes-Oxley Act. (FEI, 2005). - As a result, IT executives are asking
- Which IT controls are most important?
- What methods are most cost effective?
- Are IT controls just an added cost, or do they
actually improve IT operations and security?
21Descriptive statistics
- Sample
- 95 organizations
- Mean number of IT employees 483
- Minimum 3
- Maximum 7000
- Standard Deviation 1249
- Mean number of IT components 906
- Minimum 1
- Maximum 40000
- Standard Deviation 4203
- Average IT expenditure 114 million
- Minimum 5 million
- Maximum 1050 million
22Descriptive Statistics - Industry
23 of IT Staff In App Dev
End-user satisfaction
Unplanned work
Security Satisfaction
What controls affect this?
Change success rate
Security Integration
What controls affect this?
What controls affect this?
24Unplanned work and end-user satisfaction
End-user satisfaction
Unplanned work
- The yearly average percentage of unplanned work
is negatively correlated with the IT departments
perception of end-user satisfaction. - Correlation
- r - 0.46
- Significance
- p 0.002 (Highly significant)
25Unplanned work and end-user satisfaction
As the percentage of unplanned work decreases,
end-user satisfaction increases.
26What Makes Unplanned Work Go Up?
When IT personnel consider the change management
bureaucratic, unplanned work is 10 higher.
When developers have change access, unplanned
work is 8 lower.
27What Makes Unplanned Work Go Down?
28What Makes Unplanned Work Go Down?
Can service level management activities succeed
without change? Many practitioners believe not
Change management processes must be enforced.
29What Makes Chg Succ Rate Go Down?
When IT personnel consider the change management
bureaucratic, change success rate is 15 lower.
When developers have change access, change
success rate is 15 lower.
30What Makes Chg Succ Rate Go Up?
31End-user satisfaction and security satisfaction
End-user satisfaction
Security Satisfaction
- That security satisfaction is positively
correlated with the IT departments perception of
end-user satisfaction. - Correlation
- r 0.45
- Significance
- p 0.000394 (Highly significant)
32End-user satisfaction and security satisfaction
(n58)
As the satisfaction with IT security increases,
so too does the satisfaction with IT
33Security satisfaction and security integration
Security Satisfaction
Security Integration
- That security integration is positively
correlated with the IT departments perception of
security satisfaction. - Correlation
- r 0.75
- Significance
- p 0.000000 (Highly significant)
34Security satisfaction and security integration
(n58)
As the level of security integration increases,
so too satisfaction with IT security
35What Makes Security Integration Go Down?
When IT personnel consider the change management
bureaucratic, security integration goes down.
When developers have change access, security
integration goes down.
36What Makes Security Integration Go Up?
37Key VEESC Learnings
- Everyone knows that developers should not have
production change access, and yet - 75 of respondents allow allow this to happen,
and - They have higher amounts of unplanned work, lower
change success rate, poorer IT satisfaction,
poorer IT security satisfaction and integration - Everyone knows that you should have a change
management process, and yet - 33 of respondents dont have one, and
- They have lower change success rates, poorer IT
satisfaction, poorer IT security satisfaction and
integration - The only thing worse than having no change
management process is - having a bureaucratic change management process
- or not enforcing the change management process
you have
38Visible Ops Making Change/Config Achievable
- Learn from the high performers
- Visible Ops comes from years of studying
high-performing IT operations and security
organizations in conjunction with the IT Process
Institute - There is no silver bullet its about people,
process, and technology working together - Specific steps not just theory
- Illustrates how to replicate the processes of
these high-performing organizations in just four,
achievable steps - Specific section on preparing for audits
- The only acceptable metric for unauthorized
change is zero!
Visible Ops is a very valuable resource for
anyone just getting started with IT change
management processes. This resource would have
saved me many hours of research had it been
available when I was putting together the change
management plan for our department. This book is
well written, easy to follow, with good examples
It has everything you need from beginning
development through the measuring the results..
-- Jackie Shaffer, Florida Department of
Education
39Key controls that reduce unplanned work
40Key controls that reduce unplanned work
41Key controls that reduce unplanned work
42What controls affect Change Success Rate?
43What controls affect Security Integration?
44- Avg X 3.689655 (n58) Avg Y 2.120690 (n58) p
r0.45, p0.000394
45- Avg X 2.120690 (n58) Avg Y 2.103448 (n58) p
r0.75, p0.000000
46- Avg X 2.083333 (n36) Avg Y 11.847250 (n36) p
r-0.40, p0.02
47- Avg X 2.200000 (n40) Avg Y 71.575000 (n40) p
r0.44, p0.004651
48- Avg X 2.024390 (n41) Avg Y 41.902439 (n41) p
r0.60, p0.000028
49- Avg X 2.024390 (n41) Avg Y 41.902439 (n41) p
r0.60, p0.000028