Title: Click here to proceed.
1The Changing Face of Business Continuity in the
21st Century
James R. Hackett Lakeview Technology, Inc Area
Manager North Central
2Agenda
- What is HA DR
- Defining RPO RTO
- Business Drivers DR Trends
- Approaches and Methodologies
- HA and ROI, mutually exclusive?
- Solutions from IBM and ISVs
3Business Continuity (BC)
24x7x365
- Capability to withstand outages
- Planned or Unplanned
- Uninterrupted Mission Critical Service
- Pre Defined Service Level Agreements
- Primary Solution Sets
- DISASTER RECOVERY(DR)
- Resources, plans, services, and procedures used
to recover mission critical applications and to
resume normal operations for these applications
at a remote site - Stated disaster recovery goals
- Certain levels of degradation are acceptable
- HIGH AVAILABILITY (HA)
- Provide continuous processing for mission
critical applications during planned unplanned
outage events - Permanent regular switching of roles between
backup and primary resources and remaining so for
an extended period of time (weeks verses hours) - More demanding RTO and RPO (as compared to a DR
solution) - Capable of fully automated failover
4High Availability DR Definitions
- High Availability (HA) Solution
- Capable of providing application data
resiliency with minimal impact to the application
user (assumed that the underlying infrastructure
for data, power, communications etc are
resilient) - Can provide a recovery point objective (RPO) of
the last application commitment boundary - Can provide a recovery time objective (RTO) of
less than 10 of the time for a single system RTO
(IBM) when best practices are used - Planned and unplanned outages can be a fully
automated and transparent to the production
environment - A solution environment where at IT operations
discretion, primary and backup resources exchange
rolls regularly and for an extended period of time
5High Availability DR Definitions
- Disaster Recovery (DR) Solution
- A solution primarily focused on data and
application recovery where recovery may involve
many hours verses minutes. - A solution where the RPO may be hours (i.e.. tape
recovery) - A solution that typically requires remote
location recovery - A solution that is normally not used to mask
disruptive maintenance procedures from the end
application user - A solution topology where the roles of the
primary and back up servers are static (regular
role switching is not a requirement)
6RPO and RTO - Definitions
- RPO is the targeted tolerance threshold for lost
business transactions. It defines how much data
and objects can be lost or how far you can go
back in time to resume application functionality.
Do you need to be as near to now as possible? - RTO is the targeted amount of time required to
resume and recover application functionality.
Focus should be on how fast the application can
be restarted. and brought online.
7IT Priorities
- 79 of IT managers say Business Continuity the
Top Priority
Q3a
of 4/5 responses (important /extremely
important)
Source IBM it Trends 1H 04
8Business Profile
Spending for HA is growing faster than spending
on the server market as a whole (Source Sept
2003 IDC 3008)
- Drivers
- Internet Competition is just a click away, but
so is the hacker with a malicious virus or worm - Just in time and real time operations, SLAs.
- Asset Utilization 24x7 operations
- Consolidation (data centers, servers,
operations) - Global operations or just across time zones
- Corporate Governance - i.e. SOX, HIPAA
9Business Profile
Five Top Consequences of a Disaster (Source
Database Trends, May 2005)
-
- Decreased employee productivity (62)
- Data loss (43)
- Reduction in profits (40)
- Damage to customer relationships (38)
- Reduction in revenue (27)
10Business Profile
51 Percent of companies surveyed actually had to
execute their Disaster Recovery Plan in recent
years. Another 45 Percent admit they do not know
how long it would take to get back up and
running. (Source Database Trends, May 2005)
- Most Common Incidents
- Computer system failure (37)
- External computer threats (35)
- Natural disasters and fires (14)
- Accidental and malicious employee behavior (13)
- Average Projected Recovery Time
- 1. Over three days (3.23)
11Disaster Recovery A Grudge Buy or Providing
ROI?
Emerging Business Continuity (BC) Value Based
- The fact that most organizations are unlikely to
ever use the full extent of the services they
have paid for has in the past made disaster
something of a grudge buy and not something
that most companies are eager to spend money
on. - From ITWEB
- 25 September 2001
12Disaster Recovery Bottom Line or Bottomless
Pit?
Emerging BC Value Based
- Recovery services dont add anything to the
bottom line, but the consequences of not having a
plan in place can be disastrous. - Dave Linacre
- Managing Director
- IBM Business Continuity and Recovery Services
- from ITWEB
- 25 September 2001
13Is Your BC Solution an Expense?
- Can You Predict Your Next UNPLANNEDSystem
Outage? - Can You Predict the Durationand Amount of
Losses? - Can you place an ROI on Your BC Solution?
- Does Your BC Solution Provide Benefits or
Alternate ROI Without A Disaster?
14Or, Is Your BC Solution Also an Investment?
- Can You Predict Your Next PLANNEDSystem Outage?
- Can You Predict the Durationand Amount of
Losses? - Can you place an ROI on Your BC Solution?
- Does Your BC Solution Provide and Benefits or ROI
Without A Disaster? Yes, by eliminating the
PLANNED Downtime
15Emerging BC Value Based
- Backup Elimination
- Perform backups without downtime.
- Single or Dual System Solutions
- Single or Dual Dataset Solutions
- Parallel Processing / Parallel Dataset
- Simultaneously run the application on multiple
systems / locations - The Data is maintained on all systems
simultaneously - Storage and Storage Virtualization
- Combined w/ Capacity On Demand maximizes IT
Infrastructure Utilization and Lowers Costs
16Traditional iSeries Availability Market
CLUSTER
Recovery in Hours to Minutes
Enterprise and SMB Markets
HIGH AVAILABILITY
SMB Space Large Gap Costly jump to HA
RPO
IBM BCRS / QUICK SHIP
Recovery in Days to Hours
TAPE BACKUPS
RTO
Recovery in Hours to Minutes
Recovery in Days to Hours
17Current iSeries Availability Market
0
CLUSTERING
Enterprise Market
HIGH AVAILABILITY
minutes
SMB Market
DISASTER RECOVERY / PRACTICAL AVAILABILITY
SOLUTIONS
RPO (data loss)
1-4 hours
12-24 hours
BCRS / SUNGUARD QUICK SHIP
12-24 hours
TAPE BACKUPS
1-2 weeks
2-3 days
4-10 hours
minutes
0
RTO (application downtime)
18Prices Making High Availability Ever More
Affordable!
i870 Config Before 04-20-04 System
Specifications 8/16 way (8 active) 11,500 CPW 24
GB Memory 8.4TB Disk 3481-L23 Tape (9,086) HW
List Price 2,543,934 Price Per CPW 221 SW
Tier P40 SWMA 44,403 List MMMC 7,230
i870 Config After 04-20-04 System
Specifications 8/16 way (8 active) 11,500 CPW 24
GB Memory 8.4TB Disk 3481-L23 Tape (9,086) HW
List Price 1,920,774 Price Per CPW 167 SW
Tier P40 SWMA 44,403 List MMMC 7,230
i5 570 Config on 06-11-04 System
Specifications 3/4 way (4 active) 11,700 CPW 32
GB Memory 8.4TB Disk 3581-F28 Tape (11,420) HW
List Price 1,477,168 Price Per CPW 126
(-43 from April 20th) SW Tier P30 SWMA
27,372 List MMMC 4,007
19How HA Solves Business Needs
- Get High Availability with Disaster Recovery
- Replicate locally between two servers for
Continuous Operations during - Planned maintenance
- OS hardware upgrades
- Disk failure
- Replicate to a remote location for added Disaster
Recovery protection against unplanned downtime - Site loss
Production iSeries server
Backup iSeries server
Tape Backup, Maintenance, BI/Analytics,
Development Without Downtime
20- Planned Outages
- Estimated to be 95 of all i5, iSeries and
AS/400e outages - Types of Planned Downtime
- Daily/Weekly/Monthly Saves
- Software installation/upgrade (OS, Application,
Middleware, etc) - PTF installs
- Operating Systems Upgrades
- Hardware Upgrades
- Unplanned Outages
- Estimated to be lt5 of all iSeries outages
- Unscheduled/Unplanned Downtime due to
- Power outage
- Human error or program failure
- Unprotected DASD or multiple DASD failure
- Other hardware failure
- Disasters
- Estimated to be lt1 of all iSeries outages
- Site wide problem, weather, virus/worm, terrorist
related
21LPAR Replication/Intra-Server
- Eliminate Most Planned Outages (most of 95)
- Daily/Weekly/Monthly Saves
- Software installation/upgrade (OS, Application,
Middleware, etc) - PTF installs
- Operating Systems Upgrades
Production iSeries with Logical Partitions
(LPARs)
Intra-server replication eliminates planned
downtime with no additional hardware investment
3
2
1
22Local Replication Between Two iSeries
- Eliminate ALL Planned Outages (95)
- Daily/Weekly/Monthly Saves
- Software installation/upgrade
- PTF installs
- Operating Systems Upgrades
- Hardware Upgrades
- Eliminate Most Unplanned Outages (Most of 4)
- Power Outage (if different power source)
- Human error or program failure
- Unprotected DASD or multiple DASD failure
- Other hardware failure
Production Apps Keep Running
Maintenance work performed on the Backup Server,
then roll swap with the Production server
Maintenance Without Downtime
23Local Replication and Remote Replication
- Eliminate ALL Planned/Unplanned Outages AND
Disaster Recovery - 100 of Downtime!
Datacenter 1
Production
- Daily/Weekly/Monthly Saves
- Software installation/upgrade
- PTF installs
- Operating Systems Upgrades
- Hardware Upgrades
- Power outage
- Human error or program failure
- DASD failure
- Other hardware failure
- Site wide problem
Backup
Datacenter 2
Remote Backup
24(No Transcript)
25Solution Topology For DR The More Common
Deployment
Requirements Feasibility of Approach Dictate
Deployment Approach
High Availability or CBU Server
Primary Server
Disaster Recovery Center
Data Center
Remote recovery outage support topology for
relatively static environments Temporary
roll-swapping done periodically, Save window
reduction rarely done, Work load distribution
rarely done.
26Solution Topology for HA The Best Practice
Capacity BackUp Server
HSL Loop
High Availability Server
Primary Server
Disaster Recovery Center
Data Center
Data Center Regular permanent roll swaps, save
window reduction, workload distribution Disaster
Center Relatively static and dedicated for
disaster recovery scenarios
27Implementing HA is A Collaborative Process within
an Organization
28Business Resiliency Requirements
- BUSINESS REQUIRMENTS
- Business Impact Analysis
- Establish Existing Planned Outage Profile
- Asset utilization (planned outage unavailability)
- Customer Impacts
- Develop Worst Case Outage Profile's
- Financial and Customer Impacts
- What-If Scenarios
- Business Benefit Analysis
- Establish a Financial cost/benefit assessment
- HA can pay for itself !
- Risk Assessment
- Probability Versus Risk Aversion
- IT REQUIREMENTS
- DR Solution, HA Solution, Combination
- Planned Outage Transparency
- Backups Maintenance Procedures
- Role Swap Objectives (for an HA Solution)
- Workload Balancing Objectives
- RPO, RTO
- Recovery Point Objective (last transaction or
greater) - Recovery Time Objective (minutes, hours..)
- Data Center, Disaster Center, Combination
- Solution topology will be driven by requirements
and cost - Consider Offering i5 Check
- IT Resiliency Profile
29Company Drivers As Business Objectives Expand,
IT Spending Increases.
Relation to DR HA Technologies
- Business Objectives
- 24X7 Operations
- Best of Breed
- Acquisitions/Expansion
- Cost Expense Cutting
- Status Quo
Enterprise HA/DR
SMB HA Solutions
Tape Backup
- IT Capabilities
- Stage 1 Website
- Legacy Systems
- Stage 2 or 3 Website (Transaction processing)
- Vendor Data Exchanges
- BI Analytics
-
30Customer Sample Preliminary Availability
Justification
- Questions
- Company Annual Revenue
- Company Annual Profit
- Planned Annual Downtime
- Prior Year Unplanned Downtime
- Average concurrent users supported?
- Average User Annual Compensation (25/hr)
- Estimated employee DT productivity ()?
- Calculated profit loss during DT (per hour)
- Hours of Operation per day?
- Days of Operation per year?
- Hours of work per employee per week?
- Weeks of work per employee per year?
- Prior Year UPDT Additional Costs
- - Overtime (15K/hr) , Penalties, External
Support, Emergency Plan Activation
Downtime (DT) Factors Costs Annual Revenue Per
Hour Annual Profit Per Hour Planned Downtime Cost
(Profit) Prior Year Unplanned Downtime Cost
(Profit) Outage Costs Based On Revenue Unplanned
DT Salary Costs (400 x 12.5 x 76) Agreed Upon
HA Costs - Unplanned (150K 433K
380K) Additional Justification (Business
Initiatives) - Business Intelligence Project
(400K) - Developer Server Partition
(200K) TOTAL JUSTIFICATION
750M 50M
86K 5,700
32 hrs
182,000
76 hrs 400
433,000
9.3M
50,000 50
380,000
5700 24
963,000
365 40
600,000
50 150,000
1.6M
How Far Can You Stretch Your Service Levels
Without Planned Availability!
31The Solution Choices
- The Availability Building Blocks
- Basics Raid5 and Mirroring
- Static DR Implementations
- Dynamic HA Implementations
- The Solution Choices For Data Resiliency
32The Availability Building Blocks
Continuous Availability
Application Resiliency
Data Resiliency IASPs, HA Replication
Disk Replication, IASPs FlashCopy PPRC
Transaction logging Journaling
Disk, HW, Data Protection Mirroring
Disk Protection Raid 5
Backup Recovery and Tape Automation
33Disk Mirroring Solution Package for Integrated
DisksProvide Protection against Single Points of
Failure
34iSeries Multi System Data Resiliency Solutions
Remote Journaling
Logical Replication
Switched Disk IASPs
Tape
ESS with IASPs
XSM (Cross Site Mirroring)
35iSeries Business Continuity Offerings
- IBM HA Express Portfolio on iSeries
- Models 520, 810 825
- Support mission critical 24x7 environments
- Minimize planned and unplanned downtime
- Minimize save window
- Lower complexity and cost
- iSeries for High Availability Offering
- Models 570, 870 890
- Support mission critical 24X7 environments
- Role swapping and workload distribution
- Minimize planned unplanned downtime
- Support Heterogeneous Environments
- iSeries for Capacity BackUp Offering
- Designed for disaster recovery scenarios
- Not intended for 24x7 HA solutions
- Not intended for workload distribution
- No option to permanently activate standby
processors
36Logical Replication Solutions Vendors For HA
Logical Replication
37Logical Replication Solution Vendors For DR
Logical Replication
38 Heterogeneous OS Mixed Technology Solutions
Replication, Switchable Resources Multiple OS
Environments
39ISV Solution Coverage
IASPs
Windows
I5/OS
Linux
AIX
DS
Data Mirror
iTera
Lakeview Technology
NoMax
Traders
Vision Solutions
40Data Center
Hybrid Cluster Data Center SW Disk with
Replication Off-Site for DR
Production Server
Provides Overall Replication
Management and Control
Replicate Backup eServer i5
DR Site
Backup Server
41Summary
- HA is a Cooperative Implementation
- Business Units Standards Drive IT Capabilities
- BC must add value to achieve investment in the
short term - BC must integrate with the business units goals
- IBM solutions portfolio feature a rich set of
business continuity and high availability
solutions - ISV Solutions, IBM Hardware Solutions
- HA CBU Servers
42Additional Resources
- iSeries HA External Website
- Several HA articles available such as HA 101
- www.ibm.com/eserver/iseries/ha
- iSeries HA Sales Kit
- http//w3-1.ibm.com/sales/systems/portal/_s.155/2
54?navIDf220s240geoIDAllprodIDiSeriesdocIDi
hask.skitdocTypeSalesKitskCatDocumentType - iSeries for HA/CBU
- http//www-1.ibm.com/servers/eserver/iseries/hardw
are/is4ha - http//www-1.ibm.com/servers/eserver/iseries/hardw
are/is4cbu - IBM Redbooks
- i5/OS High Availability Clusters Data
Resilience Solutions - http//www.redbooks.ibm.com
43 Questions? Thank you!
44Back Up Charts
45Cross Site Mirror (XSM) for remote HA/D/R
iSeries Cluster Device Domain
Open TCP/IP Network
WAN
TCP/IP Network
TCP/IP Network
Sync to memory
Optional Sync to disk
Concurrent I5/OS writes
iASP-A
iASP-A
46Data and Disaster Center Availability with
ESS-PPRC (with PPRC-Toolkit)
SPCN
Prod. iSeries-A or LPAR-A iASP-A varied on
Switchable Tower(s) A
ProdESS
iASP-A
HSL
Fiber(s)
HA iSeries-B or LPAR-B iASP-A varied off
SysBas
TCP/IP Network
FC / ESCON
Backup iSeries or LPAR
iASP-A
Tape Backup
47Protect Your AIX and Linux Applications
Switched Disk Solution Using ESS
Backup AIX/Linux Partition(s)
Production AIX/Linux Partition(s)
48AIX Data and Application Protection with HACMP
and HACMP/XD
Lakeview offers the skill service capability
To deploy HACMP and HACMP/XD
Backup AIX Partition
Production AIX Partition
HACMP for Control/Switching
HACMP/XD for Replication
Direct I/O
Direct I/O
49Protect Your AIX and Linux Applications and
Data and
Replication-based Solution Using Virtual I/O
EchoStream FS
Backup AIX/Linux Partition(s)
Production AIX/Linux Partition(s)
50Product Offerings
51iSeries Solutions for Business ContinuitySmall
and Medium Business
IBM HA Express Portfolio on iSeries Models 520,
810, 825
Support mission critical 24X7 environments Role
swapping and workload distribution Minimize
planned downtime Minimize unplanned
downtime Save window reduction
52iSeries Solutions for Business ContinuityLarge
and Medium Business
iSeries For High Availability Models 570, 870
890
Support mission critical 24X7 environments Heterog
eneous OS Environments Role swapping and workload
distribution Minimize planned downtime Minimize
unplanned downtime Save Window Reduction
53iSeries Solutions for Disaster RecoveryLarge
and Medium Business
iSeries For Capacity BackUp
Models 570, 825, 870, 890
Support For Disaster Recovery Environments
54IT Resiliency Profile Analysis
- Achievability of current SLA
- Availability Coverage
- Single Points of failure
- Interdependencies
- Ownership and Process
- Skills and Staffing
- Develop a Resiliency Baseline
- Resiliency Gap Analysis
- Plan of Action
CTC (HA consulting) http//www-1.ibm.com/servers/
eserver/services/havail.html
551H05 Events, Offerings and Promotions (AG)
- Rochester HA CIO Fly-ins
- Quarterly Business Continuity CIO Fly-Outs
- Regional Cities
- Quarterly ISV Webcasts
- Availability Assessments On Demand
- Buy One Get an HA Server Free
56iSeries Technical Sales Support, Americas
57(No Transcript)