Title: Disk Based Disaster Recovery
1Disk Based Disaster Recovery Data Replication
Solutions
- Gavin Cole
- Storage Consultant SEE
2Agenda
- Planning for a Disaster
- Using Local Copies for Protection
- Using Remote Mirroring for Protection
- Conclusion
3Agenda
- Planning for a Disaster
- Using Local Copies for Protection
- Using Remote Mirroring for Protection
- Conclusion
4Disaster
any interruption in the normal access to a valid
set of data used by applications and end users to
execute mission critical business processes for
an unacceptable period of time.
Jon William Toigo Chairman, Data Management
Institute 2005
5Source Ontrack Data Report 2007
6Source Ontrack Data Report 2007
7The threat of data loss cant be ignored
- Risks
- High cost of data loss and downtime
- Insufficient data recovery plans and procedures
- 70 of businesses fail after major data loss
- Recovery overextends limited staffs and financial
resources - Needs
- High availability and fast data recovery
- Seamless integration with existing IT
infrastructures - Interoperability with current and future storage
and computing systems
8The Cost of Downtime
- Forrester Consulting Interviewed 138 companies
- Estimated cost of 1 hour downtime
- Less than 10,000 / hr 25
- 10,000 to 100,000 / hr 33
- 100,000 to 500,000 25
- 500,000 to 1 million 13
- Greater than 1 million 4
- 67 - could not estimate the financial cost of
downtime
9Key definitions
Resumption of normal business
Last safe Backup
Disaster
Recovery Point Objective (RPO)? The time between
the last safe backup and the point of time of the
disaster
Recovery Time Objective (RTO)? The time elapsed
from when the disaster occurred to the resumption
of normal business activities
10Business Continuity
- Planning to never go down
- Always have access to information
- Needs more than a good data recovery strategy
- A disaster can be something as simple as a
deleted file - Use disk duplication strategies
- Mirroring
- Snapshot
- Remote replication
11Complete data protection requires personalized
and practical solutions
Risk
Disruption
Regional Disaster
Seconds
Local Disaster
Minutes
Business goals Threats Budget realities Existing
assets
Hours
Operator error
Days
Hardware failure
Application
Department
Data Center
Enterprise
Scale
12Seven Key Planning Steps
- Business impact assessment
- How long can I live without data?
- Discovery
- What data do I need first?
- Budget
- What is my data worth?
- Role-based teams
- Who are the key people?
- Data protection
- How do I protect what I need?
- Logistics
- What are the physical requirements?
- Testing
- Will my plan work?
13Agenda
- Planning for a Disaster
- Using Local Copies for Protection
- Using Remote Mirroring for Protection
- Conclusion
14Volume Copy Terms
- Complete point in time replication of one Volume
(source) to another (target) within a Storage
Subsystem -
Copy Pair
Target or Copy or Clone
Target Volume that maintains a copy of the data
from the source
Source Volume that accepts host I/O and stores
application data
15How Volume Copy Works
16Snapshot Terms
- A point-in-time (PiT) image of a volume
- Logical equivalent of a physical copy
Storage System
Physical Disk Space
Base Volume - the volume from which the Snapshot
will be created
17Snapshot Flow Chart
18Using Volume Copy and Snapshot together
- Copying the Snapshot creates a full PiT clone
copy while I/O continues to base volume (LUN)?
19Agenda
- Planning for a Disaster
- Using Local Copies for Protection
- Using Remote Mirroring for Protection
- Conclusion
20Remote Volume Mirroring
- Ongoing, real-time replication of a volume from
one storage system to another
21Remote Volume Mirroring Components
- Primary volume accepts read and write host I/O
- Secondary volume accepts read host I/O.
- accepts remote writes of data from controller
owner of Primary volume.
22Remote Volume Mirroring Components
- Mirror Repository volume Stores mirroring data,
such as info about remote writes that have not
completed
Mirror Repositories
Mirror Repositories
Mirror Pairs V1 -gt V1M V2 -gt V2M V3 -gt V3M
Primary
Secondary
Primary
Secondary
23Synchronous Replication
- The primary disk system acknowledges a host write
when the data has been successfully mirrored
- Primary benefit
- Ensures remote data is an exact replicate of the
local data - Note only effective for campus area replication
24Asynchronous Write Mode
- Allows the primary disk system to acknowledge a
host write request before the data has been
successfully mirrored
- Primary benefits
- Reduces impact of latency when replicating over
longer distances - Provides performance improvement compared to
synchronous for primary site I/O (disk system
and application)? - Enables effective replication over longer
distances (WAN)?
25Preserved Write Order
- Write operations to the secondary disk system
matches I/O completion order on the local disk
system - Also referred to as a consistency group
- Primary benefit
- Maintains data integrity in multi-LUN
applications (databases) by eliminating
out-of-order updates at the remote side that can
cause logical corruption
26Remote Volume Mirroring Mirror Management
- Role Reversal (from secondary to primary or vice
versa) is user-initiated - If primary is also base volume for snapshots,
role reversal will cause associated snapshots to
fail - It is possible to force role change for the local
volume if communication to the remote volume is
down - Used in disaster recovery scenarios
- Can prepare by mapping secondary volumes to hosts
using Storage Partitions before they are promoted
27DR / HA Architecture
Cluster 1
Cluster 2
Site A
Site B
Volume 3 replication
Volume 1 2 replication
M mirror
M mirror
28Architecture Description
- Site A and Site B each contain a copy of
critical data - Critical data is copied in real time using the
disk controllers - minimal impact on server processing power
- OS and key applications are clustered across
both sites - If either site fails application transparently
fails over to remote - Customers notices minimal disruption
- 2 way Disaster Protection
- Cluster 1 uses volume V1 and V2 primary
business is at Site A, Mirrored to Site B for
protection - Cluster 2 uses volume V3 - primary business is at
Site B, Mirrored to Site A for protection - Sites are connected by Fibre Channel network for
performance - Could be connected by long distance IP network
will be performance impact on replication.
29Agenda
- Planning for a Disaster
- Using Local Copies for Protection
- Using Remote Mirroring for Protection
- Conclusion
30Could you survive a disaster?
- 35 of companies have a plan
- 60 of plans are never tested
- Half the companies that suffer losses never
recover - 10 50 K per MB to re-create data
31Seven Key Planning Steps
- Business impact assessment
- How long can I live without data?
- Discovery
- What data do I need first?
- Budget
- What is my data worth?
- Role-based teams
- Who are the key people?
- Data protection
- How do I protect what I need?
- Logistics
- What are the physical requirements?
- Testing
- Will my plan work?
32Thank You
Disaster Recovery Planning
- Gavin Cole
- gavin.cole_at_sun.com
- 33 6 70 72 99 53