Title: BACKUP/MASTER: Building a Storage Wide Area Network (WAN) for Enterprise DR
1BACKUP/MASTER Building a Storage Wide Area
Network (WAN) for Enterprise DR
- Dragon Slayer Consulting
- Marc Staimer, President CDS
- marcstaimer_at_earthlink.net
- 22 September 2004
2Dragon Slayer background
- 7 yrs sales
- 7 yrs sales mgt
- 10 yrs mkting bus dev
- Storage SANs
- 6 years consulting
- Launched or participated
- 20 products
- Paid Consulting
- gt 70 vendors
- Unpaid Consulting
- gt 200 end users
- Known Industry Expert
- Speak 5 events/yr
- Write 3 trade articles/yr
3Storage DR WAN level setting
- Storage WANs by definition
- Primarily for DR purposes
- Enterprise DR Characteristics
- Big blocks of data
- Can overwhelm standard IP routers
- Not limited to nights weekends
- Time is very relevant
- Time windows are getting smaller
4Classifying DR data
- Mission-critical Crucial
- Organizations vital data
- Primary business processes, primary applications,
SLAs - Data access loss often means organizational death
- Essential Secret
- Very important to the organization
- Day-to-day business processes
- Instantaneous recovery preferred not required
- Important Valuable
- Many day-to-day organization ops apps
- Low-critical Nominal value
- Low organizational value
5Prioritizing DR data
- Mission-critical Crucial
- eCommerce/order-entry/sales transactions
- Essential Secret
- Customer data/intellectual property/Email
- Important Valuable
- Employee records/marketing collateral
- Low-critical Nominal value
- Resumes/market data/competitive data
6Recovery Point Objectives (RPO)
- RPO point-in-time which systems data
- Must be recovered TO within the DR facility
7Recovery Time Objectives (RTO)
- RTO total time which systems apps
- Must be recovered AFTER an outage
8Picking WAN DR options
- Mission critical
- Split mirror disk-to-disk
- Synch mirroring
- Hot standby at remote locations
- Servers and storage
- Continuous snapshots
- Essential
- Asynch remote mirroring
- Snapshot
- Distributed backup
- Volume copy
- Important
- Wide area backup
- Distributed directory journaling
- Low critical
- Electronic Tape vaulting
9Wide Area Network (WAN) options
- ATM
- SONET
- TCP/IP
- WAN can gt 50 of DR OpEx costs
10ATMAsynchronous Transfer Mode
- Pros
- High performance
- OC3 to OC48
- 155Mbps to 2.5Gbps
- Shared network (cell based)
- IP over ATM
- Excellent QoS
- Available from most telcos
- High bandwidth utilization
- Cons
- Bandwidth overhead
- Niche technology
- Out-of-favor
- Disappearing
- Appears to be Dead-End
- High cost
11SONET/SDHSynchronous Optical Network/Synchronous
Digital Hierarchy
- Pros
- High performance
- OC3 to OC192
- 155Mbps to 10Gbps
- Preferred by most telcos
- Can be shared
- TCP/IP switch/routers
- CWDM technology
- DWDM technology
- POS (IP packet over SONET)
- Very high bandwidth utilization
- Cons
- Expensive
- Although declining
- Not shareable natively
- Not a LAN technology
- Separate mgt from LAN
12TCP/IPTransmission Control Protocol/Internet
Protocol
- Pros
- Ubiquitous
- Available everywhere
- Well known mgt
- Large knowledge pool
- Shared network
- Std network for most orgs.
- Can piggyback on IP WAN
- DR WAN perceived as free
- Cons
- Designed for packet loss
- Typical 1
- Packet loss retransmissions
- Congestion
- Bit Error Rates
- Jitter
- Latency
- Router buffer overruns
- Packet loss low throughput
13Calculating storage WAN bandwidth
- How much data between application sites?
- And the DR site
- Over what period of time to move the data?
- Will the bandwidth be shared?
- If so, how much bandwidth is available?
- What type of WAN?
- Native ATM
- Native SONET
- TCP/IP
14Assumptions
- ATM
- gt 80 bandwidth utilization
- IP over ATM
- And native ATM end-to-end
- SONET/SDH
- a.k.a. clear channel
- gt 90 bandwidth utilization
- Primarily POS (IP over SONET)
- Or FCBB (FC over SONET)
- TCP/IP
- End-to-end
- Packet loss avg 1
- lt 30 bandwidth utilization
- Worsens w/distance
- Worsens w/gt packet loss
- For calculations 30
15Market trends
- In a poll of over 200 end users
- From SMB, SME, Enterprise
- 61 DR over TCP WANs
- 3 DR over SONET
- 1 DR over ATM
- 24 No DR over WAN
- 11 Both TCP SONET or ATM
16What WAN do you use today for data protection?
- TCP/IP
- SONET
- ATM
- TCP SONET or ATM
- None of the above
17Why TCP/IP WANs are so prevalent with DR
- Perception that the bandwidth is free
- Or at least very inexpensive
- Piggyback on IP WAN networks
- Evenings and Weekends
18Most DR apps primarily USE IP
- Asynch mirroring
- Snapshot
- Volume replication
- Distributed backup
- Incremental
- Replication or Backup
- Tape Vaulting
- Continuous replication
- Continuous snapshot
- Fibre Channel over WAN
- FCIP
- iFCP
- Although there is FCBB
- FC over SONET
TCP/IP
19Reality checkWhy TCP/IP WAN throughput is dismal
- TCP/IP
- Byte-streaming protocol moving data in small
packets - Retransmits the data from the last point of the
error - Immediately reduces the rate
- Backs down to slow start mode
- Additional ramp-up packet loss causes further
rate reduction - During periods of lossy conditions
- Application performance never has a chance to
recover - Why packet loss is so detrimental to TCP
throughput
20TCP 1985 Designed for LANs
- TCP Slow Start
- Packet rate 2X
- Per successful R-T
- Per Loss Event
- Sending rate cut 1/2
- TCP Congestion Control
- Sending rate gt 1
- Per successful R-T
21TCP 2004 LAN protocol over the WAN
- Same internal logic
- Since 1985!
- High BW Loss events
- large packet losses
- High Latency
- Slower recovery
- During congestion control
- Infrequent feedback
- Changing route conditions
- Based on packet loss events
22TCP resource contention on shared linksreduces
data protection throughput
- Sporadic packet loss
- Short long distance sessions
- Contend for same resources
- Router queues change dynamically
- From traffic bursts
3
23TCP What really happens to long distance
sessions?
- Packet loss events
- Frequent for shared nets
- Loss events
- Router buffer overruns
- Affect other sessions
- Lots of lost packets
- LD sessions beat down
- By SD sessions
- Results
- Low throughput
- Random delays
24The DR TCP WAN disconnect
- As distance gt, performance lt
- Worse with higher bandwidth
25DR TCP/IP conclusion
- Perception reality do not match
- Must be taken into account
- When building a Storage WAN for Enterprise DR
- Increasing the bandwidth doesnt solve the problem
26Now what?
- What happens when
- Throughput is much less than usable bandwidth?
- Time windows cant be met?
- The IP WAN is insufficient?
- Throwing more bandwidth at it fails to resolve
problem?
27There are 2 choices
- Go ATM end-to-end
- Not very palatable to most end users
- OR TCP enhancers
- Proxies
- Compressors
- Caching/spoofing
- Accelerators
28Different clever technologies
- TCP/IP performance enhancing proxy
- Eliminates TCP packet loss latency issues
- Compression
- Increases payloads per packet
- Compression increases from 2X to 400X
- Caching (a.k.a. spoofing)
- Acknowledges packets locally
- Accelerators
- Resequencing, QoS, concatenation, duplication
elimination - Chatty protocol elimination
29TCP/IP network shielding
Bit Error Rates
Network Jitter
- Shields TCP/IP network
- Bit error rates
- Congestion
- Jitter
- Latency
- Buffer overflows
- Much gt BW utilization!
- Before compression
TCP/IP Latency
Router buffer Overflows
Network Congestion
Data protection packets in a TCP/IP network
30gt DS3 TCP/IP performance enhancements
- NetEx - HyperIP
- Orbital Data - IP Express
31lt DS3 TCP/IP performance enhancements
- Expand - IP Accelerators
- 1800/4800/6800/9000 series
- Peribit
- SR20/50/55/80
- Net Celera
- T Series
- River Bed - Steel Head
- 500/1K/2K/3K/5K
- Orbital Data
- IP Express LC
32Caching appliances
- River Bed - Steel Head 500/1K/2K/3K/5K
- CIFS MAPI (NFS coming)
- Tacit - Ishared Server
- CIFS NFS
- Kashya - KBX4000
- Includes volume replication, snapshot,
mirroring - File block replication
33Storage WAN for enterprise DR TCP enhancement
caveats
- Needs TCP enhancement
- Packet loss is an issue
- Long distance
- Big bandwidth
- Large amounts of data
- Data migration
- Volume replication
- Snapshots
- High IOPS
- Bulk data transfers
- May not need it
- Incremental data
- Only changed data
- Short time for net new data
- Asynchronous mirroring
- Short distance
- Small bandwidth
34Other issues to weigh
- Shared cost mgt.
- VLANs important
- Dedicated cost mgt.
- More flexibility
35Summary and conclusions
- Build your DR foundation 1st
- Calculate DR throughput requirements
- Pick WAN technology of choice
- If TCP determine need for enhancement
- Implement
- Reassess quarterly