BID211: High Availability - PowerPoint PPT Presentation

1 / 75
About This Presentation
Title:

BID211: High Availability

Description:

BID211: High Availability & Disaster Recovery With Sybase IQ. Sid Sipes ... Without Raid 1 or Raid 5, the loss of a single disk platter will corrupt your IQ Database. ... – PowerPoint PPT presentation

Number of Views:327
Avg rating:3.0/5.0
Slides: 76
Provided by: Kar9252
Category:

less

Transcript and Presenter's Notes

Title: BID211: High Availability


1
BID211 High Availability Disaster Recovery
With Sybase IQ
Sid Sipes Principal System Consultantsipes_at_sybase
.com / 972-687-6477August 7, 2003
2
For This Presentation
  • Lets Keep It Interactive
  • I will entertain questions on a subject
  • Would be happy to speak to you offline, if
    desired
  • No Question is too Basic
  • Chances are others may have the same question
  • Open Question Forum at the end
  • Time Permitting

3
Target Audience
  • Data Warehouse Management
  • Database Administrators
  • All Levels
  • System Administrators
  • All Levels

4
Disaster Recovery
  • Agenda
  • We will be discussing
  • Why Do I Need to Backup A Data Warehouse?
  • How do you Backup Terabytes of data?
  • Are there any 3rd party tools?
  • Alternative Backup/Restore Techniques
  • We will also be discussing Disaster Recovery
    Planning and Implementation

5
Disaster Recovery
  • Why do I need to Backup My Database?
  • Why do I need a Disaster Recovery Plan?
  • There seems to be an attitude that Data
    Warehouses do not need the same type of
    safeguards that OLTP systems require!!!!

6
Disaster Recovery
  • Why Bother with Backups?
  • A Data Warehouse is Not Mission Critical
  • Who cares if it goes down for a while?
  • Only a few analysts will miss it!!!
  • Very few on-line users
  • I can always rebuild it
  • A Data Warehouse only contains historical data.
  • The source data is around, somewhere in the OLTP
    systems

7
Disaster Recovery
  • Heres Why
  • A Data Warehouse is Not Mission Critical
  • Not True Anymore!!!!
  • Many Data Warehouse systems are 24x7
  • Decisions made on very recent data
  • Many users are demanding historical data combined
    with Near-RealTime data.
  • Potential for many thousands of on-line users via
    web interface

8
Disaster Recovery
  • I can always rebuild it !
  • How?
  • Where is the Data?
  • Do you have access to source systems?
  • What impact will massive extracts have on source
    systems?
  • How efficient is your ETL process?
  • Do you have the network bandwidth to move that
    much data?

9
Disaster Recovery
  • I can always rebuild it !
  • Chances are great that you can not rebuild your
    Data Warehouse.
  • Not Enough Time
  • Data doesnt exist anymore (purged from OLTP
    systems)
  • Many, Many, Many, Many More reasons why!!!
  • Things only get worse with users breathing down
    your neck.

10
Disaster Recovery
  • It will Happen to YOU!!!!!!!
  • Backup / Recovery and Disaster Planning are
    Critical Components of any Data Warehouse
    Operational Plan.
  • Sooner or later, you will loose your database
  • Hardware Problem
  • Data Corruption
  • Gremlins
  • Plan for It

11
Disaster Recovery
  • Ok, How do I backup Terabytes of Data?
  • Just How Do I perform Backups Given
  • Huge Amounts of Data!!
  • Slow Backup Devices!!
  • Not Enough Time
  • What are the Alternatives?

12
Disaster Recovery
  • Ok, How do I backup Terabytes of Data?
  • How Much Time Does a Backup Take?
  • How Do I handle Terabytes of Data?
  • Question Recently Asked on IQUG!
  • Case Size Disk/ File
    Time Time Rate
  • (MB) Tape Size
    (Secs) (HHMM) (MB/Sec)

13
Disaster Recovery
  • Backup Rates
  • Best Rate Reported
  • 55.6 MByte/Sec
  • 5 Hours / TeraByte!!!
  • Slowest Rate Reported
  • 3.7 MByte/Sec
  • 75 Hours / TeraByte!!!!

14
Disaster Recovery
  • Another Interesting Case
  • Sybase ASIQ Client Recently Ran Tests on VLDB
  • Anticipating 10TB IQ Database
  • Tested Backup to Conventional Tape Arrays
  • 27 Tape Array (20/30 Gig Tapes)
  • Requires 400 Tapes to Backup Database
  • High Probability of Tape Failure
  • Requires 27 Hours to Backup Database

15
Disaster Recovery
  • Too Slow!!!!
  • What are the Backup Alternatives?
  • Sybase ASIQ Built-In Backup
  • ASIQ Backup Alternatives
  • Fast Extract
  • System Level Backup
  • Third Party Backup Utilities
  • Instant Disk Level Imaging

16
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • Backup Utility Built Into Server
  • Adaptive Server IQ provides three types of
    backups
  • FULL
  • INCREMENTAL
  • INCREMENTAL SINCE FULL

17
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • FULL Backup
  • Causes a full backup of both the Catalog Store
    and the IQ Store.
  • FULL is the default action.

18
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • INCREMENTAL Backup
  • Makes a full backup of the Catalog Store,
  • Then backs up all changes to the IQ Store since
    the last IQ backup of any type.

19
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • INCREMENTAL SINCE FULL
  • Makes a full backup of the Catalog Store,
  • Then backs up all changes to the IQ store since
    the last full IQ backup.

20
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • All three backup types fully back up the Catalog
    Store.
  • In most cases, the Catalog Store is much smaller
    than the IQ Store.
  • Should be very fast
  • Temporary Store data is not backed up.
  • However, the meta data and any other information
    needed to recreate the Temporary Store structure
    is backed up.

21
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • BACKUP backs up committed data only.
  • Adaptive Server IQ backs up only those database
    blocks actually in use at the time of backup.
  • Free blocks are not backed up
  • It does not back up the transaction log file.
  • It does not use the transaction log to restore
    the database.

22
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • Backups begin with an automatic checkpoint.
  • At this point, the backup program determines what
    data will be backed up.
  • It backs up the current snapshot version of your
    database as of the time of this checkpoint.
  • Any data that is not yet committed when this
    checkpoint occurs is not included in the backup

23
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • BACKUP always makes a full backup of the Catalog
    Store on the first archive device, and then backs
    up the data from the IQ Store in parallel across
    all of the devices you specify.
  • Blocks are not distributed evenly across archive
    media.
  • You may have more on one device than others,
    depending on the processing speed of individual
    threads

24
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • You can back up any IQ database onto either disk
    or magnetic tape.
  • AS IQ supports backup and restore using multiple
    tape drives at near device speeds, or to multiple
    disks.

25
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • When Backing Up / Restoring from Multiple
    Devices
  • Beware of Exceeding I/O Bandwidth to SAN/Disks
    with Too Many Tape Drives
  • Each Drive operates at about 20MB / Sec
  • Typical I/O Bandwidth is 1 GigaBit (200 MB/Sec)
  • 27 Drives 20 MB/Sec 540 MB/Sec I/O Requests
  • Youve just blown your bandwidth for a single
    Fiber Channel

26
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • Adaptive Server IQ allows Stacker drives with
    multiple tapes
  • Adaptive Server IQ BACKUP does not support
  • jukeboxes
  • robotic loaders.
  • Quarter Inch Cartridge (QIC) drives.

27
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • Is ASIQ Backup Fast Enough?
  • Probably not for Terabytes of Data
  • Should Consider Other methods
  • Tuning for Faster ASIQ Backups is Possible
  • May provide enough performance
  • Well explore other options in a few slides

28
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • Increasing Performance
  • Increasing the number of archive devices
  • BACKUP and RESTORE write your IQ data in parallel
    to or from all of the archive devices you
    specify.
  • The Catalog Store is written serially to the
    first device.
  • Faster backups and restores result from greater
    parallelism.
  • To achieve greater performance when backing up or
    restoring a large database, specify more archive
    devices.

29
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • Increasing Performance
  • Eliminating data verification
  • You can also improve the speed of backup and
    restore operations by setting
  • CRC OFF in the BACKUP command. This setting
    deactivates cyclical redundancy checking.
  • Default is CRC ON
  • WARNING Introduces some Risk

30
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • Increasing Performance
  • Spooling backup data
  • You may find that it is faster and more efficient
    to create backups on disk, and then spool them
    onto tape for archival storage.
  • If you choose this approach, you need to unspool
    the data onto disk before restoring it.

31
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • Increasing Performance
  • Increasing memory used during backup
  • The amount of memory used for buffers during
    backup directly affects backup speed, primarily
    for tape backups.
  • The BLOCK FACTOR parameter of the BACKUP command
    controls the amount of memory used.
  • If your backups are slow, you may want to
    increase the value of BLOCK FACTOR for faster
    backups.

32
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • Increasing Performance
  • Balancing system load
  • Adaptive Server IQ allows you to perform backups
    concurrently with all other read/write
    operations, except those that affect the
    structure of the database.
  • It is still a good idea to schedule backups
    during times of low system use, however, to make
    the best possible use of system resourcesdisk,
    memory, and CPU cycles.

33
Disaster Recovery
  • Sybase ASIQ Built-in Backup
  • OK, ASIQ Backup/Restore May Not Be Fast Enough.
  • What do I do?
  • Alternatives
  • Fast Extract
  • System Level Backups
  • Array Vendor Instant Backup
  • Shadow-Image / BCV
  • True-Copy
  • SRDF

34
Disaster Recovery
  • ASIQ Fast Extract.
  • Use ASIQ Built-In Fast-Extract option
  • SQL Query specifies what to Backup
  • Data extracted to ASCII or BINARY files
  • Single or Multiple Tables
  • Useful for Backing Up/Restoring Huge Tables
  • Data Restored using IQ SQL Load Syntax
  • Very Fast Operation

35
Disaster Recovery
  • ASIQ Fast Extract.
  • Downside Does not save/restore any metadata or
    system information.
  • Little or no error checking

36
Disaster Recovery
  • System Level (OS) Backups
  • You can use system-level backups for an IQ
    database.
  • For Example, DD raw devices.
  • Must be done carefully, and with certain
    safe-guards in place
  • If you attempt to restore your IQ database files
    from a system-level backup without these
    safeguards in place,
  • You will cause data loss or corruption, either
    from activity in the database while the
    system-level backup occurred, or from missing
    files.

37
Disaster Recovery
  • System Level (OS) Backups
  • Shutting down the database (non-multiplex)
  • Your IQ database MUST NOT be running during a
    system-level backup.
  • You must shut down your IQ database before
    starting the system-level backup.
  • You must also ensure that no one starts the IQ
    database until the system-level backup is complete

38
Disaster Recovery
  • System Level (OS) Backups
  • Shutting down the database (multiplex)
  • Your multiplex WRITE SERVER MUST NOT be running
    during a system-level backup.
  • You must shut down your write server before
    starting the system-level backup.
  • You must also ensure that no one starts the write
    server until the system-level backup is complete.

39
Disaster Recovery
  • System Level (OS) Backups
  • You must Backup the following Files
  • SYSTEM dbspace file,
  • typically named dbname.db
  • The transaction log file, which is required for
    system recovery,
  • typically named dbname.log
  • The IQ_SYSTEM_MAIN dbspace file,
  • typically named dbname.iq
  • Or the corresponding raw device(s).
  • Files for any additional dbspaces that have been
    added to IQ_SYSTEM_MAIN

40
Disaster Recovery
  • System Level (OS) Backups
  • You only need to save the lengths of the
    following files
  • The IQ_SYSTEM_TEMP dbspace file,
  • typically named dbname.iqtmp
  • Or, the corresponding raw device(s) size.
  • Files / Devices for any additional dbspaces that
    have been added to IQ_SYSTEM_TEMP

41
Disaster Recovery
  • System Level (OS) Backups
  • System Level Backup
  • It is not required that you back up the temporary
    dbspaces.
  • IQ can reconstruct any temporary dbspace provided
    that it sees a file of the correct length at the
    time the database starts.
  • Therefore, you may simply keep records of the
    sizes of the files or raw devices used to hold
    the temporary dbspaces.

42
Disaster Recovery
  • System Level (OS) Backups
  • Optional Files
  • The ASCII message files such as dbname.iqmsg
  • The ASLOGDIR/.svrlog
  • The ASLOGDIR/.stderr files.
  • Not Required, but contain useful information
  • ASIQ Engineering may need these files, should
    something go catastrophically wrong during the
    restore.

43
Disaster Recovery
  • Third Party Backup Utilities
  • Veritas Net Backup
  • Now GA
  • Certified (by Veritas and Sybase) for Sybase ASIQ
  • Requires ASIQ 12.4.3 EBF 4 (or Newer)
  • Similar in Implementation to Net Backup Agent for
    Sybase ASE
  • Will Fit within Existing Veritas Installations

44
Disaster Recovery
  • Third Party Backup Utilities
  • Veritas Net Backup
  • Fits within Veritas Enterprise Architecture
  • Works with
  • Veritas Master Node (Scheduler)
  • Veritas Media Server
  • Veritas Client (Net Backup Agent

45
Disaster Recovery
  • Third Party Backup Utilities
  • Veritas Net Backup
  • Currently Single-Threaded.
  • Can only write to a single archive device at a
    time.
  • Sybase and Veritas working to add parallelism

46
Disaster Recovery
  • Third Party Backup Utilities
  • Sun Solaris HA Suite
  • Provides ability to take Snap-Shot of Disk
    Devices
  • Full Snap-Shot
  • Incremental Snap-Shots
  • Might be able to provide Point-In-Time recovery.
  • MUST OBSERVE Same SAFEGARDS as OS/System Level
    Backups
  • Simplex Server must be shutdown
  • Multiplex Write Server must be shutdown

47
Disaster Recovery
  • What About Disk Array Replication?
  • Array Vendors (EMC, Hitachi, Sun) offer the
    ability to remotely replicate all of your IQ data
    base devices
  • EMC SRDF
  • Hitachi True Copy?
  • Replication is Transparent to IQ
  • Entire Database is Replicated
  • Catalog
  • IQ Store
  • Transaction Log

48
Disaster Recovery
  • Disk Array Replication
  • ASIQ Is Certified with the Vendors
  • EMCC
  • Hitachi
  • Hitachi

49
Disaster Recovery
  • Disk Array Replication
  • Instant Recoverability
  • Little Down Time
  • Both Local and Remote copies of data
  • True Disaster Prevention
  • Some of you are already backing up to disk
  • This will automate the process
  • You can still perform tape backups on remote copy
    of database

50
Disaster Recovery
  • Disk Array Replication
  • How Does it Work?
  • As changes are made to disk blocks, the disk
    array controller
  • captures the changes
  • Replicates in near-real-time the changed block to
    a mirroring set of disks.
  • Similar to Disk Mirroring
  • Can replicate to other Array Frames across LAN
  • Can replicate to remote frames, across WAN.

51
Disaster Recovery
  • Disk Array Replication
  • How Does it Work?
  • Capture and replication of changed data blocks
    are transparent to IQ.
  • Everything happens at the Array level.
  • Maintains a complete (near-real-time) copy of
    your entire database
  • Catalog Store
  • IQ devices (temp and main)
  • Any other File Systems or Raw Devices that are
    Critical to You.

52
Disaster Recovery
  • Disk Array Replication
  • How Does it Work?
  • At some point in time (Weekly?) you can
  • Verify the Integrity of your Database
  • Shut down IQ
  • Simplex Shutdown entire Server
  • Multiplex Shutdown Write Server
  • Break the Synchronization to the Replicated Copy
  • You now have a valid/verified Backup of IQ

53
Disaster Recovery
  • Disk Array Replication
  • How Does it Work?
  • If Primary IQ System Fails, You can start another
    Server on Replicated Copy of Database.
  • While Synchronization is Broken, you can copy
    replicated database to tapes.
  • After backup is complete, you can turn
    Replication back on.
  • Replicate Copy will be re-synched with Master

54
Disaster Recovery
  • Disk Array Replication
  • Case Study by ASIQ Client
  • One Primary IQ Database
  • Two Replicated Copies.
  • First Copy is Synchronized with the Master Daily.
  • Synch is broken before new loads
  • If loads fail/corrupt database, Client can switch
    to first Replicated Copy.

55
Disaster Recovery
  • Disk Array Replication
  • Case Study by ASIQ Client
  • Two Replicated Copies.
  • Second Copy is Synchronized with the Daily Copy
    at end of Each Week.
  • This copy will be 1-7 Days out of currency
  • This copy will be backed up to tapes
  • In Worst Case, Primary has Failed, and Daily Copy
    is corrupt.
  • Client can switch to weekly copy
  • If Weekly Copy fails,
  • The Client has a tape library to restore from

56
Disaster Recovery
  • Disk Array Replication
  • Advantages
  • Quickly and easily replicate/copy huge IQ
    Databases
  • Can provide more than one Disk Based Copy
  • Can Provide Remote copy over WAN
  • Almost Instant Recoverability after
    Database/Device Failure.

57
Disaster Recovery
  • Disk Array Replication
  • Disadvantages
  • Expensive
  • You need 2-3 times disk space
  • Not an Issue because of IQ Compression
  • Right????????
  • Some of you are already backing up to disk
  • Have to purchase replication software from array
    vendors

58
Disaster Recovery
  • Disk Array Replication
  • However
  • Even though it may be more expensive. Array
    replication may be the only way to backup huge,
    multi-terabyte databases.
  • Cost of disk is coming down all the time

59
Disaster RecoveryDisk Array Replication, Final
Thoughts
  • Volume Manager
  • Not a necessary or recommended component for IQ-M
  • Additional overhead and software that has no
    value add to an IQ-M installation
  • Acceptable for an IQ-M single node operation
    where hardware does not have the ability to apply
    RAID to the disk devices
  • RAID Level
  • Recommend using RAID 5 as a blend between
    performance, protection, and cost
  • Recommend using raw devices (a must for
    multi-node IQ-M)
  • For details on drive array specifications and
    configurations see the Sun Reference Architecture
    Whitepapers and EMC Whitepapers on array
    configurations with IQ-M

60
Disaster Recovery
  • Method Summarization
  • Thats It, In Summary
  • Several Backup Alternatives
  • ASIQ Backup
  • Fast Extract
  • System Level Backups
  • OS Level HA Features
  • Automated Remote Database Device Replication

61
Disaster Recovery
  • Wait, Havent We forgotten Something?
  • What about a Plan?
  • You Will eventually have a situation from which
    you need to recover!!!
  • Hardware Problem?
  • ASIQ Corruption? (Surely Not!!!)
  • OS Level Corruption?
  • Entry-Level /Newbee DBAs and SAs

62
Disaster Recovery
  • Disaster Recovery Plan
  • Start at the Lowest Level First (Hardware).
  • You need to have some form of RAID, even if you
    have a small system.
  • Without Raid 1 or Raid 5, the loss of a single
    disk platter will corrupt your IQ Database.
  • With Raid 1 or Raid 5, the loss of a platter will
    go un-noticed by ASIQ and the Users.
  • Your SAs will have opportunity and time to
    replace platter(s)

63
Disaster Recovery
  • Disaster Recovery Plan
  • Move Up a Little ( Still Looking at Hardware).
  • You should have multiple I/O paths to Disk/Array
    Devices.
  • Loss of a Single Path wont bring IQ down.

64
Disaster Recovery
  • Disaster Recovery Plan
  • Move Up a Little More
  • Think about completely documenting the Hardware
    and Operating System configuration
  • What Disk Devices you have
  • Device Sizes
  • Paths / Logical Paths to Devices
  • OS Kernal Parameters
  • User Accounts / Privileges
  • Any Scripts / SQL you use to perform you work.

65
Disaster Recovery
  • Disaster Recovery Plan
  • Move Up a Little More
  • Document your ASIQ Configuration
  • Device Usage and Sizes
  • System Option Settings
  • Users / Groups / Privileges
  • Extract Schema
  • Any Scripts / SQL you use to perform your daily
    work.

66
Disaster Recovery
  • Disaster Recovery Plan
  • Now, How are you going to backup your database?
  • Smaller Databases ( lt 1TB) IQ Backup is a good
    alternative
  • Larger Databases Look at
  • OS Level Backup
  • Disk Array Replication.

67
Disaster Recovery
  • Disaster Recovery Plan
  • If you choose an OS Level Backup
  • Do you have enough Down-Time for IQ
  • Remember, Simplex must be completely down.
  • Multiplex Write Server must be completely down.
  • Where are you going to backup data too?
  • Disks?
  • Tapes?

68
Disaster Recovery
  • Disaster Recovery Plan
  • If you choose an Array Replication Method
  • Do you have the budget for 2-3 times the disk
    space?
  • Do you have the budget for the replication
    software.
  • Do you have a Storage / SAN expert on Staff?

69
Disaster Recovery
  • Disaster Recovery Plan
  • Finally, Document a Plan
  • What methodologies you have chosen.
  • Whom is responsible for each component of the
    plan.
  • There should be a checklist
  • Step-by-Step Actions
  • Responsible person for each action
  • Backup person for each action.

70
Disaster Recovery
  • Disaster Recovery Plan
  • Is That It?
  • NOT QUITE You need to rehearse a recovery!!!!
  • You wont know if your recovery methodology will
    work, unless you test it.
  • This will also provide a base time-line for a
    recovery process
  • Any surprises can be uncovered during the
    rehearsal.
  • You will be setting realistic expectations for
    everybody.

71
Disaster Recovery
  • Now We are Really Done!!
  • Hopefully, you now understand the need for
    Disaster Recovery Planning, and understand the
    basics of
  • How ASIQ Backup / Restore Works
  • Alternatives to ASIQ Backup / Restore
  • Understanding of Instant Remote Backup

72
Disaster Recovery
  • Remember
  • You will suffer a catastrophe at some point
  • You need to plan for this event
  • You need to rehearse for this event.
  • When It happens, and you successfully, and
    smoothly recover,
  • You will be the HERO!!!

73
Disaster Recovery
  • Thats it
  • Go Forth and Safely Compute
  • See Ya Next Year
  • Sid

74
Disaster Recovery
  • Questions?

75
BID211 High Availability Disaster Recovery
With Sybase IQ
Sid Sipes Principal System Consultantsipes_at_sybase
.com / 972-687-6477August 7, 2003
Write a Comment
User Comments (0)
About PowerShow.com