Experiences with D/R Procedures - PowerPoint PPT Presentation

About This Presentation
Title:

Experiences with D/R Procedures

Description:

Our Unix-based servers were restarted, and checked. ... Allocate/format new WORK with old space. Change high-used RABN and high-used ISN ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 82
Provided by: Dieter81
Category:

less

Transcript and Presenter's Notes

Title: Experiences with D/R Procedures


1
Experiences with D/R Procedures
  • Of ADABAS Data on Mainframes
  • Natural Conference Boston
  • Dieter W. Storr
  • May 2004
  • info_at_storrconsulting.com

2
(No Transcript)
3
Different Disaster Different Action
  • Unplanned downtime
  • Machine outages
  • Network outages
  • Software failures
  • Disaster
  • Site / data center loss
  • Catastrophic failure

4
Leading Causes of DowntimeSource DRJ Summer
2002, Volume 15, Number 3
Power Storm Flood
Terrorism Outage
Damage Sabotage
5
Other Causes of Downtime
  • Fire
  • Earthquake
  • Computer Crime

6
LA Times Downtime
  • Flood Damage 21 April 2002
  • Water was flooding through the Orange County
    facility, 14-inch pipe that supplies the
    fire-sprinkler system burst, half the facility
    standing in more than a foot of muddy water
  • Affected areas editorial, ad ops, IT,HR,ADABAS
    was not affected

7
LA Times Downtime
  • Bomb Alarm 14 June 2002
  • A bomb was believed to have been left in the Bank
    of America branch thats set into the Times
    Building
  • Security swept the building,
  • DBAs observed the system from home

8
LA Times Downtime
  • Bomb Alarm 29 July 2002
  • An intruder claimed to have a bomb, darted into
    the garage
  • Security swept the building,
  • OP stopped CA7 - so PLOGCOPY couldnt start
    automatically, two PLOGs got full, ADABAS was
    locked, DBAs later started the PLCOPY jobs
    manually

9
LA Times Downtime
  • Power Outage - 29 August 2002 (343 P.M.)
  • City (DWP) had a power grid, flood leaked into a
    DWP transformer
  • There were actually 2 spikes/outages, the first
    started the UPS switchover, which was interrupted
    by the second, which took the UPS down.

10
LA Times Downtime
  • Power Outage - cont
  • The network was back in service after a short
    delay.
  • Our Unix-based servers were restarted, and
    checked. There was no evidence of damage to the
    Sybase Adaptive Server Enterprise (ASE, formerly
    Sybase SQL Server) servers.

11
LA Times Downtime
  • Power Outage - cont
  • Mainframe recovery was delayed due to corruption
    to the Hardware Management Console (HMC)
  • OP did a power-on reset, which restored the HMC
  • Operations IPLed, and Technical Support proceeded
    with system checkout procedures.
  • Although Enterprise Storage Server (ESS) had an
    error indicator, it was still up and did not add
    to any outages
  • IBM reset error indicator without impact.

12
LA Times Downtime
  • Power Outages - cont
  • Started ADABAS servers manually Parm Error 23,
    DIB block remained after an abnormal termination
  • Started all servers with IGNDIBYES 1825
    ADABAS IS ACTIVENO ADAN58 Message

13
LA Times Downtime
  • ADAN58 Message (ADA71 ADAN5A)
  • ADAN58 BUFFER-FLUSH START RECORD DETECTED DURING
    AUTORESTART.
  • THE NUCLEUS WILL T E R M I N A T E AFTER
    AUTORESTART. IN CASE OF POWER FAILURE, THE
    DATABASE MIGHT BE INCONSISTENT BECAUSE OF
    PARTIALLY WRITTEN BLOCKS.
  • O N L Y IN THIS CASE, REPAIR THE DATABASE BY
    RESTORE AND REGENERATE OTHERWISE RESTART THE
    NUCLEUS.
  • ADAN5A FILES MODIFIED DURING AUTORESTART files

14
Power Failure During Buffer Flush
A
B
C
D
old block updated block partially updated block
on disk
E
F
C
H
E
F
C
D
15
Nucleus Restart After Power failure -
IGNDIBYES ltsnipgt ADA200 00230 User exit 2
active. ADA201 00230
PLOG2 closed. ADAP3X2P submitted.
ADAN21 00230 PROTECTION-LOG PLOGR1 STARTED
ADAN02 00230 NUCLEUS-RUN WITH
PROTECTION-LOG 00677 ADAL02 00230
2002-08-29 182518 CLOGRS IS ACTIVE
ADAN03 00230 ADABAS COMING UP
ADAN5A 00230 FILES MODIFIED DURING
AUTORESTART ADAN5A 00230 00038
00057 00069 00072 00073 00074 ADAN5A
00230 00075 00076 00104 00138 00139 00148
ADAN5A 00230 00195 00221 00243
ADAN19 00230 RUNNING WITH
ASYNCHRONOUS BUFFERFLUSH ADAN8Y 00230
FILE-LEVEL CACHING INITIALIZED
ADAN80 00230 ADABAS DYNAMIC CACHING ENVIRONMENT
ESTABLISHED. ADAN01 00230 A D A B A S V6.2.2
IS ACTIVE ADAN01 00230 MODE
MULTI I S O L A T E D ADAN01
00230 RUNNING WITHOUT RECOVERY-LOG
ADA800 00230 User exit 8 active. ltsnipgt
16
LA Times Downtime
  • Power Outage - cont
  • Switched all PLOGs
  • Checked batch and online
  • There was no evidence of damage to any of the
    ADABAS components.

17
Other LA Times Disasters
  • 1965 Watts riots
  • 1971 Sylmar quake 6.5
  • 1987 Whittier punch 5.9
  • 1992 LA riots
  • 1994 Northridge quake 6.7
  • 6 Feb 1998 El Nino, flooding in B-1 computer
    room
  • 15 April 1999 Power failure news editing

18
ADABAS Recovery
CLOG
  • Command Log (CLOG) Failure - I/O Error
  • Restore or reallocate/format the CLOG
  • ADABAS will come up through Autorestart normally
  • No data loss if CLOG is not used

19
ADABAS Recovery
PLOG
PLOG
  • Protection Log (PLOG) Failure - I/O Error
  • Restore or reallocate/format the PLOG
  • Take a full back-up of the database
  • ADABAS will come up through Autorestart normally
  • Restart batch jobs
  • Restartable batch jobs OK
  • Non-restartable batch jobs check

20
ADABAS Recovery
TEMP
SORT
  • TEMP and SORT Failure - I/O Error
  • Restore or reallocate/format the TEMP/SORT
    dataset
  • Different actions for the utilities
  • See the ADABAS Utilities manuals

21
ADABAS Recovery
DSIM
  • DSIM Failure - I/O Error
  • Restore or reallocate/format a DSIM dataset
  • Different actions for the utilities
  • See the ADABAS Utilities manuals

22
ADABAS Recovery
RLOGM
RLOGR
  • Recovery Aid Dataset Failure - I/O Error
  • Restore or reallocate/format a RLOG dataset
  • Prepare the RLOG dataset
  • ADARAI PREPARE RLOGSIZE / RLOGDEV.
  • Different actions for the utilities
  • See the ADABAS Utilities manuals
  • Take a full back-up of the database
  • This will start the first generation of the RLOG
    dataset

23
ADABAS Recovery
ASSO
ASSO
DATA
DATA
  • ASSO/DATA Failure - I/O Error
  • Copy PLOG twice - ADARES PLCOPY
  • Restore or reallocate/format DATA dataset(s)
  • Instead of reallocate/format and restore all DATA
    volumes, System specialists can
  • Reallocate and format the new volume
  • Restore the VTOC chain
  • Restore and Regenerate only files that were
    located on the failed volume
  • Otherwise, . . .

24
ADABAS Recovery
ASSO
ASSO
DATA
DATA
  • ASSO/DATA Failure - I/O Error
  • Restore entire database ADASAV RESTORE
    OVERWRITE for GCB ADASAV RESTONL
    OVERWRITEinclude PLOG
  • Start nucleus with UTIONLYYES
  • Regenerate updates from end of last save
    (SYN2)ADARES REGENERATE PLOGNUMxxxADARES
    FROMCPSYN2,FROMBLKxxx

25
ADABAS Recovery
ASSO
ASSO
DATA
DATA
  • ASSO/DATA Failure - I/O Error
  • Possible utilities need to be rerun (see ADARES)
  • ADALOD LOAD FILExxx
  • ADALOD UPDATE FILExxx
  • ADALOD UPDATE FILExxx,DDISN
  • ADAINV INVERT FILExxx,FIELDxx
  • Lock files to rerun utilities
  • ADADBS OPERCOM LOCKUxx
  • Unlock utility-only status
  • ADADBS OPERCOM UTIONLYNO

26
ADABAS Recovery
ASSO
ASSO
DATA
DATA
  • ASSO/DATA Failure - I/O Error
  • Rerun the regenerate function for the relevant
    files
  • Unlock the regenerated files
  • ADADBS OPERCOM UNLOCKUxx
  • Dont repeat these steps if ADARES points out
  • ADALOD LOAD FILEnn
  • ADARES REGENERATE FILEnn
  • ADADBS REFRESH FILEnn
  • Nucleus is ready

27
ADABAS Recovery
WORK1
WORK2
WORK3
  • WORK 1 Failure - I/O Error
  • Restore or reallocate/format the WORK dataset
  • Restore and regenerate the entire database to
    avoid inconsistencies open transactionsSee
    ASSO/DATA failure

28
ADABAS Recovery
WORK1
WORK2
WORK3
  • WORK 2/3 Failure - I/O Error
  • End the database normally (ADAEND) to avoid open
    transactions in part 1 of WORK
  • Restore or reallocate/format the WORK dataset
  • Restart the database normally
  • If database abends then restore and regenerate
    the entire database - see ASSO/DATA failure

29
ADABAS Recovery
DATA
DS
DS
  • Failure in Data Storage Blocks
  • //DDSIIN DD DSNSAVE.SIBA.
  • // DD DSNPLCOPY.LOG1
  • // DD DSNPLOCPY.LOG2
  • //DDCARD DD
  • ADARES REPAIR DSRABNxxx-yyy
  • ADARES FILEn1,n2,n3
  • Failure in DSST
  • ADADCK DSCHECK FILExxx
  • ADADCK REPAIR

DS
CALL SAG ! !
30
ADABAS Recovery
ASSO
CP
DATA
  • Nucleus Ends With RC 77
  • Not restartable
  • No more space for Checkpoint File (CP)
  • Rename old WORK
  • Allocate/format new WORK with old space
  • Change high-used RABN and high-used ISN
  • Restart nucleus with new WORK and UTIONLYYES
  • Nucleus is in crippled mode - no user has
    access
  • Expand the database
  • Stop the nucleus normally
  • Rename old WORK and restart the nucleus with old
    WORK (autorestart)

CP
31
ADABAS Recovery
ASSO
User
DATA
  • Nucleus Ends With RC 77
  • Not restartable
  • No more space for user files
  • Rename old WORK
  • Allocate/format new WORK with old space
  • Restart nucleus with new WORK and UTIONLYYES
  • Nucleus is in crippled mode - no user access
  • Expand database
  • Stop nucleus normally
  • Rename old WORK and restart nucleus with old WORK
    (autorestart)

User
32
ADABAS Recovery
ASSO
DATA
  • Nucleus Abends - Missed DE Values
  • Descriptor is marked in FDT as DE, value doesnt
  • exist in ASSO, but in DATA.
  • Check
  • ADAICK ICHECK FILExxx,NOOPEN
  • ADAVAL VALIDATE FILExxx,DESCRIPTORyy
  • Solution 1
  • ADAULD UNLOAD FILExxx,UTYPEEXF
  • ADALOD LOAD FILExxx,LWPyyyyK
  • Solution 2
  • ADADBS RELEASE FILExxx,DESCRIPTORyy
  • ADAINV INVERT FILExxx,FIELDyy,LWP...

CALL SAG ! !
33
Back-up Possibilities
  • ADASAV to tape / disk
  • Including Fast Dump Restore, DFDSS
  • Delta Save Facility (DSF)
  • Delta Save QDUMP (Legent)
  • Disk mirroring (hardware level)
  • FlashCopy of Enterprise Storage Server (ESS)
  • Peer-to-Peer Remote Copy Extended Distance
    (PPRC-XD)
  • OC-3 links two EMC disc arrays
  • Replication
  • Stand-by systems
  • Restore and Regenerate
  • Entire Transaction Server

ASSO
DATA
34
ADABAS Disaster Recovery
  • How to back-up
  • Collect recovery data
  • Restore w/o nucleus
  • Start nucleus w/ UTILONLYYES
  • Regenerate w/ nucleus
  • Switch UTIONLYNO

35
ADABAS 6.2.2 Back-up at LA Times
2100
0100
0200
0300
800 - 1100
1200
Weekly
ADAP1BKF Online SAVE
ADAP1PLC (FEOFPL)
ADAP1PLC PLOG Switch
ASSO / DATA / WORK / etc.
BRM/ABARS Several Jobs
DFDSS Full-Volume Back-up
ADAP1BKO Copy Tapes
PDS, GDGs etc.
Pick-up by Recall
36
Production Database Back-ups
ADASAV SAVE BUFNO2,TTSYN60 Record format . . .
VB Record length . . . 27994 Block size
. . . . 27998 BUFNO30
37
Back-up to SMS Disk Pool
  • Run times are consistently at least 80 lower
    when writing to disk instead of cartridge
  • Run times are consistently around 60 lower when
    copying from disk to cartridge (compared with
    cart to cart)
  • DFSMShsm, automate your storage management
    tasks,SMS Production Storage Pool

DFSMShsm
38
Back-up to Disk Pool
  • No cartridge errors
  • No cartridge drive errors
  • No cartridges get accidentally ejected from the
    silo
  • Smaller back-up window
  • Smaller maintenance windows
  • Less impact to application processes
  • Greater confidence that the data you need will be
    there when you need it

39
IBM Magstar 3494/Virtual Tape Server
  • Linear design
  • 1 - 18 frames
  • Conf. Flexibility
  • SCSI, FC, ESCON, FICON
  • 3590, 3490E, VTS
  • High availability
  • Dual robotics
  • Dual library manager

gt42 old 3490 carts will fit on 1 new 3494 cart 5
x 3390 volumes fit on one 3494 cart One 3494 cart
can be read in 45 seconds into the VTS disk cache
(raid-5)
40
Virtual Tape Concept
  • Virtual tape drives
  • Appear as multiple 3490E tape drives
  • 3490E Media 1 and 2 support
  • Shared / partitioned like real tape drives
  • Tape Volume Caching
  • All data access is to cache
  • Improves mount performance
  • LRU Cache management
  • Volume Stacking
  • Fully utilizes physical cart capacity
  • Reduces physical cart requirement
  • Reduces footprint requirement

180
181
19F
. . .
Virtual Drive n
Virtual Drive 1
Virtual Drive 2
Tape Volume Cache
Virtual Volume 1
Virtual Volume 2
Virtual Volume n
Logical Volume 1
Magstar 3590 30/60 GB capacity
Logical Volume n
assumes 31 compression
41
Performance Tests
42
Collecting Data For Recovery
Block Ranges SYN1 - SYN2 For ADASAV RESTORE From
ADASAV SAVE PROTECTION LOG PLOGNUM64,
SYN14695, SYN24698 From ADAREP SYN1 06 UTI
2002-09-23 210009 64 4695 DUAL
ADAP1BKF SYNP 06 UTI 2002-09-23 210012 64
4696 DUAL ADAP1BKF SYN2 06 UTI
2002-09-23 210137 64 4698 DUAL
ADAP1BKF SYNV 0A UTI 2002-09-23 210140 64
4699 DUAL ADAP1BKF SYNV 0A UTI
2002-09-23 210140 64 4700 DUAL
ADAP1BKF SYNV 28 UTI 2002-09-23 210208 64
4702 DUAL ADAP1PLC SYNP 28 UTI
2002-09-23 210208 64 4703 DUAL
ADAP1PLC ltsnipgt EOD 00 ET 2002-09-23
233003 64 4747 DUAL ADAPRREP SYNS
53 ET 2002-09-23 233025 64 4749
DUAL ADAP1REP SYNV 28 UTI 2002-09-23
233030 64 4750 DUAL ADAP1PLC SYNP
28 UTI 2002-09-23 233031 64 4751
DUAL ADAP1PLC
43
Collecting Data For Recovery
Block Ranges SYN2 - End For ADARES
REGENERATE From ADAREP SYN1 06 UTI 2002-09-23
210009 64 4695 DUAL ADAP1BKF SYNP
06 UTI 2002-09-23 210012 64 4696
DUAL ADAP1BKF SYN2 06 UTI 2002-09-23
210137 64 4698 DUAL ADAP1BKF SYNV
0A UTI 2002-09-23 210140 64 4699
DUAL ADAP1BKF SYNV 0A UTI 2002-09-23
210140 64 4700 DUAL ADAP1BKF SYNV
28 UTI 2002-09-23 210208 64 4702
DUAL ADAP1PLC SYNP 28 UTI 2002-09-23
210208 64 4703 DUAL
ADAP1PLC ltsnipgt EOD 00 ET 2002-09-23
233003 64 4747 DUAL ADAPRREP SYNS
53 ET 2002-09-23 233025 64 4749
DUAL ADAP1REP SYNV 28 UTI 2002-09-23
233030 64 4750 DUAL ADAP1PLC SYNP
28 UTI 2002-09-23 233031 64 4751
DUAL ADAP1PLC
44
Collecting Data For Recovery
Dataset Name From Back-up Job (GDG) For ADASAV
RESTORE ADABAS.PRODOFFD.DB1.BACKUP.FULL.G0842V00
CATALOGED
45
Collecting Data For Recovery
Dataset Names From PLOG Copy Jobs (GDG) Matching
block numbers 4695 - End For ADASAV RESTORE and
ADARES REGENERATE DDSIAUS1 OUTPUT VOLUMEWRK015,
SESSION NR64
FROMBLK 1214, FROMTIME2002-09-23 033024
TOBLK 4701, TOTIME 2002-09-23
210142 ADABAS.PROD.DB1.PLOG.COPY.G7170V00 DDSIAU
S1 OUTPUT VOLUMEWRK015, SESSION NR64
FROMBLK 4702,
FROMTIME2002-09-23 210208 TOBLK 4748,
TOTIME 2002-09-23 233003 ADABAS.PROD.DB1.PLOG.
COPY.G7171V00 DDSIAUS1 OUTPUT VOLUMEWRK004,
SESSION NR64 FROMBLK 4749,
FROMTIME2002-09-23 233025 TOBLK 4791,
TOTIME 2002-09-24 033033 ADABAS.PROD.DB1.PLOG.
COPY.G7172V00
46
Recovery - Part 1 - W/O Nucleus
ADASAV RESTONL ltsnipgt //RESTONL EXEC
ADASAVRD //DDREST1 DD DISPSHR,BUFNO30, //
DSNADABAS.PRODOFFD.DB1.BACKUP.FULL.G0842V00
//DDPLOG DD DISPSHR,BUFNO30, //
DSNADABAS.PROD.DB1.PLOG.COPY.G7170V00 //DDKARTE
DD
ADASAV RESTONL BUFNO2,OVERWRITE
//REPORT EXEC
ADAREP
//DDKARTE DD
ADAREP NOFILE

//
47
Recovery - Part 2
Start the ADABAS nucleus with normal JCL
(UTIONLYYES) ltsnipgt ADAN21 00215 PROTECTION-LOG
PLOGR1 STARTED ADAN02 00215
NUCLEUS-RUN WITH PROTECTION-LOG 00064
ADAL02 00215 2002-09-21 212029 CLOGRS IS
ACTIVE ADAN03 00215 ADABAS COMING UP
ADAN19 00215
RUNNING WITH ASYNCHRONOUS BUFFERFLUSH
ADAN8Y 00215 FILE-LEVEL CACHING INITIALIZED
ADAN80 00215 ADABAS DYNAMIC CACHING
ENVIRONMENT ESTABLISHED. ADAN01 00215 A D A B A
S V6.2.2 IS ACTIVE ADAN01
00215 MODE MULTI I S O L A T E D
ADAN01 00215 RUNNING WITHOUT RECOVERY-LOG
ADA800 00215 User exit 8 active.
ADA801 00215
ADAP1PLC submitted.
48
Recovery - Part 2 - With Nucleus
ADARES REGENERATE ltsnipgt //REGEN EXEC ADARES

//DDSIIN DD DISPSHR,BUFNO30, //
DSNADABAS.PROD.DB1.PLOG.COPY.G7170V00 //
DD DISPSHR,BUFNO30, //
DSNADABAS.PROD.DB1.PLOG.COPY.G7171V00 //
DD DISPSHR,BUFNO30, //
DSNADABAS.PROD.DB1.PLOG.COPY.G7172V00 //DDKARTE
DD
ADARES REGENERATE PLOGDBID215,PLOGNUM
64 ADARES
FROMCPSYN2,FROMBLK4698
ADARES TOCPEOD,TOBLK00000 not
needed ltsnipgt
49
Recovery - Part 3 - With Nucleus
  • Lock files to re-run utilitiesSee regenerate
    report
  • ADADBS OPERCOM LOCKUfnror SYSAOS A / I / L / F
    or modify command /F jobname,LOCKUfnr
  • Unlock utility-only status for users
  • ADADBS OPERCOM UTIONLYNOor SYSAOS A / I / L /
    U or modify command /F jobname,UTIONLYNO

50
Recovery - Part 3 - With Nucleus
  • Re-run the utilities - if necessary
  • ADALOD LOAD / UPDATE / DDISN
  • ADAINV INVERT FILExxx,FIELDxx
  • Unlock files
  • ADADBS OPERCOM UNLOCKFfnror SYSAOS A / I / L /
    F / N or modify command /F jobname,UNLOCKFfnr

51
Delta Save Facility (DSF)
52
Delta Save Facility
53
Delta Save QDUMP (CCA - now TSI)
http//www.treehouse.com/qdump.shtml
54
Disk Mirroring
ASSO
  • Benefits
  • Asynchronous disk mirroring can provide better
    physical protection by supporting extended
    physical distances.
  • No loss of committed transactions in synchronous
    storage (mirroring/RAID) on a CPU failure

DATA
ASSO
DATA
55
Disk Mirroring
ASSO
  • Limitations
  • No protection from data corruption introduced by
    the hardware / software
  • Secondary site is not guaranteed to be
    transitionally consistent, because data is moved
    at the disk/track/sector or bit level (in the
    case of asynchronous mirroring).
  • Client application must be re-started after
    failure and need to be aware of failure

DATA
ASSO
DATA
56
Disk Mirroring
ASSO
  • Limitations
  • Synchronous mirroring and RAID devices can add
    overhead to application performance.
  • Redundant/specialized high availability
    hardware/software can be expensive and restricted
    to use for backup purposes only.
  • Secondary copy of data is not available for use
    low hardware utilization.
  • Need to replicate everything on disk, no
    selectivity of data replication

DATA
ASSO
DATA
57
Example For Disk Mirroring
Back Up / Hot Site
S/390
UNIX
EMC 5700
SRDF remote mirroredsynchronized
OC-3 link
SRDF remote mirroredsynchronized
12-15 miles
EMC 5700
S/390
UNIX
Main Platform
58
Dedicated line broadband speeds and prices
  • T-1 - 1.544 megabits per second (24 DS0 lines)
    Ave. cost 400.-650./mo.
  • T-3 - 43.232 megabits per second (28 T1s) Ave.
    cost 6,000.-16,000./mo.
  • OC-3 - 155 megabits per second (100 T1s) Ave.
    cost 20,000.-45,000./mo.
  • OC-12 - 622 megabits per second (4 OC3s) no price
  • OC-48 - 2.5 gigabits per seconds (4 OC12s) no
    price
  • OC-192 - 9.6 gigabits per second (4 OC48s) no
    price
  • Source http//www.infobahn.com/research-informati
    on.htm
  • prices updated 16 March 2004

59
Peer-to-Peer Remote Copy Extended Distance
(PPRC-XD) PPRC 60 miles - PPRC-XD continent
FlashCopy
ESS Shark - IBM ESS DASD - HDSalso support PPRC
ESS Shark
Also see TimeFinder from EMC
60
External Back-up Systems
  • Fast Copy of Data
  • Snapshot
  • No data movement
  • A virtual copy by copying pointers
  • Copy Process
  • Physical copy asynchr. from the log. Copy
  • No impact on applic. on the original data
  • Specific Hardware Required
  • Software works only with the hardware
  • Work on Volume Level
  • Some snapshot only tools work also on dataset
    level

61
Snapshot Physical Copy
  • IBM
  • Hardware Enterprise Storage Server
  • Software Flashcopy
  • http//www.share.org/proceedings/sh98/data/S3087.P
    DF
  • EMC2
  • Hardware Symmetrix Remote Data Facility
  • Software EMC TimeFinder
  • http//www.emc.com/interactive_center/media/timefi
    nder/tf_noRC.html

62
How It Works
Read only update requests are queued
Pre-defined time window
Suspend
Resume
Read only
Read / update
Read / update
snap
Physical Backup
Snapshot
Source Data
Source SAG
63
Replication
  • Benefits
  • Warm standby systems can be configured over a
    Wide Area Network, providing protection from site
    failures.
  • Ability to more quickly swap to the standby
    system in the event of failure, as backup
    database is already on-line.
  • Data corruption is typically not replicated as
    transactions are logically reproduced rather than
    I/O blocks mirrored.

64
Replication
ASSO
  • Benefits
  • Warm standby systems can be configured over a
    Wide Area Network, providing protection from site
    failures.
  • Ability to more quickly swap to the standby
    system in the event of failure, as backup
    database is already on-line.
  • Data corruption is typically not replicated as
    transactions are logically reproduced rather than
    I/O blocks mirrored.

DATA
WORK
WORK
DATA
ASSO
65
Replication
ASSO
  • Benefits
  • Automatic switch over for clients using a
    switching mechanism, no client restart needed.
  • Originating applications are minimally impacted
    as replication takes place asynchronously after
    commit of the originating transaction.
  • The warm standby database is available for
    read-only operations, allowing better utilization
    of backup systems.

DATA
WORK
WORK
DATA
ASSO
66
Replication
ASSO
DATA
  • Benefits
  • Ability to resynchronize and easily switch back
    to primary system when it becomes available
    without loss of data.

WORK
WORK
DATA
ASSO
67
Replication
ASSO
DATA
  • Limitations
  • Warm standby system will be out-of-date by
    transactions committed at the active database
    that have not been applied to the standby.
  • Protection is limited to components supporting
    Warm Standby (e.g. DBMS data sources may be
    protected but file systems may not be supported).

WORK
WORK
DATA
ASSO
68
Entire Transaction Propagator
  • The Entire Transaction Propagator allows for
    asynchronous data replication.
  • Replicated data can be updated and synchronized
    with master data at user specified intervals.

69
  • OS/390 Recovery Procedures
  • Prepared by the Mainframe Recovery Team
  • Recovering
  • The OS/390 platform
  • The ABARS aggregates
  • The ADABAS databases

70
(No Transcript)
71
  • OS/390 D/R Times (SUNGARD)
  • About 2400 tapes
  • Shipping time from storage to the mainframe ?
  • 4 hours ahead for tape staging
  • OS/390 and ABARS aggregates
  • 5 hours planned, 7 hours with problems
  • ADABAS databases
  • Approx. 2-3 hours for tape restore and regenerate
  • Next test Nov 1 approx. 45 minutes from disk pool

72
Experiences From D/R Tests
  • Problems to IPL on a strange CPU (6 hours
    duration)
  • Initial setup (restore SYS.. Libraries)
  • Pre-IPL procedures (restore Adabas, work, spool
    volumes, etc)
  • Post-IPL procedures (DFHSM in disaster mode,
    etc.)
  • Application restores
  • Tape drive offline problems, Import MVSCAT typo
    errors, etc.
  • Recovered wrong volumes, generation errors
  • Initialize work volumes - conversion to SMS
    (DFSMShsm)
  • TMC recovery problems caused BRM recovery
    problems, too

73
Experiences From D/R Tests
  • Sent wrong cartridges with system dates to
    storage
  • Less channels for tapes on our offsite (2 instead
    of 4) double restore time

74
Experiences From D/R Tests
  • RESTONL abended with SB00, no PLOG restored,
    Recovery Aid flag was on at the saved database.
  • REGENERATE deleted file and pointed out to repeat
    the ADALOD job but the input dataset was not
    saved
  • We did a full volume restore (DFDSS), restored
    the database and forgot to format the dual
    protection logs.
  • Missed protection logs
  • BRM restored wrong aggregates
  • Missing full-volume restores - (Database 2)
  • Missing volumes in Work Storage Pool - (Database
    3)

75
Experiences From D/R Tests
  • BRM Back-up and Recovery ManagerABARS
    Aggregate Back-up and Recovery Support(ABARS
    not Air conditioning and refrigeration industry
    services ltsmilegt ) Recovered (-1) Aggregates
    instead of (0) (all Databases) Recovered only
    SOME files on Aggregate (0) - (Database
    1)BRM/ABARS was not properly recovered (wrong
    version of BRM database) Once those problems
    were resolved (several hours later), the ADABAS
    recovery ran smoothly.
  • 5 Databases (61.4GB) restored and regenerated in
    3.5 hours (tape/cart)

76
How Far is Far Enough?(http//www.drj.com/artic
les/spr03/1602-02.html)
  • Alternate Facility
  • Offsite Storage Facility
  • Answer 105 miles
  • so the survey

77
Lessons Learned (http//www.drj.com/articles/spr02
/1502-07.html)
  • Distance is keyStreets, bridges, tunnels,
    airports are closed
  • Tape recovery is not effective
  • All applications are critical
  • Inconsistent back-up is no back-up at all
  • People-dependent processes do not suffice
  • Two sites are not enough
  • People are irreplaceable so is information

78
Lessons Learned (http//www.drj.com/articles/spr02
/1502-07.html)
  • Companies that relied on tape or on third-party
    provider found in many cases they had difficulty
    meeting their recovery time objectives

All disasters are possible
79
Helpful Links
  • Software AG - ADABAS Recoveryhttp//www.softwarea
    g.com/adabas/news/vers_7.htmhttp//servline24.sof
    twareag.com/SecuredServices/ ltKnowledge Center -
    ADABASgt
  • ADABAS Restart and Recovery (Operations
    Manual)http//servline24.softwareag.com/SecuredSe
    rvices/ ltKnowledge Center - Product
    Documentationgt
  • University of Arkansas - D/R Planhttp//www.uark.
    edu/staff/drp/
  • Disaster Recovery Journal http//www.drj.com

80
Helpful Links
  • FlashCopyhttp//www.share.org/proceedings/sh97/da
    ta/S9111.PDFhttp//www.storage.ibm.com/hardsoft/p
    roducts/ess/pubs/f2ahs05.pdf
  • Shark (ESS)http//www.almaden.ibm.com/cs/shark/
    http//www.storage.ibm.com/hardsoft/disk/index.htm
    l
  • State of the Art Storagehttp//www.networkmagazin
    e.com/article/NMG20010104S0002/2
  • EMC TimeFinderhttp//www.emc.com/products/softwar
    e/timefinder.jsp
  • Entire Transaction Propagator (SAG)http//servlin
    e24.softwareag.com/SecuredServices/document/html/e
    tp151/pdf/man.pdf

81
Thank you!
Questions?
Write a Comment
User Comments (0)
About PowerShow.com