Title: Oracle9i Data Guard
1Oracle9i Data Guard
- Darl Kuhn
- Sun Microsystems
- RMOUG Training Days 2003
2Oracle9i Data Guard Agenda
- Business Requirements
- What is Data Guard?
- Data Guard Architecture
- Implementing Data Guard
- Failover when Disaster Strikes
- 8i Data Guard New Features
- 9i Data Guard New Features
3Business Requirements
- Business Goals
- Keep the data 7x24 highly available
- High Availability
- Minimize unscheduled downtime
- Minimize scheduled downtime
- Disaster Recovery protection
- Minimize Cost
4Business Requirements
- Minimize Costs
- Hardware
- Personnel (DBA, SA, etc.)
- Special skills
- Training
- Licensing
- Implementation
- Maintenance
5Business Requirements
- Minimize Complexity
- Number of moving parts
- Ease of implementation
- Ease of maintenance
- More complex usually means more cost
- Higher availability usually means more cost
- Choose a solution that
- Meets business requirements
- Minimizes cost and complexity
6Business Solutions Available
- Backup the database, restore from tape
- Operating System failover
- Remote Mirroring
- Quest Software SharePlex
- Oracle Advanced Replication (OAR)
- Oracle Real Application Clusters (RAC)
- Oracle Data Guard (Standby)
7We Chose Data Guard
- Needed 7x24 Disaster Recovery protection
- Simple to implement
- Requires DBA with BR skills
- Didnt need special SA skills or consultants
- Low maintenance (do more w/less DBAs)
- We dont have to baby-sit it
- No extra licensing (built into Oracle9i)
- Not an add-on, part of Enterprise Edition
- Main cost was extra hardware
8Data Guard History
- Previously known as the Standby Option
- First developed by Oracle Consulting in the late
80s early 90s - Became part of Oracle database in 7.3
- Renamed to Oracle Data Guard in 8.1.7
- Greatly enhanced in Oracle9i
9What is Data Guard?
- Built upon a simple concept
- Take a backup of your database
- Move backup files to standby box
- Mount standby database in recovery mode
- Apply archived redo to standby database as it
becomes available from primary database
10Copy and Apply Archive Redo
- Oracle 7.3
- Copy and apply of archive redo was manual
- Oracle8i
- Automated copy and apply of archived redo
- Uses ARCn and Net8 for copy
- Configured via init.ora parameters
- Oracle9i
- Advanced archive gap detection
- Uses Fetch Archive Log (FAL) process
11Basic Physical Standby Architecture
Network
Primary Database Site (production hardware)
Standby Database Site (standby box)
(archived transactions in an OS file)
Damn Users
Local Archive Redo
Standby Database
Managed Recovery Process
Copied Archive Redo
Primary Database
ARCn
(a background process)
Network
12Basic Physical Standby Architecture
Primary Database Production Site
Standby Database Server
Copied Archive Redo
Copied Archive Redo
Users
Remote File Server (RFS)
Fetch ArchiveLog (FAL)
Oracle Net
Primary Database
LGWR
Managed Recovery Process (MRP)
ARCn
On-line Redo
Online Redo
Standby Database
Local Archive Redo
Local Archive Redo
13Implementing Data Guard in 10 Steps
- Assumptions
- Two different servers
- Identical mount points on both boxes
- Both primary and standby will have same name
- Using ARCn process in this example (not LGWR)
- http//otn.oracle.com/oramag/oracle/02-nov/o62avai
l.html - 1. Ensure primary database is in archive log mode
- 2. Take backup of primary database datafiles
- SQLgt select from vdatafile
- Cold (easiest)
- Hot
- RMAN
- Do not backup controlfiles
- Do not backup on-line redo logs
14Implementation (continued)
- 3. Copy backup datafiles to standby server
- If you use an online backup, ensure that you copy
over any archive redo generated during the backup - 4. Create a standby controlfile
- SQLgt alter database create standby controlfile as
stbycf.ctl - This controlfile will be used by the standby
database - 5. Copy the standby controlfile to the standby
box
15Implementation (continued)
- 6. Configure primary init.ora or spfile
- location of primary archive redo
- log_archive_dest_1LOCATION/ora01/oradata/BRDSTN
/ - location of archive redo on standby
- log_archive_dest_2SERVICEstandby1 optional
- log_archive_dest_state_2enable
- 7. Copy primary database init.ora file to standby
box and make modifications for standby database - Ensure standby database is pointing at standby
controlfile - control_files/ora01/oradata/BRDSTN/stbycf.ctl
- Location of where archive redo logs are being
written in standby environment - standby_archive_dest/ora01/oradata/BRDSTN
- Enable archive gap management
16Implementation (continued)
- 8. Configure Oracle Net
- Need Oracle Net connectivity from Primary to
Standby - and
- Need Oracle Net connectivity from Standby to
Primary - Primary database needs
- Listener
- TNS entry for standby1 service
- Standby database needs
- Listener
- TNS entry for primary1 service
17TNS File
- primary1 (DESCRIPTION(ADDRESS(PROTOCOLtcp)
(PORT1521) (HOSTprimary_host))
(CONNECT_DATA(SERVICE_NAMEBRDSTN))) - standby1 (DESCRIPTION(ADDRESS(PROTOCOLtcp)
(PORT1521) (HOSTstandby_host))
(CONNECT_DATA(SERVICE_NAMEBRDSTN)))
18Implementation (continued)
- 9. Startup and mount standby database
- SQLgt startup nomount
- SQLgt alter database mount standby database
- 10. Enable managed-recovery mode on standby
- SQLgt alter database recover managed standby
database disconnect
19Troubleshooting Implementation
- Continuously view bottom of primary and standby
alert.log files - tail f alert_BRDSTN.log
- Most of problems are with getting Oracle Net
setup correctly (fat fingering a TNS entry)
20When Disaster Strikes
- Ensure you apply all applicable redo to standby
- Make the standby the primary
- On standby
- 8i example
- SQLgt shutdown immediate
- SQLgt alter database mount standby database
- SQLgt alter database activate standby database
- 9i many more options
- SQLgt alter database recover managed standby
database finish
21When Disaster Strikes
- Applications must change TNS to point to new
database - Or change Oracle Names entry
- Or use Oracle Net load balance feature
- brdstn
- (DESCRIPTION
- (LOAD_BALANCEon)
- (ADDRESS(PROTOCOLtcp)(HOSTl3srv1)(PORT152
1)) - (ADDRESS(PROTOCOLtcp)(HOSTl3srv2)(PORT152
1)) - (CONNECT_DATA(SERVICE_NAMEbrdstn))
- )
22When Disaster Strikes
- Java Thin Driver Example
- String connectInfo
- "jdbcoraclethin_at_"
- "(DESCRIPTION "
- "(LOAD_BALANCEon)"
- "(ADDRESS(PROTOCOLtcp)(HOSTl3srv1)(PORT1521))"
- "(ADDRESS(PROTOCOLtcp)(HOSTl3srv2)(PORT1521))"
- "(CONNECT_DATA(SERVICE_NAMEbrdstn))"
- ")"
23Data Guard 8i New Features
- Auto copy of redo to standby
- Net8
- Initialization parameters
- Auto apply of redo on standby
- Enabled in standby database via
- SQLgt alter database recover managed standby
database
24Data Guard 8i New Features
- Read Only Mode
- Standby was mainly a disaster recovery and high
availability solution - Some limited scalability features
- Can now open database in read-only mode
- SQLgt alter database open read only
- And then back to recovery mode
- SQLgt alter database close
- SQLgt alter database recover managed standby
database
25Data Guard 8i New Features
- Read Only mode allows for SELECT statements to
run against standby database - No DDL allowed in read-only mode
- Applying of archive redo and read-only mode
mutually exclusive - Can either be in read-only mode or recovery mode,
but not both
26Data Guard 9i New Features
- Archive gap management
- Advanced data protection modes
- Automated propagation of datafile operations
- User error protection
- Database switchover/switchback
- Cascading standby
- Archive_lag_target parameter
- Logical Standby (version 9.2 only)
- Data Guard Broker
27Archive Gap Management
- What if network is down?
- What if standby is down?
- Primary continues to generate archive redo
- Those unapplied redo logs make up an archive gap
- When failure is resolved how is the standby
caught up?
28Archive Gap Management
- In version 8, DBA has to figure out which archive
redo logs make up the archive gap - Copy over archive redo logs
- Issue recover command on standby
- SQLgt alter database recover standby database
- Then automatic ship and apply can restart
29Archive Gap Management
- Gap detection crucial to high availability
- 9i has new methods for gap resolution
- FAL_CLIENT and FAL_SERVER initialization
parameters initiate Fetch Archive Log (FAL)
processes - If network or standby is down
- FAL process waits until communication is
re-established - Detects gap
- Transmits and applies archive redo log
30Data Protection Modes
- Maximum protection
- Guaranteed no-data-loss
- Uses LGWR to transfer redo
- SQLgt alter system set log_archive_dest_2SE
RVICEstandby1 LGWR SYNC AFFIRM - Standby configured with its own redo logs
- SQLgt alter database add standby logfile
(/ora01/oradata/BRDSTN/sb_redo_01.log) size
10m - Maximum availability
- Uses LGWR
- Allows for temporary unavailability of standby
- Maximum performance
- Default mode
- Uses either ARCn or LGWR for transferring redo
31Maximum Protection Architecture
Standby Database Server
Primary Database Production Site
Users
Synch/Asynch writes
Remote File Server (RFS)
LGWR
Oracle Net
Primary Database
Standby Online Redo
ARCn
Standby Database
Managed Recovery Process
Archive Redo
Archive Redo
Local Archive Redo
Local Archive Redo
ARCn
32Propagation of Datafile Operations
- Physical standbys have always supported all data
types - Supported all insert/update/delete statements
- Supported most DDL
- Create table/view/index/sequence
- Create user
- Create package/procedure/function/trigger
- Create role/synonym
- Grants
- Etc.
33Propagation of Datafile Operations
- One weakness of 8i was that add/drop
tablespace/datafile commands were not
automatically propagated - DBA had to intervene
- Now automated in 9i
- In standby initialization file
- standby_file_management auto
34Propagation of Datafile Operations
- Example
- SQLgt drop tablespace SHOLAY including contents
and datafiles - On standby the tablespace SHOLAY will be
automatically dropped and all datafiles will
automatically be deleted from disk - Note If you want to rename a datafile, and have
those changes applied to standby, DBA must
intervene
35User Error Protection
- In 8i, DBAs often implemented a manual delay in
application of redo - Create a shell script that
- Copy archive redo
- Wait 60 minutes before applying
- Idea being that if user accidentally
dropped/deleted/truncated data, the DBA could
intervene before it was applied to standby
36User Error Protection
- 9i you get more robust methods to manage user
errors - On standby
- SQLgt recover managed standby database delay 60
disconnect - Archive redo copied to standby when generated but
not applied until 60 minutes later - Disable the delay via
- SQLgt recover managed standby database nodelay
37Database Switchover
- Database can be in one of two roles
- Primary or standby
- SQLgt select database_role from vdatabase
- A switchover allows you to reverse roles of
primary and standby
38Database Switchover
- Database switchover is not a database failover
- A failover is permanently making a standby
database the primary database - Database switchovers would be ideal for rolling
OS upgrades - Switchovers cannot be used for database upgrades
39Cascading Redo Logs
- Version 9.2 only
- Standby can receive redo from another standby
- Reduce load on primary
- Can receive redo from a physical or logical
standby - Setup standby redo log files
- Ensure archiving enabled on physical standby
40Archive_lag_target
- Use archive_lag_target init parameter to ensure
online redo is switching often enough - Specifies in seconds how often you want an online
redo log switch to occur - Value in seconds
- Valid values 0, or 60 through 7200
- A value of 0 (default) disables this feature
- 1800 (30 minutes) seems reasonable
41Logical Standby
- Prior to 9.2 you could only have a physical
standby - Primary and physical identical down to the block
level - With 9.2 you can now create a logical standby
with the Data Guard SQL Apply tool - Logical standby is available for SELECT access
while applying transactions from the primary via
SQL statements generated using LogMiner
technology - Ideal for 7x24 reporting databases
- Logical updates one or more log files behind the
primary
42Logical Standby Architecture
Primary Database Production Site
Logical Database Standby Site
Network
User Enters Data
User runs reports
Logical Standby Process (LSP) Transforms
Redo into SQL
Production Primary Database
Logical Database
SQL Apply
LGWR
43Logical Standby
- In event of disaster, logical standby can be
turned into a primary database - Can create indexes and MVs to tune it
- Doesnt support longs and raws
- SQLgt select from dba_logstdby_unsupported
- Many steps (20) involved with setting up logical
standby - See MetaLink note 186150.1
44Data Guard Broker
- Two components
- Command line tool
- Data Guard Manager OEM GUI tool
- With Data Guard Manager, point and click your way
through setting up standby environment - Create physical and logical standby databases
- Perform switchovers and failovers
- Monitor log transport and apply
- Configure event driven reporting/e-mailing/paging
45Data Guard Summary
- Relies on tried-and-true standby technology
- Ideal for disaster recovery and high availability
- Fairly low cost
- DBA needs to have backup and recovery skills to
create and maintain - Many new Data Guard features in Oracle9i
- Now has some scalability features