Title: Turkcell Backup
1TurkcellBackup Recovery Strategy
Hüsnü Sensoy Turkcell Telecommunication Services
VLDB Expert Oracle ACE Director Member of Global
DWH Leaders Oracle CAB Oracle DBA of 2009
2Agenda
- Backup Recovery Strategies for Oracle Databases
- Motivation behind those strategies
- Revisiting Incrementally Updated Backup
- Revisiting FRA
- How to bring your database back without restore ?
- Sick backup will not work
- Centralized scheduling monitoring
- 11g Release 2 Backup Recovery New Features with
real Telco data warehouse data - Brand new compression algorithms
- Summary
3Turkcell Overview
- Leading GSM operator of Turkey established in
February 1994. - Third GSM operator in Europe in terms of
subscriber (36 million). - First and only Turkish company ever to be listed
on New York Stock Exchange. - Member of Board of Directors of GSMA since 2003.
- 25th company of INFOTECH 100 list.
4Backup Recovery Strategies for Oracle Databases
- TurkcellBackup Recovery Strategy
5Design Considerations
- Define your backup recovery policies upfront
- A well documented strategy that can be used to
bring everything back - KISS Even a junior DBA should be able to bring
your database back. - Standardize, standardize, standardize
- Be prepared to justify the cost in terms of
business impact of downtime
6Design Considerations
- Proactively validate database and backup
integrity - Physical errors
- Logical inconsistencies
- Transmission errors
- Do you perform regular full recoveries to
separate host and storage?
7Design Considerations
- Centralized backup reporting
- Is there a single point of access for all my
databases backup logs ? - What is the average backup duration for database
X ? - How do brand new tape drives affect backup
performance ?
8What type of Architecture ?
- Whats in there ?
- 7 RAC databases
- More than 20 services
APPDB
VASCMT
VASSE
VASNIF
BSSOSS
VASRES
BSSARCH
DATA
ARCHIVE
FRA
9How Do We Backup ?
- Incrementally Updated Backup Strategy
- Initial image copy backup to FRA
- Fast incremental backups thereafter
- Image copy is rolled forward with incremental
backup on regular basis to create full on-disk
backup - Full database backup times only depend on the
amount of blocks changed since last incremental
backup. - The longest backup time is only 30 minutes, with
ZLIB backup compression and logical block
checking turned on.
- run
- backup as compressed backupset
- check logical incremental level 1
- for recover of copy with
- tag DAILY_COPY database
- filesperset 1
- recover copy of database with
- tag DAILY_COPY
-
This is the shortest, cleanest, and most elegant
backup script that I have seen in all my years
at Turkcell.
10Setting Up Flash Recovery Area(Oracle Database
11g Release 1)
- Self managed organized logical storage area.
- Setup as part of Universal Installer wizard.
- Redo log copy, control file copy, archived logs,
and Flashback logs are automatically stored
there. - RMAN automatically utilizes FRA for all disk
backups. - Or, just enable it by setting two init.ora
parameters - db_recovery_file_dest_size
- db_recovery_file_dest
11Flash Recovery Area
- ASM is the best infrastructure to be used as FRA
destination - Raw device performance.
- No other solution (except Sun ZFS file system
with its online FS check capability) will
practically let you implement large storage pools
as ASM does. - Ease of management.
- ASM allows you to provision the same diskgroup to
multiple FRA destinations.
DB1 FRA
DB2 FRA
DB3 FRA
DB4 FRA
ASM Diskgroup (FRA)
12Restore-Free Recovery
13What Are the Commands?
Step1 SQLgt startup pfile/home/oracle/init.ora nomount ORACLE instance started.
Step 2 RMANgt switch database to copy using target database control file instead of recovery catalog datafile 1 switched to datafile copy "FRA/disaster/datafile/system.503.678209167" datafile 9 switched to datafile copy "FRA/disaster/datafile/undotbs5.510.678209175"
Step 3 RMANgt recover database Starting recover at 07-FEB-09 using channel ORA_DISK_1 starting media recovery media recovery complete, elapsed time 000003 Finished recover at 07-FEB-09
Step 4 RMANgt alter database open database opened
14Backup Validation
- Backups on disk or tape might be damaged due to
- Physical problems on media (fabric problems,
dust, cosmic rays, etc) - Media library errors (error in checksum
computation) - How you can increase the probability that your
backups are healthy ?
15Possible Solutions
16RMAN Backup Validation
- RMANgt backup check logical validate
- datafilecopy all
- filesperset 1
- This will report
- For any inconsistent data, index, or other type
of blocks. - Number of total and empty blocks examined.
- Highest change number of each datafile copy.
17Centralized Scheduling Monitoring
- Develop standard backup job scheduling and
monitoring routines. - This enables you to
- See all backup schedules at once
- Check details of previously completed backups
(duration, logs, etc.) - Easily modify backup scripts and bulk deploy them.
18Grid Control Backup Jobs
Manage backup of all databases of the cluster by
using just one screen
19(No Transcript)
20(No Transcript)
2111g Release 2 RMAN Compression
- TurkcellBackup Recovery Strategy
2211gR2 RMAN Compression
23Test Setup
Data Marketing data from Turkcell data warehouse 2.2 billion records (140G) No segment compression PCTFREE 1 16K block size tablespace
Number of Channels 8 RMAN Channels
Compression Types NONE BASIC LOW MEDIUM HIGH
Collected Metrics Compression Ratio Duration I/O Throughput CPU Utilization
24Backup Compression Summary
- In Oracle Database 11g Release 2, RMAN extends
its compression capabilities to fit any CPU power
and I/O throughput combination. - MEDIUM compression level can backup faster than
BASIC with the same compression ratio and 3X
faster with 50 less CPU utilization. - Even if you dont have need to reduce backup
sizes, LOW/MEDIUM compression level might be
faster than uncompressed backup depending on your
I/O throughput, by significantly reducing the
amount of data/sec written by RMAN.
25Best Practices Summary
- A well defined, documented, standard, manageable,
and fast backup recovery strategy is a MUST if
you manage tens (even hundreds) of databases. - Whatever solution you pick, the indicator of a
good backup recovery strategy is simple - It shouldnt depend on the size of database.
- FRA over ASM and RMAN satisfies these
requirements with zero cost.