Title: Transaction-Oriented Database Recovery
1Transaction-Oriented Database Recovery
2ApplicationProgrammer(e.g., business
analyst, Data architect)
Application
SophisticatedApplicationProgrammer(e.g., SAP
admin)
Query Processor
Indexes
Storage Subsystem
Concurrency Control
Recovery
DBA,Tuner
Operating System
HardwareProcessor(s), Disk(s), Memory
3Outline
- Principles of transaction-oriented database
recovery - Recovery tuning
4Transaction-Oriented Database Recovery
- Transaction properties
- A Atomicity
- C Consistency
- I Isolation
- D Duration
- A database is transaction or logically consistent
iff it contains the results of successful
transactions
5Failures To Recover From
- Transaction failure
- Self- or system-abort
- To recover within time for normal transaction
- 10-100 times per min.
- System failure
- OS or DBMS crash
- To recover in same amount of time as required for
all interrupted transactions - A few times per week
- Media failure
- Disk crash
- To recover in hours
- A few times per year
6Recovery Actions
- Transaction UNDO roll-back a specific active
trans - Global UNDO roll-back all active trans
- Partial REDO re-instate some committed trans
- Global REDO re-instate all committed trans
Failure Type
Recovery Action
Transaction
Transaction UNDO
System
Global UNDO, Partial REDO
Media
Global REDO
7Log for UNDO/REDO
- Logical logging operators their arguments
- Requires atomic actions from physical layer
- Not always possible/justifiable
- Physical state logging
- Before and/or after image
- Physical transition logging
- Use XOR commutative and associative
- Log XOR before image ? after image
- Log XOR after image ? before image
- Lower space consumption (1 entry/change compress
long strings of 0s small number of changes)
8System Framework
Source T. Haerder, A. Reuter
9Log Timing
- UNDO entries must reach log file before changes
are written out Write-Ahead Logging (WAL)
principle - To enable roll-back if necessary
- REDO entries must reach log file before
End-Of-Transaction (EOT) is acknowledged - To enable re-instatement after failure
10Dependency with Buffer Management
- UNDO
- STEAL Modified pages may be written anytime
- STEAL Modified pages kept in buffer till after
transaction commits - Large buffers required
- No global UNDO
- Transaction UNDO within memory
- No logging required for UNDO
- REDO
- FORCE All modified pages written during EOT
- No need to log for partial REDO
- Need logging for global REDO
- FORCE No propagation during EOT
At least one of global UNDO or partial REDO is
always required. Why?
11Checkpointing to Optimize Recovery
- Problem
- With LRU buffer replacement, frequently used
pages will remain in buffer - Partial REDO has to go back very far
- Checkpointing limits amount of partial REDO
- Checkpoint
- Write BEGIN-CHECKPOINT to temporary log
- Write checkpoint data to log
- Write END-CHECKPOINT to temporary log
12Crash Recovery with Checkpoint
Oldest Page In Buffer
Checkpoint
Crash
T1
Nothing
T2
REDO
T3
T4
UNDO
T5
Analyze
Recovery Process
UNDO
REDO
13Transaction-Oriented Checkpoint (TOC)
- FORCE ? TOC
- EOT ? (BEGIN-CHECKPOINT, END-CHECKPOINT)
- Frequently used pages need to be written out each
time a transaction commits - Not suitable for large applications
Source T. Haerder, A. Reuter
14Transaction-Consistent Checkpoint (TCC)
Source T. Haerder, A. Reuter
15Transaction-Consistent Checkpoint (TCC)
- When checkpoint generation is triggered
- All new update transactions are put on hold
- All incomplete update transactions are completed
- Write out all modified pages
- Both REDO and UNDO are bounded
- REDO starts from latest checkpoint
- UNDO back to latest checkpoint
- Drawback
- Delay new update transactions not suitable for
large multi-user DBMS - High checkpointing costs
16Action-Consistent Checkpoint (ACC)
Source T. Haerder, A. Reuter
17Action-Consistent Checkpoint (ACC)
- When checkpoint generation is triggered
- All new actions are put on hold
- All incomplete actions are completed
- Write out all modified pages
- Less disruptive than TCC
- Partial REDO only from the most recent checkpoint
- Global UNDO not bounded
- Still costly when buffers are large
18Fuzzy ACC
- During checkpointing, the numbers of all dirty
pages in buffer are written to the log - If a modified page is found in the previous
checkpoint, and since then has not been written
out, write it out now - Partial REDO from penultimate checkpoint
19Archive Recovery
Source T. Haerder, A. Reuter
Make sure the two paths are independent!!
20Multi-Generation Archive Copies
- Archive copies are accessed very infrequently
- Subject to magnetic decay
- Keep several generations
Source T. Haerder, A. Reuter
21Duplicate Archive Logs
Source T. Haerder, A. Reuter
22Duplicate Archive Logs
- Archive log must extend back to the oldest
archive copy - Log susceptible to magnetic decay as well
- Duplicate archive log
- Need to synchronize both archive logs with
temporary log at EOT - Very expensive!
23Decouple Archive Logs from EOT
Source T. Haerder, A. Reuter
24Decouple Archive Logs from EOT
- Log entries written only to temporary log during
EOT - Asynchronous process copies REDO entries to
archive log - Need to replicate temporary log
- Synchronize both temporary logs at EOT
25Summary
Failure Type
Recovery Action
Transaction
Transaction UNDO
System
Global UNDO, Partial REDO
Media
Global REDO
- Crash recovery
- TOC Per transaction
- TCC Transaction boundary
- ACC Action boundary
- Archive recovery
- Multi-generation archive copy
- Duplicate archive logs
- Decouple archive log from EOT