Title: Crash Recovery
1Crash Recovery
2Review The ACID properties
- A tomicity All actions in the Xaction happen,
or none happen. - C onsistency If each Xaction is consistent, and
the DB starts consistent, it ends up consistent. - I solation Execution of one Xaction is
isolated from that of other Xacts. - D urability If a Xaction commits, its effects
persist. - CC guarantees Isolation and Atomicity.
- The Recovery Manager guarantees Atomicity
Durability.
3Why is recovery system necessary?
- Transaction failure
- Logical errors application errors (e.g. div by
0, segmentation fault) - System errors deadlocks
- System crash hardware/software failure causes
the system to crash. - Disk failure head crash or similar disk failure
destroys all or part of disk storage
- The data we will lose can be in main memory or in
disk
4Storage Media
- Volatile storage
- does not survive system crashes
- examples main memory, cache memory
- Nonvolatile storage
- survives system crashes
- examples disk, tape, flash memory,
non-volatile (battery backed up) RAM - Stable storage
- a mythical form of storage that survives all
failures - approximated by maintaining multiple copies on
distinct nonvolatile media
5Recovery and Durability
- To achieve Durability Put data on stable
storage - To approximate stable storage make two copies of
data - Problem data transfer failure
6Stable-Storage Implementation
- Solution
- Write to the first disk
- Write to the second disk when the first disk
completes - The process is complete only after the second
write completes successfully - Recovery (from disk failures, etc)
- Detect bad blocks with the checksum (e.g. parity)
- Two good copies, equal blocks done
- One good, one bad copy good to bad
- Two bad copies ignore write
- Two good, unequal blocks?
Ans Copy the second to the first
7Recovery and Atomicity
- Durability is achieved by making 2 copies of data
- What about atomicity
- Crash may cause inconsistencies
8Recovery and Atomicity
- Example transfer 50 from account A to account B
- goal is either to perform all database
modifications made by Ti or none at all. - Requires several inputs (reads) and outputs
(writes) - Failure after output to account A and before
output to B. - DB is corrupted!
9Recovery Algorithms
- Recovery algorithms are techniques to ensure
database consistency and transaction atomicity
and durability despite failures - Recovery algorithms have two parts
- Actions taken during normal transaction
processing to ensure enough information exists to
recover from failures - Actions taken after a failure to recover the
database contents to a state that ensures
atomicity, consistency and durability
10Background Data Access
- Physical blocks blocks on disk.
- Buffer blocks blocks in main memory.
- Data transfer
- input(B) transfers the physical block B to main
memory. - output(B) transfers the buffer block B to the
disk, and replaces the appropriate physical block
there. - Each transaction Ti has its private work-area in
which local copies of all data items accessed and
updated by it are kept. - Ti's local copy of a data item X is called xi.
- Assumption each data item fits in and is stored
inside, a single block.
11Data Access (Cont.)
- Transaction transfers data items between system
buffer blocks and its private work-area using the
following operations - read(X) assigns the value of data item X to the
local variable xi. - write(X) assigns the value of local variable xi
to data item X in the buffer block. - both these commands may necessitate the issue of
an input(BX) instruction before the assignment,
if the block BX in which X resides is not already
in memory. - Transactions
- Perform read(X) while accessing X for the first
time - All subsequent accesses are to the local copy.
- After last access, transaction executes write(X).
- output(BX) need not immediately follow write(X).
System can perform the output operation when it
deems fit.
12buffer
input(A)
Buffer Block A
X
A
Buffer Block B
Y
B
output(B)
read(X)
write(Y)
disk
x2
x1
y1
work area of T2
work area of T1
memory
13Recovery and Atomicity (Cont.)
- To ensure atomicity, first output information
about modifications to stable storage without
modifying the database itself. - We study
- log-based recovery
- Database
- Files storing the actual data
- Log
- Another file storing the actions of transactions
14Log-Based Recovery
- Simplifying assumptions
- Transactions run serially
- logs are written directly on the stable stogare
- Log a sequence of log records maintains a
record of update activities on the database.
(Write Ahead Log, W.A.L.) - Log records for transaction Tj
-
-
-
- Two approaches using logs
- Deferred database modification
- Immediate database modification
15Log example
Log 2050
Transaction T1 Read(A) A A-50
Write(A) Read(B) B B50 Write(B)
16Deferred Database Modification
- Write the modifications to a log
- And Defer execution of write operations till
database commits - Example
- Ti starts write a record to log.
- Ti write(X)
- write to log V is the new value for
X - The write is deferred
- Note old value is not needed for this scheme
- Ti partially commits
- Write to the log
- DB updates by reading and executing the log
-
17Deferred Database Modification
- How to use the log for recovery after a crash?
- Redo if both and are
there in the log. - Ignore otherwise.
- Crashes can occur while
- the transaction is executing the original
updates, or - while recovery action is being taken
- REDO should be idempotent (i.e., executing
several times should be the same as once) - example transactions T0 and T1 (T0 executes
before T1) - T0 read (A) T1 read (C)
- A - A - 50 C- C- 100
- Write (A) write (C)
- read (B)
- B- B 50
- write (B)
18Deferred Database Modification (Cont.)
- Below we show the log as it appears at three
instances of time.
(a)
commit (b)
commit
(c)
- Only new values of an item are recorded in the
log (old values can be omitted). - At partial
commit time, after the log records are on stable
storage, the items are written to the database.
19Immediate Database Modification
- Allow database modifications to be OUTPUT to the
database before transaction commits. - Tighter logging rules are needed to ensure
transaction are undoable - Write records must be of the form Vnew
- Both old and new values
- Log record must be written before database item
is written/output - Output of DB items can occur
- Before or after commit
- In any order
- But Log record should be written prior to Output
of an item to database
20Immediate Database Modification Example
- Log Database
-
-
-
- A 950
- B 2050
-
-
-
- C 600
-
21Immediate Database Modification (Cont.)
- Recovery procedure
- Undo is in the log but
is not. Undo - restore the value of all data items updated by Ti
to their old values, going backwards from the
last log record for Ti - Redo and are both in the
log. Redo - sets the value of all data items updated by Ti to
the new values, going forward from the first log
record for Ti - Both operations must be idempotent even if the
operation is executed multiple times the effect
is the same as if it is executed once - Undo operations are performed first, then redo
operations. Why?
22I M Recovery Example
2050 600 (c)
2050 600 (b)
2050 (a)
- Recovery actions in each case above are
- (a) undo (T0) B is restored to 2000 and A to
1000. - (b) undo (T1) and redo (T0) C is restored to
700, and then A and B are - set to 950 and 2050 respectively.
- (c) redo (T0) and redo (T1) A and B are set to
950 and 2050 - respectively. Then C is set to 600
23Checkpoints
- Problems in recovery procedure as discussed
earlier - searching the entire log is time-consuming
- we might unnecessarily redo transactions which
have already output their updates to the
database. - How to avoid redundant redoes?
- Put marks in the log indicating that at that
point DB and log are consistent. Checkpoint!
24Checkpoints
- At a checkpoint
- Output all log records currently residing in main
memory onto stable storage. - Output all modified buffer blocks to the disk.
- Write a log record onto stable
storage.
25Checkpoints (Cont.)
- Recovering from log with checkpoints
- Scan backwards from end of log to find the most
recent record - Continue scanning backwards till a record start is found.
- Need only consider the part of log following
above start record. Why? - After that, recover from log with the rules that
we had before.
26Example of Checkpoints
Tc
Tf
T1
T2
T3
T4
checkpoint
system failure
checkpoint
- T1 can be ignored (updates already output to disk
due to checkpoint) - T2 and T3 redone.
- T4 undone
27Recovery With Concurrent Transactions
- To permit concurrency
- All transactions share a single disk buffer and a
single log - Concurrency control Strict 2PL i.e. Release
eXclusive locks only after commit. Why? - Logging is done as described earlier.
- The checkpointing technique and actions taken on
recovery have to be changed (based on ARIES) - since several transactions may be active when a
checkpoint is performed.
28Recovery With Concurrent Transactions (Cont.)
- Checkpoints for concurrent transactions
- L the list of transactions
active at the time of the checkpoint - We assume no updates are in progress while the
checkpoint is carried out - Recovery for concurrent transactions, 3 phases
- Analysis Construction of Undo, Redo-Lists
- Perform Undo
- Perform Redo
-
29Recovery With Concurrent Transactions (Cont.)
- 1. ANALYSIS Construction of Undo,
Redo-lists - a. Initialize undo-list and redo-list to empty
- b. Scan the log backwards from the end, stopping
when the first record is found.
For each record found during the backward scan - if the record is , add Ti to redo-list
- if the record is , then if Ti is not
in redo-list, add Ti to undo-list - c. For every Ti in L, if Ti is not in
redo-list, add Ti to undo-list - ? This will add txns to undo-list that started
prior to L but did not commit
ANALYSIS
30Recovery With Concurrent Transactions
- Perform UNDO
- Scan log backwards
- Perform undo(T) for every transaction in
undo-list - Stop when reach for every T in
undo-list. - Perform REDO
- Locate the most recent record.
- Scan log forwards from the record
till the end of the log. - perform redo for each log record that belongs to
a transaction on redo-list
UNDO
REDO
31Example of Recovery
- Go over the steps of the recovery algorithm on
the following log -
-
-
-
-
-
-
-
-
-
-
-
-
Redo-listT3 Undo-listT4, T1, T2
Undo Set C to 10 Set C to 0 Set B to 0
Redo Set A to 20 Set D to 10
DB A B C D Initial
0 0 0 0 At crash 20 10 20
10 After rec. 20 0 0 10
32 Recovery Summary
- Durability
- Duplicate copies
- Ensuring Atomicity
- Deferred Logging --- Only Redo
- Immediate Logging -- Undo and Redo
- Checkpoints to limit log size
- Recovery of concurrent txns with checkpoints
- 3 phases
33What we covered
- Relational model - SQL
- Formal commercial query languages
- Functional Dependencies
- Normalization
- Interfacing with Databases PL/SQL, JDBC
- Physical Design
- File storage
- Indexing
- Query Processing and Optimization
- Txns
- Concurrency Control
- Recovery
34(No Transcript)