Crash Recovery - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Crash Recovery

Description:

non-volatile (battery backed up) RAM. Stable storage: ... The checkpointing technique and actions taken on recovery have to be changed (based on ARIES) ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 35
Provided by: ssu72
Category:
Tags: aries | crash | ram | recovery | the

less

Transcript and Presenter's Notes

Title: Crash Recovery


1
Crash Recovery
2
Review The ACID properties
  • A tomicity All actions in the Xaction happen,
    or none happen.
  • C onsistency If each Xaction is consistent, and
    the DB starts consistent, it ends up consistent.
  • I solation Execution of one Xaction is
    isolated from that of other Xacts.
  • D urability If a Xaction commits, its effects
    persist.
  • CC guarantees Isolation and Atomicity.
  • The Recovery Manager guarantees Atomicity
    Durability.

3
Why is recovery system necessary?
  • Transaction failure
  • Logical errors application errors (e.g. div by
    0, segmentation fault)
  • System errors deadlocks
  • System crash hardware/software failure causes
    the system to crash.
  • Disk failure head crash or similar disk failure
    destroys all or part of disk storage
  • The data we will lose can be in main memory or in
    disk

4
Storage Media
  • Volatile storage
  • does not survive system crashes
  • examples main memory, cache memory
  • Nonvolatile storage
  • survives system crashes
  • examples disk, tape, flash memory,
    non-volatile (battery backed up) RAM
  • Stable storage
  • a mythical form of storage that survives all
    failures
  • approximated by maintaining multiple copies on
    distinct nonvolatile media

5
Recovery and Durability
  • To achieve Durability Put data on stable
    storage
  • To approximate stable storage make two copies of
    data
  • Problem data transfer failure

6
Stable-Storage Implementation
  • Solution
  • Write to the first disk
  • Write to the second disk when the first disk
    completes
  • The process is complete only after the second
    write completes successfully
  • Recovery (from disk failures, etc)
  • Detect bad blocks with the checksum (e.g. parity)
  • Two good copies, equal blocks done
  • One good, one bad copy good to bad
  • Two bad copies ignore write
  • Two good, unequal blocks?

Ans Copy the second to the first
7
Recovery and Atomicity
  • Durability is achieved by making 2 copies of data
  • What about atomicity
  • Crash may cause inconsistencies

8
Recovery and Atomicity
  • Example transfer 50 from account A to account B
  • goal is either to perform all database
    modifications made by Ti or none at all.
  • Requires several inputs (reads) and outputs
    (writes)
  • Failure after output to account A and before
    output to B.
  • DB is corrupted!

9
Recovery Algorithms
  • Recovery algorithms are techniques to ensure
    database consistency and transaction atomicity
    and durability despite failures
  • Recovery algorithms have two parts
  • Actions taken during normal transaction
    processing to ensure enough information exists to
    recover from failures
  • Actions taken after a failure to recover the
    database contents to a state that ensures
    atomicity, consistency and durability

10
Background Data Access
  • Physical blocks blocks on disk.
  • Buffer blocks blocks in main memory.
  • Data transfer
  • input(B) transfers the physical block B to main
    memory.
  • output(B) transfers the buffer block B to the
    disk, and replaces the appropriate physical block
    there.
  • Each transaction Ti has its private work-area in
    which local copies of all data items accessed and
    updated by it are kept.
  • Ti's local copy of a data item X is called xi.
  • Assumption each data item fits in and is stored
    inside, a single block.

11
Data Access (Cont.)
  • Transaction transfers data items between system
    buffer blocks and its private work-area using the
    following operations
  • read(X) assigns the value of data item X to the
    local variable xi.
  • write(X) assigns the value of local variable xi
    to data item X in the buffer block.
  • both these commands may necessitate the issue of
    an input(BX) instruction before the assignment,
    if the block BX in which X resides is not already
    in memory.
  • Transactions
  • Perform read(X) while accessing X for the first
    time
  • All subsequent accesses are to the local copy.
  • After last access, transaction executes write(X).
  • output(BX) need not immediately follow write(X).
    System can perform the output operation when it
    deems fit.

12
buffer
input(A)
Buffer Block A
X
A
Buffer Block B
Y
B
output(B)
read(X)
write(Y)
disk
x2
x1
y1
work area of T2
work area of T1
memory
13
Recovery and Atomicity (Cont.)
  • To ensure atomicity, first output information
    about modifications to stable storage without
    modifying the database itself.
  • We study
  • log-based recovery
  • Database
  • Files storing the actual data
  • Log
  • Another file storing the actions of transactions

14
Log-Based Recovery
  • Simplifying assumptions
  • Transactions run serially
  • logs are written directly on the stable stogare
  • Log a sequence of log records maintains a
    record of update activities on the database.
    (Write Ahead Log, W.A.L.)
  • Log records for transaction Tj
  • Two approaches using logs
  • Deferred database modification
  • Immediate database modification

15
Log example
Log 2050
Transaction T1 Read(A) A A-50
Write(A) Read(B) B B50 Write(B)
16
Deferred Database Modification
  • Write the modifications to a log
  • And Defer execution of write operations till
    database commits
  • Example
  • Ti starts write a record to log.
  • Ti write(X)
  • write to log V is the new value for
    X
  • The write is deferred
  • Note old value is not needed for this scheme
  • Ti partially commits
  • Write to the log
  • DB updates by reading and executing the log

17
Deferred Database Modification
  • How to use the log for recovery after a crash?
  • Redo if both and are
    there in the log.
  • Ignore otherwise.
  • Crashes can occur while
  • the transaction is executing the original
    updates, or
  • while recovery action is being taken
  • REDO should be idempotent (i.e., executing
    several times should be the same as once)
  • example transactions T0 and T1 (T0 executes
    before T1)
  • T0 read (A) T1 read (C)
  • A - A - 50 C- C- 100
  • Write (A) write (C)
  • read (B)
  • B- B 50
  • write (B)

18
Deferred Database Modification (Cont.)
  • Below we show the log as it appears at three
    instances of time.


(a)
commit (b)
commit
(c)
- Only new values of an item are recorded in the
log (old values can be omitted). - At partial
commit time, after the log records are on stable
storage, the items are written to the database.
19
Immediate Database Modification
  • Allow database modifications to be OUTPUT to the
    database before transaction commits.
  • Tighter logging rules are needed to ensure
    transaction are undoable
  • Write records must be of the form Vnew
  • Both old and new values
  • Log record must be written before database item
    is written/output
  • Output of DB items can occur
  • Before or after commit
  • In any order
  • But Log record should be written prior to Output
    of an item to database

20
Immediate Database Modification Example
  • Log Database
  • A 950
  • B 2050
  • C 600

21
Immediate Database Modification (Cont.)
  • Recovery procedure
  • Undo is in the log but
    is not. Undo
  • restore the value of all data items updated by Ti
    to their old values, going backwards from the
    last log record for Ti
  • Redo and are both in the
    log. Redo
  • sets the value of all data items updated by Ti to
    the new values, going forward from the first log
    record for Ti
  • Both operations must be idempotent even if the
    operation is executed multiple times the effect
    is the same as if it is executed once
  • Undo operations are performed first, then redo
    operations. Why?

22
I M Recovery Example
2050 600 (c)
2050 600 (b)
2050 (a)
  • Recovery actions in each case above are
  • (a) undo (T0) B is restored to 2000 and A to
    1000.
  • (b) undo (T1) and redo (T0) C is restored to
    700, and then A and B are
  • set to 950 and 2050 respectively.
  • (c) redo (T0) and redo (T1) A and B are set to
    950 and 2050
  • respectively. Then C is set to 600

23
Checkpoints
  • Problems in recovery procedure as discussed
    earlier
  • searching the entire log is time-consuming
  • we might unnecessarily redo transactions which
    have already output their updates to the
    database.
  • How to avoid redundant redoes?
  • Put marks in the log indicating that at that
    point DB and log are consistent. Checkpoint!

24
Checkpoints
  • At a checkpoint
  • Output all log records currently residing in main
    memory onto stable storage.
  • Output all modified buffer blocks to the disk.
  • Write a log record onto stable
    storage.

25
Checkpoints (Cont.)
  • Recovering from log with checkpoints
  • Scan backwards from end of log to find the most
    recent record
  • Continue scanning backwards till a record start is found.
  • Need only consider the part of log following
    above start record. Why?
  • After that, recover from log with the rules that
    we had before.

26
Example of Checkpoints
Tc
Tf
T1
T2
T3
T4
checkpoint
system failure
checkpoint
  • T1 can be ignored (updates already output to disk
    due to checkpoint)
  • T2 and T3 redone.
  • T4 undone

27
Recovery With Concurrent Transactions
  • To permit concurrency
  • All transactions share a single disk buffer and a
    single log
  • Concurrency control Strict 2PL i.e. Release
    eXclusive locks only after commit. Why?
  • Logging is done as described earlier.
  • The checkpointing technique and actions taken on
    recovery have to be changed (based on ARIES)
  • since several transactions may be active when a
    checkpoint is performed.

28
Recovery With Concurrent Transactions (Cont.)
  • Checkpoints for concurrent transactions
  • L the list of transactions
    active at the time of the checkpoint
  • We assume no updates are in progress while the
    checkpoint is carried out
  • Recovery for concurrent transactions, 3 phases
  • Analysis Construction of Undo, Redo-Lists
  • Perform Undo
  • Perform Redo

29
Recovery With Concurrent Transactions (Cont.)
  • 1. ANALYSIS Construction of Undo,
    Redo-lists
  • a. Initialize undo-list and redo-list to empty
  • b. Scan the log backwards from the end, stopping
    when the first record is found.
    For each record found during the backward scan
  • if the record is , add Ti to redo-list
  • if the record is , then if Ti is not
    in redo-list, add Ti to undo-list
  • c. For every Ti in L, if Ti is not in
    redo-list, add Ti to undo-list
  • ? This will add txns to undo-list that started
    prior to L but did not commit

ANALYSIS
30
Recovery With Concurrent Transactions
  • Perform UNDO
  • Scan log backwards
  • Perform undo(T) for every transaction in
    undo-list
  • Stop when reach for every T in
    undo-list.
  • Perform REDO
  • Locate the most recent record.
  • Scan log forwards from the record
    till the end of the log.
  • perform redo for each log record that belongs to
    a transaction on redo-list

UNDO
REDO
31
Example of Recovery
  • Go over the steps of the recovery algorithm on
    the following log

Redo-listT3 Undo-listT4, T1, T2
Undo Set C to 10 Set C to 0 Set B to 0
Redo Set A to 20 Set D to 10
DB A B C D Initial
0 0 0 0 At crash 20 10 20
10 After rec. 20 0 0 10
32
Recovery Summary
  • Durability
  • Duplicate copies
  • Ensuring Atomicity
  • Deferred Logging --- Only Redo
  • Immediate Logging -- Undo and Redo
  • Checkpoints to limit log size
  • Recovery of concurrent txns with checkpoints
  • 3 phases

33
What we covered
  • Relational model - SQL
  • Formal commercial query languages
  • Functional Dependencies
  • Normalization
  • Interfacing with Databases PL/SQL, JDBC
  • Physical Design
  • File storage
  • Indexing
  • Query Processing and Optimization
  • Txns
  • Concurrency Control
  • Recovery

34
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com