Chapter 7: Distributed Recovery - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Chapter 7: Distributed Recovery

Description:

Pointer to the checkpoint is recorded in the restart file. Recovery from a System Failure ... Recovery. Determine the most recent local checkpoint at the failed site ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 34
Provided by: Mik7253
Category:

less

Transcript and Presenter's Notes

Title: Chapter 7: Distributed Recovery


1
Chapter 7 Distributed Recovery
2
Distributed Recovery
  • Introduction
  • Recovering from Aborted Transactions
  • System Failure
  • Media Failures
  • Practical Advice
  • Summary

3
Why Do We Need Recovery?
  • Operating system fails
  • Transaction aborts
  • Media fails

4
Recovery Information
  • Recovery information is stored to help recover
    from aborts.
  • Before images are values of data before an
    update.
  • After images are modified data values.
  • Archive is a complete copy of the database at a
    point in time.

5
Transaction Execution Steps
  • Input transaction
  • Log transaction
  • Fetch DB record(s)
  • Log before image(s)
  • Compute new record value(s)
  • Log after image(s)
  • Log commitment
  • Write new DB record value(s)

More
6
Transaction Execution Steps, cont.
Input Transaction
OutputMessage
1
8
2
4
Process
5
6
7
ObtainRecord
Write UpdatedRecord
3
8
7
Distributed Recovery
  • Introduction
  • Recovering from Aborted Transactions
  • System Failure
  • Media Failures
  • Practical Advice
  • Summary

8
Recovering From Aborted Transactions
  • Two strategies
  • Incremental log with deferred updates
  • Incremental log with immediate updates

9
Incremental Log With Deferred Updates
  • Defer writes until it is assured that all writes
    can be completed successfully.
  • Log structure ltA, Startgt ltA, item X1, after
    value X1gt ltA, item X2, after value
    X2gt . . . ltA, item Xn, after value
    Xngt ltA, Commitgt
  • Recovery Operations
  • If commit, then merge log into the database.
  • If abort, then do nothing.

10
Incremental Log With Deferred Updates Example
Transaction
Log
after image A 1 null A 5 Lewis A 8 Jones
Begin transaction A Delete Employee where
ENumber1 insert into Employee (ENumber,
Name) lt5, "Lewis " gt update Employee set
Name "Jones" where ENumber8 commit end
transaction A
commit
(process deferredupdates)
abort
(do nothing)
11
Incremental Log With Immediate Updates
  • Write updates to the DB and maintain a log of
    before-and-after values for all updated items
  • Log structure ltA, Startgt ltA, X1, before value
    X1, after value X1gt ltA, X2, before value X2,
    after value X2gt . . . ltA, Xn, before
    value Xn, after value Xngt ltA, Commitgt

12
Incremental Log With Immediate Updates Example
Transaction
Log
before image after image A 1 Ackman null A nu
ll 5 Lewis A 8 Smith 8 Jones
Begin transaction A Delete Employee where
ENumber1 insert into Employee (ENumber,
Name) lt5, "Lewis " gt update Employee set
Name "Jones" where ENumber8 commit end
transaction A
abort
(process inreverse order)
(save log)
13
Recovery Exercise 1
Transaction
Log
Database 1 Ackman 8000 2 Brown
7000 6 Carson 6500 8 Smith 8500 9 Wong
7500

Begin transaction B insert into employee
values (15, Taylor, 7500) update
employee set salarysalary1.1 where
salarygt7000 delete from employee where
salarylt7000 commit end transaction B
  • Using incremental log with deferred updates,
    show
  • Contents of the log
  • What to do for commit
  • What to do for abort

14
Recovery Exercise 2
Transaction
Log
Database 1 Ackman 8000 2 Brown
7000 6 Carson 6500 8 Smith 8500 9 Wong
7500

Begin transaction B insert into employee
values (15, Taylor, 7500) update
employee set salarysalary1.1 where
salarygt7000 delete from employee where
salarylt7000 commit end transaction B
  • Using incremental log with deferred updates,
    show
  • Contents of the log
  • What to do for commit
  • What to do for abort

15
Distributed Recovery
  • Introduction
  • Recovering from Aborted Transactions
  • System Failure
  • Media Failures
  • Practical Advice
  • Summary

16
What is a System Failure?
  • Contents of main storage and I/O buffers are
    lost.
  • Database is safe.
  • Transactions in progress must be aborted.
  • Recovery approaches
  • Search entire log
  • Use a quiet point
  • Use a checkpoint

17
Recovery from System Failure by Searching the
Entire Log
  • Undo empty
  • Search the entire log from the beginning.
  • For each BEGIN TRANSACTION, place transaction
    I.D. on the UNDO list.
  • For each COMMIT TRANSACTION, remove the
    transaction I.D. from the UNDO list.
  • Rollback transactions on the UNDO list and
    restart them.
  • What is wrong with this approach?

18
Definition Quiet Point
  • Quiet point
  • Accept no new transactions until all current
    transactions have committed.
  • Write the quiet point to the log.
  • Pointer to the quiet point is recorded in the
    restart file.

19
Recovery from System Failure Using a Quiet Point
  • Undo empty
  • Search log beginning with the most recent quiet
    point.
  • For each BEGIN TRANSACTION, place the transaction
    I.D. on the UNDO list.
  • For each COMMIT TRANSACTION, remove the
    transaction I.D. from the UNDO list.
  • Rollback transactions on the UNDO list and
    restart them.
  • What is wrong with this approach?

20
Making a Checkpoint
  • Force log info to log
  • Create CHECKPOINT entry on the log
  • CHECKPOINT entry contains I.D.s of all active
    transactions
  • Pointer to the checkpoint is recorded in the
    restart file

21
Recovery from a System FailureUsing a Checkpoint
  • UNDO transaction I.D.s from most recent
    CHECKPOINT
  • Search the log beginning with the most recent
    checkpoint.
  • For each BEGIN TRANSACTION, place the transaction
    I.D. on the UNDO list.
  • For each COMMIT TRANSACTION, remove the I.D. from
    the UNDO list.
  • Rollback transactions on the UNDO list and
    restart them.
  • Something is still not right.
  • Due to system delays, updated values of committed
    transactions might not be written to the database
    before the system crashes.

22
Recovery from a System Failure Using a Checkpoint
(Revised)
  • Undo transaction I.D.s in the most recent
    checkpoint entry
  • Redo empty
  • Search the log beginning with the most recent
    checkpoint record.
  • For each BEGIN TRANSACTION, place the transaction
    I.D. on the UNDO list.
  • For each COMMIT, move the transaction I.D. from
    the UNDO list to the REDO list.
  • For each transaction on the UNDO list, rollback.
  • For each transaction on the REDO list, force log
    info to the database.

23
Incremental Logs With Immediate Updates Example
T1 T2 T3 T4 T5 Checkpoint System Crash
time
  • Log ltT1, Startgt ltT2, Startgt ltT3, Startgt ltT1,
    Commitgt ltCheckpointT2,T3gt ltT2, Commitgt ltT4,
    Startgt ltT5, Startgt ltT4, Commitgt System Crash
  • Recovery Redo T2 and T4 Undo T3 and T5

24
Recovery in a Distributed DBMS
  • Use a global checkpoint
  • Set of local checkpoints performed at all sites
  • If a subtransaction of Transaction A is contained
    in a local checkpoint, then all other
    subtransactions of Transaction A are included in
    some local checkpoint.
  • Recovery
  • Determine the most recent local checkpoint at the
    failed site
  • Force all sites to recover from the same
    checkpoint

25
Recovery Exercise
A B C D EFG H Checkpoint System Crash
time
  • Log ltA, startgt ltB, startgt ltE, commitgt
    ltA, commitgt ltF, startgt ltC, startgt ltG,
    startgt ltC, commitgt ltB, commitgt ltD,
    startgt ltH, startgt ltE, startgt ltF,
    commitgt ltCheckpoint, B, D, Egt ltH, commitgt
  • Which transactions should be redone?
  • Which transactions should be undone?

26
Distributed Recovery
  • Introduction
  • Recovering from Aborted Transactions
  • System Failure
  • Media Failures
  • Practical Advice
  • Summary

27
Media Failures Secondary Memory Is Lost
  • Restore the database from an archive.
  • Using log, redo transactions run since the
    archive was recorded.

28
Media Failures Secondary Memory and Log Are Lost
  • Restore the database to the most recent archive.
  • Apply the portion of the log that is undamaged.
  • Look for new job!

29
Distributed Recovery
  • Introduction
  • Recovering from Aborted Transactions
  • System Failure
  • Media Failures
  • Practical Advice
  • Summary

30
Some Transactions Cannot Be Rolled Back and
Restarted
  • Withdraw funds from a bank
  • Print a paycheck
  • Fill and ship an order
  • Etc.

31
Recovery from Deviant Transactions
  • Obvious approach
  • Undo transactions back to the deviant transaction
  • Undo the deviant transaction
  • Force log info after the deviant transaction into
    the database
  • Will not always work
  • A transaction executed after the deviant
    transaction may have used data written by the
    deviant transaction.
  • Hard-luck approach
  • Carefully examine database
  • Correct errors caused by the deviant transaction
  • Correct errors propagated by other transactions
  • Correct errors propagated by other transactions
  • Correct errors ASAP to avoid further database
    contamination

32
Distributed Recovery
  • Introduction
  • Recovering from Aborted Transactions
  • System Failure
  • Media Failures
  • Practical Advice
  • Summary

33
Summary
  • Most systems use incremental logs with immediate
    updates for transaction recovery.
  • Most systems use checkpoints for system recovery.
  • Most systems use archives and transaction logs
    for media recovery.
Write a Comment
User Comments (0)
About PowerShow.com