Chapter 7: Distributed Recovery - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

Chapter 7: Distributed Recovery

Description:

Pointer to the checkpoint is recorded in the restart file. Recovery from a System Failure ... Recovery. Determine the most recent local checkpoint at the failed site ... – PowerPoint PPT presentation

Number of Views:83

Avg rating:3.0/5.0

Slides: 34

Provided by: Mik7253

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 7: Distributed Recovery

1
Chapter 7 Distributed Recovery
2
Distributed Recovery

Introduction
Recovering from Aborted Transactions
System Failure
Media Failures
Practical Advice
Summary

3
Why Do We Need Recovery?

Operating system fails
Transaction aborts
Media fails

4
Recovery Information

Recovery information is stored to help recover
from aborts.
Before images are values of data before an
update.
After images are modified data values.
Archive is a complete copy of the database at a
point in time.

5
Transaction Execution Steps

Input transaction
Log transaction
Fetch DB record(s)
Log before image(s)
Compute new record value(s)
Log after image(s)
Log commitment
Write new DB record value(s)

More
6
Transaction Execution Steps, cont.
Input Transaction
OutputMessage
1
8
2
4
Process
5
6
7
ObtainRecord
Write UpdatedRecord
3
8
7
Distributed Recovery

Introduction
Recovering from Aborted Transactions
System Failure
Media Failures
Practical Advice
Summary

8
Recovering From Aborted Transactions

Two strategies
Incremental log with deferred updates
Incremental log with immediate updates

9
Incremental Log With Deferred Updates

Defer writes until it is assured that all writes
can be completed successfully.
Log structure ltA, Startgt ltA, item X1, after
value X1gt ltA, item X2, after value
X2gt . . . ltA, item Xn, after value
Xngt ltA, Commitgt
Recovery Operations
If commit, then merge log into the database.
If abort, then do nothing.

10
Incremental Log With Deferred Updates Example
Transaction
Log
after image A 1 null A 5 Lewis A 8 Jones
Begin transaction A Delete Employee where
ENumber1 insert into Employee (ENumber,
Name) lt5, "Lewis " gt update Employee set
Name "Jones" where ENumber8 commit end
transaction A
commit
(process deferredupdates)
abort
(do nothing)
11
Incremental Log With Immediate Updates

Write updates to the DB and maintain a log of
before-and-after values for all updated items
Log structure ltA, Startgt ltA, X1, before value
X1, after value X1gt ltA, X2, before value X2,
after value X2gt . . . ltA, Xn, before
value Xn, after value Xngt ltA, Commitgt

12
Incremental Log With Immediate Updates Example
Transaction
Log
before image after image A 1 Ackman null A nu
ll 5 Lewis A 8 Smith 8 Jones
Begin transaction A Delete Employee where
ENumber1 insert into Employee (ENumber,
Name) lt5, "Lewis " gt update Employee set
Name "Jones" where ENumber8 commit end
transaction A
abort
(process inreverse order)
(save log)
13
Recovery Exercise 1
Transaction
Log
Database 1 Ackman 8000 2 Brown
7000 6 Carson 6500 8 Smith 8500 9 Wong
7500

Begin transaction B insert into employee
values (15, Taylor, 7500) update
employee set salarysalary1.1 where
salarygt7000 delete from employee where
salarylt7000 commit end transaction B

Using incremental log with deferred updates,
show
Contents of the log
What to do for commit
What to do for abort

14
Recovery Exercise 2
Transaction
Log
Database 1 Ackman 8000 2 Brown
7000 6 Carson 6500 8 Smith 8500 9 Wong
7500

Begin transaction B insert into employee
values (15, Taylor, 7500) update
employee set salarysalary1.1 where
salarygt7000 delete from employee where
salarylt7000 commit end transaction B

Using incremental log with deferred updates,
show
Contents of the log
What to do for commit
What to do for abort

15
Distributed Recovery

Introduction
Recovering from Aborted Transactions
System Failure
Media Failures
Practical Advice
Summary

16
What is a System Failure?

Contents of main storage and I/O buffers are
lost.
Database is safe.
Transactions in progress must be aborted.
Recovery approaches
Search entire log
Use a quiet point
Use a checkpoint

17
Recovery from System Failure by Searching the
Entire Log

Undo empty
Search the entire log from the beginning.
For each BEGIN TRANSACTION, place transaction
I.D. on the UNDO list.
For each COMMIT TRANSACTION, remove the
transaction I.D. from the UNDO list.
Rollback transactions on the UNDO list and
restart them.
What is wrong with this approach?

18
Definition Quiet Point

Quiet point
Accept no new transactions until all current
transactions have committed.
Write the quiet point to the log.
Pointer to the quiet point is recorded in the
restart file.

19
Recovery from System Failure Using a Quiet Point

Undo empty
Search log beginning with the most recent quiet
point.
For each BEGIN TRANSACTION, place the transaction
I.D. on the UNDO list.
For each COMMIT TRANSACTION, remove the
transaction I.D. from the UNDO list.
Rollback transactions on the UNDO list and
restart them.
What is wrong with this approach?

20
Making a Checkpoint

Force log info to log
Create CHECKPOINT entry on the log
CHECKPOINT entry contains I.D.s of all active
transactions
Pointer to the checkpoint is recorded in the
restart file

21
Recovery from a System FailureUsing a Checkpoint

UNDO transaction I.D.s from most recent
CHECKPOINT
Search the log beginning with the most recent
checkpoint.
For each BEGIN TRANSACTION, place the transaction
I.D. on the UNDO list.
For each COMMIT TRANSACTION, remove the I.D. from
the UNDO list.
Rollback transactions on the UNDO list and
restart them.
Something is still not right.
Due to system delays, updated values of committed
transactions might not be written to the database
before the system crashes.

22
Recovery from a System Failure Using a Checkpoint
(Revised)

Undo transaction I.D.s in the most recent
checkpoint entry
Redo empty
Search the log beginning with the most recent
checkpoint record.
For each BEGIN TRANSACTION, place the transaction
I.D. on the UNDO list.
For each COMMIT, move the transaction I.D. from
the UNDO list to the REDO list.
For each transaction on the UNDO list, rollback.
For each transaction on the REDO list, force log
info to the database.

23
Incremental Logs With Immediate Updates Example
T1 T2 T3 T4 T5 Checkpoint System Crash
time

Log ltT1, Startgt ltT2, Startgt ltT3, Startgt ltT1,
Commitgt ltCheckpointT2,T3gt ltT2, Commitgt ltT4,
Startgt ltT5, Startgt ltT4, Commitgt System Crash
Recovery Redo T2 and T4 Undo T3 and T5

24
Recovery in a Distributed DBMS

Use a global checkpoint
Set of local checkpoints performed at all sites
If a subtransaction of Transaction A is contained
in a local checkpoint, then all other
subtransactions of Transaction A are included in
some local checkpoint.
Recovery
Determine the most recent local checkpoint at the
failed site
Force all sites to recover from the same
checkpoint

25
Recovery Exercise
A B C D EFG H Checkpoint System Crash
time

Log ltA, startgt ltB, startgt ltE, commitgt
ltA, commitgt ltF, startgt ltC, startgt ltG,
startgt ltC, commitgt ltB, commitgt ltD,
startgt ltH, startgt ltE, startgt ltF,
commitgt ltCheckpoint, B, D, Egt ltH, commitgt
Which transactions should be redone?
Which transactions should be undone?

26
Distributed Recovery

Introduction
Recovering from Aborted Transactions
System Failure
Media Failures
Practical Advice
Summary

27
Media Failures Secondary Memory Is Lost

Restore the database from an archive.
Using log, redo transactions run since the
archive was recorded.

28
Media Failures Secondary Memory and Log Are Lost

Restore the database to the most recent archive.
Apply the portion of the log that is undamaged.
Look for new job!

29
Distributed Recovery

Introduction
Recovering from Aborted Transactions
System Failure
Media Failures
Practical Advice
Summary

30
Some Transactions Cannot Be Rolled Back and
Restarted

Withdraw funds from a bank
Print a paycheck
Fill and ship an order
Etc.

31
Recovery from Deviant Transactions

Obvious approach
Undo transactions back to the deviant transaction
Undo the deviant transaction
Force log info after the deviant transaction into
the database
Will not always work
A transaction executed after the deviant
transaction may have used data written by the
deviant transaction.
Hard-luck approach
Carefully examine database
Correct errors caused by the deviant transaction
Correct errors propagated by other transactions
Correct errors propagated by other transactions
Correct errors ASAP to avoid further database
contamination

32
Distributed Recovery

Introduction
Recovering from Aborted Transactions
System Failure
Media Failures
Practical Advice
Summary

33
Summary

Most systems use incremental logs with immediate
updates for transaction recovery.
Most systems use checkpoints for system recovery.
Most systems use archives and transaction logs
for media recovery.

Write a Comment

User Comments (0)