Title: Checkpointing and Recovery in Distributed Systems
1Checkpointing and Recovery in Distributed Systems
2The Main Idea
- Processes take checkpoints to store the work they
have done so far. - Checkpoint of a process contains all the data
necessary to restart the process from that point. - When a process fails and restarts, the system may
enter an inconsistent state. - Recovery involves restoring the system to a
consistent state. - May require other processes to restart their
execution from earlier checkpoints.
3Koo and Touegs Algorithm
- Assumptions
- All channels are FIFO.
- All channels are bidirectional.
- All channels are reliable.
- Communication topology need not be a complete
graph.
4Koo and Touegs Algorithm (Contd.)
- Ensures that the last checkpoints of any
processes are concurrent. - Consequently
- No process has to roll back beyond its last
checkpoint. - Checkpointing by one process may cause other
processes to take a checkpoint as well. - Recovery involves rolling back processes to their
last checkpoints.
5Koo and Touegs Algorithm (Contd.)
- At any given time, multiple instances of
checkpointing and recovery algorithms may be in
progress - A process participates in at most instance
(checkpointing or recovery) at any given time. - It will refuse to participate in other instances,
thereby causing them to abort. - Aborted instances are restarted later by their
initiators.
6Koo and Touegs Checkpointing Algorithm
- Consists of two phases.
- First Phase
- Processes take tentative checkpoints if they can.
- A process after taking a tentative checkpoint
cannot send any application messages until the
second phase completes. - Second Phase
- If all required processes are able to take
checkpoints in the first phase, then tentative
checkpoints are made permanent. - Otherwise, tentative checkpoints are discarded.
7Koo and Touegs Checkpointing Algorithm (Contd.)
- Minimizes the number of processes that take
checkpoints. - Each process assigns labels with monotonically
increasing value to its messages. - ? is a special label
- It is smaller than any other label value.
- Each process maintains two vectors with one entry
for each of its neighbors - last_label_rcvd
- first_label_sent
8Koo and Touegs Checkpointing Algorithm Details
- Consider processes X and Y that are neighbors.
- Definition of last_label_rcvdXY
- Let m be the last message that X has received
from Y since its last permanent/tentative
checkpoint. - If m exists, then last_label_rcvdXY is the
label of m. - Otherwise, last_label_rcvdXY is ?.
9Koo and Touegs Checkpointing Algorithm Details
(Contd.)
- Definition of first_label_sentXY
- Let m be the first message that X has sent to Y
since its last permanent/tentative checkpoint. - If m exists, then first_label_sentXY is the
label of m. - Otherwise, first_label_sentXY is ?.
10Koo and Touegs Checkpointing Algorithm Details
(Contd.)
- Assume that X has taken a (tentative) checkpoint
- Y does not need to take a checkpoint if
last_label_rcvdXY ?. - Otherwise, X requests Y to take a checkpoint and
sends last_label_rcvdXY to Y. - Y takes a (tentative) checkpoint if
- last_label_rcvdXY first_label_sentYX gt ?
11Koo and Touegs Checkpointing Algorithm An
Illustration
W
X
Z
Y
Communication topology (used in all illustrations)
12Koo and Touegs Checkpointing Algorithm An
Illustration
(
W
2
1
1
3
4
(
X
1
3
3
4
4
(
Y
2
2
1
Z
First vector last_label_rcvd
Second vector first_label_sent
13Koo and Touegs Recovery Algorithm
- Only permanent checkpoints are used in recovery.
- Consists of two phases.
- First Phase
- Processes agree to roll back if they can.
- A process after agreeing to roll back stops its
execution until the second phase completes. - Second Phase
- If all required processes agree to roll back in
the first phase, then they restart their
execution from the last checkpoint. - Otherwise, processes resume their execution from
their current point.
14Koo and Touegs Recovery Algorithm Details
- Consider processes X and Y that are neighbors.
- Definition of last_label_sentXY
- Let m be the last message that X sent to Y before
its last permanent checkpoint. - If m exists, then last_label_sentXY is the
label of m. - Otherwise, last_label_sentXY is ?.
15Koo and Touegs Recovery Algorithm Details
(Contd.)
- Assume that X has agreed to roll back
- X requests Y to roll back sends
last_label_sentXY to Y. - Y agrees to roll back if
- last_label_rcvdYX gt last_label_sentXY
16Koo and Touegs Recovery Algorithm An
Illustration
W
X
2
1
4
?
3
4
X
1
3
3
Y
?
?
2
1
Z
First vector last_label_rcvd
Second vector last_label_sent
17Juang and Venkatesans Algorithm
- Assumptions
- All channels are FIFO.
- All channels are bidirectional.
- All channels are reliable.
- Communication topology need not be a complete
graph. - A process changes its state only on receiving a
message (except initially).
18Juang and Venkatesans Checkpointing Algorithm
- A process takes a checkpoint every time it
executes an event. - Checkpoints are taken in volatile storage.
- Periodically checkpoints in volatile storage are
flushed to stable storage. - Checkpoints in volatile storage are lost when
failure occurs. - A checkpoint consists of
- the local state just before message is received,
and - the message received.
19Juang and Venkatesans Checkpointing Algorithm
(Contd.)
- A checkpoint consists of
- the local state just before message is received,
and - the message received.
- Checkpoint can be used to recover the state just
after message is received.
20Juang and Venkatesans Recovery Algorithm
- Each process maintains two vectors with one entry
for every neighbor - SENT stores the number of messages sent to each
neighbor so far. - RCVD stores the number of messages received from
each neighbor so far.
21Juang and Venkatesans Recovery Algorithm (Contd.)
- Consider a collection of checkpoints, one from
each process P1, P2,,PN, given by - ckpt1, ckpt2, , ckptN
- Checkpoints form a consistent global state if,
for each pair of neighbors Pi and Pj, the
following holds - SENT(ckpti)j RCVD(ckptj)i
- Otherwise, Pjs state is inconsistent with that
of Pi - The inconsistency can be removed by rolling back
Pj.
22Juang and Venkatesans Recovery Algorithm (Contd.)
- All processes participate in recovery.
- Failed process, on restarting, rolls back to its
last stable checkpoint and instructs all
processes to start recovery using flooding. - Recovery algorithm executes in iterations.
- In each iteration, every process sends to each of
its neighbors the number of messages it has sent
to it as per the current state. - A process rolls back if its state is inconsistent
with that of its neighbor. It rolls back to its
latest checkpoint that removes the inconsistency. - System is guaranteed to be in a consistent state
after N-1 iterations (N is the number of
processes).
23Juang and Venkatesans Recovery Algorithm An
Illustration
Iteration 2
Iteration 3
Flooding
Iteration 1
(
(
W
cw0
0
1
cw2
cw1
1
2
0
(
X
X
0
0
cx0
cx1
0
0
Y
cy0
cy1
cy2
0
1
1
0
(
Z
cz0
cz1
cz2