Title: Self-stabilization
1Self-stabilization
2What is Self-stabilization?
- Technique for spontaneous healing after transient
failure or perturbation. - Non-masking tolerance (Forward error recovery).
- Guarantees eventual safety following failures.
- Feasibility demonstrated by Dijkstra in his
Communications of the ACM 1974 article
3Why Self-stabilizing systems?
- Recover from any initial configuration to a
legitimate configuration in a bounded number of
steps, as long as the codes are not corrupted.
The ability to spontaneously recover from any
initial state implies that no initialization is
ever required. - Such systems can be deployed ad hoc, and are
guaranteed to function properly in bounded time -
4Two properties
- It satisfies the following two criteria
- Convergence. Starting from a bad configuration,
every computation leads to a legitimate
configuration - Closure. Once in a legitimate configuration,
continues to be in that configuration, unless
there is another transient failure.
5Examples of Self-stabilizing systems
-
- We discussed at least one such system while
discussing about clock phase synchronization on
an array of clocks that are synchronously
ticking. - We will discuss about a couple of others now.
6Example 1 Stabilizing mutual exclusion(Dijkstra
1974)
N-1
0
1
7
6
2
4
5
3
Consider a unidirectional ring of processes. In
the legal configuration, exactly one token will
circulate in the network
7Stabilizing mutual exclusion on a ring
0
The state of process j is xj ? 0, 1, 2, K-1
Process 0 do x0 xN-1 ? x0 x0 1
od Process j gt 0 do xj ? xj -1 ? xj
xj-1 od
(TOKEN ENABLED GUARD)
Hand-execute this first, before reading further.
Start the system from an arbitrary initial
configuration
8Why does it work?
- Proof of Closure
- As long as K gt N, there is at least one value x
(O x K-1) that is NOT - the initial state of any nod. Observe the
following facts - There is no deadlock
- Number of tokens never increases
- It means that if the system is in a good
configuration, it remains so - (unless, of course a failure occurs)
9Why does it work?
- Proof of Convergence
- Let x be one of the missing states in the
system. - Processes 1..N-1 acquire their states from their
left neighbor - Eventually process 0 attains the state x
- Thereafter in N-1 steps, all processes attain
the state x. - This is a legal configuration (only process 0
has a token) - Thus the system is guaranteed to recover from a
bad configuration - to a good configuration
10To disprove
- To prove that a given algorithm is not
self-stabilizing, it is sufficient - to show that. either
- there exists a deadlock configuration, or
- (2) there exists a cycle of illegal
configurations in the history - of the computation.
11Example 2 Stabilizing spanning tree
- Problem description
- Given a connected graph G (V,E) and a root r,
design an algorithm for maintaining a spanning
tree in presence of transient failures that may
corrupt the local states of processes (and hence
the spanning tree) . - Let n V
12Different scenarios
0
0
1
1
1
P(2) is corrupted
2
4
2
2
4
5
3
4
3
5
3
5
Each process i has two variables L(i) Distance
from the root via tree edges P(i) parent of
process i
13Different scenarios
0
0
1
1
1
1
2
4
2
2
4
5
2
5
3
4
3
5
3
4
5
5
The distance variable L(3) is corrupted
14Definitions
Each process i has two variables L(i) Distance
from the root via tree edges P(i) parent of
process i N(i) denotes the neighbors of i By
definition L(r) 0, and P(r) is undefined. Also,
0 L(i) n. In a legal state ?i ? V i ?
r L(i) ? n and L(i) L(P(i)) 1.
15The algorithm
do (L(i) ? n) ? (L(i) ? L(P(i)) 1) ?
(L(P(i)) ? n) ? L(i) L(P(i)) 1 (L(i)
? n) ? (L(P(i)) n) ? L(i)n (L(i)
n) ? (?k ? N(i)L(k) lt n-1) ? L(i) L(k)1
P(i)k od
0
0
1
1
P(2) is corrupted
2
2
4
5
3
4
3
5
The blue labels denote the values of L
16Proof of stabilization
Define an edge from i to P(i) to be well-formed,
when L(i) ? n, L(P(i)) ? n and L(i) L(P(i)) 1.
In any configuration, the well-formed edges
form a spanning forest. Delete all edges that are
not well-formed. Each tree T(k) in the forest is
identified by k, the lowest value of L in that
tree.
17Example
- In the sample graph shown earlier, the original
spanning tree is decomposed into two well-formed
trees - T(0) 0, 1
- T(2) 2, 3, 4, 5
- Let F(k) denote the number of T(k)s in the
forest. - Define a tuple F (F(0), F(1), F(2) , F(n)).
- For the sample graph, F (1, 0, 1, 0, 0, 0)
after node 2s has a transient failure.
18Skeleton of the proof
- Minimum F (1,0,0,0,0,0) legal configuration
- Maximum F (1, n-1, 0, 0, 0, 0) (considering
lexicographic order) - With each action of the algorithm, F decreases
lexicographically. Verify the claim! - This proves that eventually F becomes
(1,0,0,0,0,0) and the spanning tree stabilizes. - What is an upper bound time complexity of this
algorithm?
19Graph coloring
- Devise a self-stabilizing algorithm for coloring
the nodes of a directed - acyclic graph of maximum out-degree d with at
most (d1) colors. Let - ? be the set of colors
- c(i) color of node i
- sc(i) set of colors of the successors of node
i - program for node i
- do ?j ? succ(i) c(i) c(j) ? c(i) b b ?
? \ sc(i) od - Why does it work?