Self-Stabilization in Distributed Systems: A Concept of Fault-Tolerance

Self-
Stabilization
Deepan Sathyamoorthy
Krishna Bharadwaj
Sai Sundaram Parthasarathi
2/20/2025
1
CS553 – Self-stabilization
Introduction
It is a concept of fault-tolerance in distributed
systems
Regardless of a system's initial state, the system will
end up in a legitimate state in a finite amount of time
without any external intervention
Example: Making a group of people to build a self
stabilizing circle
2/20/2025
CS553 – Self-stabilization
2
System Model
Consists of a set of n computers or state machines
that communicate with each other
Communication can be either message passing or
shared memory
Configuration for message passing system
o
Processes P
i
 and P
j
 are neighbors. S
i
 is the state of the processor i and Q
i,j
 is
FIFO queue which contains all messages sent by processor P
i
 to its neighbor
P
j
2/20/2025
CS553 – Self-stabilization
3
System Model contd.
Configuration for shared memory model
o
 
s
i
 is the state of P
i
 and r
i
 is the content of the communication register
Read, Write and State transition operations can be
done in a single atomic operation(Composite
atomicity)
2/20/2025
CS553 – Self-stabilization
4
Self-Stabilizing systems – Formal
Definition
A system is self-stabilizing if and only if
o
Starting from any state, it is guaranteed that the
system will eventually reach a correct state
(convergence)
o
Given that the system is in a correct state, it is
guaranteed to stay in a correct state, provided that
no fault happens (closure)
2/20/2025
CS553 – Self-stabilization
5
Issues in the design of self-
stabilizing algorithms
The main issues in the design of self-stabilization algorithm
o
Number of states in each of the machines
o
Uniform and Non-Uniform networks
o
Central and Distributed demon
o
Reducing the number of the states in the token ring
o
Shared Memory models
o
Mutual Exclusion
o
Cost of self-stabilization
2/20/2025
CS553 – Self-stabilization
6
Self-Stabilization in spite of
distributed control
Basis: The relation “The system is in a legitimate
state” is kept invariant
o
Test each individual step to ensure that it doesn't violate the invariant and
delay steps that could possibly cause violation until a more favorable system
state has been reached
Issue: Due to the lack of a shared memory, local
actions taken on account of local information must
accomplish a global objective.
2/20/2025
CS553 – Self-stabilization
7
Problem definition
A system built from N+1 finite state machines
numbered 0 through N. (State space is the Cartesian
product of individual state spaces of N+1 machines)
Machines are arranged in a ring as below:
2/20/2025
http://deptinfo.unice.fr/twiki/pub/Minfo/DistributedAlgo/Cours_Failures
Detectors-Consensus-SelfStabilization.pdf
8
Problem
 
definition contd.
A demon gives each time, one of the machines the
command “to adjust itself” (In fair random order)
As a function of its own state, (and states of its
neighbors) a machine may be “privileged”
The legitimate states are those in which exactly one
machine is privileged
2/20/2025
CS553 Self-stabilization
9
Observations
The problem cannot be solved if we require our
machines to be identical because then, one
privileged machine cannot be chosen
o
We can make all the machines different from one another or,
o
We can make one machine exceptional and all other mutually
equal
The adjustment command when given to a non
privileged machine will have no effect on it. When
it is given to a privileged machine, after
adjustment, it loses its privilege
Thus a legitimate state is one where exactly one
machine is privileged at any given point
2/20/2025
CS553 – Self-stabilization
10
Observations
We must have one machine exceptional and all others
to be mutually equal to one another. Let us suppose
machine 0 is exceptional and rest of them from 1 to N
are mutually equal.
The 'being privileged' function can be defined to affect
this behavior. Machine i is privileged if x[i] ≠ x[i-1].
Due to transitivity nature of equality, we can say that
when every machine except for machine 0 is non
privileged, then x[0] = x[N]. (x[i] is the state of i)
If we define this to be the condition for machine 0 to
be privileged then at least one machine will be
privileged at any point.
2/20/2025
CS553 – Self-stabilization
11
Observations
If the adjustment that we make should cause the
system to lose it privilege then it must be
o
if x[i] ≠ x[i-1] then x[i] := x[i-1]  
//for any non exceptional
machine
o
if x[0] = x[N] then x[0] := (x[0] + 1) mod k 
// for the
exceptional machine
2/20/2025
CS553 – Self-stabilization
12
Circulation of token
2/20/2025
http://deptinfo.unice.fr/twiki/pub/Minfo/DistributedAlgo/Cours_FailuresDetect
ors-Consensus-SelfStabilization.pdf
13
Dijkstra’s Self-stabilization with
four-state machines
Idea: For each machine i (0 ≤ i ≤ N) we introduce two
Booleans called x[i] and up[i] respectively.
Let machine 0 be “bottom machine” and machine N be
“top machine”. In the bottom machine up[0] = true
holds permanently, in the top machine up[N] = false
holds permanently. All other Booleans are variables.
(Two Boolean variables make up for the four states,
hence the title!)
2/20/2025
CS553 – Self-stabilization
14
Dijkstra’s Self-stabilization with
four-state machines contd.
Two kinds of privileges are defined: “privilege from below”
and “privilege from above”.
For all machines except for the bottom machine, privilege
from below is defined as:
x[i] ≠ x[i-1]
For the normal machines, corresponding move is defined
as:
x[i] := !x[i]; up[i] := true
For top machine, it is reduced to:
x[N] := !x[N]
2/20/2025
CS553 – Self-stabilization
15
Dijkstra’s Self-stabilization with
four-state machines contd.
For all machines except for the top machine, privilege from
above is defined as
o
x[i] = x[i+1] and up[i] and !up[i+1]
For the normal machines, corresponding move is defined as
o
up[i] := false
For bottom machine, it is reduced to
o
x[0] := !x[0]
2/20/2025
CS553 – Self-stabilization
16
Dijkstra’s self-stabilizing token ring
system contd.
Third solution
o
When the number of states, k = 3
2/20/2025
CS553 – Self-stabilization
17
Dijkstra’s self-stabilizing token ring
system contd.
The algorithm for the bottom machine:
o
If S = 0 and R = 1, then the state of S is changed to
2
o
If S = 1 and R = 2, then the state of S is changed to
0
o
If S = 2 and R = 0, then the state of S is changed to
1
2/20/2025
CS553 – Self-stabilization
18
Dijkstra’s self-stabilizing token ring
system contd.
The algorithm for the top machine
o
If L = 0, then the state of S is changed to 1
o
If L = 1, then the state of S is changed to 2
o
If L = 2, then the state of S is changed to 0
2/20/2025
19
CS553 – Self-stabilization
Dijkstra’s self-stabilizing token ring
system contd.
The algorithm for all other machines:
o
If s = 0 and L = 1, then s = 1
o
If s = 1 and L = 2, then s = 2
o
If s = 2 and L = 0. then s = 0
2/20/2025
20
CS553 – Self-stabilization
Dijkstra’s self-stabilizing token ring
system contd.
2/20/2025
21
CS553 – Self-stabilization
Ghosh’s solution
Special networks where the number of states required by
each processor is either 0 or 1
A node needs to use information from all its neighbors. Let s[i]
denote the state of  a machine i
Let b denote an arbitrary state and its complementary state
be ƃ
2/20/2025
22
CS553 – Self-stabilization
Ghosh’s solution
2/20/2025
23
CS553 – Self-stabilization
Ghosh’s solution contd.
This algorithm assumes a large atomicity because
each machine must be able to examine the states of
all its neighbors in one atomic step(composite
atomicity)
2/20/2025
24
CS553 – Self-stabilization
Uniform vs. non-uniform networks
In a uniform network, all machine execute the same
algorithm
It is often necessary to have non-uniformity among
machines
Exceptional machines executing different algorithm
thus indicating non-uniformity
2/20/2025
25
CS553 – Self-stabilization
Central and Distributed demons
The presence of central demon is an undesirable
constraint in self-stabilizing algorithms
With distributed demon, each privileged machine
makes a decision unilaterally
Central demons help in verifying the weak correctness
of the algorithm
Burns et al. showed that central demon assumption is
not necessary for three and four state algorithms
2/20/2025
26
CS553 – Self-stabilization
Number of states in the token ring
The objective is to minimize the number of states of a
machine for efficient implementation
Ghosh’s solution uses only two states in a special
network non-ring topology
2/20/2025
CS553 – Self-stabilization
27
Shared Memory Models
No processor has direct access to the state of its
neighbors
Only way state of the neighbors is through shared
registers
The system is represented as a graph in which each
processor is represented as a node and there is a link
between two nodes which denote that there is
communication between two processors
In self-stabilization algorithm, only one process can
change a register at any instance and this happens
after the system is stabilized
2/20/2025
CS553 – Self-stabilization
28
Mutual Exclusion
In mutual exclusion, each process has a critical section of
code, and only one process can enter its critical section at
any time, and every process that wants to enter its critical
section must be able to enter its critical section in finite
time
If the process does not want to enter its critical section, it
simply passes its privilege to its neighbor
Since there is only one privilege in the system, each
process enjoys a privilege an infinite amount of times
Example: Token system, which has processes circulating
tokens. If a process has a token, it enters its critical
section. After a finite time, only one token exists in the
system
2/20/2025
CS553 – Self-stabilization
29
Cost of self-stabilization
Definition of self-stabilization does not put any upper
bound on the number of transitions to reach a safe
state starting from an unsafe state
Two concepts related to cost of self-stabilization
o
Convergence span: Maximum number of transitions that can be executed in a
system starting from an arbitrary state, before it reaches a safe state
o
Response span: The maximum number of transitions that can be executed in
a system to reach a specified target state, starting from some initial state
Aim of the designer is to reduce the convergence span
and response span
2/20/2025
CS553 – Self-stabilization
30
Methodologies for design of self-
stabilizing systems
A malicious adversary(virus or hardware failure) might
disrupt the distributed system. To be called self-stabilizing,
a system must have the capability to recover when exposed
to such attacks
However, if the system is destroyed completely, then the
self stabilization is not possible
Layering and Modularization
o
How it works? Divide the system into smaller components, making each
component self stabilizing and then integrate them to compose the system
o
First step is to build a self-stabilizing “platform” and any program written on that
platform becomes self-stabilizing
o
We require primitives to provide a platform on which algorithms are built
o
Two primitives
Common clock and Topology-based
2/20/2025
CS553 – Self-stabilization
31
Methodologies for design of self-
stabilizing systems contd.
Common clock primitives
o
Maintaining time through the use of local clocks in shared memory systems
o
Two properties
safety and progress
  
For synchronous shared memory system
o
Safety: All clocks have the same value
o
Progress: At each step, each clock is incremented by the same amount
For asynchronous systems with shared memory:
o
Safety: Clocks of 2 neighboring nodes differ by at most 1
o
Progress: A clock is incremented to i+1 when clocks at all neighboring nodes
have i or i+1
2/20/2025
CS553 – Self-stabilization
32
Methodologies for design of self-
stabilizing systems contd.
Topology based primitives
Layer 1
o
Elect a leader and construct a spanning tree
Layer 2
o
Algorithms for mutual exclusion or reset can be easily built on top of layer 1
Example
o
A two-layered self stabilizing algorithm for mutual exclusion
First layer: creates a spanning tree from an arbitrarily connected graph
Second layer: token-based system. When a node receives the
token/privilege, it executes its critical section in a depth first manner
The above two layers are superimposed to obtain a single self-stabilizing
algorithm
2/20/2025
CS553 – Self-stabilization
33
Communication Protocols
Collection of processes that exchange messages over
communication links in a network
Adversely affected for several reasons
1.  Initialization to an illegal state
2.  A change in the mode of operation
3.  Transmission error because loss of message
4.  Process failure and recovery
5.  A local memory crash
2/20/2025
CS553 – Self-stabilization
34
Communication Protocols
Protocol is stabilizing if and only if starting from any
unsafe state, the protocol is guaranteed to converge
to safe state within finite number of state transitions
Gouda and Multari – Properties to be self-stabilizing
1.  It must be non-terminating
2.  There are infinite number of safe states
3.  There are timeout actions in a non-empty subset of
process
2/20/2025
CS553 – Self-stabilization
35
Dolev, Israeli & Morgan Algorithm
Self-stabilizing BFS spanning tree construction
Algorithm for
-  Semi-uniform system
-  Central demon
-  Read/Write atomicity
Every node maintains two variables
-  Pointer to one of its incoming edges
-  distance in hops to the root
2/20/2025
http://www.dcg.ethz.ch/lectures/ss06/distcomp/lecture/dolev_israeli_moran.pdf
36
Dolev, Israeli & Morgan Algorithm
Nodes periodically exchange their distance value
Root node always send 0
Each node chooses the neighbor with minimum
distance as its new parent and updates its distance
accordingly
After reading all the neighbor for k times, distance
value of a process is at least k+1
After all the nodes at distance d have computed their
distance and updated register, their register no longer
change
Eventually, entire tree will be stabilized
2/20/2025
37
http://www.dcg.ethz.ch/lectures/ss06/distcomp/lecture/dolev_israeli_moran.pdf
Dolev, Israeli & Morgan Algorithm
2/20/2025
38
http://www.dcg.ethz.ch/lectures/ss06/distcomp/lecture/dolev_israeli_moran.pdf
Dolev, Israeli & Morgan Algorithm
2/20/2025
39
http://www.dcg.ethz.ch/lectures/ss06/distcomp/lecture/dolev_israeli_moran.pdf
Similar Algorithms
Collin & Dolev Algorithm
 -  Semi-uniform
 -  DFS spanning-tree
 -  Central demon
 -  Read/Write atomicity
Herman Algorithm
 -  Semi-uniform
 -  DFS spanning-tree
 -  Central demon
 -  Composite atomicity
2/20/2025
CS553 – Self-stabilization
40
Afek, Kutten & Yung Algorithm
BFS spanning tree in the read/write atomicity
No assumption of a distinguished root process
All nodes have globally unique identifier that can be
totally ordered
Largest identifier will be root of the tree
Periodically, nodes exchange identifier information
If a node has maximum identifier, then it becomes root
If there is larger root identifier, then send “join
request” and receive “grant” from the root
2/20/2025
CS553 – Self-stabilization
41
Arora and Gouda Algorithm
BFS spanning tree in the read/write atomicity with
central demon
No assumption of a distinguished root process
Like Afek et al., all nodes have globally unique
identifier that can be totally ordered and largest
identifier will be root of the tree
Needs a bound N on the number n of nodes in the
network to work correctly
Cycles are detected when distance bound grows to
exceed N
2/20/2025
CS553 – Self-stabilization
42
Afek and Bremler Algorithm
For synchronous and asynchronous networks
Node with smallest identifier is considered as root
Based on new idea called “Power Supply”
Nodes are expected to receive power (a continuous
flow of message, one per round)  from root
Only legal roots may be source of power
Nodes attached to fake roots eventually fail to receive
power and make themselves the root of new tree
When a node receives power from a neighbor with
small identifier, it attaches itself to the tree
2/20/2025
CS553 – Self-stabilization
43
1-Maximal Independent Set
A maximal independent set is a set of nodes such that
every node not in the set is adjacent to a node in the
set
A 1-maximal independent set is a maximal
independent set provided one cannot increase the
cardinality of the independent set by removing one
node and adding more nodes
2/20/2025
http://people.cs.clemson.edu/~goddard/papers/maxIndependent.pdf
44
Shi et al., Algorithm
A connected, undirected graph with node set V and
edge set E
N(i) denotes set of neighbor of node I
Algorithm is presented as a set of rules, each with
Boolean predicate and action.
A node will be privileged if predicate is true
If a node is privileged, it may execute the
corresponding action, called move
A central demon is assumed to exist
Node is state 0 will be in desired set
2/20/2025
http://people.cs.clemson.edu/~goddard/papers/maxIndependent.pdf
45
Shi et al., Algorithm
Rules
1.
If not involved in a transition process, then set state
to the number of neighbors in state 0 or state 0'.
2.
If in state 0 and adjacent to at least two 1s, change to
state 0.
3.
If in state 1 and adjacent to a 0', change state to 1‘.
4.
If in state 0' and adjacent to at least two 1's, change
state to 2‘.
5.
If in state 1' and adjacent to no 0', change state to 0.
6.
If in state 2' and adjacent to no 1', change state to 2.
2/20/2025
http://people.cs.clemson.edu/~goddard/papers/maxIndependent.pdf
46
Shi et al., Algorithm
2/20/2025
CS553 – Self-stabilization
47
Shi et al., Algorithm
2/20/2025
CS553 – Self-stabilization
48
Probabilistic self-stabilizing leader
election algorithm
Dolev et al. algorithm to choose leader among n
station
During a time unit, stations can detect either silence,
success or collision
Silence – no station tried to transmit a message
Success – only one station transmit a message
Collision – at least two stations attempted to transmit
a message
2/20/2025
CS553 – Self-stabilization
49
Probabilistic self-stabilizing leader
election algorithm
2/20/2025
CS553 – Self-stabilization
50
Probabilistic self-stabilizing leader
election algorithm
2/20/2025
CS553 – Self-stabilization
51
Probabilistic self-stabilizing leader
election algorithm
2/20/2025
CS553 – Self-stabilization
52
Self-stabilization as a solution to
fault tolerance
Fault tolerance is the tolerance to transient failures, in
which the state of a component changes spontaneously,
but the component remains correct
Self-stabilization meets a stronger notion of correctness, in
that, regardless of the origin or type of failure, the system
converges to a correct state
Self-stabilization handles:
Inconsistent initialization
Mode of change
Transmission errors
Process failure and recovery
Memory crash
2/20/2025
CS553 – Self-stabilization
53
Factors preventing Self-
stabilization
Symmetry: In a distributed system if the machines
involved are symmetric to each other, then self-
stabilization is not possible
Termination: If a final state is unsafe, then the system will
not stabilize. One such form of termination is deadlock
Isolation: The local state and computation of a process is
consistent with the global state, however the resulting
global state and computation is not safe
Look-alike configurations: If two states of a machine are
identical, then convergence to a safe state cannot be
guaranteed
2/20/2025
CS553 – Self-stabilization
54
Limitations of Self-stabilization
Time: If the system doesn’t reach one of its legal states
quickly enough, then the system might not tolerate it
Exceptional Machine: The need for an exceptional machine
is difficult to achieve, although it is not a bottleneck
Pseudo-stabilization: Every computer only needs to have
some state such that the suffix of the computation
beginning at this state is in the set of legal computations.
Weaker, but cost effective
2/20/2025
CS553 – Self-stabilization
55
References
http://www.cs.utexas.edu/~EWD/ewd03xx/EWD392.
PDF
http://deptinfo.unice.fr/twiki/pub/Minfo/Distributed
Algo/Cours_FailuresDetectors-Consensus-
SelfStabilization.pdf
Chapter 17(Role of Compilers in self stabilization not
included) in A.D. Kshemkalyani, M. Singhal,
Distributed Computing: Principles, Algorithms, and
Systems
2/20/2025
CS553 – Self-stabilization
56
Slide Note
Embed
Share

Self-stabilization is a fault-tolerance concept ensuring a system reaches a legitimate state without external intervention. The system model involves communication among computers or state machines using message passing or shared memory. A self-stabilizing system guarantees convergence and closure from any state, maintaining correctness. Design issues include state count, network types, token ring optimization, mutual exclusion, and cost considerations.

  • Self-Stabilization
  • Distributed Systems
  • Fault-Tolerance
  • System Model
  • Design Issues

Uploaded on Feb 20, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Self- Stabilization Deepan Sathyamoorthy Krishna Bharadwaj Sai Sundaram Parthasarathi CS553 Self-stabilization 2/20/2025 1

  2. Introduction It is a concept of fault-tolerance in distributed systems Regardless of a system's initial state, the system will end up in a legitimate state in a finite amount of time without any external intervention Example: Making a group of people to build a self stabilizing circle CS553 Self-stabilization 2/20/2025 2

  3. System Model Consists of a set of n computers or state machines that communicate with each other Communication can be either message passing or shared memory Configuration for message passing system o Processes Pi and Pj are neighbors. Si is the state of the processor i and Qi,j is FIFO queue which contains all messages sent by processor Pi to its neighbor Pj CS553 Self-stabilization 2/20/2025 3

  4. System Model contd. Configuration for shared memory model o si is the state of Pi and ri is the content of the communication register Read, Write and State transition operations can be done in a single atomic operation(Composite atomicity) CS553 Self-stabilization 2/20/2025 4

  5. Self-Stabilizing systems Formal Definition A system is self-stabilizing if and only if o Starting from any state, it is guaranteed that the system will eventually reach a correct state (convergence) o Given that the system is in a correct state, it is guaranteed to stay in a correct state, provided that no fault happens (closure) CS553 Self-stabilization 2/20/2025 5

  6. Issues in the design of self- stabilizing algorithms The main issues in the design of self-stabilization algorithm o Number of states in each of the machines o Uniform and Non-Uniform networks o Central and Distributed demon o Reducing the number of the states in the token ring o Shared Memory models o Mutual Exclusion o Cost of self-stabilization CS553 Self-stabilization 2/20/2025 6

  7. Self-Stabilization in spite of distributed control Basis: The relation The system is in a legitimate state is kept invariant o Test each individual step to ensure that it doesn't violate the invariant and delay steps that could possibly cause violation until a more favorable system state has been reached Issue: Due to the lack of a shared memory, local actions taken on account of local information must accomplish a global objective. CS553 Self-stabilization 2/20/2025 7

  8. Problem definition A system built from N+1 finite state machines numbered 0 through N. (State space is the Cartesian product of individual state spaces of N+1 machines) Machines are arranged in a ring as below: http://deptinfo.unice.fr/twiki/pub/Minfo/DistributedAlgo/Cours_Failures Detectors-Consensus-SelfStabilization.pdf 2/20/2025 8

  9. Problemdefinition contd. A demon gives each time, one of the machines the command to adjust itself (In fair random order) As a function of its own state, (and states of its neighbors) a machine may be privileged The legitimate states are those in which exactly one machine is privileged CS553 Self-stabilization 2/20/2025 9

  10. Observations The problem cannot be solved if we require our machines to be identical because then, one privileged machine cannot be chosen o We can make all the machines different from one another or, o We can make one machine exceptional and all other mutually equal The adjustment command when given to a non privileged machine will have no effect on it. When it is given to a privileged machine, after adjustment, it loses its privilege Thus a legitimate state is one where exactly one machine is privileged at any given point CS553 Self-stabilization 2/20/2025 10

  11. Observations We must have one machine exceptional and all others to be mutually equal to one another. Let us suppose machine 0 is exceptional and rest of them from 1 to N are mutually equal. The 'being privileged' function can be defined to affect this behavior. Machine i is privileged if x[i] x[i-1]. Due to transitivity nature of equality, we can say that when every machine except for machine 0 is non privileged, then x[0] = x[N]. (x[i] is the state of i) If we define this to be the condition for machine 0 to be privileged then at least one machine will be privileged at any point. CS553 Self-stabilization 2/20/2025 11

  12. Observations If the adjustment that we make should cause the system to lose it privilege then it must be o if x[i] x[i-1] then x[i] := x[i-1] //for any non exceptional machine o if x[0] = x[N] then x[0] := (x[0] + 1) mod k // for the exceptional machine CS553 Self-stabilization 2/20/2025 12

  13. Circulation of token http://deptinfo.unice.fr/twiki/pub/Minfo/DistributedAlgo/Cours_FailuresDetect ors-Consensus-SelfStabilization.pdf 2/20/2025 13

  14. Dijkstras Self-stabilization with four-state machines Idea: For each machine i (0 i N) we introduce two Booleans called x[i] and up[i] respectively. Let machine 0 be bottommachine and machine N be topmachine . In the bottom machine up[0] = true holds permanently, in the top machine up[N] = false holds permanently. All other Booleans are variables. (Two Boolean variables make up for the four states, hence the title!) CS553 Self-stabilization 2/20/2025 14

  15. Dijkstras Self-stabilization with four-state machines contd. Two kinds of privileges are defined: privilege from below and privilege from above . For all machines except for the bottom machine, privilege from below is defined as: x[i] x[i-1] For the normal machines, corresponding move is defined as: x[i] := !x[i]; up[i] := true For top machine, it is reduced to: x[N] := !x[N] CS553 Self-stabilization 2/20/2025 15

  16. Dijkstras Self-stabilization with four-state machines contd. For all machines except for the top machine, privilege from above is defined as o x[i] = x[i+1] and up[i] and !up[i+1] For the normal machines, corresponding move is defined as o up[i] := false For bottom machine, it is reduced to o x[0] := !x[0] CS553 Self-stabilization 2/20/2025 16

  17. Dijkstras self-stabilizing token ring system contd. Third solution o When the number of states, k = 3 CS553 Self-stabilization 2/20/2025 17

  18. Dijkstras self-stabilizing token ring system contd. The algorithm for the bottom machine: o If S = 0 and R = 1, then the state of S is changed to 2 o If S = 1 and R = 2, then the state of S is changed to 0 o If S = 2 and R = 0, then the state of S is changed to 1 CS553 Self-stabilization 2/20/2025 18

  19. Dijkstras self-stabilizing token ring system contd. The algorithm for the top machine o If L = 0, then the state of S is changed to 1 o If L = 1, then the state of S is changed to 2 o If L = 2, then the state of S is changed to 0 CS553 Self-stabilization 2/20/2025 19

  20. Dijkstras self-stabilizing token ring system contd. The algorithm for all other machines: o If s = 0 and L = 1, then s = 1 o If s = 1 and L = 2, then s = 2 o If s = 2 and L = 0. then s = 0 CS553 Self-stabilization 2/20/2025 20

  21. Dijkstras self-stabilizing token ring system contd. CS553 Self-stabilization 2/20/2025 21

  22. Ghoshs solution Special networks where the number of states required by each processor is either 0 or 1 A node needs to use information from all its neighbors. Let s[i] denote the state of a machine i Let b denote an arbitrary state and its complementary state be CS553 Self-stabilization 2/20/2025 22

  23. Ghoshs solution CS553 Self-stabilization 2/20/2025 23

  24. Ghoshs solution contd. This algorithm assumes a large atomicity because each machine must be able to examine the states of all its neighbors in one atomic step(composite atomicity) CS553 Self-stabilization 2/20/2025 24

  25. Uniform vs. non-uniform networks In a uniform network, all machine execute the same algorithm It is often necessary to have non-uniformity among machines Exceptional machines executing different algorithm thus indicating non-uniformity CS553 Self-stabilization 2/20/2025 25

  26. Central and Distributed demons The presence of central demon is an undesirable constraint in self-stabilizing algorithms With distributed demon, each privileged machine makes a decision unilaterally Central demons help in verifying the weak correctness of the algorithm Burns et al. showed that central demon assumption is not necessary for three and four state algorithms CS553 Self-stabilization 2/20/2025 26

  27. Number of states in the token ring The objective is to minimize the number of states of a machine for efficient implementation Ghosh s solution uses only two states in a special network non-ring topology CS553 Self-stabilization 2/20/2025 27

  28. Shared Memory Models No processor has direct access to the state of its neighbors Only way state of the neighbors is through shared registers The system is represented as a graph in which each processor is represented as a node and there is a link between two nodes which denote that there is communication between two processors In self-stabilization algorithm, only one process can change a register at any instance and this happens after the system is stabilized CS553 Self-stabilization 2/20/2025 28

  29. Mutual Exclusion In mutual exclusion, each process has a critical section of code, and only one process can enter its critical section at any time, and every process that wants to enter its critical section must be able to enter its critical section in finite time If the process does not want to enter its critical section, it simply passes its privilege to its neighbor Since there is only one privilege in the system, each process enjoys a privilege an infinite amount of times Example: Token system, which has processes circulating tokens. If a process has a token, it enters its critical section. After a finite time, only one token exists in the system CS553 Self-stabilization 2/20/2025 29

  30. Cost of self-stabilization Definition of self-stabilization does not put any upper bound on the number of transitions to reach a safe state starting from an unsafe state Two concepts related to cost of self-stabilization o Convergence span: Maximum number of transitions that can be executed in a system starting from an arbitrary state, before it reaches a safe state o Response span: The maximum number of transitions that can be executed in a system to reach a specified target state, starting from some initial state Aim of the designer is to reduce the convergence span and response span CS553 Self-stabilization 2/20/2025 30

  31. Methodologies for design of self- stabilizing systems A malicious adversary(virus or hardware failure) might disrupt the distributed system. To be called self-stabilizing, a system must have the capability to recover when exposed to such attacks However, if the system is destroyed completely, then the self stabilization is not possible Layering and Modularization o How it works? Divide the system into smaller components, making each component self stabilizing and then integrate them to compose the system o First step is to build a self-stabilizing platform and any program written on that platform becomes self-stabilizing o We require primitives to provide a platform on which algorithms are built o Two primitives Common clock and Topology-based CS553 Self-stabilization 2/20/2025 31

  32. Methodologies for design of self- stabilizing systems contd. Common clock primitives o Maintaining time through the use of local clocks in shared memory systems o Two properties safety and progress For synchronous shared memory system o Safety: All clocks have the same value o Progress: At each step, each clock is incremented by the same amount For asynchronous systems with shared memory: o Safety: Clocks of 2 neighboring nodes differ by at most 1 o Progress: A clock is incremented to i+1 when clocks at all neighboring nodes have i or i+1 CS553 Self-stabilization 2/20/2025 32

  33. Methodologies for design of self- stabilizing systems contd. Topology based primitives Layer 1 o Elect a leader and construct a spanning tree Layer 2 o Algorithms for mutual exclusion or reset can be easily built on top of layer 1 Example o A two-layered self stabilizing algorithm for mutual exclusion First layer: creates a spanning tree from an arbitrarily connected graph Second layer: token-based system. When a node receives the token/privilege, it executes its critical section in a depth first manner The above two layers are superimposed to obtain a single self-stabilizing algorithm CS553 Self-stabilization 2/20/2025 33

  34. Communication Protocols Collection of processes that exchange messages over communication links in a network Adversely affected for several reasons 1. Initialization to an illegal state 2. A change in the mode of operation 3. Transmission error because loss of message 4. Process failure and recovery 5. A local memory crash CS553 Self-stabilization 2/20/2025 34

  35. Communication Protocols Protocol is stabilizing if and only if starting from any unsafe state, the protocol is guaranteed to converge to safe state within finite number of state transitions Gouda and Multari Properties to be self-stabilizing 1. It must be non-terminating 2. There are infinite number of safe states 3. There are timeout actions in a non-empty subset of process CS553 Self-stabilization 2/20/2025 35

  36. Dolev, Israeli & Morgan Algorithm Self-stabilizing BFS spanning tree construction Algorithm for - Semi-uniform system - Central demon - Read/Write atomicity Every node maintains two variables - Pointer to one of its incoming edges - distance in hops to the root http://www.dcg.ethz.ch/lectures/ss06/distcomp/lecture/dolev_israeli_moran.pdf 2/20/2025 36

  37. Dolev, Israeli & Morgan Algorithm Nodes periodically exchange their distance value Root node always send 0 Each node chooses the neighbor with minimum distance as its new parent and updates its distance accordingly After reading all the neighbor for k times, distance value of a process is at least k+1 After all the nodes at distance d have computed their distance and updated register, their register no longer change Eventually, entire tree will be stabilized http://www.dcg.ethz.ch/lectures/ss06/distcomp/lecture/dolev_israeli_moran.pdf 2/20/2025 37

  38. Dolev, Israeli & Morgan Algorithm http://www.dcg.ethz.ch/lectures/ss06/distcomp/lecture/dolev_israeli_moran.pdf 2/20/2025 38

  39. Dolev, Israeli & Morgan Algorithm http://www.dcg.ethz.ch/lectures/ss06/distcomp/lecture/dolev_israeli_moran.pdf 2/20/2025 39

  40. Similar Algorithms Collin & Dolev Algorithm - Semi-uniform - DFS spanning-tree - Central demon - Read/Write atomicity Herman Algorithm - Semi-uniform - DFS spanning-tree - Central demon - Composite atomicity CS553 Self-stabilization 2/20/2025 40

  41. Afek, Kutten & Yung Algorithm BFS spanning tree in the read/write atomicity No assumption of a distinguished root process All nodes have globally unique identifier that can be totally ordered Largest identifier will be root of the tree Periodically, nodes exchange identifier information If a node has maximum identifier, then it becomes root If there is larger root identifier, then send join request and receive grant from the root CS553 Self-stabilization 2/20/2025 41

  42. Arora and Gouda Algorithm BFS spanning tree in the read/write atomicity with central demon No assumption of a distinguished root process Like Afek et al., all nodes have globally unique identifier that can be totally ordered and largest identifier will be root of the tree Needs a bound N on the number n of nodes in the network to work correctly Cycles are detected when distance bound grows to exceed N CS553 Self-stabilization 2/20/2025 42

  43. Afek and Bremler Algorithm For synchronous and asynchronous networks Node with smallest identifier is considered as root Based on new idea called Power Supply Nodes are expected to receive power (a continuous flow of message, one per round) from root Only legal roots may be source of power Nodes attached to fake roots eventually fail to receive power and make themselves the root of new tree When a node receives power from a neighbor with small identifier, it attaches itself to the tree CS553 Self-stabilization 2/20/2025 43

  44. 1-Maximal Independent Set A maximal independent set is a set of nodes such that every node not in the set is adjacent to a node in the set A 1-maximal independent set is a maximal independent set provided one cannot increase the cardinality of the independent set by removing one node and adding more nodes 2/20/2025 44 http://people.cs.clemson.edu/~goddard/papers/maxIndependent.pdf

  45. Shi et al., Algorithm A connected, undirected graph with node set V and edge set E N(i) denotes set of neighbor of node I Algorithm is presented as a set of rules, each with Boolean predicate and action. A node will be privileged if predicate is true If a node is privileged, it may execute the corresponding action, called move A central demon is assumed to exist Node is state 0 will be in desired set 2/20/2025 45 http://people.cs.clemson.edu/~goddard/papers/maxIndependent.pdf

  46. Shi et al., Algorithm Rules 1. If not involved in a transition process, then set state to the number of neighbors in state 0 or state 0'. 2. If in state 0 and adjacent to at least two 1s, change to state 0. 3. If in state 1 and adjacent to a 0', change state to 1 . 4. If in state 0' and adjacent to at least two 1's, change state to 2 . 5. If in state 1' and adjacent to no 0', change state to 0. 6. If in state 2' and adjacent to no 1', change state to 2. 2/20/2025 46 http://people.cs.clemson.edu/~goddard/papers/maxIndependent.pdf

  47. Shi et al., Algorithm 2/20/2025 47 CS553 Self-stabilization

  48. Shi et al., Algorithm 2/20/2025 48 CS553 Self-stabilization

  49. Probabilistic self-stabilizing leader election algorithm Dolev et al. algorithm to choose leader among n station During a time unit, stations can detect either silence, success or collision Silence no station tried to transmit a message Success only one station transmit a message Collision at least two stations attempted to transmit a message CS553 Self-stabilization 2/20/2025 49

  50. Probabilistic self-stabilizing leader election algorithm CS553 Self-stabilization 2/20/2025 50

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#