Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction: Distributed Systems
- Part One Protocols
- Part Two Fundamental Algorithms
- Part Three Fault Tolerance
- 13 Fault Tolerance in Distributed Systems
- 14 Fault Tolerance in Asynchronous Systems
- 15 Fault Tolerance in Synchronous Systems
- 16 Failure Detection
- 17 Stabilization
- Part Four Appendices
- References
- Index
17 - Stabilization
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- 1 Introduction: Distributed Systems
- Part One Protocols
- Part Two Fundamental Algorithms
- Part Three Fault Tolerance
- 13 Fault Tolerance in Distributed Systems
- 14 Fault Tolerance in Asynchronous Systems
- 15 Fault Tolerance in Synchronous Systems
- 16 Failure Detection
- 17 Stabilization
- Part Four Appendices
- References
- Index
Summary
The stabilizing algorithms considered in this chapter achieve fault-tolerant behavior in a manner radically different from that of the robust algorithms studied in the previous chapters. Robust algorithms follow a pessimistic approach, suspecting all information received, and precede all steps by sufficient checks to guarantee the validity of all steps of correct processes. Validity must be guaranteed in the presence of faulty processes, which necessitates restriction of the number of faults and of the fault model.
Stabilizing algorithms are optimistic, which may cause correct processes to behave inconsistently, but guarantee a return to correct behavior within finite time after all faulty behavior ceases. That is, stabilizing algorithms protect against transient failures; eventual repair is assumed, and this assumption allows us to abandon failure models and a bound on the number of failures. Rather than considering processes to be faulty, it is assumed that all processes operate correctly, but the configuration can be corrupted arbitrarily during a transient failure. Ignoring the history of the computation during the failure, the configuration at which we start the analysis of the algorithm, is considered the initial one of the (correctly operating) algorithm. An algorithm is therefore called stabilizing if it eventually starts to behave correctly (i.e., according to the specification of the algorithm), regardless of the initial configuration.
The concept of stabilization was proposed by Dijkstra [Dij74], but little work on it was done until the late nineteen-eighties; hence the subject can be considered relatively new.
- Type
- Chapter
- Information
- Introduction to Distributed Algorithms , pp. 520 - 548Publisher: Cambridge University PressPrint publication year: 2000