Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction: Distributed Systems
- Part One Protocols
- Part Two Fundamental Algorithms
- Part Three Fault Tolerance
- 13 Fault Tolerance in Distributed Systems
- 14 Fault Tolerance in Asynchronous Systems
- 15 Fault Tolerance in Synchronous Systems
- 16 Failure Detection
- 17 Stabilization
- Part Four Appendices
- References
- Index
15 - Fault Tolerance in Synchronous Systems
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- 1 Introduction: Distributed Systems
- Part One Protocols
- Part Two Fundamental Algorithms
- Part Three Fault Tolerance
- 13 Fault Tolerance in Distributed Systems
- 14 Fault Tolerance in Asynchronous Systems
- 15 Fault Tolerance in Synchronous Systems
- 16 Failure Detection
- 17 Stabilization
- Part Four Appendices
- References
- Index
Summary
The previous chapter has studied the degree of fault tolerance achievable in completely asynchronous systems. Although a reasonable robustness is attainable, reliable systems in practice are always synchronous in the sense of relying on the use of timers and upper bounds on the message-delivery time. In these systems a higher degree of robustness is attainable, the algorithms are simpler, and the algorithms guarantee an upper bound on the response time in most of the cases.
The synchrony of the system makes it impossible for faulty processes to confuse correct processes by not sending information; indeed, if a process does not receive a message when expected, a default value is used instead, and the sender becomes suspected of being faulty. Thus, crashed processes are detected immediately and pose no difficult problems in synchronous systems; we concentrate on Byzantine failures in this chapter.
In Section 15.1 the problem of performing a broadcast in synchronous networks is studied; we present an upper bound (t < N/3) on the resilience, as well as two algorithms with optimal resilience. The algorithms are deterministic and achieve consensus; it is assumed that all processes know at what time the broadcast is initiated. Because consensus is not deterministically achievable in asynchronous systems (Theorem 14.8), it follows that in the presence of failures (even a single crash), synchronous systems exhibit a strictly stronger computational power than asynchronous ones.
- Type
- Chapter
- Information
- Introduction to Distributed Algorithms , pp. 469 - 504Publisher: Cambridge University PressPrint publication year: 2000