In this thesis, we explore mechanisms to tolerate misbehavior of components in distributed systems, focusing on replicated systems. First, we show that equivocation -- the act of telling different lies to different nodes -- is a fundamental weapon that adversaries can use to violate the safety of systems. To prevent equivocation, we propose Attested Append-Only Memory (A2M), a trusted system facility that is small, easy to implement, and easy to verify. A2M provides the programming abstraction of a trusted log, which leads to protocol designs immune to equivocation. Using A2M, we improve upon the state of the art in Byzantine-fault tolerant replicated state machines, producing A2M-enabled protocols (variants of Castro and Liskov's PBFT) that remain correct (linearizable) and keep making progress (live) even when half the replicas are faulty, improving the previous upper bound. We also applied A2M to achieve linearizability in a single-server shared storage in spite of faults. Our evaluation shows that this fault tolerance improvement is achieved with minor performance overhead.
Second, we address the problem of long-term fault tolerance. Typical Byzantine models require that the number of faulty nodes do not exceed a hard upper bound. Unfortunately, in long-running systems, uninterrupted good health is tough to guarantee due to rare, short-term overwhelming faults such as malicious attacks, leading to loss of all correctness properties. To combat this problem, we propose a tiered Byzantine fault model that has two fault bounds depending on the type of operations. We introduce a desirable property called Healthy-Write-Implies-Correct-Read (HWICR) which stipulates that the system will return correct data as long as it is written during a good period of system health. We then present TimeMachine (TM), a preserved name service, that uses a two-phase approach to provide HWICR under the tiered fault model. The approach alternates between service phase and proactive recovery phase, and important state changes happen only during proactive recovery. Our prototype demonstrates that TM meets the goal of the long-term naming service with reasonable performance.
Finally, we tackle a problem of replication among rational nodes in multiple administrative domains. We take a game-theoretic approach to quantify the effects of rationality on the social cost of replicated systems. We show that replication performed by selfish agents can be very inefficient, but with a proper incentive mechanism such as payment the system can be guided to socially optimal replication.