Imagine a slow winding river of network programming. At its mouth near the sea, you see all the kids splashing in the waves having fun with web scraping and their chatbot projects. As you work your way up the river, you'll find your usual sorts of HTTP servers and frameworks. Continuing still further, perhaps you'll encounter some message queues, remote procedure calls, and distributed objects. However, if you keep on going, you'll eventually start to see wild hackers doing unimaginable things with sockets, threads, async, and other low-level systems programming primitives. Beyond that is a world of darkness--a frightening territory as the river narrows and the banks close in with complexities everywhere. If you squint and look ahead, the river vanishes into the forest. The shells of abandoned GitHub projects line the shores. The horror. The horror. That's precisely the location where you will be dropped to start this week-long journey of attempting to implement the Raft Distributed Consensus algorithm from scratch. And likely failing.
The problem of Distributed Consensus relates to the challenge of making a group of machines operate as a collective whole that can survive the failure of one or more of its members. This behavior is a critical part of building reliable fault-tolerant systems. Raft is an algorithm that achieves just that. The goal is a modest one--implement Raft from scratch using nothing more than basic system programming libraries and your own wits. It will not be an easy task. It may be the hardest small bit of networking code you'll ever have to write, test, and debug. However, you will learn a lot in the process. Are you up to the challenge?
This course is for experienced programmers who want to deepen their knowledge of operating systems, concurrency, networks, and distributed systems. There are also strong elements of design, software architecture, and testing of complex systems.
This is a project-based course that involves a significant amount of coding, thinking, and discussion. Each day starts with a short presentation and exercises related to facets of the project. However, 5-6 hours a day is spent working on the project.
Implementing Raft is typically a multi-week project found in a graduate computer science course on Distributed Systems. You should have significant experience working in your preferred programming language such as Python, C, Go, or Rust. Core concepts and small exercises are presented in Python. However, this is not a Python course--you may implement the project in any language that you wish. Some prior experience with network programming, systems programming, and concurrency is highly advisable although all of necessary concepts needed for the project are covered in the course.
Although the stated goal is to produce a working implementation of the Raft algorithm, the ultimate purpose of the course is cover important topics from concurrency and distributed computing in a practical setting. These topics include:
A major challenge in completing the project is managing the complexity of testing, monitoring, and debugging in the presence of failures and nondeterministic execution. In a basic 5-machine configuration of Raft, you might have code executing with upwards of 60 threads, spread across multiple processes, interacting with various timers and queues. This will push the limits of your ability to comprehend what is happening. Much of the course is spent on coping strategies.
Note: Everything in this course is derived from first principles. There is no reliance upon third party frameworks or packages.
This course is taught by David Beazley. From 1998-2005, David was an assistant professor in the Department of Computer Science at the University of Chicago where he taught graduate courses in operating systems and networks. David is well known in the Python world as the author of the Python Essential Reference, 4th Edition (Addison Wesley) and Python Cookbook, 3rd Edition (O'Reilly Media). He has also given various talks about concurrency-related topics including the infamous Python GIL Talk and this bit of live coding. More recently, he has been working on the Curio project..