Holiday of Compilers!

Course Dates:

  • December 21, 2020 - January 1, 2021

Registration:

Due to COVID-19, courses are only being offered in an online format. Click here for more information.

• Online ($1500)

Instructor: David Beazley

Meeting Schedule:

The course meets for 7 days over the two week holiday period. The last day is merely a short wrap-up and demo-hour.

  • • December 21-23 (Part I).
  • • December 28-30 (Part II).
  • • January 1. (Compiler Showcase).

Other Courses | FAQ


The Pitch

Some people spend the holidays skiing in the Rockies. Some people spend the holidays at the beach. Some people write a compiler. You could be one of those people!

This is a special extended offering of my "Write a Compiler course, taught over the two week holiday period between December 21 and January 1. The course meets live for 3 days each week. We'll have a wrap-up and special compiler showcase on January 1. Because of the extra day, the extra time, and the extra cold, the course offers all sorts of extra challenges. Will you be the first to implement the complete type system or to implement array handling? Maybe. Anything is possible.

Course Overview

Compilers is often considered a capstone course for computer science majors. There is a reason for that--a compiler touches almost every topic of computer science ranging from theory to computer architecture. Writing a compiler is also an exercise in managing software complexity. Compilers have a lot of moving parts, involve unusual tools, are difficult to test, and are challenging to debug. You'll learn a lot by writing a compiler. Plus, you'll be able to brag about it later.

Target Audience

This class is for more experienced programmers who'd like to expand their skills by taking on a good challenge. If you're a self-taught programmer and you're curious about what writing a compiler is all about, then this is the course for you. You'll learn a lot more about programming languages and have a much better insight into the various coding tools you use every day.


"If you've seen any of David Beazley's fun and mind blowing talks, you can begin to imagine his week-long, intensive courses, held in his own Chicago office -- or "lair", as he calls it. It's even better than you think: starting with breakfast every day, David is a gracious host, a daredevil coder, and an inspiring teacher. I took his Compiler Course and I hope to attend more of his unique, hands-on workshops."

Luciano Ramalho, author of Fluent Python

Instruction Format

This course is almost entirely project focused. A typical day might consist of an hour of discussion followed by 7-8 hours of coding. During coding, there is typically considerable group discussion about coding techniques, design tradeoffs, testing, and other related topics.

Prerequisites

You might not think that you're ready to write a compiler, but if you've been coding for awhile and know the basics of data structures, it's something that you could probably tackle. No prior background in compilers is required although awareness of common programming language concepts (e.g., type systems, functions, classes, scoping rules, etc.) is strongly advised. Some knowledge of regular expressions, computer architecture (machine instructions, memory, etc.), and prior use of a compiled language is also recommended.

Syllabus

The course is structured around the goal of creating a small programming language called Wabbit. You'll write a compiler that can take Wabbit code and compile it to a native executable program via LLVM. Recent versions of the course have also directly targeted WebAssembly. The code produced by your compiler will have performance comparable to programs written in C. Along the way, you'll also implement an interpreter and a transpiler.

The project involves the following core problems:

  1. Data model. You need to figure out some way to represent a computer program, not as text, but as a proper data structure. Sometimes this is called an Abstract Syntax Tree (AST), but it's not necessarily directly tied to parsing.

  2. Parsing. You need to parse programs by converting them from text to the data model. This involves tokenizing text and understanding grammars. We'll look a bit at how parsing algorithms work and some of the tools used for parsing.

  3. Interpretation. You'll write an interpreter that can directly execute programs from the data model. It will be rather slow, but to make it it work, you'll have to understand how programs evaluate.

  4. Type checking. You'll write an static program analyzer that checks the source code for type errors and other semantic problems. We'll also discuss some things like Algebraic Type Systems.
  5. Control-flow analysis. You'll write a module for checking program control flow and finding more errors. This involves conditionals, looping, function calls, and other matters.

  6. Code generation. You'll have your compiler generate code for LLVM and/or WebAssembly. Once you have this, you'll have programs that execute at native speed comparable to C programs.

  7. Native code generation. Tools such as LLVM still hide a lot of low-level details. We'll conclude by talking about some lower-level issues such as register allocation, activation frames, function calls, linkers, and other matters.

It's important to note that a major goal is to build a stronger intuition for how all of the parts of a compiler actually work. Although there are a lot of existing frameworks and tools that can be used to assist in compiler construction, they are minimally used in this course. Instead, you will be creating a compiler from scratch from first principles.

Practical Takeaways

Although most programmers are unlikely to write a compiler in their day-to-day work, this course touches on a wide variety of practical topics that are applicable elsewhere. These include:

Are You Nuts?

Writing a compiler in such a short time? Is it even possible? To be sure, compilers is often regarded as one of the most difficult CS courses that one can take. If you take it at a University, you'll probably get a professor who will drag you through the infamous Dragon Book, spend a lot of time doing mathematical proofs (e.g., deriving the LALR(1) parsing algorithm), and make the focus of the course on preparing graduate students for future research in programming languages. I have taught that class. This is NOT that class.

Instead, this is a compilers course aimed at practioners. As such, the main focus is on coding and software development. Yes, you will learn about some important core compiler concepts such as regular expressions, context free grammars, type systems, and programming language semantics. However, instead of doing mathematical proofs involving parsing theory, we'll focus on how you would actually go about implementing a parser and doing things such as writing unit tests for it.

To be sure, you will write a lot of code in this course. The course runs over a two week period with six days of instruction. The final completed project might consist of 3000-5000 lines of Python and is every bit as involved as the project you typically find in a college-level compilers course for computer science majors.

About the Instructor

This class is led by David Beazley. Although most known for his work in the Python community, Dave was formerly a tenure-track assistant professor in the Department of Computer Science at the University of Chicago where he taught a Compilers course along with a variety of other topics in systems and programming languages. Dave is also the creator of the PLY and SLY parsing tools for Python. He recently gave a PyCon talk about these tools. In a past decade, he also created Swig, a C/C++ compiler for building scripting language extension modules.


Copyright (C) 2005-2024, David Beazley