Write a Compiler

Upcoming Course Dates:

  • March 13-17, 2023.


Due to COVID-19, courses are only being offered in an online format. Click here for more information.

• In-person ($3000) -- Unavailable
• Online ($1500)

Instructor: David Beazley


  • Breakfast and lunch (in-person only)
  • Course materials

Location: 5412 N Clark Street #218, Chicago, IL 60640

Other Courses | FAQ


Shatter your brain by writing a compiler for a new programming language!

Compilers is often considered a capstone course for computer science majors. There is a reason for that--a compiler touches almost every topic of computer science ranging from theory to computer architecture. Writing a compiler is also an exercise in managing software complexity. Compilers have a lot of moving parts, involve unusual tools, are difficult to test, and are challenging to debug. You'll learn a lot by writing a compiler. Plus, you'll be able to brag about it later.

Target Audience

This class is for more experienced programmers who'd like to expand their skills by taking on a good challenge. If you're a self-taught programmer and you're curious about what writing a compiler is all about, then this is the course for you. You'll learn a lot more about programming languages and have a much better insight into the various coding tools you use every day.

"If you've seen any of David Beazley's fun and mind blowing talks, you can begin to imagine his week-long, intensive courses, held in his own Chicago office -- or "lair", as he calls it. It's even better than you think: starting with breakfast every day, David is a gracious host, a daredevil coder, and an inspiring teacher. I took his Compiler Course and I hope to attend more of his unique, hands-on workshops."

Luciano Ramalho, author of Fluent Python

Instruction Format

This course is almost entirely project focused. A typical day might consist of an hour of discussion followed by 7-8 hours of coding. During coding, there is typically considerable group discussion about coding techniques, design tradeoffs, testing, and other related topics.


You might not think that you're ready to write a compiler, but if you've been coding for awhile and know the basics of data structures, it's something that you could probably tackle. No prior background in compilers is required although awareness of common programming language concepts (e.g., type systems, functions, classes, scoping rules, etc.) is strongly advised. Some knowledge of regular expressions, computer architecture (machine instructions, memory, etc.), and prior use of a compiled language is also recommended.


The course is structured around the goal of creating a small programming language called Wabbit. Wabbit is a small, statically typed, imperative language. You'll write a compiler that can take Wabbit code and compile it to a native executable program via LLVM. Recent versions of the course have also directly targeted WebAssembly. The code produced by your compiler will have performance comparable to programs written in C. Along the way, you'll also implement an interpreter and static type checker.

The project involves the following core problems:

  1. Data model. You need to figure out some way to represent a computer program, not as text, but as a proper data structure. Sometimes this is called an Abstract Syntax Tree (AST), but it's not necessarily directly tied to parsing.

  2. Parsing. You need to parse programs by converting them from text to the data model. This involves tokenizing text and understanding grammars. We'll look a bit at how parsing algorithms work and you will write a recursive descent parser from scratch.

  3. Interpretation. You'll write a so-called "definitional interpreter" that can directly execute programs from the data model. It will be rather slow, but to make it it work, you'll have to develop an understanding of what it means to "evaluate" a computer program. More generally, this is related to operational semantics. However, having an interpreter can also play a useful role in testing and validation.

  4. Type checking. You'll write an static program analyzer that checks the source code for type errors and other semantic problems. We'll also discuss a few advanced topics such as Algebraic Type Systems.
  5. Code generation. You'll have your compiler generate code for LLVM and/or WebAssembly. Once you have this, you'll have programs that execute at native speed comparable to C programs.

  6. Native code generation. Tools such as LLVM still hide a lot of low-level details. We'll conclude by talking about some lower-level issues such as register allocation, activation frames, function calls, linkers, and other matters.

It's important to note that a major goal is to build a stronger intuition for how all of the parts of a compiler actually work. Although there are a lot of existing frameworks and tools that can be used to assist in compiler construction, they are minimally used in this course. Instead, you will be creating a compiler from scratch from first principles.

Practical Takeaways

Although most programmers are unlikely to write a compiler in their day-to-day work, this course touches on a wide variety of practical topics that are applicable elsewhere. These include:

Are You Nuts?

Writing a compiler in only 5 days? Is it even possible? To be sure, compilers is often regarded as one of the most difficult CS courses that one can take. If you take it at a University, you'll probably get a professor who will drag you through the infamous Dragon Book, spend a lot of time doing mathematical proofs (e.g., deriving the LALR(1) parsing algorithm), and make the focus of the course on preparing graduate students for future research in programming languages. I have taught that class. This is NOT that class.

Instead, this is a compilers course aimed at practioners. As such, the main focus is on coding and software development. Yes, you will learn about some important core compiler concepts such as regular expressions, context free grammars, type systems, and programming language semantics. However, instead of doing mathematical proofs involving parsing theory, we'll focus on how you would actually go about implementing a parser and doing things such as writing unit tests for it.

To be sure, you will write a lot of code in this course. The course runs for more than 40 hours over five days. The final completed project consists of approximately 2500-3000 lines of Python and is every bit as involved as the project you typically find in a college-level compilers course for computer science majors. However, the project is structured in a way to help you succeed.

About the Instructor

This class is led by David Beazley. Although most known for his work in the Python community, Dave was formerly a tenure-track assistant professor in the Department of Computer Science at the University of Chicago where he taught a Compilers course along with a variety of other topics in systems and programming languages. Dave is also the creator of the PLY and SLY parsing tools for Python. He recently gave a PyCon talk about these tools. In a previous decade, he also created Swig, a C/C++ compiler for building scripting language extension modules.