Write a Compiler

Upcoming Course Dates (Chicago):

• March 16-20, 2020.

Instructor: David Beazley

Price: $2500

Includes:

  • Breakfast and lunch
  • Course materials

Location: 5412 N Clark Street #218, Chicago, IL 60640

Other Courses | FAQ


Overview

Come to Chicago and shatter your brain by writing a compiler for a new programming language!

Compilers is often considered a capstone course for computer science majors. There is a reason for that--a compiler touches almost every topic of computer science ranging from theory to computer architecture. Writing a compiler is also an exercise in managing software complexity. Compilers have a lot of moving parts, involve unusual tools, are difficult to test, and are challenging to debug. You'll learn a lot by writing a compiler. Plus, you'll be able to brag about it later.

Target Audience

This class is for more experienced programmers who'd like to expand their skills by taking on a good challenge. If you're a self-taught programmer and you're curious about what writing a compiler is all about, then this is the course for you. You'll learn a lot more about programming languages and have a much better insight into the various coding tools you use every day.


"If you've seen any of David Beazley's fun and mind blowing talks, you can begin to imagine his week-long, intensive courses, held in his own Chicago office -- or "lair", as he calls it. It's even better than you think: starting with breakfast every day, David is a gracious host, a daredevil coder, and an inspiring teacher. I took his Compiler Course and I hope to attend more of his unique, hands-on workshops."

Luciano Ramalho, author of Fluent Python

Instruction Format

This course is almost entirely project focused. A typical day might consist of an hour of discussion followed by 7-8 hours of coding. During coding, there is typically considerable group discussion about coding techniques, design tradeoffs, testing, and other related topics.

Prerequisites

You might not think that you're ready to write a compiler, but if you've been coding for awhile and know the basics of Python, it's something that you could probably tackle. No prior background in compilers is required although awareness of common programming language concepts (e.g., type systems, functions, classes, scoping rules, etc.) is strongly advised. Some knowledge of regular expressions, computer architecture (machine instructions, memory, etc.), and prior use of a compiled language is also recommended.

Syllabus

The course is structured around the goal of creating a small programming language (currently based on Go) and to have it compile to executable programs via LLVM. Recent versions of the course have also targeted WebAssembly. The code produced by your compiler will have performance comparable to programs written in C. You'll also see how to make a Just-In-Time (JIT) compiler.

There are 8 project milestones:

  1. Lexing and tokenization. You'll write a tokenizer for recognizing text patterns.
  2. Parsing and Abstract Syntax Trees (ASTs). Expressions are parsed into an AST data structure. Although there are tools that can help with this, you'll learn how parsing works at a low-level by writing a recursive descent parser from scratch.

  3. Type checking. You'll write an static program analyzer that checks the source code for type errors and other semantic problems.
  4. Intermediate code generation. The AST is converted to an intermediate representation based on a stack architecture.
  5. Code generation. The intermediate code is turned to runnable programs using LLVM and/or WebAssembly. With the completion of this stage, you have the basic foundation of the complete compiler. Subsequent steps add more functionality.
  6. Relations and booleans. Relational operators and boolean tests are added so that control flow structures can be added.
  7. Control flow. Conditionals and loops are added to the language. Control-flow analysis, basic blocks, and other related concepts are introduced.
  8. Functions. User defined functions are added to the language.

It's important to note that a major goal is to build a stronger intuition for how all of the parts of a compiler actually work. Although there are a lot of existing frameworks and tools that can be used to assist in compiler construction, they are NOT used in this course. Instead, you will be creating a compiler from scratch from first principles.

Practical Takeaways

Although most programmers are unlikely to write a compiler in their day-to-day work, this course touches on a wide variety of practical topics that are applicable elsewhere. These include:

Are You Nuts?

Writing a compiler in only 5 days? Is it even possible? To be sure, compilers is often regarded as one of the most difficult CS courses that one can take. If you take it at a University, you'll probably get a professor who will drag you through the infamous Dragon Book, spend a lot of time doing mathematical proofs (e.g., deriving the LALR(1) parsing algorithm), and make the focus of the course on preparing graduate students for future research in programming languages. I have taught that class. This is NOT that class.

Instead, this is a compilers course aimed at practioners. As such, the main focus is on coding and software development. Yes, you will learn about some important core compiler concepts such as regular expressions, context free grammars, type systems, and programming language semantics. However, instead of doing mathematical proofs involving parsing theory, we'll focus on how you would actually go about implementing a parser and doing things such as writing unit tests for it.

To be sure, you will write a lot of code in this course. The course runs for more than 40 hours over five days. The final completed project consists of approximately 2500-3000 lines of Python and is every bit as involved as the project you typically find in a college-level compilers course for computer science majors. However, the project is structured in a way to help you succeed.

About the Instructor

This class is led by David Beazley. Although most known for his work in the Python community, Dave was formerly a tenure-track assistant professor in the Department of Computer Science at the University of Chicago where he taught a Compilers course along with a variety of other topics in systems and programming languages. Dave is also the creator of the PLY and SLY parsing tools for Python. He recently gave a PyCon talk about these tools. In a past decade, he also created Swig, a C/C++ compiler for building scripting language extension modules.