Introduction to Compiler Design

Introduction to Compiler Design is a textbook is intended for an introductory course on compiler design, suitable for use in an undergraduate programme in computer science or related fields.

The book presents techniques for making realistic, though non-optimising compilers for simple programming languages using methods that are close to those used in "real" compilers, albeit slightly simplified in places for presentation purposes. All phases required for translating a high-level language to machine language is covered, including lexing, parsing, intermediate-code generation, machine-code generation and register allocation. Interpretation is covered briefly.

The book aims to be neutral with respect to implementation languages, so algorithms are presented in pseudo-code rather than in any specific programming language, and suggestions for implementation in several different language flavours are in many cases given. The techniques are illustrated with examples and exercises.

The author has taught compiler design at the University of Copenhagen for over a decade, and the book is based on material used in the undergraduate compiler design course there.

Additional material for use with this book is available below. More material will be added later.

Publication details

Introduction to Compiler Design is published by Springer Verlag.

Lecture slides

With thanks to Jost Berthold and Cosmin Oancea. A few slides refer to course-specific exercises.

Solutions to selected exercises

solutions.pdf

There is an overweight of solutions to exercises from Chapters 1--2. This might be remedied later.

Known misprints

Page 33: It is not true that Flex generates the DFA as program code. Other lexers (e.g., Quex), however, do.
Page 62: In the two lists of set inequalities on the middle of the page, the subset relations point the wrong way. I.e, replace a total of 8 occurrences of ⊆ with ⊇.
Page 63 (near top): The sentence "A grammar that can be parsed using LL(1) parsing is called an LL(1) grammar" is strictly speaking wrong, as you do not parse grammers, you parse strings relatively to a grammar. A more correct formulation is "A grammar where membership of strings in the language generated by the grammer can be tested using LL(1) parsing is called an LL(1) grammar".
Page 70 (near beginning of Section 2.13): The sentence "LR parsers is a class of bottom-up methods for parsing that accept a much larger class of grammars than LL(1)" is imprecise, as parsers do not accept grammers. Instead of "accept", write "can solve the parsing problem for".
Fig. 2.28 (page 75): Remove the occurrences of T and R from the "NFA states" column (states 0, 3 and 4).
Fig. 6.11 (page 139): Replace two occurrences of vase with base.
Page 153 (middle): Replace "move r_d, r_s" by "mov r_d, r_s" (to be consistent about using "mov" for move instructions).
Page 156 (exercise 7.2): Change ":= a+b^last" to "a := a+b^last".
Page 166: It is stated that graph colouring is NP-complete, but this is true only of the decision problem ("Can I do it with N colours?"), whereas the problem of assigning colours ("Do it with N colours!") is only proven NP-hard, so it is potentially more complex than the decision problem.

Torben Mogensen
DIKU, University of Copenhagen, Universitetsparken 5, DK-2100 København Ø
E-mail: torbenm@diku.dk
Mobile phone: (+45) 21849672 Fax: (+45) 35321401