Degree Level Course

Compiler Design

This course introduces the fundamental principles, structures, and techniques used in the design and implementation of language translators, particularly compilers. It provides a comprehensive understanding of the complete compilation process – from lexical and syntax analysis to intermediate representation, code optimization, and target code generation. Through theoretical foundations and hands-on implementation, students will learn to build a working compiler for a subset of the C programming language (nanoC) using tools like Flex and Bison, and will gain exposure to key compiler data structures such as symbol tables, abstract syntax trees, and control flow graphs. The course emphasizes both the theoretical underpinnings and practical aspects of compiler construction. • This course provides a comprehensive introduction to the fundamental concepts and techniques used in language translation. • It balances theoretical foundations in formal languages with practical implementation using the Flex-Bison tool-chain. • It uses a nanoC, designed as a subset of C, as a programming language for illustrating and providing a sample implementation of various concepts and practices of language translation

by Prof. Partha Pratim Das

Course ID: BSCS4032

Course Credits: 4

Course Type: Elective

Pre-requisites: BSCS3005 -  Programming in C

What you’ll learn

By the end of the course, students will be able to: – Understand the architecture of a compiler and the distinct roles of its front-end and back-end components in the language translation process.
– Apply formal methods in lexical and syntax analysis, including regular expressions, finite automata, and context-free grammars, and implement these using Flex and Bison.
– Design and manage compiler data structures, such as symbol tables, abstract syntax trees (AST), and intermediate representations like TAC, DAG, and SSA.
– Implement translation schemes for key language features in C, including expressions, control structures, type systems, and runtime semantics.
– Understand run-time environments and interactions with it, including memory organization, activation records, and function call protocols.
– Construct control flow graphs and perform basic block-level optimizations for improving the performance of generated code.
– Translate machine-independent code to assembly code, understand the basics of register allocation, and interface with target architectures like x86.
– Build a working compiler for picoC, transforming source code through all compilation stages into executable machine code, demonstrating an end-to-end compilation pipeline.

Course structure & Assessments

For details of standard course structure and assessments, visit Academics page.

WEEK 1 Introduction to Compiler Construction : Evolution of compilers; front-end and back-end phases (HLL → TAC → target code, optimization); compilation environment (preprocessors, linkers, loaders); generator tools (Flex/Bison ecosystem)
WEEK 2 Lexical Analysis : Tokens, lexemes, and buffer management; regular expressions and transition diagrams; NFA → DFA construction and minimization; scanner implementation using Flex
WEEK 3 Basic Syntax Analysis : Context-free grammars and derivations; recursive descent parsing; predictive parsing (FIRST/FOLLOW, LL(1)); grammar properties (ambiguity, nullable/unit productions); operator precedence parsing
WEEK 4 Syntax Analysis : Shift-reduce parsing; LR(0), SLR(1), LR(1), and LALR(1) parsing techniques; parsing tables and FSM; parser implementation using Bison
WEEK 5 Semantic Analysis & Attribute Grammars : Syntax-directed definitions and translation; S-attributed and L-attributed grammars; type checking and type systems; type inference (overloading, polymorphism); expression evaluation implementation
WEEK 6 Symbol Table & Memory Organization : Symbol table design and lookup (hashing); scope management with nested tables; memory layout and allocation; activation records and parameter passing; integration with lexer/parser
WEEK 7 Intermediate Code Generation – Declarations & Expressions : Intermediate representations (AST, DAG, TAC); arithmetic expression translation; array address computation; variable declarations and storage layout; implementation of expressions in nanoC
WEEK 8 Intermediate Code Generation – Control Flow & Boolean Logic : Boolean expressions and short-circuit evaluation; backpatching; translation of control structures (if, loops); procedure calls; implementation of control flow in nanoC
WEEK 9 Basic Code Optimization : Optimization goals and CFG; basic blocks and flow graphs; common subexpression elimination and copy propagation; dead code elimination and constant folding; peephole optimization
WEEK 10 Global Optimization & Data Flow Analysis : Data flow analysis (reaching definitions); liveness analysis; loop optimizations (LICM, induction variables); global expression optimization; inter-procedural analysis
WEEK 11 Target Code Generation : Code generation design and instruction selection; register allocation (graph coloring); TAC to assembly translation; instruction scheduling; function handling and calling conventions
WEEK 12 Advanced Runtime & Professional Tools : Garbage collection techniques; JIT compilation and profiling; parallelizing compilers (SIMD/MIMD); domain-specific compilers; emerging trends (LLVM, AI-driven compilation)
+ Show all weeks

Prescribed Books

The following are the suggested books for the course:

– Compilers: Principles, Techniques, and Tools by Alfred V. Aho, Monica S Lam, Ravi Sethi, and Jeffrey D. Ullman. Contributor: Sorav Bansal, Updated 2nd Edition, Pearson, 2024

References: – Compiler Design in C by Allen I. Holub, Prentice-Hall, 1989 – Programming Languages and Operational Semantics by Maribel Fern´andez, Springer Nature, 2014

Language Manual: – The C Programming Language by Brian Kernighan and Dennis Ritchie, 2nd Edition, Prentice-Hall, 1988

Tools Manual: – Flex and Bison: Unix Text Processing Tools by John Levine, O’Reilly, 2009

About the Instructors

Prof. Partha Pratim Das
Professor, Computer Science and Engineering, IIT Kharagpur

Dr. Das obtained his B Tech, M Tech and Ph D degrees in1984, 1985 and 1988 respectively from IIT Kharagpur. He served as a faculty member in the Department of Computer Science and Engineering, IIT Kharagpur from 1988 to 1998. In 1998, he moved to the Industry and served in Senior Director positions. In 2011, Dr. Das joined back the Department as a Professor. He is the Joint PI of National Digital Library of India project of MoE and leads the national initiative to integrate the digital learning contents.

...  more

Dr. Das is a regular contributor to the SWAYAM Program and his courses on Programming in C++, Object Oriented Analysis and Design, and Data Base Management Systems have been attended by thousands of students since 2017. Over the last decade he has been teaching Compilers, Software Engineering, Image Processing, Foundations of Algorithms & Machine Learning, and Principles of Programming Languages for which he received top students’ feedback 6 times during 2014 to 2018. Dr. Das has published over 50 papers in international journals in his interest areas spanning Computer Vision, Digital Learning, Digital Geometry, Human-Computer Interaction, Medical Image Analysis, Computer Analysis of Indian Classical Dance, and Productivity in Software Engineering.

  less

Other courses by the same instructor: BSCS2001 - Database Management Systems

support@study.iitm.ac.in
7850999966
IITM BS Degree Office, 3rd Floor,
ICSR Building, IIT Madras,
Chennai - 600036

Please use only the above methods for program queries. Response time: 3 working days. During peak periods, Google Meet links will be shared. Call wait times may be longer.