SIC 305, Slot 2: Monday 9.30-10.25, Tuesday 10.35-11.30, Thursday 11.35-12.30

TA: Abhirup Ghosh (abhirup@cse).

Course coverage:

This course will cover the following topics:

    1. Code optimization
    2. Data flow analysis
    3. Register assignment
    4. Vectorization and parallelization of code
    5. Generation of high quality target code
    6. Selected current topics, e.g., exploiting multi-core CPUs

For each topic, we cover the classical techniques and a few research papers.

The course will contain homework assignments (with very little weightage), group projects, individual paper reading & presentation assignments, and exams.

The primary sources for the FIRST FEW lectures would be:

    1. The Aho-Lam-Sethi-Ullman Compilers book.
    2. Slides on code optimization and data flow analysis ( pdf version.)
    3. Slides on partial redundancy elimination  and a brief paper on E-path based PRE  
    4. Advanced Compiler Design & Implementation, Steven S Muchnick, Harcourt Asia/Morgan Kaufmann, 1997.

Honesty policy

Unless otherwise specified, students are both permitted and encouraged to discuss assignment problems but each person must write his/her own assignment completely independently. Discussion of project ideas is also permitted and encouraged, but looking over other persons' code is forbidden.

Plan/log of lectures

Lectures 1,2 (3, 4 January 2011)
Course introduction. Introduction to code optimization - kinds of optimization, levels of optimization

Lecture 3 (6 January 2011)
Optimizing transformations - compile time evaluation, CSE, constant propagation, variable propagation, code movement optimization,
loop optimization
Safety of code movement

Lecture 4 (10 January 2011)
Strength reduction, loop test replacement, Dead code elimination
DAG based local optimization

Lecture 5 (11 January 2011)
Abstract syntax tree, triples, quadruples.
Global optimization: Control flow analysis---dominators and post dominators. Data flow analysis---available expressions

Lecture 6 (13 January 2011)
Quiz 1
Data flow property, Transfer functions for basic blocks.

Lecture 7 (17 January 2011)
Available expressions.
MOP solution. Data flow equations.

Lecture 8 (18 January 2011)
Iterative solution of data flow equations.
Complexity of round-robin iterative data flow analysis.

Lecture 9 (18 January 2011)
Reaching definitions and Live variables.
The meet semi-lattice: Determining the lattice top and lattice bottom elements, partial order, etc.
Initializations of iterative data flow analysis for obtaining MFP.

Lecture 10 (24 January 2011)
Busy and very busy expressions.
Revisiting the meet semi-lattice theory---Monotonicity and its implications for complexity of round robin iterative analysis.
Worklist iterative data flow analysis.
Elements of the partial redundancy elimination (PRE) problem.

Lecture 11 (25 January 2011)
Partial redundancy elimination (PRE)---overview, historical.
PRE by using eliminatability paths.

Lecture 12 (31 January 2011)
E-path_PRE data flows and example.
Introduction to the SSA form.

Lecture 13 (1 February 2011)
Constant propagation using the SSA form (Paper by Wegman, Zadeck):
Simple constant, Sparse Simple constant. Ideas of Simple conditional constant propagation.

Lecture 14 (3 February 2011)
Sparse Conditional Constant Propagation.
Construction of the SSA representation (paper by Cytron et al.)

Lecture 15 (7 February 2011)
Discussion on construction of SSA representation.
Control dependence and its use in program slicing.
Introduction to register assignment.

Lecture 16 (8 February 2011)
Approaches to register assignment by Chow-Hennessey and Chaitin et al.
Chow-Hennessy approach.

Lecture 17 (10 February 2011)
Chow-Hennessy approach to register assignment.
Key idea of improved colouring approach by Briggs et al.

Lecture 18 (14 February 2011)
Comparison of Chow_Hennessy and Chaiting approaches.
Optimal code generation for expressions: Key issues---instruction selection, evaluation order, register utilization.
Parsing and pattern matching approaches.

Lecture 19 (15 February 2011)
Optimal code generation for expressions: Ershov approach---contiguous evaluation of an expression, register requirement of nodes in expression trees.
Overview of the Aho-Johnson approach. Width of program.

Lecture 20 (17 February 2011)
Strongly contiguous evaluation of an expression.

Lecture 21 (28 February 2011)
Code generation algorithm of Aho-Johnson.

Lecture 22 (1 March 2011)
Instruction-level parallelism.

Lecture 23 (3 March 2011)
Instruction scheduling within basic blocks.

Lecture 24 (7 March 2011)
Mid-sem #2.

Lecture 25 (8 March 2011)
Global scheduling of code.

Lecture 26 (10 March 2011)
Software Pipelining.

Lecture 27 (14 March 2011)
Software pipelining of acyclic dependences.

Lecture 28 (15 March 2011)
Software pipelining of cyclic dependences.

Lecture 29 (17 March 2011)
Software pipelining of cyclic dependences (contd.).
``Advanced compiler optimizations for supercomputers", by Padua, Wolfe, CACM, 1986.

Lecture 30 (21 March 2011)
Padua-Wolfe paper (contd.)

Lecture 31 (25 March 2011)
Impact of loop fusion and loop fission on cache performance.
Concurrentization in the presence of loop-carried dependences.
Affine expressions in array access. Iteration spaces. Data dependence and data reuse.

Lecture 32 (24 March 2011)
Overview of concurrentization in presence of loop-carried dependences --- Affine space partitioning.
Representing the iteration space: Matrix representation of inequalities. Projection. Fourier-Metzkin elimination.

Lecture 33 (28 March 2011)
Determination of loop bounds. Change of axis.

Lecture 34 (29 March 2011)
Legality of loop interchange.
Detection of data reuse.

Lecture 35 (31 March 2011)
Affine partitioning---formulation of space-partition constraints.

Lecture 36 (5 April 2011)
Determining processor loop bounds in concurrentized program.

Lecture 37 (7 April 2011)
Improving a concurrent program: Eliminating empty iterations.

Lecture 38 (11 April 2011)
Improving a concurrent program: Eliminating tests from loops.

Lecture 39 (12 April 2011)
Primitive affine transforms.