An Introduction to Program Analysis
A Tutorial in the Winter School on Foundation of Programming, 2019
organized by
Persistent Computing Institute


Uday Khedker, CSE, IIT Bombay


About Program Analysis Coverage Practice Slides



  About Program Analysis

Program analysis refers to semantic analysis of programs that aims to to discover useful properties of programs. It was originally invented for code optimization in compilers which tries to improve the performance of a program in terms of time, space, or power/energy consumed by the program during its execution. However, it has come to be used in many other applications that deal with programs such as verifying, validating, debugging, maintaining, migrating, enhancing, and understanding programs.

  Coverage

This tutorial provides an introduction to both basics and (some) frontiers of program analysis by discussing the following three useful data flow analyses:

Live variables analysis and constant propagation have been around for a long while and are mature analyses that every compiler employs. Pointer analysis has also been around for a long while but it has not reached the same degree of exploitation because for it needs to performed across procedures for finding out practically meaningful information. This hampers its scalability and leading to a precision-scalability tradeoff. Thus it is a ripe area of research. This tutorial provides a glimpse of some of the recent research efforts in pointer analysis.

Although the tutorial primarily focuses on the C/C++ language model, the concepts are generic and are applicable to Java too. More details of these analyses can be found in the text Data Flow Analysis: Theory and Practice.

  Practice

We will perform these analyses using SPAN (Synergistic Program Analyzer), a tool developed by Anshuman Dhuliya as a part of his research in our group at IIT Bombay. Please use this link to set it up on your laptops. The link also gives instructions for setting up its front end based on CLANG so that the analysis can be performed on any C program by generating spanir files corresponding to them. This step of setting up CLANG is a bit time consuming. If you have difficulties in setting up the front end, you can try to modify the spanir representation of a program which is a textual representation describing it in python data structures. A set of C programs and their spanir versions can be found here along with a README file containing useful information.


  Slides

The tutorial will be based on the following set of slides. They contain several animations so viewing them in presentation mode would be more effective.