CS615: Formal Specification and Verification of Programs
Autumn 2007

Instructor	Supratik Chakraborty	email: supratik [AT] cse ...
Friendly Teaching Assistants	Kaustubh Nimkar	email: knimkar [AT] cse ...
	Abhisekh Sankaran	email: abhisekh [AT] cse ...

Announcements

All scores and grades have been posted. Happy vacations!

Course Description

The purpose of this course is to train students in the art and science of specifying correctness properties of programs and proving them correct. The primary emphasis will be on theoretical aspects of specification formalisms and algorithmic verification. There will be occasion to study some of the issues that arise in deploying these techniques in practice, and solution strategies for them. Techniques covered in the course should be applicable in both software and hardware (modeled as processes) domains.

Prerequisites

This is a graduate level course, and good working knowledge (at undergraduate level) of propositional and first order logics, algorithm design and analysis, and automata theory are required for crediting/auditing this course.

A quick check for prerequisites: If you are comfortable answering all questions in this set , it is likely you meet the prerequisites for this course. However, if you are having problems answering these questions and still want to register for this course, please contact the instructor at the earliest.

Hours

What	When	Where
Lectures	Tue 15:30-17:00, Fri 14:00-15:30	Seminar Hall, Old CSE Bldg.
Office hour	Wed 16:00-17:00	CFDVS Lab 1 (Maths basement) (by email appointment)

Tutorials	When	Where
Tutorial 1	Tue, July 31 15.30-17.00	CSE Seminar Hall
Tutorial 2	Tue, Aug 7 15.30-17.00	CSE Seminar Hall
Tutorial 3	Fri, Sept 21 18.30-20.00	CSE Seminar Hall
Tutorial 4	Fri, Oct 26 18.30-20.00	CSE Seminar Hall
Tutorial 5	Fri, Nov 2 18.30-20.00	CSE Seminar Hall
Tutorial 6	Fri, Nov 16 14.30-15.30	CSE Seminar Hall

Grading

Quizzes + Homeworks	30%
Endterm	50%
Project	20%

Practice Problems

Practice problems will be given. These are not meant for submission and your performance in these will not contribute towards your final grade. You are encouraged to solve all problems by yourself or in small groups, in order to gain understanding of the ideas. You are also encouraged to approach the instructor or TA (by setting up an appointment by email) for help with practice problems, if needed.

Project Policy

Each student is required to do a small project as a hands-on exercise in formal verification of a system. Each project will involve choosing a code fragment (ideally approx 50-100 lines of uncommented C code) from the Linux kernel, and verifying some interesting properties of the chosen code fragment. This will require modeling the environment of the code fragment appropriately, and inserting appropriate environmental constraints, where appropriate. The code fragment and properties must be chosen by students and approved by the instructor/TA. We will use a restricted set of tools based on abstract interpretation in numerical abstract domains (octagons, polyhedra, etc) and tools based on predicate abstraction for the projects. The choice of tool should be made in consultation with the instructor/TA.

Students are required to execute the project in groups of at least two and at most three. Each project will be evaluated separately based on a project report, to be submitted prior to end-semester exams, and a live demonstration of the project according to a schedule to be announced later.

Links for downloading tools/documentation to be used in your projects:

BLAST
ARMC
SLAM/SDV
Dagger (link to be provided soon)

Some more logistics ...

There will be no midterm examination.

Endterm exam and quizzes will be open book, notes and any material brought to the exam hall.

If you miss a quiz for a genuine reason, meet the instructor after setting up an appointment.

Each quiz will carry equal weightage.

Informal discussion among students for the purpose of clarification is strongly encouraged.

There will be zero tolerance for dishonest means like copying solutions from others in quizzes, projects and end-term exam. Anybody found indulging in such activities will be summarily awarded an FR grade. The onus will be on the supposed offender to prove his or her innocence.

Quizzes, Assignments, Practice Problems ...

For some practice problems, please check previous years' course home pages [2006, 2005, 2004]

Homework 1	Due Sept 11, 2007 (5.00 pm)
Homework 2	Due Oct 3, 2007 (5.00 pm)
Quiz1	Sept 28, 2007 (2.00-3.30 pm)
Quiz2	Oct 30, 2007 (3.30-5.00 pm)
Endterm exam

Reading Material

Papers, chapters and articles to be announced in class.
Class notes.

Current Reading List

Propositional and predicate logic prerequisites

Chapters 1 and 2 of Logic in Computer Science: Modeling and Reasoning about Systems by Michael Huth and Mark Ryan, Cambridge University Press

Hoare Logic: Partial correctness proofs

Chapter 4 of Logic in Computer Science: Modeling and Reasoning about Systems by Michael Huth and Mark Ryan, Cambridge University Press
Sections 1, 2, 3, 4, 5 (upto subsection 5.2.3 and then subsections 5.3.1 and 5.3.2) and 7 (upto subsection 7.3) of Chapter 15, Methods and Logics for Proving Programs by P. Cousot, in Handbook of Theoretical Computer Science, Vol B (Formal Models and Semantics), edited by Jan Van Leeuwen, Elsevier and The MIT Press
Also available from Prof. Patrick Cousot's web page.
[You need to read the sections on least fixed point formulations of loop invariants to follow what has been taught in class. Be advised that this paper is relatively heavy on notation and requires familiarizing yourself with notations introduced in Sections 2, 3 and 4 of the paper.]

Boolean programs and predicate abstraction

A beautiful introduction to the paradigm of Boolean Programs by Tom Ball and Sriram Rajamani
BLAST: Berkeley Lazy Abstraction Software Verification Tool web site is a must-visit for all students of this course. We will be using this tool for some of our projects. You should go through the following papers from the list of papers available at the above web site.
- Lazy Abstraction by Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar and Gregoire Sutre, Proceedings of the 29th Annual Symposium on Principles of Programming Languages (POPL), ACM Press, 2002, pp. 58-70.
- Abstractions from Proofs by Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar and Kenneth L. McMillan, In ACM SIGPLAN-SIGACT Conference on Principles of Programming Languages, 2004. (You can skim through Section 3 and skip Section 6 of this paper)
- The Software Model Checker BLAST: Applications to Software Engineering by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala, and Rupak Majumdar, Int. Journal on Software Tools for Technology Transfer, 2007.

Abstract interpretation ...

A wonderful tutorial by Prof. Cousot, explaining in fairly simple language what abstract interpretation is all about. Highly recommended for everybody taking this course.
Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints, a classic 1977 paper by Patrick Cousot and Radhia Cousot.
Slides (in particular Lecture 15) of a course on Abstract Interpretation offered by Prof. Cousot at MIT in Spring 2005.
Automatic discovery of linear restraints among variables of a program by P. Cousot and N. Halbwachs, in Conference Record of the Fifth Annual Symposium on Principles of Programming Languages, pages 84-97, 1978. A link to the paper is available from Prof. Cousot's home page. This paper discusses the use of the widening operator for computing linear invariants.
For those interested in knowing how refinement can be done automatically in abstract domains like octagons and polyhedra, this is an interesting paper to read: Counterexample driven Refinement for Abstract Interpretation, by Bhargav S. Gulavani and Sriram K. Rajamani, in Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), March 2006

Shape analysis (reasoning about heaps) with 3-valued logic ...

Although we have not covered all topics discussed in the following papers, the POPL 99 paper is a good introduction to what we have covered in class. The TOPLAS 02 paper is a much more detailed version for those interested in additional reading.
- Parametric Shape Analysis via 3-valued Logic by M. Sagiv, T. Reps and R. Wilhelm, in Proceedings of ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 1999
- Parametric Shape Analysis via 3-valued Logic by M. Sagiv, T. Reps and R. Wilhelm, in ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 24, Number 3, 2002.

References for the more interested ...

A good set of slides prepared by Prof. S. Ramesh on material relevant to this course.
Prof. Cousot's course on Abstract Interpretation at MIT in Spring 2005.
Formal Methods educational resources from the world-wide web Formal Methods page.
The Temporal Logic of Reactive and Concurrent Systems: Specification by Zohar Manna and Amir Pnueli, Springer Verlag.
Temporal Verification of Reactive Systems: Safety by Zohar Manna and Amir Pnueli, Springer Verlag.

Lecture Schedule

Date	Topics details
Jul 27	Introduction, logistics. A simple programming language, examples of some programs in this language, motivation for formal reasoning about programs. Notion of state of a program in execution (process), reasoning about programs with variables of unbounded type using first-order logic formulae. Intuitive notion of loop invariants, and an ad-hoc proof of correctness of a program using suitably supplied loop invariants.
Jul 31	Tutorial on predicate logic by Abhisekh Sankaran. Compensation class on Aug 17.
Aug 3	No class. Compensation class on Aug 24.
Aug 7	Tutorial on predicate logic and simple program analysis by Abhisekh Sankaran and Kaustubh Nimkar. Compensation class on Aug 31.
Aug 10	No class. Compensation class on Oct 5.
Aug 14	Notions of preconditions, postconditions and Hoare triples. Understanding partial correctness, termination and total correctness of programs. Strongest postconditions and weakest preconditions: computation, interpretation and use. Understanding the relation between strongest postcondition and weakest precondition of a set of states with respect to a statement.
Aug 17	A proof system for Hoare logic for our simple programming language: Inference rule schemas for different constructs. Using these schemas to prove Hoare triples. Illustration through sample proof examples. Importance of obtaining the right loop invariants for proving interesting properties of programs. Discussion on how difficult it can get to obtain the right loop invariant.
Aug 17 Comp. class	Coding the halting problem of Turing Machines in our simple programming language. Discussion on why proving Hoare triples or coming up with tightest loop invariants is undecidable in general even for simple integer programs
Aug 21	Reduction of halting problem for Turing Machines to checking validity of Hoare triples for programs in our simple programming language. Motivation for imprecise but careful reasoning. Illustration of how values of suitably chosen predicates can be used to approximately describe the state of a program. Illustration of how this leads to a finite state transition diagram of a program without function calls. Utility of such a state transition diagram in obtaining simple loop invariants
Aug 24	Visualizing a program as a state set transformer. Discussion on how proving a Hoare triple is equivalent to searching whether the set of states reached from a given initial set of states has a specified property. Discussion on lattices, and powerset lattice in particular. Understanding the meaning of a program as a function on an appropriate powerset lattice.
Aug 24 Comp. class	Introduction to Galois connection and Galois insertion. Discussion on alpha and gamma functions in the context of program analysis, and why it makes sense for these functions to have certain properties.
Aug 28	More discussion on motivating properties of alpha and gamma functions relating concrete and domains for reasoning about programs. Computing reachable states of a program the limiting lub (least upper bound) a suitable function in the powerset or abstract lattice. Illustration through a simple example.
Aug 31	Illustration of reasoning based on different abstract domains for the same program. More discussions about limiting lubs: use in computing reachable states, loop invariants and invariants at arbitrary program locations. Computing the limiting lub as the limiting value of repeated applications of a suitably defined monotone function, discussion on monotonicity of image functions in concrete and abstract domains.
Aug 31 Comp. class	More discussion on the notion of using abstract domains for program analysis. Discussion on limiting lub, fixed points and expressing the limiting lub of previous class as the least fixed point of a suitably defined monotone function. Tarski's theorem on fixed points of monotone functions in complete lattices, least fixed points. Computing reachable states of a program, loop invariants, etc. as special cases of least fixed point computations.
Sept 4	Translating relations between points in concrete lattice to relations between corresponding points in abstract lattice through Galois connection, and vice versa. Defining abstract image/postcondition operators, computing the least fixed point of a suitable state transformer in the abstract lattice, relating to the least fixed point of state set transformer in concrete lattice. Advantages of computing overapproximations of least fixed points in abstract lattice. Illustration of least fixed point computation of a state set transformer in an infinite abstract lattice -- that of of finite sets of linear inequalities. Need for learning patterns between elements of a sequence to compute its limit. Introduction to the widen operator.
Sept 7	Discussion on widening: widening as a learning operation, properties of a widen operator, some analogies to finding limits of numerical sequences. Discussion on how the limit of a given increasing chain can be overapproximated by applying widening to another appropriately constructed sequence, that is element-wise greater than the original sequence. Need for detecting when widening operation applied to an increasing sequence has converged: difficulty of doing this for arbitrary increasing sequences, ease of doing this for specially constructed increasing sequence; constructing these special sequences in the context of program analysis.
Sept 18	General method of computing abstractions of loop invariants using widen operation over the abstract domain. Widen operation for the domain of intervals; illustration of where the domain of intervals with the anpve widen operation helps; illustration of where it doesn't help.
Sept 21	Abstract domain of Difference Bound Matrices (DBM)s, canonical DBMs, lub, glb and widen operation on DBMs, analyzing a program where using the abstract domain of intervals isn't sufficient to prove a post-condition, but the domain of DBMs suffices.
Sept 25	Abstract domain of polyhedra, lub, glb and widen operations on polyhedra. Illustrating potential improvements in precision when using the polyhedra abstract domain compared to the DBM abstract domain.
Sept 28	Quiz 1
Oct 2	Holiday: Gandhi Jayanti
Oct 5	More discussion on the polyhedra abstract domain and widen operator. Illustrating an example where polyhedral abstract domain fares very poorly. Importance of choosing the right abstract domain. Introduction to predicate abstraction.
Oct 5 Comp. class	Discussion on predicate abstraction techniques. Illustrating how counterexamples can be analyzed to identify the right set of predicates: symbolic simulation of counterexamples, constraint solving, unsatisfiable cores and predicate selection. Using a chosen set of predicates to overapproximate the transition relation of a program.
Oct 9	The paradigm of Boolean programs. Mapping Boolean programs (possibly with function calls) to push-down automata. Discussion on location reachability in push-down automata. Some illustrations of Boolean program construction.
Oct 12	More discussion on the construction of Boolean programs. Need to maintain an overapproximation of the set of traces of the original program. Expressing while statements using assume statements. Converting assignment and "assume" statements in the original program to a set of statements in the Boolean program.
Oct 16	Constructing Boolean programs: Translating "while", "if then else" and "assert" statements in the original program. Computing weakest preconditions of different program constructs for use in constructing Boolean programs. Discussion on CounterExample Guided Abstraction Refinement (CEGAR) loop in modern tools. Location-specific predicates as opposed to globally tracked predicates: advantages and utility.
Oct 19	Predicate abstraction with location-specific predicates: illustration and discussion. Craig's interpolation theorem and intuitive arguments of why interpolants are suitable indicators of location-specific meaningful predicates.
Oct 23	Craig's interpolation theorem and its application in deriving location-specific succinct predicates during counterexample analysis. Comparison with analysis that derives predicates that must be tracked at all locations in the program.
Oct 26	Guest lecture by Aditya Nori and Kanika Nema, Microsoft Research Labs India
Oct 30	Quiz 2
Nov 2	Reasoning about heaps: shape analysis with 2-valued and 3-valued logic structures. Representing abstract heap structures as directed graphs, representing directed graphs as logic structures, 3-valued information lattice, need for 3-valued logic structures to represent abstract heap structures. Discussion on use of different predicates for representing heap structures with varying properties
Nov 6	More discussion on 3-valued logic structures as abstract representations of heap, finiteness of universe given a set of predicates to track, discussion of role of summary nodes. Illustrating abstract shape analysis using 3-valued logic structures for a simple linked list manipulating program
Nov 9	Holiday: Diwali
Nov 13	Course evaluation Understanding how abstract shape analysis can be done using transformation of predicates representing shape properties of heap. Discussion on how simple program statements result in updation of predicates.
Nov 18 Comp. class	Discussion on updation of shape predicates due to different assignment statements involving pointer-type variables and pointer fields. Understanding the summary predicate and how different assignment statements cause this predicate to be updated. Summary of topics taught in this course.

CS615: Formal Specification and Verification of Programs Autumn 2007