CS615: Formal Specification and Verification of Programs
Autumn 2017


Instructor Supratik Chakraborty email: supratik [AT] cse ...
Friendly Teaching Assistant Sumith Kulal email: sumith [AT] cse ...

Announcements

  • Project Stage II submission and viva date: Nov 24, 2017, 3-5pm
  • Detailed instructions for your projects have been uploaded here.
  • You can download CAnalyzer API documentation, the VM for CAnalyzer and the code for a simple (and partial) implementation for the sign domain here. The site is password protected. Please check moodle for login credentials.
  • Please subscribe to the moodle page for CS615 and to the piazza page for CS 615A at IIT Bombay. Note that the course number is CS615A on piazza. Please see posting on moodle to get the access code for enrolling in piazza.

Course Description

The purpose of this course is to train students in the art and science of specifying correctness properties of programs and proving them correct. The primary emphasis will be on deductive and algorithmic verification techniques. There will be occasion to study some of the issues that arise in deploying these techniques in practice, and solution strategies for them. The course will make use of projects and tools to drive home different ideas taught in class.

Prerequisites

This is a graduate level course, and good working knowledge of propositional and first order logics, algorithm design and analysis are required for crediting/auditing this course. Some background in automata theory is also needed. These topics will not be reviewed in class, and the instructor will assume familiarity with them. Some topics in the theory of lattices are also required as pre-requisites. However, there will be an overview of these topics prior to their usage in lectures. If you are not sure whether you meet the prerequisites of this course, please meet the instructor at the earliest.
In propositional and first-order logic, you are expected to be familiar with the material in
  • Chapters 1 and 2 of Logic in Computer Science: Modelling and Reasoning about Systems by Michael Huth and Mark Ryan, Second Edition, Cambridge University Press
It would also help (although some of this will be quickly covered in the course) if you are familiar with the material in
  • Chapters 1 and 2 of Introduction to Lattices and Orders by B. A. Davey and H. A. Priestley, Second edition, Cambridge University Press,

Contact Hours

What When Where
Lectures Tue 17:30-18:55,
Fri 17:30-18:55
SIC 305, KReSIT Building
Office hour Wed 16:00-17:00 Room 314, New CSE Building
(by email appointment only)

Related links ...

Grading (tentative schedule)

In-class quizes and tool-based projects 20%
Midterm 20%
Endterm 40%
Individual end-term project 20%

Project Policy

Each student group (ideally of size 2) is required to do a small project as a hands-on exercise in building a formal verification tool. For more details of the projects, please check
here.

Each project will be evaluated separately based on a project report to be submitted prior to end-semester exams, and a live demonstration of the project.

The project evaluation schedule is as given below. During each of these evaluations, students must meet the instructor/TA and explain their project, and also give a demonstration of the project, if asked for.
Deliverables Date Where
Code and properties to be analyzed; abstract domain to be used; plan of implementation in CAnalyzer; a brief report Oct 27 Moodle
Final report; demonstration of working of abstract domain in CAnalyzer, using CAnalyzer with your domain to demonstrate property verification Nov 24 CFDVS Lab

List of some formal verification tools relevant to the contents of this course.

Some more logistics ...

  • Midterm and endterm exams, and quizes will be open book, open notes and any open material brought to the exam hall.
  • If you miss a quiz for a genuine reason, meet the instructor as soon as possible after setting up an appointment.
  • Each quiz/tool assignment will carry equal weightage.
  • Informal discussion among students for the purpose of clarification is strongly encouraged.
  • There will be zero tolerance for dishonest means like copying solutions from others in quizzes, projects and end-term exam. Anybody found indulging in such activities will be summarily awarded an FR grade. The onus will be on the supposed offender to prove his or her innocence. Please read this to understand the distinction between "copying" and "discussion"
  • Quizzes, Assignments, Practice Problems ...

    For some practice problems from previous years, please check previous years' course home pages. Note that the set of topics taught in earlier years may vary slightly from those taught in the current offering.

    Problem sets for current offering

    Problems Graded? Expected to be completed by
    Key-Hoare exercises, see also this document from Chalmers University
    Please read this note on Key-Hoare first.
    Ungraded Aug 18, 2017
    In-class Key-Hoare based quiz Graded Aug 18, 2017
    In-class quiz Graded Sep 8, 2017
    Midsem exam Graded Sep 14, 2017
    In class quiz Graded Nov 3, 2017
    Endsem exam Graded Nov 11, 2017

    Reading Material

    • Papers, chapters and articles to be announced in class.
    • Class notes.

    Current Reading List

    Propositional and predicate logic prerequisites
    1. Chapters 1 and 2 of Logic in Computer Science: Modeling and Reasoning about Systems by Michael Huth and Mark Ryan, Cambridge University Press
    Hoare Logic: Partial correctness proofs
    1. Chapter 4 of Logic in Computer Science: Modeling and Reasoning about Systems by Michael Huth and Mark Ryan, Cambridge University Press
    2. A deck of slides used in Microsoft Research Summer School 2008. Skip the parts about Separation Logic for now.
    3. A short note on using Key-Hoare.
    4. An introductory set of slides on program verification, programs as state transition systems, and the notions of abstract and concrete states
    Abstract interpretation
    1. A wonderful tutorial by Prof. Cousot, explaining in fairly simple language what abstract interpretation is all about. Highly recommended for everybody taking this course.
    2. Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints, a classic 1977 paper by Patrick Cousot and Radhia Cousot.
    3. Invited talk on "Proving Programs Correct by Abstract Interpretation" at WEPL 2015.
    4. Slides (in particular Lecture 15) of a course on Abstract Interpretation offered by Prof. Cousot at MIT in Spring 2005.
    5. Automatic discovery of linear restraints among variables of a program by P. Cousot and N. Halbwachs, in Conference Record of the Fifth Annual Symposium on Principles of Programming Languages, pages 84-97, 1978. A link to the paper is available from Prof. Cousot's home page. This paper discusses the use of the widening operator for computing linear invariants.
    6. For those interested in knowing how refinement can be done automatically in abstract domains like octagons and polyhedra, this is an interesting paper to read: Counterexample driven Refinement for Abstract Interpretation, by Bhargav S. Gulavani and Sriram K. Rajamani, in Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), March 2006
    Boolean programs and predicate abstraction
    1. A beautiful introduction to the paradigm of Boolean Programs by Tom Ball and Sriram Rajamani
    2. The CPAChecker site is a must-visit for all students of this course. You should go through the following papers from the list of papers available at the above web site.
    3. Lazy Abstraction by Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar and Gregoire Sutre, Proceedings of the 29th Annual Symposium on Principles of Programming Languages (POPL), ACM Press, 2002, pp. 58-70.
    4. Abstractions from Proofs by Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar and Kenneth L. McMillan, In ACM SIGPLAN-SIGACT Conference on Principles of Programming Languages, 2004. (You can skim through Section 3 and skip Section 6 of this paper)
    5. A wonderful set of slides on the BLAST model checker, by Tom Henzinger, Ranjit Jhala and Rupak Majumdar. [I'm mirroring the slides here because I can no longer find a link to the original slides.]
    Program verification using constraint solving
    • The CBMC homepage. An excellent set of slides on CBMC are available here.
    • A tutorial on software model checking with Horn clauses. Note that not all material in the tutorial was covered in the course.
    • Another tutorial that deals with software verification using Horn clauses. Note again that we haven't covered all material in this tutorial in the course.
    • A delightful paper on Progam Analysis as Constraint Solving by Sumit Gulwani, Saurabh Srivastava and Ramarathnam Vemkatesan.
    Core constraint solving (propositional case)
    • A few slides on Tseitin encoding prepared by the instructor.
    • A wonderful tutorial on propositional satisfiability solvers and SMT solvers by Roberto Sebastiani.
    • Ashutosh Gupta's slides on SMT solving and interpolant generation.
    Separation Logic
    • A good repository of information about Separation Logic
    • A seminal paper by John C. Reynolds
    • A wonderful primer on separation logic by Peter O'Hearn
    • A gentle introduction to Separation Logic and Bi-abduction from Facebook Infer
    • Hongseok Yang's lecture notes on Program Verification using Separation Logic at Microsoft Summer School, Bangalore, 2008. Search for "Microsoft Summer School" and check the links to his lecture slides (Lectures 0 through 3) at Hongseok's Talks page.
    • A wonderful tutorial on Separation Logic given by Peter O'Hearn at CAV 2008.
    Shape analysis (reasoning about heaps) with 3-valued logic
    Although we have not covered all topics discussed in the following papers, the POPL 99 paper is a good introduction to what we have covered in class. The TOPLAS 02 paper is a much more detailed version for those interested in additional reading.
    • Parametric Shape Analysis via 3-valued Logic by M. Sagiv, T. Reps and R. Wilhelm, in Proceedings of ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 1999
    • Parametric Shape Analysis via 3-valued Logic by M. Sagiv, T. Reps and R. Wilhelm, in ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 24, Number 3, 2002.
    • An excellent set of slides on TVLA from T. Reps, M. Sagiv and R. Wilhelm.
      [I'm mirroring the slides here because I can no longer find a link to the original slides on the web.]
    Survey articles
    • A nice survey article on automated techniques for formal software verification by Vijay D'Silva, Daniel Kroening and Georg Weissenbacher.
    • Another comprehensive survey article on software model checking (as of 2009) by Ranjit Jhala and Rupak Majumdar.

    Lecture Schedule

    Date Topics
    Jul 18 Introduction and course logistics; notion of program state; programs as state transition systems; assertion checking as a reachability problem; notions of abstract and concrete states; importance of abstractions in program analysis and verification; brief demo of Key-Hoare tool
    Jul 21 No class
    Jul 25 A simple sequential, imperative programming language, sufficiency of the language for encoding Turing machines, a simple assertion language. Introduction to Hoare logic: Hoare triples, partial and total correctness semantics, weakest pre-conditions and strongest post-conditions and finding them for a simple assignment statement, Hoare axiom for an assignment statement and its illustration
    Jul 28 Hoare logic inference schema for sequential composition of statements, conditional statements and loop invariants. Centrality of loop invariants in analyzing programs with loops, uncomputability of tightest loop invariants, undecidabilty of checking validity of arbitrary Hoare triples in our simple languate
    Aug 1 Demonstration of the importance of good loop invariants using Key Hoare tool: integer division and array sorting. Importance of keeping all relevant information in an invariant, need for algorithmic techniques for determining good loop invariants.
    Aug 4 Introduction to the theory of abstract interpretation: powerset lattice of concrete states, abstract lattice and Galois connection between the two lattices, properties of Galois connections and their use in analyzing a program in the abstract domain, alternative definitions of Galois connection. Computing abstract post-condition of a statement from the concrete post-condition transformer function, practical issues with computing the most precise abstract post-condition and need for over-approximations.
    Aug 8 Abstract post-condition of a sequence of statements and of conditional statements, implications of representing conditions precisely in abstract domain, viewing a while loop as a sequence of conditional statements to understand how a loop invariant may be computed. Formulating an abstract loop invariant as the limit of a sequence of abstract values, constructing a suitable map G in the abstract lattice such that the abstract loop invariant is the limit of G^i(\bot), implications of monotonicity of G.
    Aug 11 More on the existence of non-trivial abstract loop invariants, their relation to fixpoints of appropriately constructed maps in the abstract lattice, pre-fixpoints, post-fixpoints, Knaster-Tarski fixpoint theorem. Approximately computing abstract loop invariants: use of widen operator, properties of widen operator
    Aug 18 In-class Key-Hoare based quiz. More on computing abstract loop invariants: use of widen operator to get a post-fix point, subsequent refining of abstract loop invariant.
    Aug 22 Examples of abstract domains and corresponding operators: intervals (boxes), defining inclusion, lub, glb and widen operators on intervals, a general template for defining widen using thresholds arranged in a lattice with finite height.
    Aug 25 The hexagon abstract domain: Difference Bound Matrices (DBM) and the need to canonicalize them, all-pairs shortest paths for canonicalizing DBMs, inclusion, lub and glb operators for DBMs
    Aug 29 More on hexagons: interplay of canonicalization and lub/glb computation, widen operator for hexagons as an extension of corresponding operator for intervals. Illustrating proving a Hoare triple using intervals and using hexagons.
    Sep 1 Introduction to predicate abstraction: formulas as domain elements, containment, lub, glb and widen operators, computing abstract post-condition using predicate abstraction, introduction to assume and assert statements and their differences, modeling loops and conditional statements using assumes, use of overapproximations and underapproximations in different contexts
    Sep 5 More on predicate abstraction, Constructing a complete Boolean program from a given program and a set of predicates; equivalence of predicate abstraction on original program and analysis of boolean state space of Boolean program
    Sep 8 Boolean abstractions of programs with function calls; equivalence of location reachability in push-down automata (PDA) and assertion checking in Boolean programs.
    Sep 14 Mid-semester exam
    Sep 19 More on Boolean abstractions of programs with function calls, illustration by a simple example. Demo and presentation of CAnalyzer -- tool to be used for course projects
    Sep 22 Refining Boolean abstraction of a program by analyzing spurious counterexample traces; discovery of new predicates; trace formulas and use of Craig interpolation in discovering new predicates (a nice set of slides can be found here)
    Sep 26 Discussion of mid-sem solutions. Product operators to combine abstract analyses using different domains: Cartesian product, reduced product, cardinal power (a nice set of slides by Prof. Agostino Cortesi can be found here)
    Sep 29 Further discussion of product operators and their complexity of implementation. Three ways of refining abstract analysis: product of multiple abstract domains, refining abstract domain, refining abstract operators. Illustration of refinement of lub, glb and widen operators in an abstract domain. Introduction to bug-hunting as opposed to proving properties of programs.
    Oct 3 Bounded assertion checking as a technique to hunt for "shallow" bugs in complex programs; translating programs without pointers and heap accesses to a system of constraints; correspondence between models of constraint and assertion violating traces in the program; keeping size of constraints linear in the size of the program -- difference between trace-based encoding of constraints and program-structure-based encoding of constraints.
    Oct 5 Session on CAnalyzer conducted by Dr. Hrishikesh Karmarkar and Prateek Saxena. CAnalyzer API documentation and sample code for the sign domain are available here.
    Oct 6 Further discussion on bounded assertion checking; translating programs that dereference pointers and access the heap to systems of constraints using uninterpreted functions; moving back and forth between error traces in such programs and models of the generated constraints; dealing with functions with bounded recursion.
    Oct 10 Formulating program verification as a problem of solving a Horn formula. Illustrating the subtleties in the formulation.
    Oct 13 Discussion on solving Horn clauses -- a few illustrative techniques. Solving Horn clause formulas with linear inequality templates and Farkas Lemma (see the paper by S. Gulwani, S. Srivastava and R. Venkatesan for reference).
    Oct 17 Introduction to Separation Logic for reasoning about heap memory manipulating programs; syntax, semantics and some illustrations of how separating conjunction and implication enjoy different properties compared to propositional conjunction and implication. Hoare logic and separation logic: frame rule
    Oct 21 No class. Compensation class on Wed, Oct 25
    Oct 24 Representing inductively defined data structures (e.g. lists, trees) using separation logic predicates; application of frame rule to prove properties of programs using separation logic; illustrative examples: disposing of a tree, maintaining acyclicity of a list, etc. A nice primer on separation logic by Peter O'Hearn
    Oct 25 Using the Frame Rule to obtain specification of a composition of program fragments from specifications of individual fragments; biabduction in separation logic, and its use in inferring specifications of programs from specifications of components; solving problems using bi-abduction and Frame Rule.
    Oct 27 Introduction to three-valued logical structures as an abstraction for heap, shape graphs, concrete (two-valued) and three-valued structures and their relation, interpretation of the 1/2 value in three-valued logic, interpretation of summary predicate, canonical abstraction and the embedding theorem for three-valued structures. Introduction to update formulas for three-valued structures with respect to simple program statements.
    Oct 31 Defining update formulas for three-valued structures with respect to additional program statements, abstract interpretation using three-valued structures, illustration through examples. Motivation for focusing a three-valued structure, focusing three-valued structures for simple program statements.
    Nov 3 More on focusing three-valued structures with respect to a given formula. Using instrumentation predicates to infer compatibility conditions, coercing focused three-valued structures using compatibility condition to improve the precision of representation.
    Nov 7 Practice problem solving using three-valued logic abstraction of heap. Discussion of quiz 3 solutions. Summarization of course and clarification of doubts.