CS781: Formal Methods in Machine Learning
Spring 2022

Instructor	Supratik Chakraborty	email: supratik [AT] cse ...
Teaching Assistant	Shreyas Narahari	email: 203050037 [AT] iitb ...

Announcements

All scores and grades have now been posted. Happy summer vacations!

Course Information

Machine learning tools and techniques are becoming ubiquitous and are being deployed in various scenarios like autonomous driving, unmanned aerial vehicles, automated screening of applicants for jobs, automated screening for loans etc. where there is a need for guarantees on safety, fairness, unbiased-ness in decision making. Existing techniques for developing and training learning models however come with few provable guarantees in this regard. The problem is inherently hard, given the size and complexity of state-of-the-art learning models. On the other hand, the field of program (and hardware) verification has matured over the last four decades to produce tools that are routinely used in the industry. This course will try to bridge these two areas and teach students how techniques grounded in formal methods, that have been immensely successful in program/hardware verification, can be adapted to reason about guarantees in various machine learning settings. This area is nascent as of today, but several very interesting results have already been shown. While safety, fairness, un-biasedness in machine learning models motivate the introduction of this course, it will also cover (time permitting) techniques where formal methods can be used to guide the training process and tune parameters of learning models. Overall, the course intends to introduce students to an area of significant importance that leverages advances in both formal methods and machine learning.

Pre-requisites

UG students	CS228, CS213, CS337 or equivalent
PG students	Either CS433 or CS719 or CS771. CS725 is recommended but not essential

Course Logistics

There will be lectures and interaction sessions for clearing doubts.
All lectures will be given live online, and recordings of lectures will be made available.
MSTeams will be used for both lectures and live online interaction sessions. Please subscribe to the CS 781 channel on MSTeams using your IITB LDAP login. If you need a code to join the team, send email to the instructor.
We expect all students taking the course to be present on campus, and not have any internet bandwidth or laptop/computing device constraints. In case you have internet bandwidth or laptop/computing device issues, please send email to the instructor at the earliest.
IIT Bombay rules permitting, we will also have weekly face-to-face discussion and interaction sessions at a venue to be announced. For those who can't attend these sessions in person, you can post questions on moodle.
As per latest instructions, all interactions in the first two weeks of the semester will be online.
Attending live interaction sessions for clearing doubts is not mandatory. No new material will be covered in these sessions, and will be used strictly for clearing doubts.
Students are required to read the required reading material in order to keep pace with this course.
Moodle will be the primary platform for discussion and dissemination of material. Please subscribe to the IITB Moodle page for CS 781.

Regular Lecture Schedule

MSTeams CS 781 channel, Tues and Fri, 5.30-7.00 pm

Evaluation and assessment

Midsem	30%
Endsem	30%
Project	20%
Quizes	20%

All exams/quizes will be open-book.
If you miss an exam for a genuine reason, inform the instructor immediately.
There will be no homeworks.
Each graded quiz will carry equal weightage.
Informal discussion among students for the purpose of clarification is strongly encouraged.
There will be zero tolerance for dishonest means like copying solutions from others in any quiz, exam, project. Anybody found indulging in such activities will be summarily awarded an FR grade. The onus will be on the supposed offender to prove his or her innocence.
If you are unsure about the distinction between copying and discussion, please read this.

Course project

The course project carries 20% marks.
The ideal size for a project team is 2 students. If you want to do a course project alone, please contact the instructor.
A project submission includes a demo, a presentation and a report.
More information about course projects is posted here .
Each group must choose a tool, say T from among the topics studied in this course and do the following.
- Choose a machine-learning model, such as a deep neural network N, for a specific task. Examples are ACAS-Xu networks, MNIST networks, etc.
- Decide on a property P you'd like to verify/analyze for the network.
- Train the network using publicly available datasets for the task you'd like to verify properties of.
- Use the tool T to verify/analyze property P of ML model N
Submit a report of the exercise by Fri, Apr 15, 2022 (last date of examination)
Important dates
- Wed Apr 6, 2022: Meeting with instructor to discuss your project plan
- Sat Apr 16, 2022: Demo and presentation of your project

Required Reading List

There are no standard reference textbooks in this area. There is an online e-book by Prof. Aws Albarghouthi (Univ of Wisconsin at Madison) that has recently been published and is freely available. We will follow this book for topics that are covered in the book. For other topics, the primary reference will be papers and articles listed below (roughly 1 paper per week). Some edits to the list of papers/articles may be made during the course of the semester.

E-book: Introduction to Neural Network Verification by Aws Albarghouthi, 2020
Papers and articles (all papers below have appeared in conferences and/or journals; however, whenever an extended version was found on arXiv/CoRR, the arxiv citation has been given).
1. Towards Verified Artificial Intelligence by Sanjit A. Seshia, Dorsa Sadigh and S. Sankar Sastry, 2016.
2. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks by Guy Katz, Clark Barrett, David Dill, Kyle Julian and Mykel Kochenderfer, 2017
3. An Abstract Domain for Certifying Neural Networks by Gagandeep Singh, Timon Gehr, Markus Puschel and Martin Vechev, 2019
4. Verifying Properties of Binarized Deep Neural Networks by Nina Narodytska, Shiva Kasiviswanathan, Leonid Ryzhyk, Mooly Sagiv and Toby Walsh, 2017
5. Synthesizing Pareto-optimal Interpretations for Black-box Models by Hazem Torfah, Shetal Shah, Supratik Chakraborty, Akshay S and Sanjit Seshia, 2020
6. Abduction based Explanations for Machine Learning Models by Alexey Igantiev, Nina Narodytska, Joao Marques Silva, 2018
7. Safe Reinforcement Learning via Shielding by M. Alshiekh, R. Bloem, R. Ehlers, B. Konighofer, S. Niekum, U. Topcu, 2017
8. Safe Reinforcement Learning via Formal Methods: Toward Safe Control through Proof and Learning by Nathan Fulton and Andre Platzer, 2018
9. Verifiable Reinforcement Learning via Policy Extraction by Osbert Bastani, Yewen Pu and Armando Solar-Lezama, 2018
10. Provably Minimally Distorted Adversarial Examples by Nicholas Carlini, Guy Katz, Clark Barrett and David Dill, 2017

Readings for your leisure

Concrete Problems in AI Safety by Dario Amodei et al.
CAMUS: A Framework to Build Formal Specifications for Deep Perception Systems using Simulators by Julien Girard-Satabin et al.
Automated Reasoning for Explainable AI -- slides of a wonderful talk by Joao Marques-Silva.
Logic-Enabled Verification and Explanation of ML Models -- a beautiful tutorial at IJCAI 2020 by Alexey Ignatiev, Joao Marques-Silva, Kuldeep Meel and Nina Narodytska.

Quizes and Exams

Exam/Quiz	Posted on	Answers due by
Quiz 1	Feb 9, 2022 (8.30 am)	Feb 9, 2022 (9.45am)
Midsem Exam	Feb 26, 2022 (8.30 am)	Feb 26, 2022 (10.30am)
Quiz 2	Mar 16, 2022 (8.30 am)	Mar 16, 2022 (9.30am)
Quiz 3	Apr 13, 2022 (8.30 am)	Apr 13, 2022 (9.30am)
Endsem Exam	Apr 29, 2022 (5.30 pm)	Apr 29, 2022 (8.30pm)

Lectures and interactions

Week	Topic	Class Resources	External Resources
Week 1 (Jan 4)	Course logistics and information, high-level overview of course	Video recording, Slides	None
Week 1 (Jan 7)	High-level overview of course (contd), quick survey of concepts from logic (propositional and first order logics, Hoare triples)	Video recording, Slides Class notes	None
Week 2 (Jan 11)	Quick survey of concepts from logic (Hoare triples, linear temporal logic)	Video recording, Class notes	Excellent intro to LTL by B. Srivathsan
Week 2 (Jan 14)	Quick survey of concepts from machine learning (feed-forward neural networks)	Video recording, Class notes	None
Week 3 (Jan 18)	Quick survey of concepts from machine learning (feed-forward neural networks), specifying properties of feed-forward neural networks	Video recording, Slides Class notes	Paper 1
Week 3 (Jan 21)	Specifying properties of feed-forward neural networks (contd): Problems and some approaches to overcoming them	Video recording, Slides Class notes	Paper 1
Week 4 (Jan 25)	Specifying properties of feed-forward neural networks (contd): Some more approaches	Video recording, Slides	Paper 1
Week 4 (Jan 28)	Reluplex: Specialized constraint solving for neural network verification	Video recording, Paper annotations	Paper 2
Week 5 (Feb 1)	Reluplex: Specialized constraint solving for neural network verification (contd)	Video recording, Paper annotations	Paper 2
Week 5 (Feb 4)	Reluplex: Specialized constraint solving for neural network verification (concluding discussion)	Video recording, Paper annotations	Paper 2
Week 6 (Feb 8)	Basics of abstract interpretation for reasoning about neural networks	Video recording, Paper annotations
Week 6 (Feb 11)	Reasoning about neural networks using an abstract domain (DeepPoly): high-level idea of abstraction of ReLUs and affine transformations	Video recording, Paper annotations	Paper 3
Week 7 (Feb 15)	Reasoning about neural networks using an abstract domain (DeepPoly): mathematical details of ReLU, sigmoid and tanh transformations, and also for affine transformations	Video recording, Paper annotations	Paper 3
Week 7 (Feb 18)	Reasoning about neural networks using an abstract domain (DeepPoly): Robustness w.r.t. rotational transformations of inputs; discussion of Quiz 1 solutions	Video recording, Paper annotations	Paper 3
Week 7 (Feb 18)	Reasoning about neural networks using an abstract domain (DeepPoly): Robustness w.r.t. rotational transformations of inputs; discussion of Quiz 1 solutions	Video recording, Paper annotations	Paper 3
Week 8	Mid-semester examinaton
Week 9 (Mar 1)	Discussion on pre-midsem topics	Video recording, Lecture notes
Week 9 (Mar 4)	Binarized deep neural networks: modeling using MILP	Video recording, Paper annotations	Paper 4 Slides on big-M encoding in MILP
Week 10 (Mar 8)	Binarized deep neural networks: modeling using ILP and CNF propositional formula, modeling adversarial constraints	Video recording, Paper annotations	Paper 4
Week 10 (Mar 11)	Binarized deep neural networks: Use of Craig interpolants to simplify CNF SAT solving; Synthesizing interpretations for black-box ML models: motivation, some examples	Video recording, Paper 4 annotations, Paper 5 annotations,	Paper 4 Paper 5
Week 11 (Mar 14)	Synthesizing interpretations for black-box ML models: problem formulation, notion of Pareto optimality, basics of weighted MaxSAT, encoding problem in propositional logic	Video recording, Paper 5 annotations	Paper 5
Week 11 (Mar 18)	Institute holiday due to Holi
Week 12 (Mar 22)	Synthesizing interpretations for black-box ML models: weighted maxSAT encoding of problem, Pareto optimality of solution, constructing Pareto optimal curve	Video recording (coming soon), Paper 5 annotations	Paper 5
Week 12 (Mar 25)	Synthesizing interpretations for black-box ML models: optimized construction of Pareto optimal curve, basics of Probably Approximately Correct (PAC) learning theory, using PAC theory to derive sample size for synthesizing interpretations of black-box ML models	Video recording (coming soon), Paper 5 annotations	Paper 5
Week 13 (Mar 29)	Synthesizing interpretations for white-box ML models: logical theory, implicants, prime implicants, abduction	Video recording (coming soon), Paper 6 annotations	Paper 6
Week 13 (Apr 1)	Synthesizing interpretations for white-box ML models: hitting sets, computing subset-minimal and cardinality-minimal explanations	Video recording (coming soon), Paper 6 annotations	Paper 6
Week 14 (Apr 5)	Synthesizing interpretations for white-box ML models: More explanation on how cardinality-minimal explanations are computed using minimal hitting sets	Video recording (coming soon), Paper 6 annotations Helper slides	Paper 6
Week 14 (Apr 9)	Safe reinforcement learning via shielding: basics of reinforcement learning and idea of shields	Video recording (coming soon), Paper 7 annotations	Paper 7
Week 14 (Apr 10)	Safe reinforcement learning via shielding: pre-shielding and post-shielding, safety automata, Markov decision processes (MDP), abstraction of MDP, 2-player safety game, winning regions	Video recording (coming soon), Paper 7 annotations	Paper 7
Week 15 (Apr 12)	Safe reinforcement learning via shielding: pre-shielding and post-shielding, constructing game graphs for shield synthesis, finding winning regions, synthesizing pre- and post-shields; discussion on questions from other papers studied in course	Video recording (coming soon), Paper 7 annotations	Paper 7

CS781: Formal Methods in Machine Learning Spring 2022