Machine learning is increasingly being deployed in
applications like autonomous driving, unmanned aerial vehicles,
automated screening of applicants for jobs, automated screening for
loans etc. In most of these applications, there is a need for
guarantees on safety, fairness, unbiased-ness in decision making.
Existing techniques for developing and training learning models
however come with few provable guarantees in this regard. The problem
of guaranteeing desirable behaviour of machine learning models is
inherently hard, both because state-of-the-art learning models are
very large, and also because specifying precise properties of these
models, especially for perceptual tasks like image recognition, is
difficult. On the other hand, the field of program (and hardware)
verification has matured over the last four decades to produce tools
that are routinely used in the industry. This course will try to
bridge these two areas and teach students how techniques grounded in
formal methods, that have been immensely successful in
program/hardware verification, can be adapted to reason about
guarantees in various machine learning settings. This area is nascent
as of today, but several very interesting results have already been
shown. While safety, fairness, un-biasedness in machine learning
models motivate the core of this course, it will also cover
(time permitting) techniques where formal methods can be used to guide
the training process and tune parameters of learning models. Overall,
the course intends to introduce students to an area of significant
importance that leverages advances in both formal methods and machine
learning.
Pre-requisites
UG students | CS228, CS213, CS337 or equivalent can be taken along with this course |
PG students | Either CS433 or CS719 or CS771. CS725 is recommended but not
essential |
Course Logistics
- There will be lectures and office hours for clearing
doubts.
- Students are required to read the required reading material
(roughly 3 hours per week) in order to keep pace with this course.
- Piazza will be the
primary platform for discussion and query answering.
- Moodle will be used primarily for
announcements, surveys etc.
Meeting Schedule
Lecture Hours | SIC 205 (KReSIT Building), Tue and Fri, 2.00-3.30 pm |
Office Hours | CC 314, Wed, 4-5pm (please send email to instructor if you wish to come for office hour) |
Evaluation and assessment
Midsem | 25% |
Endsem | 25% |
Project | 25% |
Quizes | 25% |
- There will be ungraded homework assignments, that may require use of publicly available tools.
- All exams/quizes will be open-book.
- If you miss an exam for a genuine reason, contact the instructor immediately.
- Each graded quiz will carry equal weightage towards the 25% weightage for all quizes.
- Informal discussion among students for the purpose of clarification is
strongly encouraged.
- There will be zero tolerance for
dishonest means like copying solutions from others in any
quiz, exam, project.
Anybody found indulging in such activities
will be summarily awarded an FR grade. The onus will be
on the supposed offender to prove his or her innocence.
- If you are unsure about the distinction between copying and
discussion, please
read this.
|
Course project
- The course project carries 25% marks.
- Each project will be done by a single student. If you absolutely need to pair
up with another student, please contact the instructor.
- A project submission includes a demo, a presentation and a report.
- A project involves choosing a problem based on topics
studied in class, identifying publicly available tool(s) that
are likely to be useful for solving the problem, planning out
how the existing tools may be used to solve parts of the
problem, and making an attempt to solve the problem using the
identified tool(s). This may require modifying parts of the
tool(s) to overcome some of their limitations. It is fine if
the problem is eventually not solved by the end of the
project; the learning from the project is the expected benefit
of the project.
For the current semester, your project should pertain to one of the following
topics:
- Local robustness verification
- Finding abductive explanations and/or optimal explainable models
- Compositional reinforcement learning from logical specifications
- In case a non-trivial problem is solved by a student (or group
of students), we keep open the possibility of developing this
further and attempting a research paper submission.
Tools required for course projects
- DeepPoly (within ERAN)
- Installation
instructions: You may encounter some installation hiccups
when following the instructions on the github repository
verbatim. We've indicated how to address these in some
comments in our installation instructions. These have been
tested on Ubuntu 22.04
- ɑβ-CROWN (available here)
- Installation
instructions: You may encounter some installation hiccups
when following the instructions on the github repository
verbatim. We've indicated how to address these in some
comments in our installation instructions. These have been
tested on Ubuntu 22.04
- Reluplex (within Marabou)
Practice problems, Quizes and Exams
Exam/Quiz/Problem set | Posted on | Answers due by |
Practice Problem Set 1 | Aug 30, 2023 | |
Quiz 1 | Sep 6, 2023, 8:40am | Sep 6, 2023, 9:30am |
Midsem Exam | Sep 21, 2023, 8:30am | Sep 21, 2023, 11:00am |
Tool-based Practice Problem Set 2 | Oct 5, 2023, 8:00am | Oct 10, 2023, 2:00pm |
Quiz 2 | Nov 8, 2023, 8:40am | Nov 8, 2023, 9:40am |
Quiz 3 | Nov 11, 2023, 2:30pm | Nov 11, 2023, 3:30pm |
Endsem Exam | Nov 23 2023, 5:30pm | Nov 23, 2023, 8:30pm |
Lectures and interactions
Date | Topic | Resources |
Jul 31 | Why formal methods in machine learning: an overview of the area |
Slides |
Aug 1 | Specifying properties of ML components |
Slides |
Aug 4 | Specifying properties: issues and approaches; abstraction as a modeling methodology |
Slides-1,
Slides-2 |
Aug 8 | More on the theory of abstraction: illustration with interval abstract domain |
Slides |
Aug 11 | Missed lecture (to be compensated) | |
Aug 18 | Computing abstract state transitions and their approximations; using abstractions for
verifying local robustness properties of neural networks (DeepPoly approach) |
Slides,
DeepPoly paper: Sec 1, 2, 3, 4 (prior to 4.1)
|
Aug 22 | Further discussions on DeepPoly: abstracting ReLUs and complexity of
constructing linear bounds in terms of inputs |
DeepPoly paper: Sec 4.1, 4.4, 4.5
|
Aug 23 | [Compensation class] Concluding discussions on DeepPoly: abstracting activation
functions other than ReLU, dealing with rotations and other perturbations (shear, shift, etc.) of inputs
|
DeepPoly paper: Sec 4.2, 4.3, 5
|
Aug 25 | The CROWN approach to finding bounding functions for outputs of deep neural networks
|
CROWN paper : Sec 1, 3.1, 3.3, Appendix A, Appendix D
|
Aug 29 | More discussions on CROWN and on finding concrete bounds of outputs for input perturbations
within various p-norm balls
|
CROWN paper : Sec 3.1, 3.2, Appendix B
Dual norms: some intuition
|
Sep 1 | Discussion on linear programming (LP) relaxation based techniques vis-a-vis
linear relaxation based perturbation analysis (LiRPA) techniques; ɑ-CROWN: simultaneously optimizing
linear lower bounds of multiple activation functions and output bounds of a neural network using
LiRPA techniques; splitting on unstable ReLUs to improve precision
|
ɑ-CROWN paper : Sec 1, 2, 3.1
|
Sep 5 | More discussion on ɑ-CROWN; using ɑ-CROWN in conjunction with
branch-and-bound, incompleteness in the absence of LP based feasibility checks, building
a complete verifier using ɑ-CROWN, LP feasibility checks and branch-and-bound;
NP-completeness of finding a counterexample to robustness verification of neural networks with
ReLU activation
|
ɑ-CROWN paper : Sec 3.2, 3.3, 3.4, Appendix A
|
Sep 8 | Incorporating split constraints using Lagrange multipliers: the β-CROWN algorithm;
using ɑ- and β-CROWN together and the role of projected gradient descent/ascent optimizers in
robustness verification.
|
β-CROWN paper : Sec 1, 2, 3.1, 3.3, 3.4
|
Sep 12 | More discussion about β-CROWN and
joint optimization of ɑ and β parameters; Towards a
modified simplex algorithm for neural network verification: an
abstract description of the simplex algorithm using a small
set of guarded assignments
|
β-CROWN paper : Sec 3.1, 3.3, 3.4
Reluplex paper : Sec 1, 2
|
Sep 15 | Reluplex algorithm as set of guarded assignments, including a splitting rule;
soundness and completeness of Reluplex; difference of Reluplex from solving multiple independent
LP instances by simplex; illustration of Reluplex in action on a simple example
|
Reluplex paper : Sec 2, 3
|
Sep 21 | Mid-semester Exam |
|
Sep 26 | More discussion on Reluplex, its strengths
and limitations. Introduction to logical abduction based
explanation of neural network decisions
|
Reluplex paper : Sec 2, 3, 4
Slides for abductive explanations |
Sep 29 | A lazy algorithm for building abductive explanations
of neural network decisions using counter-examples
|
Slides
for abductive explanations; Abductive explanation paper
|
Oct 3 | Demo of DeepPoly in action on MNIST networks and datasets; introduction to modeling ReLU based
neural networks as Mixed Integer Linear Programs (MILP)
|
Abductive explanation paper: Sec "Encoding Neural Networks with MILP"
Slides on big-M encoding for MILP
|
Oct 6 | Further discussion on abductive explanations of ML model predictions
|
Abductive explanation paper: all sections except "Related Work"
|
Oct 10 | Duality between absolute explanations and counterexamples; adversarial examples from explanations; an implicit hitting set enumeration based algorithm for generating explanations/counterexamples
|
Duality
paper: Sec 1, 2, 3
|
Oct 13 | Further discussions on algorithm for generating explanations/counterexamples, application in understanding
relevance of points in a patch for MNIST image classification
|
Duality
paper: Sec 3.3, 4, 5
|
Oct 17 | Missed lecture (to be compensated) | |
Oct 20 | Synthesizing optimal decision trees using SAT: encoding structure of a binary decision tree using
auxiliary variables and constraints
|
Optimal decision tree by SAT paper: Sec 1, 2, 3.1
|
Oct 24 | Institute holiday | |
Oct 27 | Synthesizing optimal decision trees using SAT: encoding labeling of nodes and matching of |
Optimal decision tree by SAT paper: Sec 3.2, 3.3, 4
|
Oct 31 | Encoding cardinality constraints using propositional formulas: arc consistency, generalized
arc consistency and Sinz' encoding |
Excellent slides on encoding by Ruben Martins: Slides 14-22
|
Nov 3 | Basics of Reinforcement Learning: some commonly used terminology |
Paper with terminology explained: Sec 2 and 3 |
Nov 7 | Compositional reinforcement learning from logical specifications: problem definition, specifying
complex tasks using SpectRL, abstract graph reachability problem |
Compositional RL paper: Sec 1, 2, 3.1 |
Nov 10 | Compositional reinforcement learning from logical specifications: constructing abstract
graphs from logical specs, DiRL algorthm |
Compositional RL paper: Appendix A, Sec 3.2, 4 |
Nov 10 | [Compensation class] Compositional reinforcement learning from logical specifications: discussion on
calculating edge weights, relevance of Dijsktra's algorithm, theoretical bounds; Discussion of solutions of Quiz 2 |
Compositional RL paper: Sec 4, 5 |
|