This page will provide information on what is covered in each lecture and will be updated as the class progresses.

Course slides are available on Moodle.

Date Topics Reading


Overview of the course

  • Basic course information and administrative details
  • Supervised and unsupervised learning
  • Learning task, instances, features, labels, reward/loss, training, testing
Lecture slides
Chapter 1 of SS17



  • Overview of classification: setup, training, test, validation dataset, overfitting.
  • Classification families: linear discriminative, non-linear discriminative, decision trees, probabilistic (conditional and generative), nearest neighbor.

20/07/2018 - 03/08/2018

Decision tree classification,

  • Purity, Gini index, entropy
  • Algorithms for constructing a decision tree
  • Pruning methods to avoid over-fitting
  • Regression trees
Chapter 3 of Mitchell97


Quiz 1,

  • Quiz on basics of probability and statistics and decision trees.
Sample reading: sections 3.1 to 3.9 here

01/08/2018 -- 17/08/2018

Probabilistic classifiers

  • Basics assumed: probability, axioms of probability, random variables, common distributions, means, variance and other moments, joint distributions, and conditional distributions.
    • You are assumed to know this material. For revising you can look at any standard textbook. Here are some handy references: chapters 1--6 from PRS, Chapter 1 and 2 of Bis07). Multivariate Gaussian (normal) distribution, covariance
  • Generative classifiers: LDA, QDA
  • Generative classifiers: Naive Bayes classification
  • Conditional classifier: Logistic
Chapters 4.2, 4.3.2 of Bis07
Example: naive Bayes
Lecture notes PDF and OneNote Link
Additional reading:
Mitchell's chapter


Quiz 2,

Probabilistic classifiers

Hyperplane classifiers,

  • Loss-regularization framework for classification
  • Loss functions: 0/1 ("true"), square, perceptron, logistic, hinge
  • Regularizers (Chapter 1.1 and 3.1.4 of Bis07)

Lecture notes
29/08/2018, 31/08/2018

Convex Optimization (Review),

  • Review of convex function and optimization of unconstrained functions. (
    • Definition and properties of convex function (Chapters 3.1.1 to 3.1.5, 3.2 of BV)
    • Unconstrained optimization algorithms: zero-th order , first order (Chapters 9.1 to 9.3 of BV excluding convergence proofs)
    • second order (Chapter 9.5 of BV excluding convergence proofs)

Convex functions notes
Optimization notes


Feedforward Neural networks ,

  • Feedforward networks
  • Backpropagation Algorithm

Lecture slides, Slides as pdf
Chapter 6 of Deep Learning book


Convolutional Neural Networks

03/10/2018, 05/10/2018, 10/10/2018

Recurrent Neural Networks

  • Basic RNNs
  • Back propagation along time
  • Application time series forecastingm, language modeling (with word embeddings)
  • Encoder-decoder model with attention for sequence to sequence learning.

  • Lecture slides
    Slides in pdf
    Chapter 10.0 to 10.4 of Deep Learning book

    10/10/2018, 12/10/2018, 17/10/2018


    Lecture notes

    24/10/2018, 26/10/2018
    Combining models Lecture notes: bagging Lecture notes: boosting

    31/10/2018, 02/10/2018
    Support vector machines (Chapter 7 of Bis07)
    • Max margin motivation: low density, high stability
    • Margin geometry to primal SVM formulation for separable training data (demo)
    • Dual formulation and role of alpha in a form of sparse local regression
    • Inseparable data, slack variables, hinge loss, upper bound on 0/1 training loss (demo)
    • Handling non-linear regression by lifting data points to higher dimension (demo)
    • Polynomial, Gaussian, RBF kernels
    • Sequential minimal optimization (SMO) algorithm
    Lecture notes Wikipedia

    Overview of graphical models
    Overview of Markov Decision Process and Reinforcement Learning Lecture by Sabyasachi Ghosh