This page will provide information on what is covered in each lecture and will be updated as the class progresses.

Course slides are available on Moodle.

Date Topics Reading


Overview of the course

  • Basic course information and administrative details
  • Supervised and unsupervised learning
  • Learning task, instances, features, labels, reward/loss, training, testing
Lecture slides
Chapter 1 of SS17



  • Overview of classification: setup, training, test, validation dataset, overfitting.
  • Classification families: linear discriminative, non-linear discriminative, decision trees, probabilistic (conditional and generative), nearest neighbor.


Probability for ML (basics)

  • Probability, axioms of probability, random variables, common distributions, means, variance and other moments, joint distributions, and conditional distributions.
    • You are assumed to know this material. For revising you can look at any standard textbook. Multivariate Gaussian (normal) distribution, covariance
sections 3.1 to 3.9 here
or, chapters 1--6 from PRS,
or Chapter 1 and 2 of Bis07)


Probabilistic Generative classifiers

  • Naive Bayes classification
Chapter 2.5 of DDL2019

14/08/2019, 16/08/2019

Decision tree classification,

  • Purity, Gini index, entropy
  • Algorithms for constructing a decision tree
  • Pruning methods to avoid over-fitting
  • Regression trees
Chapter 3 of Mitchell97


Conditional Linear classifiers and Regressors

  • Logistic classifier
  • Linear regression
Chapter 3.1, 3.2 of DDL2019
Chapters 4.2, 4.3.2 of Bis07

Mitchell's chapter

Numerical Optimization (Review),

  • Review of convex function and optimization of unconstrained functions. (
    • Definition and properties of convex function (Chapters 3.1.1 to 3.1.5, 3.2 of BV)
    • Unconstrained optimization algorithms: zero-th order first order
Chapter 4.3 of Deep Learning book
Chapters 9.1 to 9.3 of BV excluding convergence proofs


Quiz 1

Up to and including decision trees.

04/9/2019 -- 13/9/2019
Support vector machines
  • Max margin motivation: low density, high stability
  • Margin geometry to primal SVM formulation for separable training data (demo)
  • Dual formulation and role of alpha in a form of sparse local regression
  • Inseparable data, slack variables, hinge loss, upper bound on 0/1 training loss (demo)
  • Handling non-linear regression by lifting data points to higher dimension (demo)
  • Polynomial, Gaussian, RBF kernels
  • Sequential minimal optimization (SMO) algorithm
Chapter 7 of Bis07

25/09/2019, 27/09/2019

Feedforward Neural networks ,

  • Feedforward networks
  • Backpropagation Algorithm

Chapter 6 of Deep Learning book

04/10/2019, 09/10/2019

Convolutional Neural Networks

  • Motivation. Basic convolution operation. Pooling.
  • LeNET architecture for basic image classification task.
  • CNNs for Object detection.
Chapter 6 and 12 of DDL book

11/9/2019, 16/9/2019
Recurrent Neural Networks
  • Basic RNNs
  • Back propagation along time
  • Application time series forecastingm, language modeling (with word embeddings)
  • Encoder-decoder model with attention for sequence to sequence learning.

Chapter 10.0 to 10.4 of Deep Learning book

18/10/2019, 19/10/2019, 23/10/2019


Lecture notes

Combining models Chapter 14.2 of Bis07, Chapter 8.7 of HTF book
25/10/2019 Dimensionality Reduction
  • Principal component analysis (PCA) Basic PCA, Eigenvalue and eigenvector recap, demo
Chapter 12 from Bis07 Eigen faces demo
30/10/2019 Tutorial
01/11/2019, 08/11/2019 Overview of graphical models
06/11/2019 Quiz