This page will provide updated information on what is covered in each lecture along with pdf of the hand written material from each class.

The course notes are available from course page.
Date Summary

-

Practice problem set: Please find here a growing list of practice problems. You can also refer to this wiki for definitions of several concepts relevant to this course, connections between those concepts through theorems and proofs of those theorems.

It is very important that you attempt the homework problems that were posted at the end of (almost) each lecture.

You can consider solving the following problems from Hastie et. al. : (A) 2.1, 3.2 , 3.5 , 3.6 , 3.7 , 3.12 , 3.19 , 3.21 , 3.23 , 3.27 , 3.28 , 3.29 , 3.30 , 4.2 , 4.3 , 4.4 , 4.5 , 4.6 , 4.7 , 4.8 , 5.16 , 6.2 , 6.4 , 6.6 , 6.9 , 6.11 , 8.1 , 8.2 , 8.5, 8.7, 10.5a, 10.8, 10.12, 11.2, 11.3, 11.4, 11.5, 11.7, 12.1, 12.2, 12.9, 12.10, 12.11 (harder), 13.1, 14.1, 14.2, 14.23 (harder)

DateTopics
05-01-2016
  • Introduction to Machine Learning
  • Unannotated Slides
  • Annotated Slides
  • Chapter 1 of Hastie et. al.
  • Homework: Intuitively analyse the machine learning problem of handwritten digit recognition (slides 8 and 9) taking cues from the analysis on slides 6 and 7.
08-01-2016
12-01-2016
  • Basis functions/attributes, Least squared Regression and geometrical interpretation of its solution
  • Unannotated Slides
  • Annotated Slides
  • Sections 2.2, 3.1 and 3.2 of Hastie et. al.
  • Homework: Understand the concept of column space and the geometrical interpretation of the solution to the Least squared regression problem. What is polynomial regression on k indepenendent variables v1, v2.... vk and how would you achieve it using linear regression?
15-01-2016
  • More matrix algebra for Least squared Regression, motivation for feature selection and regularized learning (through constraints), some basics of optimization
  • Unannotated Slides
  • Annotated Slides
  • Tutorial 1
  • Sections 3.2, 5.2.3 and Chapter 18 of Hastie et. al. and Section 4.1.4 of Optimization notes (assuming you are comfortable already until Section 4.1.3 from your basic calculus course).
  • Homework: On page 12 of Annotated Slides, based on different inequalities between m and p, find the cases where this equation has (a) no solution (b) one solution and (c) multiple solutions.
  • Homework: On page 18 of Annotated Slides, find the solution to the least squares regression problem based on necessary condition (gradient = 0).
19-01-2016
22-01-2016
  • Convex Sets,Convex Functions,Strict Convex Functions and First Order,Second Order definition of Convex Functions
  • Unannotated Slides
  • Annotated Slides
  • Tutorial 2
  • Reference: Section 4.2 of Basics of Convex Optimization.
  • Homework: Last page of Annotated Slides, Explain why the error on the train data reduces as the degree increases until 7. Why does the error on the test data also decrease until degree of 7? Now explain why the train continues to remain low even beyond degree of 7 whereas the test data starts increasing now.
29-01-2016
02-02-2016
05-02-2016
  • Solution to honework on two equivalent formulations of ridge regression, Lasso and its two equivalent formulations, solution to quiz 1 problem 3, Iterative Soft Thresholding Algorithm (ISTA) for Lasso, Introduction to Support Vector Regression
  • Unannotated Slides
  • Annotated Slides
  • Reference: Sections 3.4.2, 3.4.3, 3.8, 12.3.6 of Hastie et. al..
  • Homework: Try deriving the KKT conditions for the two norm regularized Support Vector Regression problem on slide 18 of annotated slides
09-02-2016
  • Support Vector Regression: formulation of optimization problem and geometric interpretation, derivation of KKT conditions, geometric interpretation of KKT conditions
  • Unannotated Slides
  • Annotated Slides
  • Reference: Sections 3.4.2, 3.4.3, 3.8, 12.3.6 of Hastie et. al..
  • Homework: Understand the summarized reasons (discussed so far) for regularization on page 2 of annotated slides
12-02-2016
16-02-2016
19-02-2016
01-03-2016
04-03-2016
  • Rationale behind perceptron update, gradient descent and stochastic gradient descent for perceptron updates, convergence proof of perceptron update rule for linearly seperable case, kernel perceptron, Tutorial 6
  • Unannotated Slides
  • Annotated Slides
  • Tutorial 6
  • Reference: Sections 4.5.1 of Hastie et. al.
  • Homework: Attempt tutorial 6
  • .
08-03-2016
11-03-2016
15-03-2016
18-03-2016
22-03-2016
29-03-2016
01-04-2016
05-04-2016
  • Generative Classifiers, Multinomial Distribution and its Maximum Likelihood and Bayesian Estimates, Dirichlet Distribution, Multinomial Naive Bayes and its Maximum Likelihood Estimate, Bayesian Estimate for Naive Bayes (Tutorial 10 problem), Gaussian Discriminant Classifier
  • Unannotated Slides
  • Annotated Slides
  • References: Sections 6.6.3 (Naive Bayes Classifier) and 4.3 (Gaussian Discriminant Analysis: Quadratic and Linear Discriminant Analysis) of Hastie et. al., Section 10 (Multivariate Bernoulli/Multinomial, Naive Bayes, Dirichlet distribution, ML and Bayesian Estimation) of previous offering's class notes
12-04-2016
15-04-2016
  • General EM Algorithm, Special case for GMM, Hard EM for GMM as K-Means, K-Mediod and K-mode algorithms for clustering, Distance measures and Hierchical Clustering Methods
  • Unannotated Slides
  • Annotated Slides
  • References: Sections 8.5.1 (EM Algorithm for Mixture of Gaussians), 8.5.2 (General EM Algorithm), 13.2.1 and 14.3.6 and 14.3.7 (K-means clustering), 14.3.2 and 14.3.3 (Distance/dissimilarity measures), 14.3.10 (K-mediods algorithm), 14.3.12 (Extra: Hierarchical Clustering) of Hastie et. al., Section 13 (Clustering) and Section 12.2 (EM Algo for clustering) of previous offering's class notes and Section 7.8 (more reading on unsupervised learning and EM algorithm) of notes on learning and inferencing in probabilistic models