This page will provide information on what is covered in each lecture and will be updated as the class progresses.

Course slides are available on Moodle.

Overview of the course

• Basic course information and administrative details
• Supervised and unsupervised learning
• Learning task, instances, features, labels, reward/loss, training, testing

Chapter 1 of SS17

Quiz on Linear Algebra (basics)

• Vectors, Matrices, Tensors, Basic matrix operations, Special matrix types, Eigen decomposition
Chapter 2.1 to 2.11 of here

Tutorial on Python for ML

• To be conducted by TAs in an extra class. Attendance is optional.

Classification and regression

• Overview: setup, training, test, validation dataset, overfitting.
• Setting up vector notations, understanding multidimensional spaces.
Slides

Linear regression and classification

• Linear regression: Defining loss functions
• Logistic classifier: Defining logistic loss
Chapter 3.1, 3.2 of DDL2019
Chapters 4.2, 4.3.2 of Bis07

Mitchell's chapter

Numerical Optimization (basics)

• Review of convex function and optimization of unconstrained functions.
• Matrix calculus: Gradients and Hessians. <
• Definition and properties of convex function (Chapters 3.1.1 to 3.1.5, 3.2 of BV)
Chapter 4.3 of Deep Learning book
Chapters 9.1 to 9.3 of BV excluding convergence proofs
Programming homework on training linear classifiers with various loss functions using stochastic gradient descent

Probability for ML (basics)

• Probability, axioms of probability, random variables, common distributions, means, variance and other moments, joint distributions, and conditional distributions.
sections 3.1 to 3.9 here
or, chapters 1--6 from PRS,
or Chapter 1 and 2 of Bis07)

Probabilistic Generative classifiers

• Naive Bayes classification
• Generative classifiers: LDA
Chapter 2.5 of DDL2019

Support vector machines
• Max margin motivation: low density, high stability
• Margin geometry to primal SVM formulation for separable training data (demo)
• Dual formulation and role of alpha in a form of sparse local regression
• Inseparable data, slack variables, hinge loss, upper bound on 0/1 training loss (demo)
• Handling non-linear regression by lifting data points to higher dimension (demo)
• Polynomial, Gaussian, RBF kernels
• Sequential minimal optimization (SMO) algorithm
Chapter 7 of Bis07
Wikipedia

Feedforward Neural networks

• Feedforward networks
• Backpropagation Algorithm

Chapter 6 of Deep Learning book

Neural Network Architectures: CNNs

• Motivation Basic convolution operation. Pooling.
• LeNET architecture for basic image classification task.
Chapter 6 and 12 of DDL book

Neural Architectures for Sequences
• RNNs
• Transformers
• Encoder-decoder model

Chapter 10.0 to 10.4 of Deep Learning book

Combining models
Chapter 14.2 of Bis07, Chapter 8.7 of HTF book

Unsupervised learning

Lecture notes

Reinforcement Learning