This page will provide information on what is covered in each lecture and will be updated as the class progresses.

Course slides are available on Moodle.

Date	Topics	Reading
	Overview of the course Basic course information and administrative details Supervised and unsupervised learning Learning task, instances, features, labels, reward/loss, training, testing	Chapter 1 of SS17
	Quiz on Linear Algebra (basics) Vectors, Matrices, Tensors, Basic matrix operations, Special matrix types, Eigen decomposition	Chapter 2.1 to 2.11 of here
	Tutorial on Python for ML To be conducted by TAs in an extra class. Attendance is optional.
	Classification and regression Overview: setup, training, test, validation dataset, overfitting. Setting up vector notations, understanding multidimensional spaces.	Slides
	Linear regression and classification Linear regression: Defining loss functions Logistic classifier: Defining logistic loss	Chapter 3.1, 3.2 of DDL2019 Chapters 4.2, 4.3.2 of Bis07 Mitchell's chapter
	Numerical Optimization (basics) Review of convex function and optimization of unconstrained functions. Matrix calculus: Gradients and Hessians. < Definition and properties of convex function (Chapters 3.1.1 to 3.1.5, 3.2 of BV) Unconstrained optimization algorithms: gradient descent and stochastic gradient descent	Chapter 4.3 of Deep Learning book Chapters 9.1 to 9.3 of BV excluding convergence proofs
	Programming homework on training linear classifiers with various loss functions using stochastic gradient descent
	Probability for ML (basics) Probability, axioms of probability, random variables, common distributions, means, variance and other moments, joint distributions, and conditional distributions.	sections 3.1 to 3.9 here or, chapters 1--6 from PRS, or Chapter 1 and 2 of Bis07)
	Probabilistic Generative classifiers Naive Bayes classification Generative classifiers: LDA	Chapter 2.5 of DDL2019
	Support vector machines Max margin motivation: low density, high stability Margin geometry to primal SVM formulation for separable training data (demo) Dual formulation and role of alpha in a form of sparse local regression Inseparable data, slack variables, hinge loss, upper bound on 0/1 training loss (demo) Handling non-linear regression by lifting data points to higher dimension (demo) Polynomial, Gaussian, RBF kernels Sequential minimal optimization (SMO) algorithm	Chapter 7 of Bis07 Wikipedia
	Feedforward Neural networks Feedforward networks Backpropagation Algorithm	Chapter 6 of Deep Learning book
	Neural Network Architectures: CNNs Motivation Basic convolution operation. Pooling. LeNET architecture for basic image classification task.	Chapter 6 and 12 of DDL book
	Neural Architectures for Sequences RNNs Transformers Encoder-decoder model	Chapter 10.0 to 10.4 of Deep Learning book
	Combining models Bagging Random Forests by Breiman. Boosting: Gradient boosting, Adaboost (Chapter 14.3 of Bis07)	Chapter 14.2 of Bis07, Chapter 8.7 of HTF book
	Unsupervised learning Clustering, k-means (Chapter 14.3 of HTF, Chapter in Bis07, Online copy of Anil Jain's book), online demo Unsupervised representation learning	Lecture notes
	Reinforcement Learning Policy gradients
	LLMs: Foundation Models for Text Pre-training In-context learning RLHF
	Generative models for images Overview of text to image generators.