CS 215: Data Interpretation and Analysis, Fall 2025

CS 215 - Data Interpretation and Analysis

Instructor: Ajit Rajwade
Office: SIA-218, KReSIT Building
Email:

Lecture Timings: Slot 3: Monday 10:35 am to 11:30 am, Tuesday 11:35 am to 12:30 pm, Thursday 8:30 am to 9:25 am

Lecture Venue: LA 002

Instructor Office hours: Thursday 2:30 to 3:30 pm in KR 118 (or after class on Tuesdays in LA 002). (Feel free to send queries over email or moodle)

Teaching Assistants:

Srijan Das
Ayush Pratap Singh
Manivannan N
Mohammad Kashif Khan
Anirban Paul
Kumar Rajnish
Sabil Ahmad
Badisa Chennakesava Venkata Vignesh
Sameer Arvind Patil
S Ramachandran

Topics to be covered (tentative list)

Descriptive statistics
Discrete and continuous probability
Random variables and expectation
Special random variables: Bernoulli, Binomial, Geometric, Hypergeometric, Gaussian, Chi-square, Uniform, Poisson, Exponential
Hypothesis testing: z-test, t-test, Kolmogorov Smirnov test
Parameter estimation: maximum likelihood, Bayesian, concept of asymptotic normality and Fisher information, Cramer-Rao bounds
Regression
Non-parametric probability density estimation
Multivariate Gaussians, Principal Components Analysis
Transformation and Simulation of Random Variables
We will cover some nice applications in image procesing, basic machine learning and group testing alongside some key concepts.

Intended Audience

2nd year BTech students from CSE

Learning Materials and Textbooks

Lecture slides that will be regularly posted. I may occasionally post links to videos, or additional material such as problem sets.
We will use moodle for posting assignments and grades
Course textbook: Introduction to Probability and Statistics for Engineers and Scientists: Fifth Edition (20-30 copies available in the library)

Computational Resources

Matlab

Online MATLAB @IITB, accessible through your LDAP id and password
Matlab tutorial 1
Matlab tutorial 2
Matlab tutorial 3
The MathWorks - MATLAB Tutorial
Matlab Primer
On-line Matlab Help
Writing Fast Matlab Code (pdf)
One more tutorial for writing fast matlab code
Code Vectorization Guide
Matlab Programmin Style Guidelines (pdf)
Matlab array manipulation

Grading Policy (tenative)

Mid-sem exam: 25%
Final exam (cumulative): 30%
Programming and written assignments (about five): 30% - all to be done in groups of 2 students. This will possibly include a "project".
Two pre-announced quizzes: 15% total

Other Policies

Attendance is mandatory. Students with less than 90% attendance may be given a DX grade.
Assignments will be given out (typically) once every two or three weeks. They must be submitted on or before the deadline. No late assignments will be accepted. The programming components of the assignments will typically involve MATLAB, so you must be willing to learn it quickly.
We will adopt a zero-tolerance policy against any forms of plagiarism or any other form of cheating. Just don't do it! In cases of plagiarism, givers and takers will both be considered equally responsible.
This course is (inherently) cumulative. The syllabus for the final exam will include everything taught during the semester.

Tutorials

See moodle

Quizzes

See moodle

Lecture Schedule:

Number
Date
Content of the Lecture
Assignments/Readings/Notes

1 28/07

Introduction, course overview and course policies

Slides: Course Overview

2 29/07
Descriptive Statistics
Terminology: population, sample, discrete and continuous valued attributes

Frequency tables, frequency polyongs, line diagrams, pie charts, relative frequency tables

Histograms with examples for image intensity histograms, image gradient histograms

Histogram binning problem

Data summarization: Mean and Median

Slides: Descriptive Statistics

Readings: sections 2.1 and 2.2 of the textbook by Sheldon Ross

3 31/07

Data summarization: mean and median

"Proof" that median minimizes the sum of absolute deviations - using calculus

Proof that median minimizes the sum of absolute deviations, without using calculus

Concept of quantile/percentile

Calculation of mean and median in different ways from histogram or cumulative plots

Standard deviation and variance, some applications

Two-sided Chebyshev inequality with proof; One-side Chebyshev inequality (Chebyshev-Cantelli inequality)

Slides: Descriptive Statistics

Non-calculus proof for median

Readings: sections 2.1 and 2.2 of the textbook by Sheldon Ross

4 4/8

Two-sided Chebyshev inequality with proof; One-side Chebyshev inequality (Chebyshev-Cantelli inequality)

Concept of correlation coefficient and formula for it; proof that its value lies from -1 to +1

Correlation coefficient: properties; uncentered correlation coefficient; limitations of correlation coefficient and Anscombe's quartet

Correlation and causation

Slides: Descriptive Statistics

Readings: sections 2.1,2.2,2.3,2.4,2.6 of the textbook by Sheldon Ross

5 5/8 Discrete Probability

Discrete probability: sample space, event, composition of events: union, intersection, complement, exclusive or, De Morgan's laws

Boole's and Bonferroni's inequalities

Conditional probability, Bayes rule, False Positive Paradox

Independent and mutually exclusive events

Birthday paradox

Slides: Discrete Probability

Readings: Chapter 3 of the textbook by Sheldon Ross

6 7/8

Independent and mutually exclusive events

Birthday paradox

MATLAB Tutorial

Code vectorization: vectors and matrix operations

Plotting graphs, scatterplots, images in MATLAB

Some functions for computing statistical quantities

MATLAB Demo Examples

Slides: Discrete Probability

Readings: Chapter 3 of the textbook by Sheldon Ross

7 11/8 Random Variables

Random variable: concept, discrete and continuous random variables

Probability mass function (pmf), cumulative distribution function (cdf) and probability density function (pdf)

Expected value for discrete and continuous random variables; Law of the Unconscious Statistician

Standard deviation, Markov's inequality, Chebyshev's inequality; proofs of these inequalities

Concept of covariance and its properties

Slides: Random Variables

Readings: Chapter 4 of the textbook by Sheldon Ross

8 12/8

Proof of the law of the unconscious statistician

Weak law of large numbers and its proof using Chebyshev's inequality; statement of strong law

Slides: Random Variables

Readings: Chapter 4 of the textbook by Sheldon Ross

9 14/8

Joint PMF, PDF, CDF with examples; marginals obtained by integration of joint PDFs, CDFs, PMFs

Concept of independence of random variables

Slides: Random Variables

Readings: Chapter 4 of the textbook by Sheldon Ross

10 18/8

Conditional CDF, PDF, PMF; conditional expectation; examples

Moment generating functions: definition, genesis, properties

Slides: Random Variables

Readings: Chapter 4 of the textbook by Sheldon Ross

11 19/8

Conditional CDF, PDF, PMF; conditional expectation; examples

Moment generating functions: properties, uniqueness proofs, connection to Laplace transforms; mention of characteristic functions

Families of Random Variables

Bernoulli random variables: mean, median, mode, variance, MGF

Slides: Random Variables

Slides: Families of Random Variables

Readings: Chapter 4 of the textbook by Sheldon Ross

Readings: Chapter 5 of the textbook by Sheldon Ross

12 21/8

Binomial random variables: mean, median, mode, variance, MGF

Slides: Families of Random Variables

Readings: Chapter 5 of the textbook by Sheldon Ross

13 25/8

Gaussian distribution: definition, mean, variance, verification of integration to 1, MGF, error functions

Introduction to and basic statement of the central limit theorem, with examples

Slides: Families of Random Variables

Readings: chapter 5

Code for the central limit theorem

14 26/8

Properties of Gaussian: CDF and error function, MGF

Relation between CLT and Law of Large Numbers

Gaussian tail bounds

Distribution of sample mean and sample variance, Bessel's correction

Slides: Families of Random Variables

Readings: chapter 5

Code for the central limit theorem

15 28/8

Proof of central limit theorem

Chi-square distribution

Distribution of sample variance given Gaussian random variables

Slides: Families of Random Variables

Readings: chapter 5

Code for the central limit theorem

16 1/9

Uniform distribution: mean, mode, median, MGF, sampling from a PMF, probability integral transform

Hypergeometric distribution: mean, variance

Slides: Families of Random Variables

Readings: chapter 5

Code for the central limit theorem

Probability integral transform

17 2/9

Hypergeometric distribution: method of capture+recapture in ecology

Multinomial distribution: mean vector, covariance matrix, MGF

Slides: Families of Random Variables

Readings: chapter 5

Code for the central limit theorem

Probability integral transform

18 4/9

Poisson distribution: genesis and examples, Poisson limit theorem, mean, variance, MGF, Poisson thinning, relation to normal distribution

Exponential distribution: genesis, and relevance to Poisson distribution

Slides: Families of Random Variables

Readings: chapter 5

19 8/9

Exponential distribution: mean, variance, MGF, property of memorylessness

Parameter Estimation

Concept of parameter estimation (or parametric PDF/PMF estimation)

Maximum likelihood estimation (MLE)

MLE for parameters of Bernoulli, Poisson, Gaussian and uniform distributions

Least squares line fitting as an MLE problem

MLE for parameters of uniform distributions

Slides: Families of Random Variables

Readings: chapter 5

Slides: Parameter estimation

MLE derivations

Readings: Section 5.6 from the textbook by Sheldon Ross

Readings: Sections 7.1, 7.2, 7.5, 7.7, 9.2 (for least squares line fitting) of the textbook by Sheldon Ross

20 9/9

Least squares line fitting as an MLE problem

MLE for parameters of uniform distributions

Concept of ML estimate as a random variable, notion of confidence interval

Slides: Parameter estimation

MLE derivations

Readings: Section 5.6 from the textbook by Sheldon Ross

Readings: Sections 7.1, 7.2, 7.5, 7.7, 9.2 (for least squares line fitting) of the textbook by Sheldon Ross

21 11/9

Concept of estimator bias, mean squared error, variance

Estimators for interval of uniform distribution: example of bias

Slides: Parameter estimation

MLE derivations

Readings: Section 5.6 from the textbook by Sheldon Ross

Readings: Sections 7.1, 7.2, 7.3, 7.5, 7.7, 9.2 (for least squares line fitting) of the textbook by Sheldon Ross

22 22/9

Concept of nonparametric density estimation; Concept of histogram as a probability density estimator

Bias, variance and MSE for a histogram estimator for a smooth density (with bounded first derivatives) which is non-zero on a finite-sized interval; derivation of optimal number of bins (equivalently, optimal binwidth) and optimal MSE O(n^{-2/3})

Derivation of MSE, bias, variance for a histogram. Read section 6.1 only. These notes are by Prof. Yen-Chi Chen from the Univ. of Washington, Seattle. A local copy of the pdf is here

23 23/9

Kernel density estimation for non-parametric PDF estimation.

Derivation of bias, variance, MSE for KDE.

Derivation of MSE, bias, variance for a KDE. These notes are by Prof. Yen-Chi Chen from the Univ. of Washington, Seattle. A local copy of the pdf is here

24 25/9

Confidence intervals in maximum likelihood estimation

Concept of two-sided confidence interval and one-sided confidence interval

Confidence interval for mean of a Gaussian with known standard deviation

Confidence interval for variance of a Gaussian

Concept of two-sided confidence interval and one-sided confidence interval

Confidence interval for mean of a Gaussian with known standard deviation

Confidence interval for variance of a Gaussian

Slides: Parameter estimation

Chapter 7 of the textbook

25 29/9 Hypothesis Testing

Concept of statistical hypothesis test

Type I and Type II errors, p-value, test statistic, critical region

Hypothesis test for mean of a normal distribution with known variance

Slides

Chapter 8 of the textbook

26 30/9

One sided test for mean of a normal distribution known variance

Testing equality of means of two normal populations with known variances

Hypothesis test for mean of a normal distribution with unknown variance

Slides

Chapter 8 of the textbook

27 6/10

Hypothesis test for mean of a normal distribution with unknown variance

Derivation of the t-distribution

Testing equality of means of two normal populations with unknown but equal variances

Slides

Chapter 8 of the textbook

28 7/10

Hypothesis test regarding the variance of a normal population (with unknown mean)

Hypothesis test for the success parameter of a Bernoulli distribution

Hypothesis test: equality of success parameters in two Bernoulli distributions
\
Chi-square test for goodness of fit

Slides

Chapter 8 of the textbook

Chapter 11 for goodness of fit tests

29 7/10

Chi-square test for goodness of fit

Kolmogorov-Smirnov goodness of fit test: basic (non-constructive) derivation, computation of the test statistic efficiently; concept of empirical CDF, empirical CDF as an unbiased and low variance estimator of the true CDF

Brief mention of the Dvortezky-Kiefer-Wolfowitz inequality

Two examples of hypothesis testing in medicine: placebo effect, control group

Slides

Chapter 8 of the textbook

Chapter 11 for goodness of fit tests

30 9/10 Multivariate Normal Distribution

Expression for multivariate normal PDF; mean vector, covariance matrix

Properties of the covariance matrix

Concept of random normal vectors and derivation of the multivariate normal PDF using transformation of random variables

Slides

Wiki article on multivariate normal distribution

Statlect article on multivariate normal distribution

31 13/10

Concept of random normal vectors and derivation of the multivariate normal PDF using transformation of random variables

Sampling from a multivariate normal distribution using univariate Gaussians as a sub-routine

Special versions of the multivariate Gaussian: diagonal covariance matrix, covariance matrix proportional to identity

Slides

Wiki article on multivariate normal distribution

Statlect article on multivariate normal distribution

32 14/10

Marginalization: argument that individual elements of a multivariate normal random vector are normally distributed

Eigenvectors and eigenvalues of the covariance matrix

Matrices as Geometric Transformations

Whitening transformation

Slides

Wiki article on multivariate normal distribution

Statlect article on multivariate normal distribution

33 16/10

Matrices as Geometric Transformations

Whitening transformation

Joint and marginal PDFs

Slides

Wiki article on multivariate normal distribution

Statlect article on multivariate normal distribution

34 23/10

Joint and marginal PDFs

Isocontours of multivariate normal PDFs: hyperspheres, hyperellipsoids, oriented hyperellipsoids

Euclidean distance and Mahalanobis distance, concept of a distance measure

Different interpretations of the Mahalanobis distance, relation to chi-squared distribution

Slides

Wiki article on multivariate normal distribution

Statlect article on multivariate normal distribution

35 25/10

Principal components analysis (PCA): concept of data compression

Derivation of PCA: eigenvalus of covariance/scatter matrix as variance of the data when projected onto the eigenvectors, derivation of PCA for a single direction

Slides

36 27/10

Derivation of PCA: eigenvalus of covariance/scatter matrix as variance of the data when projected onto multiple directions, derivation of PCA for multiple directions

Application: eigenfaces, decay of eigenvalues of covariance matrices of face images

Slides

37 28/10 Maximum Likelihood and Bayesian Estimation

Regularity conditions

Efficiency of an estimator

Fisher information: intuition, different algebraic forms, examples for different distributions, significance in ML estimation

Statement of Cramer Rao lower bound

Slides

38 30/10

Statement of Cramer Rao lower bound; proof of Cramer Rao bound

Significance of Cramer Rao bound; examples of the bound for Poisson and Gaussian distributions

Concept of efficiency and asymptotic efficiency of an estimator

Slides

39 3/11

Concept of efficiency and asymptotic efficiency of an estimator

Asymptotic distribution of maximum likelihood estimates using the Fisher information

Functional invariance of MLE estimators

Concept of Bayesian estimation; posterior and prior distributions

Slides

40 4/11

Concept of Bayesian estimation; posterior and prior distributions

Maximum a posteriori (MAP) and Bayes estimators with examples

Concept of conjugate priors

Slides

41 6/11

Concept of conjugate priors

Example of conjugate priors: Bernoulli likelihood and Beta prior

Discussion regarding HW5

Slides

Number	Date	Content of the Lecture	Assignments/Readings/Notes
1	28/07	Introduction, course overview and course policies	Slides: Course Overview
2	29/07	Descriptive Statistics Terminology: population, sample, discrete and continuous valued attributes Frequency tables, frequency polyongs, line diagrams, pie charts, relative frequency tables Histograms with examples for image intensity histograms, image gradient histograms Histogram binning problem Data summarization: Mean and Median	Slides: Descriptive Statistics Readings: sections 2.1 and 2.2 of the textbook by Sheldon Ross
3	31/07	Data summarization: mean and median "Proof" that median minimizes the sum of absolute deviations - using calculus Proof that median minimizes the sum of absolute deviations, without using calculus Concept of quantile/percentile Calculation of mean and median in different ways from histogram or cumulative plots Standard deviation and variance, some applications Two-sided Chebyshev inequality with proof; One-side Chebyshev inequality (Chebyshev-Cantelli inequality)	Slides: Descriptive Statistics Non-calculus proof for median Readings: sections 2.1 and 2.2 of the textbook by Sheldon Ross
4	4/8	Two-sided Chebyshev inequality with proof; One-side Chebyshev inequality (Chebyshev-Cantelli inequality) Concept of correlation coefficient and formula for it; proof that its value lies from -1 to +1 Correlation coefficient: properties; uncentered correlation coefficient; limitations of correlation coefficient and Anscombe's quartet Correlation and causation	Slides: Descriptive Statistics Readings: sections 2.1,2.2,2.3,2.4,2.6 of the textbook by Sheldon Ross
5	5/8	Discrete Probability Discrete probability: sample space, event, composition of events: union, intersection, complement, exclusive or, De Morgan's laws Boole's and Bonferroni's inequalities Conditional probability, Bayes rule, False Positive Paradox Independent and mutually exclusive events Birthday paradox	Slides: Discrete Probability Readings: Chapter 3 of the textbook by Sheldon Ross
6	7/8	Independent and mutually exclusive events Birthday paradox MATLAB Tutorial Code vectorization: vectors and matrix operations Plotting graphs, scatterplots, images in MATLAB Some functions for computing statistical quantities	MATLAB Demo Examples Slides: Discrete Probability Readings: Chapter 3 of the textbook by Sheldon Ross
7	11/8	Random Variables Random variable: concept, discrete and continuous random variables Probability mass function (pmf), cumulative distribution function (cdf) and probability density function (pdf) Expected value for discrete and continuous random variables; Law of the Unconscious Statistician Standard deviation, Markov's inequality, Chebyshev's inequality; proofs of these inequalities Concept of covariance and its properties	Slides: Random Variables Readings: Chapter 4 of the textbook by Sheldon Ross
8	12/8	Proof of the law of the unconscious statistician Weak law of large numbers and its proof using Chebyshev's inequality; statement of strong law	Slides: Random Variables Readings: Chapter 4 of the textbook by Sheldon Ross
9	14/8	Joint PMF, PDF, CDF with examples; marginals obtained by integration of joint PDFs, CDFs, PMFs Concept of independence of random variables	Slides: Random Variables Readings: Chapter 4 of the textbook by Sheldon Ross
10	18/8	Conditional CDF, PDF, PMF; conditional expectation; examples Moment generating functions: definition, genesis, properties	Slides: Random Variables Readings: Chapter 4 of the textbook by Sheldon Ross
11	19/8	Conditional CDF, PDF, PMF; conditional expectation; examples Moment generating functions: properties, uniqueness proofs, connection to Laplace transforms; mention of characteristic functions Families of Random Variables Bernoulli random variables: mean, median, mode, variance, MGF	Slides: Random Variables Slides: Families of Random Variables Readings: Chapter 4 of the textbook by Sheldon Ross Readings: Chapter 5 of the textbook by Sheldon Ross
12	21/8	Binomial random variables: mean, median, mode, variance, MGF	Slides: Families of Random Variables Readings: Chapter 5 of the textbook by Sheldon Ross
13	25/8	Gaussian distribution: definition, mean, variance, verification of integration to 1, MGF, error functions Introduction to and basic statement of the central limit theorem, with examples	Slides: Families of Random Variables Readings: chapter 5 Code for the central limit theorem
14	26/8	Properties of Gaussian: CDF and error function, MGF Relation between CLT and Law of Large Numbers Gaussian tail bounds Distribution of sample mean and sample variance, Bessel's correction	Slides: Families of Random Variables Readings: chapter 5 Code for the central limit theorem
15	28/8	Proof of central limit theorem Chi-square distribution Distribution of sample variance given Gaussian random variables	Slides: Families of Random Variables Readings: chapter 5 Code for the central limit theorem
16	1/9	Uniform distribution: mean, mode, median, MGF, sampling from a PMF, probability integral transform Hypergeometric distribution: mean, variance	Slides: Families of Random Variables Readings: chapter 5 Code for the central limit theorem Probability integral transform
17	2/9	Hypergeometric distribution: method of capture+recapture in ecology Multinomial distribution: mean vector, covariance matrix, MGF	Slides: Families of Random Variables Readings: chapter 5 Code for the central limit theorem Probability integral transform
18	4/9	Poisson distribution: genesis and examples, Poisson limit theorem, mean, variance, MGF, Poisson thinning, relation to normal distribution Exponential distribution: genesis, and relevance to Poisson distribution	Slides: Families of Random Variables Readings: chapter 5
19	8/9	Exponential distribution: mean, variance, MGF, property of memorylessness Parameter Estimation Concept of parameter estimation (or parametric PDF/PMF estimation) Maximum likelihood estimation (MLE) MLE for parameters of Bernoulli, Poisson, Gaussian and uniform distributions Least squares line fitting as an MLE problem MLE for parameters of uniform distributions	Slides: Families of Random Variables Readings: chapter 5 Slides: Parameter estimation MLE derivations Readings: Section 5.6 from the textbook by Sheldon Ross Readings: Sections 7.1, 7.2, 7.5, 7.7, 9.2 (for least squares line fitting) of the textbook by Sheldon Ross
20	9/9	Least squares line fitting as an MLE problem MLE for parameters of uniform distributions Concept of ML estimate as a random variable, notion of confidence interval	Slides: Parameter estimation MLE derivations Readings: Section 5.6 from the textbook by Sheldon Ross Readings: Sections 7.1, 7.2, 7.5, 7.7, 9.2 (for least squares line fitting) of the textbook by Sheldon Ross
21	11/9	Concept of estimator bias, mean squared error, variance Estimators for interval of uniform distribution: example of bias	Slides: Parameter estimation MLE derivations Readings: Section 5.6 from the textbook by Sheldon Ross Readings: Sections 7.1, 7.2, 7.3, 7.5, 7.7, 9.2 (for least squares line fitting) of the textbook by Sheldon Ross
22	22/9	Concept of nonparametric density estimation; Concept of histogram as a probability density estimator Bias, variance and MSE for a histogram estimator for a smooth density (with bounded first derivatives) which is non-zero on a finite-sized interval; derivation of optimal number of bins (equivalently, optimal binwidth) and optimal MSE O(n^{-2/3})	Derivation of MSE, bias, variance for a histogram. Read section 6.1 only. These notes are by Prof. Yen-Chi Chen from the Univ. of Washington, Seattle. A local copy of the pdf is here
23	23/9	Kernel density estimation for non-parametric PDF estimation. Derivation of bias, variance, MSE for KDE.	Derivation of MSE, bias, variance for a KDE. These notes are by Prof. Yen-Chi Chen from the Univ. of Washington, Seattle. A local copy of the pdf is here
24	25/9	Confidence intervals in maximum likelihood estimation Concept of two-sided confidence interval and one-sided confidence interval Confidence interval for mean of a Gaussian with known standard deviation Confidence interval for variance of a Gaussian Concept of two-sided confidence interval and one-sided confidence interval Confidence interval for mean of a Gaussian with known standard deviation Confidence interval for variance of a Gaussian	Slides: Parameter estimation Chapter 7 of the textbook
25	29/9	Hypothesis Testing Concept of statistical hypothesis test Type I and Type II errors, p-value, test statistic, critical region Hypothesis test for mean of a normal distribution with known variance	Slides Chapter 8 of the textbook
26	30/9	One sided test for mean of a normal distribution known variance Testing equality of means of two normal populations with known variances Hypothesis test for mean of a normal distribution with unknown variance	Slides Chapter 8 of the textbook
27	6/10	Hypothesis test for mean of a normal distribution with unknown variance Derivation of the t-distribution Testing equality of means of two normal populations with unknown but equal variances	Slides Chapter 8 of the textbook
28	7/10	Hypothesis test regarding the variance of a normal population (with unknown mean) Hypothesis test for the success parameter of a Bernoulli distribution Hypothesis test: equality of success parameters in two Bernoulli distributions \ Chi-square test for goodness of fit	Slides Chapter 8 of the textbook Chapter 11 for goodness of fit tests
29	7/10	Chi-square test for goodness of fit Kolmogorov-Smirnov goodness of fit test: basic (non-constructive) derivation, computation of the test statistic efficiently; concept of empirical CDF, empirical CDF as an unbiased and low variance estimator of the true CDF Brief mention of the Dvortezky-Kiefer-Wolfowitz inequality Two examples of hypothesis testing in medicine: placebo effect, control group	Slides Chapter 8 of the textbook Chapter 11 for goodness of fit tests
30	9/10	Multivariate Normal Distribution Expression for multivariate normal PDF; mean vector, covariance matrix Properties of the covariance matrix Concept of random normal vectors and derivation of the multivariate normal PDF using transformation of random variables	Slides Wiki article on multivariate normal distribution Statlect article on multivariate normal distribution
31	13/10	Concept of random normal vectors and derivation of the multivariate normal PDF using transformation of random variables Sampling from a multivariate normal distribution using univariate Gaussians as a sub-routine Special versions of the multivariate Gaussian: diagonal covariance matrix, covariance matrix proportional to identity	Slides Wiki article on multivariate normal distribution Statlect article on multivariate normal distribution
32	14/10	Marginalization: argument that individual elements of a multivariate normal random vector are normally distributed Eigenvectors and eigenvalues of the covariance matrix Matrices as Geometric Transformations Whitening transformation	Slides Wiki article on multivariate normal distribution Statlect article on multivariate normal distribution
33	16/10	Matrices as Geometric Transformations Whitening transformation Joint and marginal PDFs	Slides Wiki article on multivariate normal distribution Statlect article on multivariate normal distribution
34	23/10	Joint and marginal PDFs Isocontours of multivariate normal PDFs: hyperspheres, hyperellipsoids, oriented hyperellipsoids Euclidean distance and Mahalanobis distance, concept of a distance measure Different interpretations of the Mahalanobis distance, relation to chi-squared distribution	Slides Wiki article on multivariate normal distribution Statlect article on multivariate normal distribution
35	25/10	Principal components analysis (PCA): concept of data compression Derivation of PCA: eigenvalus of covariance/scatter matrix as variance of the data when projected onto the eigenvectors, derivation of PCA for a single direction	Slides
36	27/10	Derivation of PCA: eigenvalus of covariance/scatter matrix as variance of the data when projected onto multiple directions, derivation of PCA for multiple directions Application: eigenfaces, decay of eigenvalues of covariance matrices of face images	Slides
37	28/10	Maximum Likelihood and Bayesian Estimation Regularity conditions Efficiency of an estimator Fisher information: intuition, different algebraic forms, examples for different distributions, significance in ML estimation Statement of Cramer Rao lower bound	Slides
38	30/10	Statement of Cramer Rao lower bound; proof of Cramer Rao bound Significance of Cramer Rao bound; examples of the bound for Poisson and Gaussian distributions Concept of efficiency and asymptotic efficiency of an estimator	Slides
39	3/11	Concept of efficiency and asymptotic efficiency of an estimator Asymptotic distribution of maximum likelihood estimates using the Fisher information Functional invariance of MLE estimators Concept of Bayesian estimation; posterior and prior distributions	Slides
40	4/11	Concept of Bayesian estimation; posterior and prior distributions Maximum a posteriori (MAP) and Bayes estimators with examples Concept of conjugate priors	Slides
41	6/11	Concept of conjugate priors Example of conjugate priors: Bernoulli likelihood and Beta prior Discussion regarding HW5	Slides