Number |
Date |
Content of the Lecture |
Assignments/Readings/Notes |
| 1 |
28/07 |
- Introduction, course overview and course policies
|
|
| 2 |
29/07 |
Descriptive Statistics
- Terminology: population, sample, discrete and continuous valued attributes
- Frequency tables, frequency polyongs, line diagrams, pie charts, relative frequency tables
- Histograms with examples for image intensity histograms, image gradient histograms
- Histogram binning problem
- Data summarization: Mean and Median
|
|
| 3 |
31/07 |
- Data summarization: mean and median
- "Proof" that median minimizes the sum of absolute deviations - using calculus
- Proof that median minimizes the sum of absolute deviations, without using calculus
- Concept of quantile/percentile
- Calculation of mean and median in different ways from histogram or cumulative plots
- Standard deviation and variance, some applications
- Two-sided Chebyshev inequality with proof; One-side Chebyshev inequality (Chebyshev-Cantelli inequality)
|
|
| 4 |
4/8 |
- Two-sided Chebyshev inequality with proof; One-side Chebyshev inequality (Chebyshev-Cantelli inequality)
- Concept of correlation coefficient and formula for it; proof that its value lies from -1 to +1
- Correlation coefficient: properties; uncentered correlation coefficient; limitations of correlation coefficient and Anscombe's quartet
- Correlation and causation
|
|
| 5 |
5/8 |
Discrete Probability
- Discrete probability: sample space, event, composition of events: union, intersection, complement, exclusive or, De Morgan's laws
- Boole's and Bonferroni's inequalities
- Conditional probability, Bayes rule, False Positive Paradox
- Independent and mutually exclusive events
- Birthday paradox
|
|
| 6 |
7/8 |
- Independent and mutually exclusive events
- Birthday paradox
MATLAB Tutorial
- Code vectorization: vectors and matrix operations
- Plotting graphs, scatterplots, images in MATLAB
- Some functions for computing statistical quantities
|
|
| 7 |
11/8 |
Random Variables
- Random variable: concept, discrete and continuous random variables
- Probability mass function (pmf), cumulative distribution function (cdf) and probability density function (pdf)
- Expected value for discrete and continuous random variables; Law of the Unconscious Statistician
- Standard deviation, Markov's inequality, Chebyshev's inequality; proofs of these inequalities
- Concept of covariance and its properties
|
|
| 8 |
12/8 |
- Proof of the law of the unconscious statistician
- Weak law of large numbers and its proof using Chebyshev's inequality; statement of strong law
|
|
| 9 |
14/8 |
- Joint PMF, PDF, CDF with examples; marginals obtained by integration of joint PDFs, CDFs, PMFs
- Concept of independence of random variables
|
|
| 10 |
18/8 |
- Conditional CDF, PDF, PMF; conditional expectation; examples
- Moment generating functions: definition, genesis, properties
|
|
| 11 |
19/8 |
- Conditional CDF, PDF, PMF; conditional expectation; examples
- Moment generating functions: properties, uniqueness proofs, connection to Laplace transforms; mention of characteristic functions
Families of Random Variables
- Bernoulli random variables: mean, median, mode, variance, MGF
|
|
| 12 |
21/8 |
- Binomial random variables: mean, median, mode, variance, MGF
|
|
| 13 |
25/8 |
- Gaussian distribution: definition, mean, variance, verification of integration to 1, MGF, error functions
- Introduction to and basic statement of the central limit theorem, with examples
|
|
| 14 |
26/8 |
- Properties of Gaussian: CDF and error function, MGF
- Relation between CLT and Law of Large Numbers
- Gaussian tail bounds
- Distribution of sample mean and sample variance, Bessel's correction
|
|
| 15 |
28/8 |
- Proof of central limit theorem
- Chi-square distribution
- Distribution of sample variance given Gaussian random variables
|
|
| 16 |
1/9 |
- Uniform distribution: mean, mode, median, MGF, sampling from a PMF, probability integral transform
- Hypergeometric distribution: mean, variance
|
|
| 17 |
2/9 |
- Hypergeometric distribution: method of capture+recapture in ecology
- Multinomial distribution: mean vector, covariance matrix, MGF
|
|
| 18 |
4/9 |
- Poisson distribution: genesis and examples, Poisson limit theorem, mean, variance, MGF, Poisson thinning, relation to normal distribution
- Exponential distribution: genesis, and relevance to Poisson distribution
|
|
| 19 |
8/9 |
- Exponential distribution: mean, variance, MGF, property of memorylessness
Parameter Estimation
- Concept of parameter estimation (or parametric PDF/PMF estimation)
- Maximum likelihood estimation (MLE)
- MLE for parameters of Bernoulli, Poisson, Gaussian and uniform distributions
- Least squares line fitting as an MLE problem
- MLE for parameters of uniform distributions
|
|
| 20 |
9/9 |
- Least squares line fitting as an MLE problem
- MLE for parameters of uniform distributions
- Concept of ML estimate as a random variable, notion of confidence interval
|
|
| 21 |
11/9 |
- Concept of estimator bias, mean squared error, variance
- Estimators for interval of uniform distribution: example of bias
|
|
| 22 |
22/9 |
- Concept of nonparametric density estimation; Concept of histogram as a probability density estimator
- Bias, variance and MSE for a histogram estimator for a smooth density (with bounded first derivatives) which is non-zero on a finite-sized interval; derivation of optimal number of bins (equivalently, optimal binwidth) and optimal MSE O(n^{-2/3})
|
|
| 23 |
23/9 |
- Kernel density estimation for non-parametric PDF estimation.
- Derivation of bias, variance, MSE for KDE.
|
|
| 24 |
25/9 |
- Confidence intervals in maximum likelihood estimation
- Concept of two-sided confidence interval and one-sided confidence interval
- Confidence interval for mean of a Gaussian with known standard deviation
- Confidence interval for variance of a Gaussian
- Concept of two-sided confidence interval and one-sided confidence interval
- Confidence interval for mean of a Gaussian with known standard deviation
- Confidence interval for variance of a Gaussian
|
|
| 25 |
29/9 |
Hypothesis Testing
- Concept of statistical hypothesis test
- Type I and Type II errors, p-value, test statistic, critical region
- Hypothesis test for mean of a normal distribution with known variance
|
- Slides
- Chapter 8 of the textbook
|
| 26 |
30/9 |
- One sided test for mean of a normal distribution known variance
- Testing equality of means of two normal populations with known variances
- Hypothesis test for mean of a normal distribution with unknown variance
|
- Slides
- Chapter 8 of the textbook
|
| 27 |
6/10 |
- Hypothesis test for mean of a normal distribution with unknown variance
- Derivation of the t-distribution
- Testing equality of means of two normal populations with unknown but equal variances
|
- Slides
- Chapter 8 of the textbook
|
| 28 |
7/10 |
- Hypothesis test regarding the variance of a normal population (with unknown mean)
- Hypothesis test for the success parameter of a Bernoulli distribution
- Hypothesis test: equality of success parameters in two Bernoulli distributions
\
- Chi-square test for goodness of fit
|
- Slides
- Chapter 8 of the textbook
- Chapter 11 for goodness of fit tests
|
| 29 |
7/10 |
- Chi-square test for goodness of fit
- Kolmogorov-Smirnov goodness of fit test: basic (non-constructive) derivation, computation of the test statistic efficiently; concept of empirical CDF, empirical CDF as an unbiased and low variance estimator of the true CDF
- Brief mention of the Dvortezky-Kiefer-Wolfowitz inequality
- Two examples of hypothesis testing in medicine: placebo effect, control group
|
- Slides
- Chapter 8 of the textbook
- Chapter 11 for goodness of fit tests
|
| 30 |
9/10 |
Multivariate Normal Distribution
- Expression for multivariate normal PDF; mean vector, covariance matrix
- Properties of the covariance matrix
- Concept of random normal vectors and derivation of the multivariate normal PDF using transformation of random variables
|
|
| 31 |
13/10 |
- Concept of random normal vectors and derivation of the multivariate normal PDF using transformation of random variables
- Sampling from a multivariate normal distribution using univariate Gaussians as a sub-routine
- Special versions of the multivariate Gaussian: diagonal covariance matrix, covariance matrix proportional to identity
|
|
| 32 |
14/10 |
- Marginalization: argument that individual elements of a multivariate normal random vector are normally distributed
- Eigenvectors and eigenvalues of the covariance matrix
- Matrices as Geometric Transformations
- Whitening transformation
|
|
| 33 |
16/10 |
- Matrices as Geometric Transformations
- Whitening transformation
- Joint and marginal PDFs
|
|
| 34 |
23/10 |
- Joint and marginal PDFs
- Isocontours of multivariate normal PDFs: hyperspheres, hyperellipsoids, oriented hyperellipsoids
- Euclidean distance and Mahalanobis distance, concept of a distance measure
- Different interpretations of the Mahalanobis distance, relation to chi-squared distribution
|
|
| 35 |
25/10 |
- Principal components analysis (PCA): concept of data compression
- Derivation of PCA: eigenvalus of covariance/scatter matrix as variance of the data when projected onto the eigenvectors, derivation of PCA for a single direction
|
|
| 36 |
27/10 |
- Derivation of PCA: eigenvalus of covariance/scatter matrix as variance of the data when projected onto multiple directions, derivation of PCA for multiple directions
- Application: eigenfaces, decay of eigenvalues of covariance matrices of face images
|
|
| 37 |
28/10 |
Maximum Likelihood and Bayesian Estimation
- Regularity conditions
- Efficiency of an estimator
- Fisher information: intuition, different algebraic forms, examples for different distributions, significance in ML estimation
- Statement of Cramer Rao lower bound
|
|
| 38 |
30/10 |
- Statement of Cramer Rao lower bound; proof of Cramer Rao bound
- Significance of Cramer Rao bound; examples of the bound for Poisson and Gaussian distributions
- Concept of efficiency and asymptotic efficiency of an estimator
|
|
| 39 |
3/11 |
- Concept of efficiency and asymptotic efficiency of an estimator
- Asymptotic distribution of maximum likelihood estimates using the Fisher information
- Functional invariance of MLE estimators
- Concept of Bayesian estimation; posterior and prior distributions
|
|
| 40 |
4/11 |
- Concept of Bayesian estimation; posterior and prior distributions
- Maximum a posteriori (MAP) and Bayes estimators with examples
- Concept of conjugate priors
|
|
| 41 |
6/11 |
- Concept of conjugate priors
- Example of conjugate priors: Bernoulli likelihood and Beta prior
- Discussion regarding HW5
|
|