CS705 Autumn 2006 Lecture Calendar
Scroll down for a tentative syllabus. Javascript must
be enabled to follow IIT-internal paper links.
- 2006-07-27
-
- Administrative details
- List of some prerequisites
- Tentative course plan for the semester
- Example applications of statistical machine learning
- Introduction to regression and classification
- Formalizing a learning task, experience, reward
- Gaussian noise and linear least-square fit
- 2006-07-31
-
- A taste of Bayesian learning
- Hypothesis, prior and posterior distributions
- Posterior distribution and prediction
- Point estimates of posterior e.g. MAP
- Minimum description length and MAP
- Back to regression: linear least-square with square
regularizer
- 2006-08-03
-
- Guest lecture by Dr. Sreeram Balakrishnan, IBM IRL Delhi:
Long span contextual features for text classification and
entity role labeling
2006-08-07
- SIGIR
- 2006-08-10
-
- Matlab and scilab tutorial, by Sandeep Deshmukh
- 2006-08-14
-
- Linear least square demo, Ridge penalty
- Lasso and its quadratic program
- Contrast between Ridge and Lasso, model sparsity
- From regression to classification
- Loss functions for classification and regression
- Canceled:
Guest lecture by Rajesh Parekh, Yahoo Research:
Data Mining and Research at Yahoo!: Insights, Lessons,
and Challenges
-
- 2006-08-17
-
- "True loss" and various approximations
- Square loss and its limitations
- Choice of discriminant functions
- Class density and class discrimination
- Discriminants for multivariate Gaussian densities
-
- 2006-08-21
- Guest lecture on spectral methods and singular value
decomposition
by Prof. Abhiram Ranade.
2006-08-24
-
- 2006-08-28
-
- Eigen demo
- Eigen-SVD connection
- SVD demo with low-rank plus noise matrix
- Connection between SVD and (regularized) least square
- PCA demo
- 2006-08-31
-
- Return to Linear discriminants and fitting criteria
- Hill-climbing, step size, and Newton method
- Derivation of the Perceptron from gradient descent considerations
- Kernel regression and kernel density estimation
- Bayesian interpretation and motivation for max-margin
- Max-margin formulation
- 2006-09-04
-
- Basic SVM QP for separable problems, scilab demo
- Inseparable problems and hinge loss
- Smooth approximations to hinge loss, direct primal optimization
- QP with slack variables, matlab/scilab demo
- 2006-09-07
-
- Primal-dual, Gordan's theorem, KKT necessary conditions
- Lagrangian saddlepoint
- Dual and Lagrangian, dualizing basic SVM, scilab demo
- Midterm exam,
09:30--11:30 room A1/A2 Math
2006-09-11
2006-09-14
- Midterm week
- 2006-09-18
-
- Dual QP optimization via SMO
- Using non-linear kernels with the dual formulation
- Dual with kernels, scilab demo
- Lagrangian support vector machines
- 2006-09-21
-
- Lagrangian and proximal support vector machines, scilab demo
- Finite Newton optimization of primal SVM by Keerthi and DeCoste
- 2006-09-25
-
- Complete Keerthi and DeCoste algorithm
- Non-convex optimization for SVMs
- Transductive SVM: pair swaps
- 2006-09-28
-
- Transductive SVM: deterministic annealing
- Graph Laplacian, its spectral properties, mincut
- Bagging and boosting -- joint lecture with
IT 608
2006-10-02
- Gandhi Jayanti and Dussehra
- 2006-10-05
-
- Ratio cuts
and spectral transduction
- Max-margin ranking and ordinal regression
-
- Extending max-margin formulation to general Ψ(x,y)
- Example applications: Markov chains, PCFG, etc.
- Max margin classification with joint features Ψ(x,y)
- Very large number of primal constraints and dual variables
- 2006-10-09
-
- The cutting plane algorithm and StructSVM
- 2006-10-12
-
- Approximate linear-time linear SVM and max-margin ranking
- SVM for multivariate performance measures
- 2006-10-16
-
- Complete multivariate performance measures
- Risk and generalization bounds, intro
- 2006-10-19
-
- Risk bounds: hypothesis consistent with training set
- Bounds on true risk minus empirical (training) risk
- Growth function, VC dimension
2006-10-23
- Diwali
- 2006-10-26
-
- Concluding part of growth function and VC-dimension
- Role of max-margin in bounding growth function (Alekh Agarwal)
- 2006-10-30
-
- Concluding part of max-margin and growth function (Alekh Agarwal)
- On to probabilistic learning:
Sampling the posterior if Pr(h|D) is simple
- The Metropolis MCMC algorithm for complicated Pr(h|D)
- Metropolis-Hastings algorithm, proof of correctness
- Gibb's Q(h,h') for sampling
- 2006-11-02
-
- MCMC intuition and example
- Semisupervised or unsupervised generative models
- Expectation maximization and its variational interpretation
- EM demo and initialization issues
- Conditional probabilistic models
- Binomial deviance loss and logistic regression
- 2006-11-06
-
- Comparison between logistic, hinge and true loss
- Optimization techniques: IRLS, iterative scaling,
generic Newton method
- Structured prediction, constraints, dualization
- Connections to StructSVM
- The connection with
maximum entropy
- 2006-11-09
-
- Non-negative matrix factorization
- Iterative scaling, NMF demo
- Dyadic factors of boolean matrices
- Cross-associations and co-clustering
- Rate-distortion and relation to cross-association
- Information bottleneck and approximating distributions
- Final exam,
09:00--13:00 room A1/A2 Math