CS763: Computer Vision, Spring 2015

CS763 - Computer Vision

Instructor: Ajit Rajwade
Office: SIA-218, KReSIT Building
Email:

Lecture Venue:SIC-301
Lecture Timings: Slot 12, Monday and Thursday 5:05 to 6:30 pm
Instructor Office Hours (in room SIA-218): Tuesday and Friday 5:00 pm to 6:00 pm, or after class, or by appointment via email
Teaching Assistants: Preeti Gopal, Shashwat Rohilla, Satyam: emails {preetig,shashwat,satyam}AT cse DOT iitb DOT ac DOT in
TA office hours: TBD

Topics to be covered (tentative list)

Camera geometry, camera calibration, vanishing points
Computational tools for creating Image Panoramas: homographies, RANSAC for point-matching, SIFT (scale invariant feature transform) for detection of salient feature points
Algorithms for - shape from shading, depth from needle map; optical flow, Kanade-Lucas-Tomasi algorithm, applications of optical flow in underwater imagery; shape from stereo, epipolar geometry; structure from motion;
Photometric stereo - deriving shape from multiple images of an object taken under different lighting conditions; applications to illumination invariant face recognition, face relighting
Machine Learning in computer vision: Face detection using Adaboost, Object detection using parts
Compressive sensing: summary of key theorems and results, proof of one or two key theorems, applications of compressive sensing, algorithms for compressive sensing reconstruction
Some of the above topics will make use of concepts from signal processing (Fourier transform, convolution) and linear algebra (principal components analysis (PCA) and singular value decomposition (SVD))

Intended Audience

First or second year M. Tech. students, Ph.D. students, third/fourth year B. Tech. students or fourth/fifth year dual-degree students. This course should be of interest to students from CSE and EE primarily, but also to students from some other departments such as Biosciences and Bio-engineering, Mechanical Engineering, Geosciences or Civil Engineering.

Pre-requisites

Exposure to basic mathematics: calculus, linear algebra and probability. Ability to program in C/C++/MATLAB and/or willingness to learn MATLAB.

Should have taken at least one out of the following: CS 663 (Image Processing), CS 475/675 (Computer graphics), CS 740 (Mathematical Methods for Visual Computing), or equivalent courses from other departments. I will expect you to be familiar with the Fourier transform, and basic linear algebra (eigen-analysis, SVD)

Learning Materials and Textbooks

Lecture slides that will be regularly posted
"Introductory Techniques for 3D Computer Vision", Emanuele Trucco and Alessandro Verri, Prentice Hall.
Robot Vision, by B. K. P. Horn, MIT Press (Cambridge).
Computer Vision: Algorithms and Applications, by Richard Szeliski (freely downloadable!)
Computer Vision: A Modern Approach, Forsyth and Ponce, Pearson Education.

Computational Resources

MATLAB at IITB
MATLAB Tutorial: here or here
MATLAB Image Processing Toolbox tutorial: here
Matlab Tutorial
The MathWorks - MATLAB Tutorial
Matlab Primer
On-line Matlab Help
Writing Fast Matlab Code (pdf)
Code Vectorization Guide
Matlab Programmin Style Guidelines (pdf)

Grading Policy

Mid-sem exam: 20%
Final exam (cumulative): 20%
Programming assignments (four or five in number) and course project: 55% - all to be done in groups of 2-3 students.
Attendance and class participation: 5%

Other Policies

Attendance is mandatory. Students with less than 80% attendance may be given a DX grade.
Assignments will be given out (typically) once every two or three weeks. They must be submitted on or before the deadline. No late assignments will be accepted. The programming components of the assignments will typically involve MATLAB, so you must be willing to learn it quickly.
We will adopt a zero-tolerance policy against any forms of plagiarism or any other form of cheating. Just don't do it! In cases of plagiarism, givers and takers will both be considered equally responsible.
This course is (inherently) cumulative. The syllabus for the final exam will include everything taught during the semester.

Lecture Schedule:

Date
Content of the Lecture
Assignments/Readings/Notes

05/01 (Mon)

Introduction, course overview

Course Overview

09/01 (Thurs)

Geometric Transformations in 2D: translation, rotation, scaling, shear, affine transformations

Geometric Transformations in 3D: translation, rotation about XYZ axes and arbitrary axes, composition of transformations in 3D

Pinhole camera model: relation between image and camera coordinates

Vanishing points

Slides
Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri.

12/01 (Mon)

Motivation for geometric camera calibration: intrinsic parameters (image and camera coordinate systems), extrinsic parameters (camera and world coordinate systems)

Camera calibration procedure in detail

Vanishing points and image center

Cross-ratio preservation in perspective projection and its applications

Slides
Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri (for notes on camera calibration) - check moodle.
Slides on numerical linear algebra: here and here.

15/01 (Thurs)

Cross-ratio preservation in perspective projection and its applications

Planar homography: derivation and solution

Slides
Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri (for notes on camera calibration) - check moodle.
Slides on numerical linear algebra: here and here.

19/01 (Mon)

Camera calibration method 2: direct solution for camera matrix

Need for camera lens, radial distortion due to lens, depth of field, aperture size (covered very briefly)

Image alignment: motion models (parametric and non-parametric)

Using control points to determine motion: affine, rotation (orthogonal procrustes problem)

Slides for camera geometry
Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri (for notes on camera calibration) - check moodle.
Slides on numerical linear algebra: here and here.
Slides for image alignment
Notes on the orthogonal procrustes problem (check the wiki article also)

22/01 (Thu)

Sketch of the SIFT procedure to automated control point based image alignment
Forward and reverse warping, field of view issues during image alignment
Image alignment using mean squared error, normalized cross-correlation, concept of joint histograms
Some applications of image alignment: template matching, mosaicing (panoramas), denoising and removal of glare from photographs of paintings

Slides for image alignment
Homework 1 posted. Due 5th Feb before 11:55 pm.

29/01 (Thu)

Image alignment using mean squared error, normalized cross-correlation, concept of joint histograms
Concept of entropy, joint entropy and its use in alignment of images with different intensity profiles

Introduction to robust methods in computer vision: concept of outlier with examples
Least squares method: maximum likelihood estimates under Gaussian noise
Limitations of least squares methods

Slides for image alignment
Slides for robust methods
For robust methods, also read appendix A.7 from Trucco and Verri (check moodle).
Homework 1 posted. Due 5th Feb before 11:55 pm.

02/02 (Mon)

Limitations of least squares methods
Laplacian distribution and the L1 norm, mean versus median
LMedS algorithm
RANSAC and its variants: applications to motion estimation

Slides for robust methods
For robust methods, also read appendix A.7 from Trucco and Verri (check moodle).
Optional reading: Robust Parameter Estimation in Computer Vision.
Homework 1 posted. Due 5th Feb before 11:55 pm.

05/02 (Thurs)

Optical Flow: brightness constancy equation, aperture problem, Horn-Shunck method, Lucas-Kanade method
Comparing Horn-Shunck and Lucas-Kanade methods

Slides for optical flow
Some code to play with
Homework 1 posted. Due 5th Feb before 11:55 pm.

09/02 (Mon)

Details of the solution of Horn-Shunck equations
Multi-scale Lucas-Kanade method
Introduction to applications: feature point tracking and structure from motion (to be covered later)
Applications of optical flow in underwater image de-skewing and estimating the surface normals of the moving water surface (not on exam)

Slides for optical flow
Some code to play with
Read sections 8.3.1, 8.3.2 and 8.4.1 from Trucco and Verri
Homework 2 posted. Due 17th Feb before 11:55 pm.
Homework1 solutions.
Optional reading: Horn and Shunck with a multi-scale strategy
Optional reading (not on exam): Hiroshi Murase, Surface shape reconstruction of a non-rigid transparent object using refraction and motion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1994 (detailed derivation of the equations we derived in class are presented in a more intuitive way here.

12/02 (Thurs)

Feature point tracking: Kanade-Lucas-Tomasi (KLT) tracker

Slides
Some code for playing around with structure tensor

Recommended reading: Shi and Tomasi, "Good Features to Track", Cornell University Technical Report
Homework 2 posted. Due 17th Feb before 11:55 pm.
Homework1 solutions.

16/02 (Mon)

Structure from motion: motivation, factorization algorithm by Tomasi and Kanade

Slides
Read section 8.5.1 from Trucco and Verri
Recommended Reading: Tomasi and Kanade,Shape and motion from image streams under orthography: a factorization approach, International Journal of Computer Vision, 1992.
Homework 2 posted. Due 17th Feb before 11:55 pm.
Homework1 solutions.

19/02 (Thurs)

Midterm review

Homework2 solutions.
Midterm time-table

2/03 (Mon)

Shape from shading: image irradiance, scene radiance, reflectance model, Lambertian model, albedo, shape from shading objective function with regularizer and optimization, Phong reflectance model

Slides.
Section 2.2.3 (upto and including the paragraph containing equation 2.4), 9.2, 9.3 and 9.4 of Trucco and Verri (note: we have not used calculus of variations in class unlike what is given in section 9.3, but we end up with very similar update equations)

4/03 (Thurs)

Distribution of midterm papers

08/03 (Mon)

Shape from shading: stereographic projections; depth from needle map, Poisson equations, a look at some of its applications in image processing; photometric stereo when light source directions are known, issue of shadows, motivation for recognition of faces from 3D maps

Slides.
Section 2.2.3 (upto and including the paragraph containing equation 2.4), 9.2, 9.3 and 9.4 of Trucco and Verri (note: we have not used calculus of variations in class unlike what is given in section 9.3, but we end up with very similar update equations)
Browse through chapters 10 and 11 of the book by BKP Horn
HW3 out

12/03 (Thurs)

Adaboost: concept of ensemble of classifiers; basic algorithm; application to face detection

Slides.
HW3 out

16/03 (Mon)

Adaboost: application to face detection; concept of false positive and false negative rates, concept of detection rate; concept of cascade of classifiers; algorithm for classifier cascade

Slides.
Paper by Viola and Jones
(strongly recommended reading)
HW3 out

19/03 (Thurs)

Adaboost as coordinate descent, theory behind rules for updating the weights of the training samples and the weights of the classifiers; Theorem about the generalization error of Adaboost

Slides.
Paper by Viola and Jones
(strongly recommended reading)

23/03 (Mon)

Stereo vision: introduction; concept of disparity and its relationship with depth
Calibrated and uncalibrated stereo
Epipolar geometry - epipoles, epipolar line, epipolar plane, epipolar constraint;
Essential and fundamental matrix; eight-point algorithm for fundamental matrix

Slides.
HW4 out

26/03 (Thu)

Properties of essential and fundamental matrix, locating epipoles from fundamental matrix
Stereo reconstruction in fully calibrated case (both intrinsic and extrinsic parameters are known)
Stereo reconstruction when only intrinsic parameters are known
Correspondence problem: matching using SSD or cross-correlations
Dynamic programming method for correspondence matching

Slides.
HW4 out

30/03 (Mon)

Conventional sensing: measure and compress/throw paradigm; measuring devices as linear systems
Signal processing basics: discrete Fourier transform (DFT) and its inverse, discrete cosine transform (DCT) and its inverse, discrete Fourier and cosine bases as orthonormal matrices; Shannon's sampling theorem and its limitations
Candes' puzzling experiment
Concept of sparsity of images in orthonormal bases
Concept of incoherence between image representation basis (Psi) and the measurement matrix (Phi)
Reconstruction from compressed measurements: use of L0 norm (leading to NP-hard problem) and L1 norm (called basis pursuit) - theorem by Candes, Romberg, Tao on reconstruction using L1 norm

Slides (CS Theory).
HW5 out

06/04 (Thurs)

Recap of key theorem by Candes, Romberg and Tao - interpretation of this theorem as a more powerful version of Shannon's sampling theorem
Intuition behind concept of incoherence
Restricted isometry property (RIP) for measurement matrices
Compressed sensing when the signal is compressible but not exactly sparse; dealing with noise
Random and RIP/Incoherence
Compressed sensing: L1 norm versus L2 norm

Slides (CS Theory).
Introductory article by Candes and Wakin

Introductory article by Romberg

HW5 out

09/04 (Thurs)

Compressive sensing: some toy experiments
Discussion of uniqueness of L0 norm solution in CS and its relation to RIP
Reconstruction algorithms for CS: Basis pursuit (category 1) and greedy approximation algorithms (category 2)
Two algorithms from category 2: Matching pursuit and orthogonal matching pursuit

Slides (CS Theory).
Slides (CS Algorithms).
Introductory article by Candes and Wakin

Introductory article by Romberg

HW3 solutions
HW4 solutions
HW5 out

13/04 (Mon)

Rice single pixel camera

Rice single pixel camera for video

Coded aperture snapshot spectral imager (CASSI) for hyperspectral image acquisition

Introduction to compressive video camera by Hitomi

Slides (CS Systems).
Slides (CS Theory).
Slides (CS Algorithms).
Introductory article by Candes and Wakin

Introductory article by Romberg

HW3 solutions
HW4 solutions
HW5 out

16/04 (Thurs)

Compressive video camera by Hitomi (not on exam)

Discussion of HW5

Slides (CS Systems).
Slides (CS Theory).
Slides (CS Algorithms).
Introductory article by Candes and Wakin

Introductory article by Romberg

HW5 out

Final exam timings

Project viva schedule

Homework solutions:

HW1
HW2
HW3
HW4
HW5

Date	Content of the Lecture	Assignments/Readings/Notes
05/01 (Mon)	Introduction, course overview	Course Overview
09/01 (Thurs)	Geometric Transformations in 2D: translation, rotation, scaling, shear, affine transformations Geometric Transformations in 3D: translation, rotation about XYZ axes and arbitrary axes, composition of transformations in 3D Pinhole camera model: relation between image and camera coordinates Vanishing points	Slides Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri.
12/01 (Mon)	Motivation for geometric camera calibration: intrinsic parameters (image and camera coordinate systems), extrinsic parameters (camera and world coordinate systems) Camera calibration procedure in detail Vanishing points and image center Cross-ratio preservation in perspective projection and its applications	Slides Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri (for notes on camera calibration) - check moodle. Slides on numerical linear algebra: here and here.
15/01 (Thurs)	Cross-ratio preservation in perspective projection and its applications Planar homography: derivation and solution	Slides Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri (for notes on camera calibration) - check moodle. Slides on numerical linear algebra: here and here.
19/01 (Mon)	Camera calibration method 2: direct solution for camera matrix Need for camera lens, radial distortion due to lens, depth of field, aperture size (covered very briefly) Image alignment: motion models (parametric and non-parametric) Using control points to determine motion: affine, rotation (orthogonal procrustes problem)	Slides for camera geometry Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri (for notes on camera calibration) - check moodle. Slides on numerical linear algebra: here and here. Slides for image alignment Notes on the orthogonal procrustes problem (check the wiki article also)
22/01 (Thu)	Sketch of the SIFT procedure to automated control point based image alignment Forward and reverse warping, field of view issues during image alignment Image alignment using mean squared error, normalized cross-correlation, concept of joint histograms Some applications of image alignment: template matching, mosaicing (panoramas), denoising and removal of glare from photographs of paintings	Slides for image alignment Homework 1 posted. Due 5th Feb before 11:55 pm.
29/01 (Thu)	Image alignment using mean squared error, normalized cross-correlation, concept of joint histograms Concept of entropy, joint entropy and its use in alignment of images with different intensity profiles Introduction to robust methods in computer vision: concept of outlier with examples Least squares method: maximum likelihood estimates under Gaussian noise Limitations of least squares methods	Slides for image alignment Slides for robust methods For robust methods, also read appendix A.7 from Trucco and Verri (check moodle). Homework 1 posted. Due 5th Feb before 11:55 pm.
02/02 (Mon)	Limitations of least squares methods Laplacian distribution and the L1 norm, mean versus median LMedS algorithm RANSAC and its variants: applications to motion estimation	Slides for robust methods For robust methods, also read appendix A.7 from Trucco and Verri (check moodle). Optional reading: Robust Parameter Estimation in Computer Vision. Homework 1 posted. Due 5th Feb before 11:55 pm.
05/02 (Thurs)	Optical Flow: brightness constancy equation, aperture problem, Horn-Shunck method, Lucas-Kanade method Comparing Horn-Shunck and Lucas-Kanade methods	Slides for optical flow Some code to play with Homework 1 posted. Due 5th Feb before 11:55 pm.
09/02 (Mon)	Details of the solution of Horn-Shunck equations Multi-scale Lucas-Kanade method Introduction to applications: feature point tracking and structure from motion (to be covered later) Applications of optical flow in underwater image de-skewing and estimating the surface normals of the moving water surface (not on exam)	Slides for optical flow Some code to play with Read sections 8.3.1, 8.3.2 and 8.4.1 from Trucco and Verri Homework 2 posted. Due 17th Feb before 11:55 pm. Homework1 solutions. Optional reading: Horn and Shunck with a multi-scale strategy Optional reading (not on exam): Hiroshi Murase, Surface shape reconstruction of a non-rigid transparent object using refraction and motion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1994 (detailed derivation of the equations we derived in class are presented in a more intuitive way here.
12/02 (Thurs)	Feature point tracking: Kanade-Lucas-Tomasi (KLT) tracker	Slides Some code for playing around with structure tensor Recommended reading: Shi and Tomasi, "Good Features to Track", Cornell University Technical Report Homework 2 posted. Due 17th Feb before 11:55 pm. Homework1 solutions.
16/02 (Mon)	Structure from motion: motivation, factorization algorithm by Tomasi and Kanade	Slides Read section 8.5.1 from Trucco and Verri Recommended Reading: Tomasi and Kanade,Shape and motion from image streams under orthography: a factorization approach, International Journal of Computer Vision, 1992. Homework 2 posted. Due 17th Feb before 11:55 pm. Homework1 solutions.
19/02 (Thurs)	Midterm review	Homework2 solutions. Midterm time-table
2/03 (Mon)	Shape from shading: image irradiance, scene radiance, reflectance model, Lambertian model, albedo, shape from shading objective function with regularizer and optimization, Phong reflectance model	Slides. Section 2.2.3 (upto and including the paragraph containing equation 2.4), 9.2, 9.3 and 9.4 of Trucco and Verri (note: we have not used calculus of variations in class unlike what is given in section 9.3, but we end up with very similar update equations)
4/03 (Thurs)	Distribution of midterm papers
08/03 (Mon)	Shape from shading: stereographic projections; depth from needle map, Poisson equations, a look at some of its applications in image processing; photometric stereo when light source directions are known, issue of shadows, motivation for recognition of faces from 3D maps	Slides. Section 2.2.3 (upto and including the paragraph containing equation 2.4), 9.2, 9.3 and 9.4 of Trucco and Verri (note: we have not used calculus of variations in class unlike what is given in section 9.3, but we end up with very similar update equations) Browse through chapters 10 and 11 of the book by BKP Horn HW3 out
12/03 (Thurs)	Adaboost: concept of ensemble of classifiers; basic algorithm; application to face detection	Slides. HW3 out
16/03 (Mon)	Adaboost: application to face detection; concept of false positive and false negative rates, concept of detection rate; concept of cascade of classifiers; algorithm for classifier cascade	Slides. Paper by Viola and Jones (strongly recommended reading) HW3 out
19/03 (Thurs)	Adaboost as coordinate descent, theory behind rules for updating the weights of the training samples and the weights of the classifiers; Theorem about the generalization error of Adaboost	Slides. Paper by Viola and Jones (strongly recommended reading)
23/03 (Mon)	Stereo vision: introduction; concept of disparity and its relationship with depth Calibrated and uncalibrated stereo Epipolar geometry - epipoles, epipolar line, epipolar plane, epipolar constraint; Essential and fundamental matrix; eight-point algorithm for fundamental matrix	Slides. HW4 out
26/03 (Thu)	Properties of essential and fundamental matrix, locating epipoles from fundamental matrix Stereo reconstruction in fully calibrated case (both intrinsic and extrinsic parameters are known) Stereo reconstruction when only intrinsic parameters are known Correspondence problem: matching using SSD or cross-correlations Dynamic programming method for correspondence matching	Slides. HW4 out
30/03 (Mon)	Conventional sensing: measure and compress/throw paradigm; measuring devices as linear systems Signal processing basics: discrete Fourier transform (DFT) and its inverse, discrete cosine transform (DCT) and its inverse, discrete Fourier and cosine bases as orthonormal matrices; Shannon's sampling theorem and its limitations Candes' puzzling experiment Concept of sparsity of images in orthonormal bases Concept of incoherence between image representation basis (Psi) and the measurement matrix (Phi) Reconstruction from compressed measurements: use of L0 norm (leading to NP-hard problem) and L1 norm (called basis pursuit) - theorem by Candes, Romberg, Tao on reconstruction using L1 norm	Slides (CS Theory). HW5 out
06/04 (Thurs)	Recap of key theorem by Candes, Romberg and Tao - interpretation of this theorem as a more powerful version of Shannon's sampling theorem Intuition behind concept of incoherence Restricted isometry property (RIP) for measurement matrices Compressed sensing when the signal is compressible but not exactly sparse; dealing with noise Random and RIP/Incoherence Compressed sensing: L1 norm versus L2 norm	Slides (CS Theory). Introductory article by Candes and Wakin Introductory article by Romberg HW5 out
09/04 (Thurs)	Compressive sensing: some toy experiments Discussion of uniqueness of L0 norm solution in CS and its relation to RIP Reconstruction algorithms for CS: Basis pursuit (category 1) and greedy approximation algorithms (category 2) Two algorithms from category 2: Matching pursuit and orthogonal matching pursuit	Slides (CS Theory). Slides (CS Algorithms). Introductory article by Candes and Wakin Introductory article by Romberg HW3 solutions HW4 solutions HW5 out
13/04 (Mon)	Rice single pixel camera Rice single pixel camera for video Coded aperture snapshot spectral imager (CASSI) for hyperspectral image acquisition Introduction to compressive video camera by Hitomi	Slides (CS Systems). Slides (CS Theory). Slides (CS Algorithms). Introductory article by Candes and Wakin Introductory article by Romberg HW3 solutions HW4 solutions HW5 out
16/04 (Thurs)	Compressive video camera by Hitomi (not on exam) Discussion of HW5	Slides (CS Systems). Slides (CS Theory). Slides (CS Algorithms). Introductory article by Candes and Wakin Introductory article by Romberg HW5 out
	Final exam timings Project viva schedule Homework solutions: HW1 HW2 HW3 HW4 HW5