CS763 - Computer Vision

Instructor: Ajit Rajwade
Office: SIA-218, KReSIT Building
Email:
Lecture Venue:SIC-301
Lecture Timings: Slot 12, Monday and Thursday 5:05 to 6:30 pm
Instructor Office Hours (in room SIA-218): Tuesday and Friday 5:00 pm to 6:00 pm, or after class, or by appointment via email
Teaching Assistants: Preeti Gopal, Shashwat Rohilla, Satyam: emails {preetig,shashwat,satyam}AT cse DOT iitb DOT ac DOT in
TA office hours: TBD




Topics to be covered (tentative list)


Intended Audience

First or second year M. Tech. students, Ph.D. students, third/fourth year B. Tech. students or fourth/fifth year dual-degree students. This course should be of interest to students from CSE and EE primarily, but also to students from some other departments such as Biosciences and Bio-engineering, Mechanical Engineering, Geosciences or Civil Engineering.

Pre-requisites

  • Exposure to basic mathematics: calculus, linear algebra and probability. Ability to program in C/C++/MATLAB and/or willingness to learn MATLAB.
  • Should have taken at least one out of the following: CS 663 (Image Processing), CS 475/675 (Computer graphics), CS 740 (Mathematical Methods for Visual Computing), or equivalent courses from other departments. I will expect you to be familiar with the Fourier transform, and basic linear algebra (eigen-analysis, SVD)

    Learning Materials and Textbooks

    Computational Resources


    Grading Policy


    Other Policies


    Lecture Schedule:


    Date

    Content of the Lecture

    Assignments/Readings/Notes

    05/01 (Mon)
    • Introduction, course overview
    09/01 (Thurs)
    • Geometric Transformations in 2D: translation, rotation, scaling, shear, affine transformations
    • Geometric Transformations in 3D: translation, rotation about XYZ axes and arbitrary axes, composition of transformations in 3D
    • Pinhole camera model: relation between image and camera coordinates
    • Vanishing points
    • Slides
    • Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri.
    12/01 (Mon)
    • Motivation for geometric camera calibration: intrinsic parameters (image and camera coordinate systems), extrinsic parameters (camera and world coordinate systems)
    • Camera calibration procedure in detail
    • Vanishing points and image center
    • Cross-ratio preservation in perspective projection and its applications
    • Slides
    • Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri (for notes on camera calibration) - check moodle.
    • Slides on numerical linear algebra: here and here.
    15/01 (Thurs)
    • Cross-ratio preservation in perspective projection and its applications
    • Planar homography: derivation and solution
    • Slides
    • Chapter 2 (section 2.4) and Chapter 6 of Trucco and Verri (for notes on camera calibration) - check moodle.
    • Slides on numerical linear algebra: here and here.
    19/01 (Mon)
    • Camera calibration method 2: direct solution for camera matrix
    • Need for camera lens, radial distortion due to lens, depth of field, aperture size (covered very briefly)

    • Image alignment: motion models (parametric and non-parametric)
    • Using control points to determine motion: affine, rotation (orthogonal procrustes problem)
    22/01 (Thu)
    • Sketch of the SIFT procedure to automated control point based image alignment
    • Forward and reverse warping, field of view issues during image alignment
    • Image alignment using mean squared error, normalized cross-correlation, concept of joint histograms
    • Some applications of image alignment: template matching, mosaicing (panoramas), denoising and removal of glare from photographs of paintings
    • Slides for image alignment
    • Homework 1 posted. Due 5th Feb before 11:55 pm.
    29/01 (Thu)
    • Image alignment using mean squared error, normalized cross-correlation, concept of joint histograms
    • Concept of entropy, joint entropy and its use in alignment of images with different intensity profiles
    • Introduction to robust methods in computer vision: concept of outlier with examples
    • Least squares method: maximum likelihood estimates under Gaussian noise
    • Limitations of least squares methods
    • Slides for image alignment
    • Slides for robust methods
    • For robust methods, also read appendix A.7 from Trucco and Verri (check moodle).
    • Homework 1 posted. Due 5th Feb before 11:55 pm.
    02/02 (Mon)
    • Limitations of least squares methods
    • Laplacian distribution and the L1 norm, mean versus median
    • LMedS algorithm
    • RANSAC and its variants: applications to motion estimation
    05/02 (Thurs)
    • Optical Flow: brightness constancy equation, aperture problem, Horn-Shunck method, Lucas-Kanade method
    • Comparing Horn-Shunck and Lucas-Kanade methods
    • Slides for optical flow
    • Some code to play with
    • Homework 1 posted. Due 5th Feb before 11:55 pm.
    09/02 (Mon)
    • Details of the solution of Horn-Shunck equations
    • Multi-scale Lucas-Kanade method
    • Introduction to applications: feature point tracking and structure from motion (to be covered later)
    • Applications of optical flow in underwater image de-skewing and estimating the surface normals of the moving water surface (not on exam)
    12/02 (Thurs)
    • Feature point tracking: Kanade-Lucas-Tomasi (KLT) tracker
    16/02 (Mon)
    • Structure from motion: motivation, factorization algorithm by Tomasi and Kanade
    19/02 (Thurs)
    • Midterm review
    2/03 (Mon)
    • Shape from shading: image irradiance, scene radiance, reflectance model, Lambertian model, albedo, shape from shading objective function with regularizer and optimization, Phong reflectance model
    • Slides.
    • Section 2.2.3 (upto and including the paragraph containing equation 2.4), 9.2, 9.3 and 9.4 of Trucco and Verri (note: we have not used calculus of variations in class unlike what is given in section 9.3, but we end up with very similar update equations)
    4/03 (Thurs)
    • Distribution of midterm papers
    08/03 (Mon)
    • Shape from shading: stereographic projections; depth from needle map, Poisson equations, a look at some of its applications in image processing; photometric stereo when light source directions are known, issue of shadows, motivation for recognition of faces from 3D maps
    • Slides.
    • Section 2.2.3 (upto and including the paragraph containing equation 2.4), 9.2, 9.3 and 9.4 of Trucco and Verri (note: we have not used calculus of variations in class unlike what is given in section 9.3, but we end up with very similar update equations)
    • Browse through chapters 10 and 11 of the book by BKP Horn
    • HW3 out
    12/03 (Thurs)
    • Adaboost: concept of ensemble of classifiers; basic algorithm; application to face detection
    16/03 (Mon)
    • Adaboost: application to face detection; concept of false positive and false negative rates, concept of detection rate; concept of cascade of classifiers; algorithm for classifier cascade
    19/03 (Thurs)
    • Adaboost as coordinate descent, theory behind rules for updating the weights of the training samples and the weights of the classifiers; Theorem about the generalization error of Adaboost
    23/03 (Mon)
    • Stereo vision: introduction; concept of disparity and its relationship with depth
    • Calibrated and uncalibrated stereo
    • Epipolar geometry - epipoles, epipolar line, epipolar plane, epipolar constraint;
    • Essential and fundamental matrix; eight-point algorithm for fundamental matrix
    26/03 (Thu)
    • Properties of essential and fundamental matrix, locating epipoles from fundamental matrix
    • Stereo reconstruction in fully calibrated case (both intrinsic and extrinsic parameters are known)
    • Stereo reconstruction when only intrinsic parameters are known
    • Correspondence problem: matching using SSD or cross-correlations
    • Dynamic programming method for correspondence matching
    30/03 (Mon)
    • Conventional sensing: measure and compress/throw paradigm; measuring devices as linear systems
    • Signal processing basics: discrete Fourier transform (DFT) and its inverse, discrete cosine transform (DCT) and its inverse, discrete Fourier and cosine bases as orthonormal matrices; Shannon's sampling theorem and its limitations
    • Candes' puzzling experiment
    • Concept of sparsity of images in orthonormal bases
    • Concept of incoherence between image representation basis (Psi) and the measurement matrix (Phi)
    • Reconstruction from compressed measurements: use of L0 norm (leading to NP-hard problem) and L1 norm (called basis pursuit) - theorem by Candes, Romberg, Tao on reconstruction using L1 norm
    06/04 (Thurs)
    • Recap of key theorem by Candes, Romberg and Tao - interpretation of this theorem as a more powerful version of Shannon's sampling theorem
    • Intuition behind concept of incoherence
    • Restricted isometry property (RIP) for measurement matrices
    • Compressed sensing when the signal is compressible but not exactly sparse; dealing with noise
    • Random and RIP/Incoherence
    • Compressed sensing: L1 norm versus L2 norm
    09/04 (Thurs)
    • Compressive sensing: some toy experiments
    • Discussion of uniqueness of L0 norm solution in CS and its relation to RIP
    • Reconstruction algorithms for CS: Basis pursuit (category 1) and greedy approximation algorithms (category 2)
    • Two algorithms from category 2: Matching pursuit and orthogonal matching pursuit
    13/04 (Mon)
    • Rice single pixel camera
    • Rice single pixel camera for video
    • Coded aperture snapshot spectral imager (CASSI) for hyperspectral image acquisition
    • Introduction to compressive video camera by Hitomi
    16/04 (Thurs)
    • Compressive video camera by Hitomi (not on exam)
    • Discussion of HW5

    Final exam timings

    Project viva schedule

    Homework solutions: