CS 763 - Computer Vision



Course Information

Topics to be covered (tentative list)


Intended Audience

First or second year M. Tech. students, Ph.D. students, third/fourth year B. Tech. students or fourth/fifth year dual-degree students. This course should be of interest to students from CSE and EE primarily, but also to students from some other departments such as Aerospace, Physics, Biosciences and Bio-engineering, Mechanical Engineering, Geosciences or Civil Engineering.

Pre-requisites

  • Exposure to basic mathematics: calculus, linear algebra and probability. Ability to program in C/C++/MATLAB and/or willingness to learn MATLAB.
  • Should have taken at least one out of the following: CS 663 (Image Processing), CS 475/675 (Computer graphics), CS 740 (Mathematical Methods for Visual Computing), or equivalent courses from other departments. I will expect you to be familiar with the Fourier transform (or be willing to learn it quickly), and basic linear algebra (eigen-analysis, matrix inverse)

    Learning Materials and Textbooks

    Computational Resources


    Grading Policy


    Other Policies


    Course Projects

    Read this link for a list of project topics, and various instructions regarding course project submissions and expectations. I will keep updating it as new topics come to my mind. You can also refer to "extra interesting readings" in the lecture schedule.

    Lecture Schedule:


    Date

    Content of the Lecture

    Assignments/Readings/Notes

    Interesting Extra Readings (not for exam)

    3rd Jan
    • Course overview
    Slides
    7th Jan Camera Geometry
    • Transformations in 2D: translation, rotation, scaling, shearing; affine and rigid transformations
    • Transformations in 3D: translation, rotation about X,Y,Z axis, rotation about arbitrary axis, 3D affine, number of degrees of freedom
    • Composition of transformations in 2D and 3D with examples; concept og homogeneous coordinates in 2D and 3D
    • Concept of pinhole camera, need for pinhole, geometry of perspective projection through pinhole camera
    Slides
    10th Jan
    • Concept of pinhole camera, need for pinhole, geometry of perspective projection through pinhole camera
    • Weak perspective projection and orthographic projection
    • Concept of image coordinate system and camera coordinate system; intrinsic camera parameters
    • Concept of world coordinate system and its relationship to camera and image coordinate systems; extrinsic camera parameters
    • Concept of camera calibration and basic aim of camera calibration
    Slides
    14th Jan
    • Concept of camera calibration and basic aim of camera calibration
    • Algorithm for derivation of camera matrix (size 3 x 4) due to Faugeras and Toscani; derivation of camera parameters from the camera matrix
    • Motivation for camera calibration - implications for 3D reconstruction using two calibrated cameras
    • Perspective invariant - cross-ratio
    Slides
    17th Jan
    • Perspective invariant - cross-ratio - proof of cross-ratio being a perspective invariant
    • Use of cross-ratio and vanishing points in metrology - two different scenarios
    • Introduction to planar homography
    Slides
    21st Jan
    • Introduction to planar homography
    • Derivation for planar homography; algorithm for homography estimation given N pairs of corresponding points from two images of a planar scene
    • Another camera calibration algorithm; orthocenter theorem for vanishing points
    24th Jan
    • Clarifications about homography

    Image Alignment
    • Problem statement: physically and digitally corresponding points
    • Motion models and degrees of freedom; non-rigid/deformable/non-parametric image alignment
    • Control point based image alignment using least squares - derivation for pseudo-inverse
    • Introduction to the SIFT algorithm
    • Applications of image alignment: Google art project
    27th Jan
    • Forward and reverse image warping - bilinear and nearest-neighbor interpolation
    • Image alignment using image similarity measures: mean squared error, normalized cross-correlation
    • Concept of field of view in image alignment using image similarity measures
    • Monomodal and multimodal image alignment
    • Concept of joint histograms and behaviour of joint histograms in multi-modal image alignment
    31st Jan
    • Concept of joint histograms and behaviour of joint histograms in multi-modal image alignment
    • Concept of entropy and joint entropy, algorithm for multimodal registration by minimizing joint entropy
    • Aspects of image registration: 2D/3D, motion model, monomodal or multimodal
    • Application scenarios for image alignment: template matching, video stabilization, panorama generation, face recognition, 3D to 2D alignment
    3rd Feb
    • Least squares algorithm for determining orthonormal transformation between corresponding pairs of points: the orthogonal procrustes problem

    Robust Methods in Computer Vision
    • Least squares problems and their relation to the Gaussian distribution on the noise
    • Examples of outliers in computer vision
    • Explanation of why the Gaussian distribution is unsuited to handling outliers
    • Introduction to the Laplacian distribution
    7th Feb
    • Introduction to the Laplacian distribution and Generalized Gaussian distribution
    • The importance of heavy-tailed distributions in robust statistics
    • Mean versus median: L2 fit versus L1 fit
    • Least median of squares algorithm (LMedS)
    • RanSaC (random sample consensus) algorithm
    • Use of RanSaC in robust determination of planar homographies
    • Variants of RanSaC
    10th Feb Structure from Motion
    • Motion as a cue to inference of 3D structure from images
    • Motion factorization algorithm by Tomasi and Kanade for inference of (sparse) 3D structure of a fixed object being observed by a moving orthographic camera (or a rigidly moving object, being observed by a fixed orthographic camera)
    • Aspects of the above algorithm: Eckhart Young theorem in SVD, metric constraints for inference of motion parameters and 3D structure
    • SVD: concept of SVD as a weighted summation of rank-one matrices
    17th Feb
    • Midsem review session
    28th Feb Optical Flow
    • Dealing with the aperture problem: regularization
    • Horn and Shunck method: algorithm using discrete formulation, steps of Jacobi's method for matrix inversion, and comments about limitations
    3rd March
    • Distribution of midsem papers
    14th March
    • Lucas-Kanade algorithm for optical flow
    • Multi-scale Lucas-Kanade algorithm
    • Comparison of Horn-Shunck and Lucas-Kanade algorithms
    • Applications of optical flow
    17th March Feature Point Tracking
    • Feature point tracking: Kanade-Lucas-Kanade tracker
    • Motion models: patch-wise translation and patch-wise affine
    • Concept of a good feature point based on saliency (similar to criteria in Lucas-Kanade optical flow algorithm)
    • Tracking of salient feature points: using translation and affine models
    • Some results of KLT tracker
    • Applications of feature point tracking: mosaicing, video stabilization, structure from motion
    21st March Adaboost
    • Machine learning 101 jargon
    • Outline of Adaboost algorithm for binary classification - weak and strong classifiers
    • Concept of weight of weak classifier, weight of sample point in Adaboost
    • Concept of family of weak classifiers
    28th March
    • Theory behind Adaboost: objective function for Adaboost, and Adaboost as coordinate descent on this objective function
    • Derivation of weights of weak classifiers and weights of training samples
    • Comments on the generalization error of Adaboost
    4th April
    • Adaboost for face detection
    • Computation of Haar-like features
    • Concept of classifier cascade for pruning away negative samples, concept of false positive rate and detection rate
    11th April
    • Concept of (binocular geometric) stereo, stereo baseline, stereo disparity
    • Simplest case of stereo with aligned coordinate systems of the two cameras: inverse relation between depth and disparity
    • Parameters of a stereo system
    • Epipolar geometry: epipolar plane, left and right epipoles, left and right epipolar lines
    • Fully calibrated stereo with unaligned coordinate systems
    • Essential and fundamental matrices in a stereo system; eight point algorithm
    • Slides (upto and including slide 28 only)
    • Read sections 7.1 and 7.3 from Trucco and Verri