CS763 - Computer Vision

Instructor: Ajit Rajwade
Office: SIA-218, KReSIT Building
Email:
Lecture Timings: Slot 12, Monday and Thursday 5:30 to 7:00 pm (note the new timing for slot 12)
Lecture Venue: SIC 301
Instructor Office Hours (in room SIA-218): Tuesday and Friday 5:00 pm to 6:00 pm, or after class, or by appointment via email
Teaching Assistants: Abhishek Chakraborty (abhishek.chakraborty.iitb AT gmail DOT com), Divyanshu Grover, Souvik Sinha Deb, Sougata Sinha {divyanshu, souviksd, sougata}@cse DOT iitb DOT ac DOT in



Topics to be covered (tentative list)


Intended Audience

First or second year M. Tech. students, Ph.D. students, third/fourth year B. Tech. students or fourth/fifth year dual-degree students. This course should be of interest to students from CSE and EE primarily, but also to students from some other departments such as Aerospace, Physics, Biosciences and Bio-engineering, Mechanical Engineering, Geosciences or Civil Engineering.

Pre-requisites

  • Exposure to basic mathematics: calculus, linear algebra and probability. Ability to program in C/C++/MATLAB and/or willingness to learn MATLAB.
  • Should have taken at least one out of the following: CS 663 (Image Processing), CS 475/675 (Computer graphics), CS 740 (Mathematical Methods for Visual Computing), or equivalent courses from other departments. I will expect you to be familiar with the Fourier transform (or be willing to learn it quickly), and basic linear algebra (eigen-analysis, matrix inverse)

    Learning Materials and Textbooks

    Computational Resources


    Grading Policy


    Other Policies


    Course Projects

    Read this link for a list of project topics, and various instructions regarding course project submissions and expectations. I will keep updating it as new topics come to my mind. You can also refer to "extra interesting readings" in the lecture schedule.

    Lecture Schedule:


    Date

    Content of the Lecture

    Assignments/Readings/Notes

    Interesting Extra Readings (not for exam)

    04/01 (Mon)
    • Introduction, course overview
    07/01 (Thu)
    • Transformations in 2D: translation, rotation, scaling, shearing, reflection; affine transformations; composition of affine transformations; number of degrees of freedom
    • Transformations in 3D: translation, rotation about cartesian axes, rotation about arbitrary axis, reflection across an arbitrary plane; affine transformations; number of degrees of freedom
    • Pinhole camera, purpose of pinhole; perspective projection and basic equations deriving relationship between coordinates of object point (in 3D) and those of its image (on the image plane)
    • Slides: pdf, pptx
    • Chapter 2 (section 2.4) and chapter 6 of Trucco and Verri (see moodle)
    11/01 (Mon)
    • Camera coordinate system and image coordinate system
    • Weak perspective projection, orthographic projection
    • Vanishing point: geometric and alegbraic derivation, proof that vanishing points of coplanar lines are collinear
    • Camera calibration: camera coordinate system, world coordinate system, meaning of geometric camera calibration, checkerboard box for calibration, concept of extrinsic and intrinsic camera parameters
    • Motivation for camera calibration (implications for 3D reconstruction given two calibrated cameras)
    • Slides: pdf, pptx
    • Chapter 2 (section 2.4) and chapter 6 of Trucco and Verri
    14/01 (Thu)
    • Motivation for camera calibration (implications for 3D reconstruction given two calibrated cameras)
    • Camera calibration algorithm (using minimum 7 points): matrix formulation and solution to homogenous system of linear equations, extraction of camera parameters from the solution
    • Perspective invariant: cross ratio, application in visual metrology
    • Slides: pdf,pptx
    • Chapter 2 (section 2.4) and chapter 6 of Trucco and Verri (chapter 6 contains the camera calibration algorithms in detail)
    18/01 (Mon)
    • Perspective invariant: cross ratio, more applications in visual metrology, proof of invariance of cross-ratio, Explanation as to why calibrated cameras are not essential for visual metrology applications (namely that affine transformations also preserve cross ratios, and coordinates of a point in image coordinate system with pixels and the camera coordinate system are related by an affine transformation)
    • Camera calibration algorithm: direct camera matrix estimation
    • Slides: pdf,pptx
    • Chapter 2 (section 2.4) and chapter 6 of Trucco and Verri (chapter 6 contains the camera calibration algorithms in detail)
    20/01 (Wed: extra lecture)
    • Concept of planar hompgraphy: derivation of the key equations
    • Algorithm to find the homography from two or images of a planar scene given N pairs of corresponding points from the two images
    • Slides: pdf,pptx
    • Chapter 2 (section 2.4) and chapter 6 of Trucco and Verri (chapter 6 contains the camera calibration algorithms in detail)
    21/01 (Thu)
    • Need for a lens in a camera, concept of depth of focus and its relation to aperture size, distortion due to a lens, human eye as a camera


    • Image alignment: concept and need
    • Motion estimation and image warping: two steps of image alignment
    • Overview of parameteric and non-parametric motion models
    • Forward and reverse image warping
    • Control-point based parametric motion estimation for affine transformation, homography, and difficulties faced in rigid motion estimation
    • Slides (Camera Geometry): pdf,pptx
    • Slides (Image Alignment): pdf
    22/01 (Fri: extra lecture)
    • Problems with control-point based image alignment
    • Image similarity metrics for image alignment: mean squared error, normalized cross-correlation: issues with field of overlap
    • Concept of joint histogram and its potential use for image alignment.
    • Slides: pdf
    • HW1 out, due 2nd Feb before 11:55 pm
    25/01 (Mon)
    • Concept of joint histogram and its potential use for image alignment.
    • Concept of entropy and joint entropy, and its use for image alignment
    • Application scenarios for image alignment: template matching, face recognition, image panoramas, denoising image bursts, Google art project, 3D model to 2D image registration
    • Slides: pdf
    • HW1 out, due 2nd Feb before 11:55 pm
    27/01 (Wed: extra lecture)
    • Orthogonal procrustes problem for determining rotation matrix: detailed derivation

    • Least squares methods in computer vision: definition and examples, least squares as a maximum likelihood problem assuming Gaussian distribution on the noise
    • Meaning of robust and non-robust methods
    28/01 (Thu)
    • Role of Laplacian and Generalized Gaussian distributions with shape parameter beta <= 1
    • Mean versus median, robust mean using the q-norm (0 < q <= 1)
    • Example of outliers in computer vision problems
    • Least Median of Squares (LMedS) algorithm, and its analysis; relationship between parameters P, p, k and S in LMedS
    • Slides: pdf
    • Appendix A.7 from Trucco and Verri (see moodle)
    • HW1 out, due 2nd Feb before 11:55 pm
    30/01 (Sat: extra lecture)
    • RanSac algorithm and its variants
    • Application of RanSac in robust image alignment - homography estimation
    • Relationship between least squares methods and pseudo-inverse

    • Optical flow: problem definition
    • Aperture problem: demo, brightness constancy and small motion assumption, derivation of brightness constancy equation
    • Slides (robust methods): pdf
    • Appendix A.7 from Trucco and Verri (see moodle)
    • Slides (optical flow): pdf
    • Read sections 8.3 and 8.4 from Trucco and Verri
    • HW1 out, due 2nd Feb before 11:55 pm
    01/02 (Mon)
    • Concept of regularization and its application to optical flow
    • Horn and Shunck algorithm: discrete formulation, update equations using Jacobi's method for inverting large systems
    • Examples of Horn and Shunck algorithm
    • Limitations of the Horn and Shunck algorithm
    • Slides (optical flow): pdf
    • Read sections 8.3 and 8.4 from Trucco and Verri
    • HW1 out, due 2nd Feb before 11:55 pm
    04/02 (Thu)
    • Lucas-Kanade algorithm and its multiscale version
    • Measures of reliability in the Lucas-Kanade algorithm
    • Applications of optical flow
    • Slides (optical flow): pdf
    • Read sections 8.3 and 8.4 from Trucco and Verri
    06/02 (Sat: extra lecture)
    • Optical flow: non-rigid image warping
    • SVD: reduced form, SVD as weighted summation of rank-1 matrices, Eckart-Young theorem for best low-rank approximation (will be required in structure from motion), application in image compression
    • Structure from motion: problem definition
    • Factorization algorithm by Tomasi and Kanade: assumptions
    • Statement of rank theorem in SfM and its proof
    • Slides (optical flow): pdf
    • Slides (SVD): pdf
    • Slides (structure from motion): pdf
    • Read section 8.5.1. from Trucco and Verri (for SfM)
    • HW1 solutions
    08/02 (Mon)
    • Factorization algorithm by Tomasi and Kanade: assumptions
    • Statement of rank theorem in SfM and its proof
    • Commplete factorization algorithm, SfM under noise (application of Eckart Young theorem), Newton's method for imposing metric constraints for unique factorization solution


    • Introduction to feature point tracking
    • Slides (structure from motion): pdf
    • Slides (feature point tracking): pdf
    • Read section 8.5.1. from Trucco and Verri (for SfM)
    • HW2 out (due 17th Feb)
    11/02 (Thu)
    • Introduction to feature point tracking
    • Motion estimation models: piecewise translation (similar to Lucas Kanade) and piecewise affine
    • Criterion for trackable points as per local structure tensor and its eigenvalues
    • Use of piecewise affine transformation model for feature tracking
    • Problem of disappearing feature points, newly appearing feature points, and occlusion during tracking
    • Comparison of piecewise translation and piecewise affine motion models
    • Applications of feature point tracking: facial expressions, video stabilization
    15/02 (Mon)
    • Introduction to shape from shading
    • Conceopt of image irradiance, scene radiance, and reflectance function
    • Lambert's cosine law, Lambertian reflectance model, concept of albedo, Phong's reflectance model
    • Shape from shading algorithm given orthographic projections and known reflectance model
    • Introduction to shape from needle map and the Poisson equation
    15/02 (Thu)
    • Midterm review session
    3/03 (Thu)
    • Distribution of midterm papers
    7/03 (Mon)
    • Stereographic projections in shape from shading
    • Poisson equation, issues in integrating gradient fields, solution to the Poisson equation using DFT, some interesting applications of the Poisson equation in image editing
    • Photometric stereo: determining surface normals and albedo from multiple images of a Lambertian object - each under a different light source direction
    • Cast and attached shadows: locating the shadows and specularities given at least 4 images
    10/03 (Thurs)
    • Adaboost: concept of weak and strong classifier, machine learning 101: classifier, training and test set
    • Basic adaboost algorithm, example of families of classifiers
    14/03 (Mon)
    • Viola and Jones face detector: detailed architecture, Haar-like features, concept of classifier cascade: detection rate and false positive rate
    17/03 (Thu)
    • Theoretical treatment of Adaboost: derivation of algorithm, deriving rule for updating classifier weights, training sample weights, criterion for next weak classifier to be chosen
    • Coordinate descent, and Adaboost as coordinate descent
    • Generalization properties of Adaboost (brief overview): concept of classifier margin, empirical evidence of increase in margin of the strong classifier across rounds of Adaboost
    28/03 (Mon)
    • Concept of (geometric binocular) stereo: concept of stereo baseline, disparity
    • Stereo with aligned camera axes: relationship between stereo disparity and depth
    • Intrinsic and extrinsic parameters of a stereo system
    • Epipolar geoemetry: left and right epipoles, epipolar line, epipolar plane, epipolar constraint
    • Concept of essential and fundamental matrix
    31/03 (Thu)
    • Degrees of freedom of essential and fundamental matrix, eight point algorithm, applications of fundamental/essential matrix
    • stereo reconstruction: known parameters; under known intrinisc and unknown extrinsic parameters
    • Correspondence problem and "solving it": comparing patches, inferring regularized disparity maps, dynamic programming method (ordering constraint and its failure)
    4/04 (Mon)
    • Compressed sensing: problem statement and motivation - conventional sensing and compression, concept of signals expressed as linear combination of columns of orthonormal matrices, sparsity of signals in discrete cosine transform bases, concept of compressed sensing and its potential applications
    • Shannon's sampling theorem and its limitations
    • Candes' puzzling experiment
    • Concept of incoherence of the measurement matrix with the signal representation basis
    • Signal reconstruction: uniqueness issues, L0-norm optimization and its NP-hard nature, L1-norm optimization; key theorem by Candes, Romberg and Tao and commentary on it
    • Slides on CS theory: pdf
    • HW5 is out, due 15th April before 11:55 pm
    6/04 (Wed)(extra lecture)
    • Theorem 1 by Candes, Romberg, Tao and commentary on it, comparisons to Shannon's sampling theorem
    • Motivation behind concept of incoherence
    • Concept of restricted isometry property and restricted isometry constant
    • Theorems 2 and 3 - CS reconstruction for compressible (not just sparse) signals, and under bounded noise
    • CS reconstruction: L1 norm versus L2 norm optimization
    • CS reconstruction: some toy experiments
    • Slides on CS theory: pdf
    • HW5 is out, due 15th April before 11:55 pm
    7/04 (Thu)
    • Candes' experiment and its results in terms of theorem 1
    • Algorithms for CS reconstruction: matching pursuit (MP) and orthogonal matching pursuit (OMP), comparison of these algorithms and their properties
    • Rice single pixel camera
    11/04 (Mon)
    • Rice single pixel camera
    • Coded aperture snapshot spectral imager (CASSI): compressive camera for acquisition of hyperspectral images; discussiom on color filter arrays and demosaicing
    • Hitomi's video camera (not for exam)