Instructions for the Project:
- Projects can be done individually, or in pairs, or maximum in
groups of 3. Under no circumstances, should the group size be more than
3. I will expect more work if there are 3 people in a group than fewer
people. M Tech students - I recommend you do the project alone.
- All group members should have individual unique contributions to
the group (which must be stated clearly in your report), and yet, be
aware of what the other group members did.
- Your project work may be a research paper implementation, or your
own idea. In the latter case, please speak to me to discuss issues such
- Your project work may contribute to your thesis, but the work you
submit for this course must be done in this semester. It should be a
- Project due date: week before finals.
- You may use MATLAB/C/C++/Java/Python + any packages (OpenCV,ITK,
etc) for your project. But merely invoking calls to someones else's
software is not substance enough. You should have your own non-trivial
coding component. If software for the research paper you implement is
already available, you should use it only for comparison sake - you
will be expected to implement the paper on your own. Please discuss
with me if you need any clarifications for your specific case.
- Due date for forming project group and deciding topic: 5th to 12th September (send me an email).
Instructions for Project Submission/Demo
- You will demo your project in my office sometime from Nov 13th to Nov 15th. Time-table is here.
- You should submit the following on moodle BEFORE you come for the demo: (1) a copy of your code with a neatly written README, and a link to all relevant data (any person should be able to
run your code independently), (2) a write-up containing (i) a statement of the problem you are solving, (ii) description of experiments and listing of results (tables/diagrams/graphs/images, etc.),
(iii) a list of inferences from the experiments (example: what's cool about the results, what's lacking in them and how to improve those results, etc.), (iv) a statement telling me what each group
- I will ask you many questions about your implementation and about the paper(s) you read or implemented.
Links to Image Datasets
List of potential project topics:
(Note: While reading a research paper for the first time, read its abstract, conclusion, experimental results (in that order). Leave most of the introduction for much later. The introduction often contains a literature survey which might
be a tad overwhelming. Always start your project from simple implementations, building up step by step. Don't try the most challenging cases first or all at once.)
- Detection of citrus fruits from images. Background literature: here, and here.
- Image deblurring using reverse heat equation and reverse mean shift - read paper here and chapter 6 of the first author's PhD thesis here.
Develop an interactive, semi-automated tool to restore "foggy" scenes:
Interactive Deweathering of an Image using Physical Models
Srinivasa G. Narasimhan and Shree K. Nayar
IEEE Workshop on Color and Photometric Methods in Computer Vision, in Conjunction with ICCV, October, 2003.
This one uses PCA (a technique we will learn in class) for object recognition:
H. Murase and S. Nayar, "Visual learning and recognition of 3D objects from appearance", International Journal of Computer Vision, 1995.
Useful data is here.
Apply the PCA technique to detect faces in images. Can you handle variations in pose, illumination, scale? Can you make your technique fast, faster, fastest?
Apply the PCA technique to detect cars in images. Can you handle variations in pose, illumination, scale, vehicle type and color? Can you make your technique fast, faster, fastest?
Using PDEs for image compression:
M. Mainberger and J. Weickert, "Edge-Based Image Compression with Homogeneous Diffusion", CAIP 2009 AND C. Schmaltz, J. Weickert and A. Bruhn, "Beating the Quality of JPEG 2000 with Anisotropic Diffusion", DAGM 2009.
(This one is mathematical but you will learn a lot from this paper and also generate some really cool-looking results on color images) Smoothing of color images using PDEs:
Regularization with PDEs: A Common Framework for Different Applications"
For how many iterations does one run a PDE? Find out by analyzing the residual image: the difference between the noisy image and its filtered form.
D. Brunet, E. Vrscay and Z. Wang, "The use of residuals in image denoising", ICIAR 2009 (be prepared for some math, though the implementation is not so hard!)
A nice way of repairing holes in textures and text images (the math in the paper might seem daunting but the actual algorithm/technique is easy and intuitive). You may want to think of some nice application of this method for some image processing problem
different from what is in the paper (if you want).
Texture Synthesis by Non-parametric Sampling, Alexei A. Efros, Thomas Leung, In ICCV 1999
Learn how to use your IIT education to become a barber - rather e-barber :-)
Image-based Shaving, Minh Hoai Nguyen, Jean-Francois Lalonde, Alexei A. Efros, Fernando de la Torre, in Eurographics 2008.
Single-Image Super-Resolution: convert a low-resolution image into a higher resolution image using a simple machine learning method.
H. Chang, D.-Y. Yeung & Y. Xiong, Super-resolution through neighbor embedding, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, 27 June-2 July 2004, Volume: 1, pp. I-275 - I-282.
Another approach to image super-resolution here.
Image Quilting for Texture Synthesis and Transfer, check out this webpage
We will study a technique in class called the Hough Transform. It is used to detect shapes such as lines, circles, ellipses, etc. in images. Implement a Hough-transform based scheme to detect some or all of the following: roads and road-lanes in images, AND/OR
buildings in urban panoramas, AND/OR vehicles such as trucks in traffic videos, AND/OR advertisement billboards or traffic sign-posts from images. can you speed up your implementation if you have a video sequence instead of a single image?
Detection of faces by detecting regions of skin from color images. Look for a short tutorial here. Check references therein.
A technique to use joint entropy to detect cuts or sudden changes in a video sequence (called as video shot detection): "Information Theory-Based Shot Cut/Fade
Detection and Video Summarization". Find the paper here.
One more approach for video shot boundary detection using color image histograms: here.
And one more approach to video shot detection: Theodore Vlachos, Cut Detection in Video Sequences Using Phase Correlation, IEEE Signal Processing Letters, Vol. 7, No. 7, July 2000, pp. 173-175
For machine learning people: implement the technique in this paper, which compares eigenfaces with the so-called "fisherfaces".
Application in forensics - detection of image splicing (a spliced image is a manipulated image created by pasting parts of different images to create a composite one that can trick a viewer into believing that it was a genuine/non-manipulated image):
Exposing Image Splicing with Inconsistent Local Noise Variances
Given an image frame, find the most similar image frame from a database of images, using color image histograms: see paper here.
Detecting regions of text in natural scenes. Follow your own intuition or implement ideas from here. Some of the methods given in this paper can actually work quite well for
If you scan or take pictures of pages from a book, the resulting images have multiple problems. They can be noisy, have poor contrast, at improper orientation or they can be skewed (this especially happens to lines of text that happen to lie in the central
portions of hard-bound books, because those pages cannot be laid out flat on a flat surface). Implement a system that can correct as many of these problems as you can. You may some of the steps interactively (in a semi-automatic way). You can see some links
- When you apply for a foreign country visa, you need to submit a photo. The photo has several specifications: it should contain a frontal view of your face, the entire face should be visible and no large head rotations are allowed, the expression should be
neutral, the background should be of a particular color (no cluttered backgrounds allowed), the resolution should be acceptable (not too low), there should be no scarves or other accessories occluding parts of the face, the spectacles should not have a large
glow on them, and so on. Try to implement a system that will check for as many of these specifications as you can (you need not do all). Let your imagination run wild! You can add other specs here that you deem fit. You can take a look at the photo requirements for a US visa . This was a course assignment in a previous offering of the course!
Take an image (a photo of a scene or a face) and convert it into a cartoon. You may apply anisotropic diffusion, bilateral filtering (the filter you implemented in assignment 2) to make the image look
piecewise constant, and follow
it up with edge detection. Make your outputs look as
as good as possible. The process of converting a face image into a cartoon is called as toonification. Here is a sample course project done at Stanford. And here (Flow based image abstraction) is an interesting paper, which applies modification of
the bilateral filter to convert a photo into a cartoon. Here is another idea - to extend this for video. The paper is here. You can try to explore: what happens if you independently applied the toonification procedure to individual frames?
Do you think you will get weird results if one frame was sligthly noisier than the previous one? Is there something you can do about it?