Research Projects Archive

Image Based Animations (Biswarup Choudhury and Ambareesha Raghothaman):
Image-based rendering techniques enable the synthesis of novel views of a scene directly from input images, unlike traditional computer graphics techniques, where the 3D geometry and surface reflectance properties of the surfaces in the scene need to be specified. It is very time-consuming to specify a realistic 3D model. Also, accurate specification of reflectance properties of materials in the scene is difficult.
Our endeavour is to take the idea of Image Based Rendering a step further, by creating motion using images. Specifically given a set of images of a static object, under a carefully chosen set of “basis lighting configurations”, and an arbitrary environment in the from of images again,our algorithm creates realistic motion along any arbitrary path composed realistically under novel illumination configurations. —

Rendering Rain Falling Effect (Pisith Hao):
There has been a manyfold increase in the computational speed of graphics hardware in recent times. This power afforded by modern graphics cards enables the possibility of simulating complex environmental phenomena, such as atmospheric special effects.
The main objective of the project is to create realistic rain falling effect in real time taking into account the refraction and reflection of raindrop by modeling the world as a cube map, and using environment mapping technique.

Customised 3D rendering engine(Lt Col Chetan Dewan):
The objective of the project is to customise any suitable open source rendering engine and transform it into a Graphics development toolkit/rendering engine for developing simulations for armed forces training. The development is part of previous ongoing project in which Coin3D a open source engine was selected and some customisation was carried out. The previous work was shelved as in the light of current survey carried out by the author it was realised that Coin3D had stagnated with time and a number of new opensource projects had far surpassed the enhanced Coin3D project both in features as well as performance. For more information you can also visit or

Radiosity (Alap Karapurkar):
Global illumination enables production of pictures that look less like those synthesized by computers. The methods advance our knowledge of the physical environment such as the simulation of light transport. The fast multipole method is a promising method for achieving fast solutions to many applied problems in engineering, biology, computer vision, and statistics. It has been called one of the ten most significant numerical algorithms discovered in the 20th century, and won its inventors, Vladimir Rokhlin and Leslie Greengard, the Steele prize. The algorithm enables the product of restricted dense matrices with a vector to be evaluated in O(N) operations, when direct multiplication requires O(N^2) operations. Despite these advantages, the FMM has not been widely used in computer graphics. In this report, we show how to apply techniques of the fast multipole method to accelerate a subset of global illumination called radiosity.

Global Illumination (Varun Singh):
Global Illumination is the leading technology used today to produce more and more real looking synthetic pictures. The generation of an image by the radiosity method consumes large amounts of time and space. The environment or the scene to be rendered consists of surfaces divided into patches and we need to calculate the interaction between these patches.This equation essentially requires us to calculate an integration term for the form factor calculation between two patches. Monte Carlo algorithm provides an approximate method to calculate denite integrals.

Realtime Raytracing of Point Based Models(Sriram Kashyap, Rhushabh Goradia):
Point-based representations of objects have been recently used as an alternative to triangle-based representations. Starting with a z-buffer style rendering, recent work has progressed to rendering point based models using raycasting, and more general raytracing, for producing photo-realistic illumination effects such as shadows and refraction. Our work advances the state of the art by showing how to render large models (several million points) in real time. We use a GPU to simultaneously provide effects involving shadows, reflection, and refraction in real time. Our system relies on an efficient way of storing and accessing hierarchical data structures on the GPU, as well as novel techniques to handle ray intersections with point based entities. —

System for Tumor Infiltrating Lymphocyte Detection (Andrew Janowczyk:
On January 1, 2012, in the United States there were approximately 13,683,850 men and women alive who had a history of cancer, making it a common threat to all families. As technology becomes more efficient, a trend towards computer aided diagnostic (CAD) tools for identification, prognosis prediction and re-occurrence likelihood is becoming a reality. The work in this thesis revolves around two methods that forward this technological front. First, we discuss Hierarchical Normalized Cuts (HNCuts) which has been expressly designed for high-throughput high quality segmentation of stained cells from histopathology images. Second, we discuss how the output of HNCuts can be fed into our Local Morphometric Scale (LMS) algorithm to provide pixel level classification of tumor versus stroma regions. We complete the talk by presenting and application and associated results in the domain of tumor infiltrating lymphocyte (TILs) detection, a valuable prognostic indicator for patient outcome.

Hierarchical Normalized Cuts: Unsupervised Segmentation of Vascular Biomarkers from Ovarian Cancer Tissue Microarrays (Andrew Janowczyk)
Research has shown that tumor vascular markers (TVMs) may serve as potential OCa biomarkers for prognosis prediction. The ability to quickly and quantitatively estimate vascular stained regions may yield an image based metric linked to disease survival and outcome. In this paper, we present a general, robust and efficient unsupervised segmentation algorithm, termed Hierarchical Normalized Cuts (HNCut), and show its application in precisely quantifying the presence and extent of a TVM on OCa TMAs. The strength of HNCut is in the use of a hierarchically represented data structure that bridges the mean shift (MS) and the normalized cuts (NCut) algorithms. This allows HNCut to efficiently traverse a pyramid of the input image at various color resolutions, efficiently and accurately segmenting the object class of interest (in this case ESM-1 vascular stained regions) by simply annotating half a dozen pixels belonging to the target class. —

Efficient Light Field based CameraWalk (Biswarup Choudhury and Aviral Pandey):
The light field rendering method is an interesting variation on achieving realism. Once authentic imagery has been acquired using a camera gantry, or a handheld camera, detailed novel views can be synthetically generated from various viewpoints. One common application of this technique is when a user “walks” through a virtual world. In this situation, only a subset of the previously stored light field is required, and considerable computation burden is encountered in processing the input light field to obtain this subset. In this paper, we show that appropriate portions of the light field can be cached at select “nodal points” that depend on the camera walk. Once spartanly and quickly cached, scenes can be rendered from any point on the walk efficiently.

Efficient Image Updates using Lightfields (Biswarup Choudhury and Aviral Pandey):
The light field rendering method is an interesting variation on achieving realism. Once authentic imagery has been acquired using a camera gantry, or a handheld camera, detailed novel views can be synthetically generated from various viewpoints.
One common application of this technique is when a user “walks” through a virtual world. In this situation, only a subset of the previously stored light field is required, and considerable computation burden is encountered in processing the input light field to obtain this subset. In this paper, we show that appropriate portions of the light field can be cached at select “nodal points” that depend on the camera walk. Once spartanly and quickly cached, scenes can be rendered from any point on the walk efficiently.

Image Based Rendering: (N N Kalyan)
The process of modelling appearance and dynamics of the real world is quite complex and has produced compelling imagery in computer graphics.unfortunately the curent geometry based methods have several drawbacks. recently models directly from photographs has received more interest as they have an advantage in producing photo-realistic images from image inputs.This is called image-based rendering.However,this also suffers from few disadvantages.There are many algorithms for image-based rendering and there are also many hybrid appoaches which draw strength from both geometry-based and image-based renderings.The seminar basically deals with the techniques of rendering photorealistic images.

A Survey of Image-Based Relighting Techniques (Biswarup Choudhury)
Image-based Relighting (IBRL) has recently attracted a lot of research interest for its ability to relight real objects or scenes, from novel illuminations captured in natural/synthetic environments. Complex lighting effects such as subsurface scattering, interreflection, shadowing, mesostructural self-occlusion, refraction and other relevant phenomena can be generated using IBRL. The main advantage of Image-based graphics is that the rendering time is independent of scene complexity as the rendering is actually a process of manipulating image pixels, instead of simulating light transport. The goal of this paper is to provide a complete and systematic overview of the research in Image-based Relighting. We observe that essentially all IBRL techniques can be broadly classified into three categories, based on how the scene/illumination information is captured: Reflectance function based, Basis function based, and Plenoptic function based. We discuss the characteristics of each of these categories and their representative methods. We also discuss about sampling density and types of light source, relevant issues of IBRL.

Vision-Based Posing of 3D Virtual Actors (Ameya Vaidya and Appu Shaji):
Construction of key poses is one of the most tedious and time consuming steps in synthesizing of 3D virtual actors. Recent alternate schemes expect the user to specify two inputs. Along with a neutral 3D reference model, more intuitive 2D inputs such as sketches, photographs or video frames are provided. Using these, of all the possible configurations, the “best” 3D virtual actor is posed. In this workr, we provide a solution to this ill-posed problem. We first give a solution to the problem of finding an approximate view consistent with the 2D sketch. Elements of this rigid-body solution are novel. Next, we provide a new solution to the process of extending or retracting limbs to more accurately suit the sketch. This posing algorithm, is based on a search based scheme inspired by anthropometric evidence. Less physical work is required by the actor to reach the desired pose from the base position. We also show that our algorithm converges to an acceptable solution much faster compared to the previous methods.

Markerless Motion Capture from Monocular Videos (Vishal Mamania & Appu Shaji):
Motion capture has attracted a lot of attention in recent times because fits power to generate large quantities of realistic animation economically and relatively quickly. The data so acquired is being used in a variety of situations, notably commercial movies and games. Most of this work is done in the motion capture studios in a very controlled environment. In this work, we generalize the motion capture environment. Specifically, we perform the motion capture of Bharatanatyam. The proposed method uses domain specific knowledge to track major joints of the human in motion from the two-dimensional input data. We then make use of various physical and motion constraints regarding the human body to construct a set of feasible 3D poses. A graph based approach is used to find an optimal sequence of feasible poses that represents the original motion in the video.

Grafting Locomotive Motion (Shrinath Shanbaug):
The notion of transplanting limbs to enhance a motion capture database is appealing and has been recently introduced. A key difficulty in the process is identifying believable combinations. Not all transplantations are successful; we also need to identify appropriate frames in the different clips that are cutpasted. In this paper, we describe motion grafting, a method to synthesize new believable motion using existing motion captured data. In our deterministic scheme designed for locomotive actions, motion grafts increase the number of combinations by mixing independent kinematics chains with a base motion in a given clip. Our scheme uses a cluster graph data structure to establish correlation among grafts so that the result is believable and synchronized.

Search and Transitioning for Motion Captured Sequences (Shrinath Shanbaug & Suddha Basu):
Animators today have started using motion captured (mocap) sequences to drive characters. Mocap allows rapid acquisition of highly realistic animation data. Consequently animators have at their disposal an enormous amount of mocap sequences which ironically has created a new retrieval problem. Thus, while working with mocap databases, an animator often needs to work with a subset of useful clips. Once the animator selects a candidate working set of motion clips, she then needs to identify appropriate transition points amongst these clips for maximal reuse. In this paper, we describe methods for querying mocap databases and identifying transitions for a given set of clips. We preprocess clips (and clip subsequences), and precompute frame locations to allow interactive stitching. In contrast with existing methods that view each individual clips as nodes, for optimal reuse, we reduce the granularity.

Synthesizing New Walk and Climb Motions from a Single Motion Captured Walk Sequence (Shrinath Shanbaug):
We describe a method to dynamically synthesize believable, variable stride, and variable foot lift motions for human walks and climbs. Our method is derived from a single motion captured walk sequence, and is guided by a simple kinematic walk model. The method allows control in the form of stride and lift parameters. It generates a range of variations while maintaining individualistic nuances of the captured performance. —

Projector-camera based Solutions for Simulation System (Nilesh Heda):
Projector based display systems are widely used as they offer an attractive combination of dense pixels over large regions. Traditionally, the projector is used for presentation purposes on single planar surface. However, it can be used for displaying on multi planar irregular surfaces. In this work, we discuss methods to use a projector along with a camera for displaying on irregular surfaces using projector-camera homography. In particular we would like to develop a shooting-range simulator system, using projector-camera system and laser pointer based interaction. —

Arrhythmia classification using local h'older exponents and support vector machine (Aniruddha Joshi)
We proposed a novel hybrid Holder-SVM detection algorithm for arrhythmia classification. The performance was evaluated using benchmark MIT-BIH arrhythmia database. Accuracy of around 96% was achieved. The distinct scaling properties of different types of heart rhythms may be of clinical importance.

Data Mining in Biomedical Signals (Aniruddha Joshi):
This projects Works in Arrhythmia Classification problem, in which classifying 12 classes of Arrhythmias (including normal) using Local Holder Exponents and Support Vector Machines has been completed. This project currently aims at computerizing Indian Ayurveda technique, “Pulse Diagnosis”, to analyze the behavior of 'Arterial Pulse' according to age, some specific disorders and so on.

5th International Seminar on Ayurvedic education, research and drug standardization (Aniruddha Joshi):
The “Nadi” or pulse has been used as a fundamental tool for diagnosis in “Ayurveda”. We provided a systematic measurement scheme to establish an objective diagnosis. The pulse waveforms show different rhythms, intensities, frequncy contents in normals and disorders considered, and thus is capable of classified by contempory machine learning algorithms. —

Content-Based Video Retrieval (Satwik Hebbar):
Automatic content based schemes, as opposed to human endeavor, have become important as users attempt to organize massive data presented in the form of multimedia data such as home or movie videos. One important goal, be it in shot detection, or scene detection, or compression is the ability to find the foreground pixels. This higher level task benefits from a graph-based description of the video. The normalized cut framework is appealing because it looks at the video from a global perspective. Unfortunately due to quadratic storage and time complexity, the algorithm appears to be infeasible to use on large videos. In this work, we combine a local approach that promises a good segmentation [1] with the normalized cut approach [2] and make graph based schemes tractable.

Video Shot-Detection Using Learning Techniques (M. Nithya):
Video has become an interactive medium of communication in everyday life. The sheer volume of video makes it extremely difficult to browse through and find the required information. Without knowing about its content, it is difficult to search the video. Manually analyzing the contents and indexing them is time consuming. The apparent alternative is to detect some events in the video automatically. The first step in automating the system is hot-detection which breaks the massive volume of video into smaller chunks called shots. This project aims in identifying shots.

Object Recognition and Content Based Image Retrieval (Krishna Kumar Rai):
This work explores the task of searching images in small databases based on the user's interest. Given a query image we try to find similar images. To do so we need to analyze the visual content of the images. Here the term 'visual content' refers to the color, texture, shape, spatial layout or any other visual information that can be derived from the image. Such an approach is generally called content based image retrieval or CBIR. CBIR has wide range of applications including remote sensing, art collection, photo archives, medical records etc. There are various ways to perform CBIR, including object recognition, region matching and metadata search etc. Object recognition is the most semantically rich way to search images but at the same time it is the most difficult one to perform effectively. Although this work is more focused upon object recognition, We have developed two techniques for CBIR. The first one is based on recognizing objects in the images to find similar images, while the second technique is based on finding similar regions (or segments) in the images.

Committee Based Active Learning for CBIR(Brahm Kiran Singh):
Active Learning has been shown to be a very effective tool for enhancing retrieval results in text retrieval. In Content-Based Image Retrieval(CBIR) it is more and more frequently used and very good results have been obtained.
This project is an extension to Committee-Based Active Learning approach for CBIR systems. Though this project aims at employing this approach using SVMs, the approach, like the traditional committee-based approach, is much more general and puts no restriction on the type of learning machines used. The aim is to employ committee-based method of Active Learning to improve the performance of CBIR systems. This report discusses a modification of the classical committee-based approach in that the initial sample selection for training the learners is modified — a sampling of the database is done, which is not entirely random.

Human Pose Extraction from Monocular Videos using Non-rigid Factorization (Appu Shaji and Sharat Chandran ):
We focus on the problem of automatically extracting the 3D configuration of human poses from 2D image features tracked over a finite interval of time . This problem is highly non-linear in nature and confounds standard regression techniques. Our approach effectively marries a non-rigid factorization algorithm with prior learned statistical models from archival motion capture database. We show that a stand alone non-rigid factorization algorithm is highly unsuitable for this problem. However, when coupled with the learned statistical model in the form of a constrained non- linear programming method, it yields a substantially better solution.

Intelligent Video Capturing (Ashutosh Sahu):
Videoconferencing has proved to be an eff ective tool for interaction between geographically distributed work teams. It has supported the transmission and recording of meetings for many years. However, such recordings of meetings are often monotonous and tedious to watch. Quite often, just one camera is used to capture video. Without multiple views, the users at the other end may lack the visual information needed to understand the meeting in its full context. Moreover, the shot does not change often, unless managed by hired professionals. This motivates the design of an automatic meeting capture system that uses cost-eff ective equipment and unobtrusive tracking, for capturing videos both in real-time and offline environments. The camera control algorithm running the system controls shot selection and handles errors, in both cases. Unlike the real-time environment, the offline environment off ers a lot of flexibility and scope to intelligently switch between various shots captured by various cameras. Audio feed has been the tried-and-tested method to detect speakers in the meeting. The goal is to explore the extent to which vision techniques using training and machine learning concepts can be applied for the purpose of detection of speakers. —

Motion Segmentation (Abhishek Ranjan):
Motion segmentation is a video analysis technique which aims at identifying and separating most prominently moving groups in a video. This technique has been used for solving various computer vision problems. Several approaches have been proposed to perform motion segmentation. An essential part of many of these approaches is analysis of frames of the video. Since these frames are images, image segmentation plays an important role in motion segmentation. In this project we have studied some of these approaches for motion and image segmentation. We concentrated on graph theoretic approaches to image segmentation.

Motion Factorisation by Geometrical Optimisation on SE3 Manifold (Appu Shaji and Sharat Chandran and David Suter ):
We presents a novel formulation for the popular factorisation based solution for Structure from Motion. Since our measurement matrices are populated with incomplete and inaccurate data, SVD based total least squares solution are less than appropriate. Instead, we approach the problem as a non-linear unconstrained minimisation problem on the product manifold of the Special Euclidean Group ($SE_3$). The restriction of the domain of optimisation to the $SE_3$ product manifold not only implies that each intermediate solution is a plausible object motion, but also ensures better intrinsic stability for the minimisation algorithm.

Isometry-based Structure from Motion(K. P. Ashwin):Structure from motion refers to the process of estimating the 3D structure of a moving object using image measurements taken over a period of time and. This has been an active area of research for a lot of years and is generally considered a hard problem to solve. In this work, we propose a novel solution to solving the age-old Structure from motion problem. We review this problem in the setting of Riemannian Geometry, in which shapes are represented by points on the surface of a high dimensional space. We use isometric constraints on the shape to identify the most likely deformation of the model.

Structure From Motion in Isometry Shape Space(Chirag Patel): Structure from Motion refers to the process of estimating the camera motion and the rigid or deforming three-dimensional structure from image measurements taken over a period of time. The problem is the one of the most difficult problem in the field of Computer Vision and many researchers have worked on it. In this project, we introduce a novel framework to solve to the problem. We reduce Structure from Motion Problem into an optimization problem in a Shape Space with a Riemannian metric which discourages non-isometric deformations. Proposed approach is expected to work better with most of the deforming objects we see in our surroundings.

skeleton-based pose estimation of human figures(Lakulish Antani):
Pose estimation of human figures is a challenging open problem. Model-based approaches, which can incorporate prior knowledge of the structure and appearance of human figures in various poses are the most promising line of research in this area. Most models of the human body represent it in terms of a tree of interconnected parts. Given such a model, two broad classes of pose estimation algorithms exist: top-down and bottom-up. Top-down algorithms locate body parts in a top-down order with respect to the tree of parts, performing a structure-guided search. Bottom-up algorithms, on the other hand, first look for potential parts irrespective of which parts they may be (usually based on local image properties such as edges), and then assemble a human figure using a subset of these candidate parts. Both approaches have pros and cons, and there are compelling reasons to develop a hybrid approach.
We describe a model-based pose estimation algorithm that combines top-down and bottom-up approaches in a simple manner. We describe a bottom-up part detector based on the skeleton transform, and a skeleton computation pipeline that uses existing algorithms for computing a pruned skeleton transform of any image. We describe a top-down pose estimation algorithm based on pictorial structures which we combine with the skeleton-based part detector. We also describe a way of imposing a structure on the space of candidate parts by computing a hierarchy between skeleton fragments, and use this structure to facilitate pose estimation. We compare our pose estimation algorithm with the classic pictorial structures algorithm, and compare our skeleton-based part detector with another edge-based part detector, and provide ideas for improving our method. —

3d File-format converter (Veerendra Singh):
It converts standard Openflight File-format to Coin3D file-format. My role in the porject was to import Degree-Of-Freedom feature from OpenFlight to Coin3D format.

Face Detection and Localisation (Abhineet Sawa):
Automatic recognition of human face is a significatant problem in the development should accomplish the following tasks. For an arbitary picture determine whether it contains any faces. If so determine the number of faces as well as their posotion and size. Identify a person from his/her face. Make a description of facial expression.(Smile, Surprise & so on). Make a description of each face. Find a certain face according to a given description. A first step in any face processing system is the detection from a single image is a challenging task because of variablity in scale, location, orientation, pose, facial expression, occlusion, and lighting conditions.

Building Vision Based Interaction Systems(Nekhil Agrawal):
The main aim of this project is to build a virtual model and enable multiple users to interact with the model, using concepts of computer vision and computer graphics. The development of project can be divided into two major portions.
1. When the user presses trigger of the camera mounted on his gun, then using techniques of camera calibration we figure out where the user is located in the room and the direction of hit.
2. Then corresponding location and orientation in the model and the point of hit is calculated.