The problem of recovering the shape and motion of 3D deformable objects from monocular video sequences (popularly called structure from motion) is extremely challenging and hard to solve.Integrating ideas from diverse fields such as differential geometry, machine learning, non-linear and global optimisation theory, we have proposed novel optimisation algorithms for solving the structure from motion problem.
We use variations of the elegant Motion Factorisation framework to solve the structure from motion problem. Since our problem contest is more practical, and challenging, finding the correct factorisation is difficult problem using existing methods. The prevalent closed from solutions are known to be ill-posed and seldom work well for challenging datasets. We overcome this hardship using a special iterative non-linear optimisation scheme.
Summarised below is the few of the advancements pertaining to optimisation techniques for motion factorisation problem that we were able to achieve.
The majority of the optimisation techniques found in computer vision literature are myopic to geometric properties of the underlying data or the process which generates the data. For example, in the problem we are handling the parameter space consists of geometric objects like rotation matrices which have non-Euclidean structures. The classical optimisation routines were designed with computational convenience in mind and do not explicitly handle the geometrical constraints placed on the data or the parameter space. Recent work in numerical techniques have shown how geometric properties of the data and the parameters can be preserved, whilst not hampering the performance of the optimiser. In our work we show how such methods can be adapted for motion factorisation problems. The key observation was that geometrical parameters resides in manifolds whose analytical structure has been well-understood. Moreover, these analytical structures can be used to preserve the geometrical properties during the optimisation i
This video sequence shows the 3D reconstruction we obtain of a synthetic dataset consisting of a 3D animation of a shark. The top row shows the actual 2D input point tracks, whereas the bottom row gives the plot of our reconstruction juxtaposed with the ground truth. The scene is viewed from a camera directly above the shark
3D reconstruction for a motion capture dataset. The top row shows the actual 2D input point tracks, middle row is the reconstructed shape rendered from a novel viewpoint, whereas the bottom row gives the plot of our reconstruction juxtaposed with the ground truth. The scene is viewed from a camera placed directly above the head
Rendering of the recovered 3d pose from novel viewpoints. The top row shows the raw frames with features overlayed. The middle and bottom shows the recovered 3d pose rendered from two novel view points. The front view is identical and not shown.

This site is
XHTML Validated
CSS Validated