Learning 3D Human Pose from Structure and Motion
Pose
People
Abstract
3D human pose estimation from a single image is a challenging problem,
especially for in-the-wild settings due to the lack of 3D annotated data.
We propose two anatomically inspired loss functions and use them with a
weakly-supervised learning framework to jointly learn from large-scale
in-the-wild 2D and indoor/synthetic 3D data. We also present a simple
temporal network that exploits temporal and structural cues present in
predicted pose sequences to temporally harmonize the pose estimations.
We carefully analyze the proposed contributions through loss surface
visualizations and sensitivity analysis to facilitate deeper understanding
of their working mechanism. Jointly, the two networks capture the anatomical
constraints in static and kinetic states of the human body. Our complete pipeline improves the state-of-the-art
by 11.8% and 12% on Human3.6M and MPI-INF-3DHP, respectively, and runs at 30 FPS on a commodity graphics card.
Materials for Download
Visual ResultsCitation
Rishabh Dabral, Anurag Mundhada, Uday Kusupati, Safeer Afaque, Abhishek Sharma, Arjun Jain
|