S3VQA provides a new approach that involves Select, Substitute, and Search (SSS) for open-domain visual question answering. S3 reaches the end result for the VQA type query by first reformulating the input question and then retrieving external knowledge source facts
Webpage PaperRUDDER contains video that describes the creation of scientific toys from waste material. Till time existing datasets have data of videos and their relevant sentences/captions in English but RUDDER has data of videos, sentences/captions and audio too.
Webpage PaperWe provide a detailed analysis of modality bias in the existing HAN architecture, where a modality is completely ignored during prediction. We also propose a variant of feature aggregation in HAN that leads to an absolute gain for visual modality.
Webpage Paper Jayaprakash A
IIT Bombay
Abhishek
IIT
Bombay
Jatin Lamba
IIT
Bombay
Mayank
Kothyari
IIT Bombay
Rishabh Dabral
IIT Bombay
Preethi Jyothi
IIT Bombay
Ganesh Ramakrishnan
IIT Bombay