Video Analytics and Surveillance softwares for real-time analysis (intrusion, loitering, tracking) and post-mortem analysis (video search, summarization)
CORDS (COResets and Data Subset selection), DISTIL (Deep dIverSified inTeractIve Learning), SPEAR (Semi-suPervisEd dAta pRogramming), SUBMODLIB (SUBMODular optimization LIBrary)
Optical Character Recognition (OCR) and Scene text Recognition for Indian languages and Indian context
Video Analytics softwares for security applications includes real-time analysis such as intrusion, loitering, tracking (codenamed SurakshaVyuha, now being productized by SrivisifAI as 3rdAI) and post-mortem analysis - video search (Jigyasa), summarization (VideoSummy and VISIOCITY). These are ongoing projects, initiated by Prof. Ganesh Ramakrishnan in 2016, which got incorporated in 2017 as a part of the National Centre of Excellence in Technology for Internal Security (NCETIS). NCETIS has been strongly facilitating as well as promoting the work to the Indian Navy, Indian Army and Several state police forces. We also feature below video analytics for compliance and quality monitoring (Drishti), a work with the Ministry of Rural Development, Government of India.
Now being productized as 3rdAI, this CCTV Video Analytics solution includes real-time intrusion detection, perimeter monitoring, loitering detection, object tracking, etc.
Credentials to use the software can be provided upon acknowledgement/request.
Jigyasa is a Video Repository Indexing and Search platform with features like text search, face search, etc.
Credentials to use the software can be provided upon acknowledgement/request.
Videosummy can condense hours’ worth of video into a couple of minutes, by preserving key events and vignettes from your original video and removing repetitive visual information. Read in details, our VISIOCITY dataset and benchmark algorithms and evaluations for video summarization across different domains such as surveillance, TV shows, sports, education, events (birthdays and weddings), and different facets of summarization therein.
Drishti is an ongoing project with the Ministry of Rural Development (MoRD), Government of India to provide automated compliance and quality monitoring solution for their DDU-GKY skill development scheme (Deen Dayal Upadhyaya Gramina Kaushalya Yojana). It supports analytics like instructor face recognition, student count, uniform compliance, punctuality detection etc. on classroom videos.
Softwares and libraries for data efficient machine learning with less data.
CORDS (COResets and Data Subset selection)
DISTIL (Deep dIverSified inTeractIve Learning)
SPEAR (Semi-suPervisEd dAta pRogramming)
SUBMODLIB (SUBMODular optimization LIBrary)
Visit the project page https://decile.org
Optical Character Recognition (OCR) and Scene text Recognition for Indian languages and Indian context
End-to-end framework for Error Detection and Corrections in Indic-OCR. The system inputs a PDF file of a book, and obtains the OCR output using IndSenz OCR and Google OCR, and then corrects the OCRed text, and provides suggestions for words that probably have mistakes during the OCR method, hence any mistakes during the OCR process can be corrected by the user.
Where Human-Like Intelligence Meets Drone Technology
Our Behavior Analysis system is an AI-driven solution that detects and interprets human behaviors in real time. Designed for drone-based surveillance, it efficiently monitors vast or hard-to-reach areas and identifies complex anomalies such as violence, loitering, etc . Powered by Miphi semiconductors chips, the system runs heavy behavior analysis models on lower-end GPUs such as the RTX A6000, achieving faster inference through an optimized compute–time trade-off. These chips enable real-time processing and threat detection without the need for extensive high-end hardware resources.
Building on this foundation, our system incorporates Open-World Video Anomaly Detection (VAD) to handle unseen or evolving behaviors, ensuring adaptability in dynamic environments. By integrating machine learning and vision-language models, it continuously learns from live feeds, generates real-time alerts, and minimizes false positives for reliable situational awareness.
Our project focuses on enhancing drone usability through intelligent path planning to address issues like signal loss during long-range flights. We are reviewing and improving existing path planning algorithms—traditionally designed for 2D grids—to function efficiently in 3D environments. The goal is to enable future integration with machine learning models for autonomous flight control. Additionally, we are optimizing these algorithms for deployment on edge devices to ensure real-time performance and scalability.
The work currently focused is mostly related to a single agent, but we already have a roadmap for how we will be integrating it in a multi-agent environment as well. The foundation of this area is the work currently being done by us in the single-agent domain. Key research topics in this area include dynamic formation maintenance for swarms of drones and collaborative behaviour specialized to the task at hand.
The goal of this domain of research is to make drones that not only have specialized roles, but also how multiple such entities can work together to perform more generalized roles as well.
To enhance the decision-making capabilities of personnel monitoring drone feeds—either live through mobile ground control stations or during post-processing of stored footage—it is crucial to develop models that can generate concise summaries and extract answers to natural language questions. Video summarization plays a vital role in efficiently processing large volumes of visual data, ensuring that only the most relevant information is retained. A particularly effective approach involves leveraging submodular functions, which provide a lightweight framework for selecting representative frames while maintaining diversity and coverage. Video Question Answering (Video QA) can be effectively handled by Vision-Language Models (VLMs), which can be further enhanced using subset selection. We are also working on adapting these models to improve drone-based surveillance and monitoring.
Selected publications related to the projects.
In Proceedings of The 11th IEEE Winter Conference on Applications of Computer Vision (WACV 2023).
Accepted paper at the Thirty-sixth Conference on Neural Information Processing Systems (Neurips 2022).
Accepted paper at the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi (Demo paper).
In Proceedings of The 36th AAAI Conference on Artificial Intelligence (AAAI 2022).
In Proceedings of The 35th AAAI Conference on Artificial Intelligence (AAAI 2021).
Accepted paper at the 38th International Conference on Machine Learning (ICML 2021).
Accepted paper at the 38th International Conference on Machine Learning (ICML 2021).
Accepted paper at the 59th Annual Meeting of the Association for Computational Linguistics (ACL 2021 Findings).
Accepted at The 29th ACM International Conference on Multimedia (ACMM 2021)
Accepted at ICDAR 2021 Workshop on Camera-Based Document Analysis and Recognition (CBDAR 2021, 9TH EDITION)
Accepted paper at The 44th International ACM Conference on Research and Development in Information Retrieval (SIGIR), Resource Track, 2021.
In Proceedings of The 9th IEEE Winter Conference on Applications of Computer Vision (WACV 2021).
In Proceedings of The 28th ACM International Conference on Multimedia (ACMM 2020), Seattle, USA.
Accepted paper at the 21st INTERSPEECH Conference (Interspeech 2020), Shanghai, China
Accepted paper at the 7th IEEE Winter Conference on Applications of Computer Vision (WACV), 2019, Hawaii, USA.
Accepted paper at the 7th IEEE Winter Conference on Applications of Computer Vision (WACV), 2019, Hawaii, USA.
Accepted paper at the 7th IEEE Winter Conference on Applications of Computer Vision (WACV), 2019, Hawaii, USA.
Accepted paper at the 7th IEEE Winter Conference on Applications of Computer Vision (WACV), 2019, Hawaii, USA.
Best demo paper award. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, CoDS-COMAD '19, Kolkota, India.
In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), New Orleans, Louisiana, USA.
In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), New Orleans, Louisiana, USA.
International Conference on Document Analysis and Recognition (ICDAR) 2017, Kyoto, Japan.
1st International Workshop on Open Services and Tools for Document Analysis (ICDAR- OST) 2017, Kyoto, Japan.
Proceedings of the 17th World Sanskrit Conference, Vancouver, 2018.
Research and Innovation Symposium in Computing (RISC) 2017 (Most Admiring Poster Presentation Award), IIT-Bombay, India.
In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Beijing, China, July - 2015
Featured On : Digital Terminal, Tech Gig, CIOL, MSN, Linkden
“This collaboration is focused on accelerating research, development, and commercialisation of AI-powered hardware and software solutions, with a key focus on advancing the efforts for AI-Powered Drone Innovation.”
Official Facebook Post by Govt. of India
Ganesh Ramakrishnan, Faculty, CSE (Principal Investigator); Rishabh Iyer, Faculty, UT Dallas (Collaborator)
Rohit Saluja (Research Scholar, Document Analytics); Vishal Kaushal (Research Scholar, Video Analytics)
Vikram Bansal Senior Scientist; Ajoy Raj Assistant Project Manager; Ramana Raja Budala Project Research Engineer
Palak Oza, MDM (Project Staff); Om Wagh, MDM (Project Staff); Himanshu Patil, MDM (Project Staff)
Anurag Borkar, MDM (Research Student); Harsh Khurana, MDM (PhD Student); Saurbh Singh Jamwal, MDM (Research Student)
Pankaj Singh, Aify (Collaborator)
SrivisifAI Technologies Pvt. Ltd. (Industry Partner)
MiPhi Semiconductors Private Limited (Industry Partner)
For further details or if you are interested in using the softwares.