Profile Photo
"My father's wit, and my mother's tongue, assist me!" ~William Shakespeare

Hello,नमस्ते,Olá,สวัสดี,Bonjour 🤗

I am Sanjeev Kumar, a TCS Research Fellow, PhD student at the Department of Computer Science and Engineering (CSE) at IIT Bombay, working under the guidance of Prof. Preethi Jyothi and Late Prof. Pushpak Bhattacharyya. My research focuses on exploring the domain of extremely low-resource Indian languages, primarily focused on languages of the Indian state of Bihar, Such as Angika, Magahi, Bhojpuri and Maithili, investigating the transferability of knowledge from SOTA language models trained on large amount of data. The aim is to leverage this knowledge to develop tools for extremely low-resource languages.
My research interests include the intricacies of Machine Learning for Natural Language Processing (NLP). I am particularly interested in crafting NLP solutions tailored for extremely low-resource Indian languages. These designed solutions will offer consistent support across diverse languages, domains, populations, and individuals.

Updates

  1. 📄 [Feb-2026] Pre-Print Paper titled "Evaluating Extremely Low-Resource Machine Translation: A Comparative Study of ChrF++ and BLEU Metrics" Paper,
  2. 🎓 [Feb-2026] Selected as a Student Volunteer at EACL 2026, Rabat, Morocco.
  3. 📄 [Jan-2026] Paper titled "SrcMix: Mixing of Related Source Languages Benefits Extremely Low-resource Machine Translation" accepted at EACL 2026 (findings)! Paper, Code, Dataset
  4. ✈️🎓 [Dec-2025] Visiting Research Scholar at The University of Sheffield, UK, hosted by Prof. Nikos Aletras.
  5. 🌍 [Jun-2025] Awarded a partial scholarship to attend MLSS 2025, Dakar, Senegal. [Couldn't avail]
  6. 🌍 [Mar-2025] Awarded Scholarship to attend Oxford Machine Learning Summer School (OxML'25) in London, UK. [Couldn't avail]
  7. 🎤 [Oct-2024] Attended the Nvidia AI Summit in Mumbai, India, from October 23–25, 2024.
  8. ✈️ [Aug-2024] Attended the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) in Bangkok, Thailand.
  9. ✈️ [Jul-2024] Attended the LxMLS'2024 in Lisbon, Portugal.
  10. 📄 [May-2024] Paper titled "Part-of-speech Tagging for Extremely Low-resource Indian Languages" accepted at ACL 2024 (findings)! Paper, Poster, Code, Dataset
  11. 🌍 [May-2024] Selected to attend LxMLS'2024 at Lisbon, Portugal from July 11th to July 17th.
  12. 🛺 [Dec-2023] Attended IndoML-2023 at IIT Bombay, India.
  13. 🚆 [Dec-2023] Attended ICON-2023 at University of Goa, India.
  14. 🗣️ [Dec-2023] Gave a talk at seminar titled "Decoding Big Data: Unveiling Insights and Applications" at Pillai College of Engineering Navi Mumbai, India. Slides
  15. 🎓 [July-2023] Selected for TCS Research Fellowships Cycle-17 (2023 - 2027) 🎉 Dept Spotlight
  16. 🎓 [July-2023] Officially a PhD candidate now! 🎉
  17. ✈️ [Feb-2023] Attended ACM India annual event at OSIT Bhopal, India.