Title: The Open Data Indic Challenge
Ms. Alolita Sharma, Wikipedia
Date & Time: May 30, 2014 14:00
Venue: Conference Room, C Block, 01st Floor, Department of Computer Science and Engineering, Kanwal Rekhi (KReSIT) Building
India has 25 official languages which have millions of native users as well as many more smaller languages supported by 8 major scripts, distinct vocabularies and grammatical rules. But when it comes to Wikipedia and other large repositories of user generated content on the Web, all these languages are very poorly represented compared to their real world usage. There are many factors for this anomaly on the Web ranging from access to the Internet and computing devices to lack of language tools such as fonts and input tools and linguistic resources such as dictionaries, terminology glossaries, structured and linked data. As the Web transforms itself into the mobile Web, an enormous opportunity to access and distribute information in Indic languages to billions of people is at stake. To make this opportunity real, user generated content needs to be jumpstarted with open data of all categories. Non-digital forms of dictionaries, terminology glossaries, geo-data - all kinds of categorized data need to be transformed into digitally consumable, standardized, and structured open data repositories which can be leveraged by users to create and contribute digital content on platforms like Wikipedia. This talk will examine the barriers and solutions in creating open data repositories for Indic languages leveraging platforms like Wikipedia and language technologies for reading and creating content on the Web and mobile Web.
Speaker Profile:
Alolita Sharma is Director of Engineering for Internationalization and Localization at Wikipedia. She is driving the initiative for Wikipedia to build open source tools and technologies to support hundreds of languages. Alolita Sharma is an engineering manager and software engineer who has been working with open source software and has promoted open source adoption for more than a decade. She is on the board of the Software Freedom Law Center and a passionate advocate of open source and the open Web. She holds Bachelors and Masters degrees in Computer Science and speaks internationally on multilingual web, language technologies and standards, open source trends, women in technology and building successful developer communities.
