CSAW: Curating and Searching the Annotated Web
Our ambition is to annotate mentions of named entities and
quantities on billions of Web pages,
improve query analysis, and thus,
enable searching with entities and relationships at an
unprecedented quality and scale.
Papers and Talk Slides
- Neural Architecture
for Question Answering Using a Knowledge Graph and Web Corpus.
With Uma Sawant, Saurabh Garg, and Ganesh Ramakrishnan.
Question Answering Using a Knowledge Graph and Web
Corpus. Uma Sawant, Soumen Chakrabarti, Ganesh Ramakrishnan.
SIGWEB Newsletter, 2018.
- Task-Specific Representation Learning for Web-scale Entity
Disambiguation. Rijula Kar, Susmija Reddy, Sourangshu Bhattacharya,
Anirban Dasgupta, Soumen Chakrabarti. AAAI 2018.
Graph and Corpus Driven Segmentation and
Answer Inference for Telegraphic Entity-seeking Queries.
Mandar Joshi, Uma Sawant and Soumen Chakrabarti.
- Quantity Queries on Web Tables: Annotation, Response and Consensus Models. Sunita Sarawagi and Soumen Chakrabarti. SIGKDD 2014.
- Joint Bootstrapping
of Corpus Annotations and Entity Types.
Siddhanth Jain, Hrushikesh Mohapatra and Soumen Chakrabarti. EMNLP 2013.
- Web-scale Entity Annotation Using MapReduce.
Shashank Gupta, Varun Chandramouli and Soumen Chakrabarti.
- Learning Joint Query Interpretation
and Response Ranking. Uma Sawant and Soumen Chakrabarti.
- Compressed Data Structures for Annotated
Web Search. Soumen Chakrabarti, Sasidhar Kasturi, Bharath Balakrishnan,
Ganesh Ramakrishnan, and Rohit Saraf. WWW 2012.
and Searching Web Tables Using Entities, Types and
Relationships. By Girija Limaye, Sunita Sarawagi and Soumen
Chakrabarti. In VLDB
- Collective Annotation of Wikipedia Entities in Web Text,
by Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, and Soumen Chakrabarti,
in SIGKDD 2009. Talk slides,
- Learning to Rank for Quantity Consensus Queries,
by Somnath Banerjee, Soumen Chakrabarti and Ganesh Ramakrishnan,
in SIGIR 2009.
Demo, poster, press, etc.
- Web-scale Entity-Relation Search Architecture. By Devshree Sane,
Ganesh Ramakrishnan, Soumen Chakrabarti. Poster
in WWW 2011.
- Curating and Searching the Annotated Web,
by Amit Singh, Sayali Kulkarni, Somnath Banerjee,
Ganesh Ramakrishnan, and Soumen Chakrabarti, in SIGKDD 2009.
- Search market to
get another engine. Business Standard, Thursday, Aug 27, 2009. (Disclaimer: We are definitely not in the engine market.)
We have some ancient Java code on SVN that we can share on request.
More recent code is here:
Related projects, services, products, links
(In approximate order of recency) Soumen Chakrabarti,
Saurabh Garg, Uma Sawant, Ganesh Ramakrishnan,
Shashank Gupta, Siddhanth Jain, Hrushikesh Mohapatra, Sasidhar
Kasturi, Devshree Sane, Apoorv Sharma,
Amit Singh, Sayali Kulkarni, Somnath Banerjee.
Partly supported by grants from IBM, nVidia, Google, HP Labs, Yahoo,
Microsoft Research, NetApp and SAP.