COMAD 2005

Contributed Papers and Other Sessions

Preface

Conference Committee

Program Committee

Editors:

Jayant R. Haritsa

Indian Institute of Science, Bangalore, India

T.M. Vijayaraman

Persistent Systems Private Limited, Pune, India

No part of this publication can be reproduced in any form or by any means without the prior written permission from CSI.

The opinion expressed and figures provided in the COMAD-2005 proceedings are the sole responsibility of the authors. The publishers and the editors bear no responsibility in this regard.

Contributed Papers

Session: Data Mining I

Learning to extract information from large websites using sequential models ................. 3

V.G.Vinod Vydiswaran, Sunita Sarawagi

Preference Queries with SV-Semantics............................................................................. 15

Werner Kießling

SynDECA: A Tool to Generate Synthetic Datasets for Evaluation of Clustering Algorithms ......................................................................................................................................... 27

Jhansi Rani Vennam, Soujanya Vadapalli

Session: Data Warehousing and Active Databases

Functional Dependency Driven Auxiliary Relation Selection for Materialized Views Maintenance..................................................................................................................... 37

Mukesh Mohania, P.Radha Krishna, K.V.N.N Pavan Kumar, Kamalakar Karlapalem, Millist Vincent

Partially Materialized Partitioned Views.......................................................................... 46

Satyanarayana R Valluri

Formalization and Detection of Events Using Interval-Based Semantics ......................... 58

Raman Adaikkalavan, Sharma Chakravarthy

Session: Streams and XML

A Temporal Foundation for Continuous Queries over Data Streams................................ 70

Jürgen Krämer, Bernhard Seeger

Estimating Missing Values in Related Sensor Data Streams.............................................. 83

Mihail Halatchev, Le Gruenwald

Efficient Handling of Sibling Axis in Xpath...................................................................... 95

G.V.Subramanyam, P. Sreenivasa Kumar

Session: Applications I

TAP: A Platform for Enabling Enterprises to Develop Business Specific Text Analytic Applications...................................................................................................................... 103

Neeraj Agrawal, Scott Holmes, Sachindra Josh, Sumit Negi

Model Driven Development of Content Management Applications................................... 112

Prasad M. Deshpande, Brendan McNichols, Michael Richmond, Savitha Srinivasan, Vladimir Zbarsky

Session: Specialized Indexing

An Architecture for Searching and Indexing Latex Equations in Scientific Literature...... 122

Ashish Lohia, Kirti Sinha, Soujanya Vadapalli, Kamalakar Karlapalem

Indexing Large Moving Objects from Past to Future with PCFI+-Index........................... 131

Zhao-Hong Liu, Xiao-Li Liu, Jun-Wei Ge, Hae-Young Bae

LWI and Safari: A New Index Structure and Query Model for Graph Databases ........... 138

Srinath Srinivasa, Martin Maier, Mandar R. Mutalikdesai, Gowrishankar K. A., Gopinath P. S.

Session: Data Mining II

Association Rules Mining Using Heavy Itemsets............................................................... 148

Girish K. Palshikar, Mandar S. Kale, Manoj M. Apte

Fast Frequent Pattern Mining in Real-Time...................................................................... 156

Rajanish Dass, Ambuj Mahanti

Sessions: Applications II

Database Access Design for E-Business - A Case Study................................................... 168

Santosh Dwivedi, Bernard Menezes, Ashish Singh

A Comparative Study of Mobile Agent and Client-Server Technologies in a Real Application......................................................................................................................................... 176

R. B. Patel, Kumkum Garg

Time Series Forecasting through Clustering - A Case Study............................................. 183

Vipul Kedia, Vamsidhar Thummala, Kamalakar Karlapalem

Other Sessions

Keynote Addresses.................................................................................................... 193

Invited Talks................................................................................................................. 195

Panels............................................................................................................................ 197

Tutorials........................................................................................................................ 199

Author Index................................................................................................................. 201

Keynote Addresses

Keynote 1

Title: Timber: A Native XML Database Management System

Speaker: H V Jagadish, Univ. of Michigan, USA, and NUS, Singapore

Keynote 2

Title: Challenges of Information Integration

Speaker: C N Ram, President, HDFC Bank, India

Invited Talks

InvitedTalk 1

Title: Tesla - On Demand Information Systems

Speaker: V. S. Batra, IBM India

Invited Talk 2

Title: Emerging Trends in OLAP

Speaker: Vaishnavi Sashikanth, Hyperion Solutions, USA

InvitedTalk 3

Title: Small Device Data Management

Speaker: Rajkumar Sen, IIT Bombay, India

InvitedTalk 4

Title: Privacy-Preserving Data Mining

Speaker: Shipra Agrawal, IISc, Bangalore, India

Panel Discussions

Panel 1 Core DB Research: Is It Relevant Anymore?

Traditionally database research has concentrated on performance, scalability, and has assumed a DB-centric view of the world where database systems were deployed as a whole system. The advent of Internet has not only brought forward a user-centric view of the world but also has forced database and other systems to play the role of a component as part of a larger system/infrastructure. This has necessitated support for newer types of data structures (e.g., HTML/XML), their storage, and processing as well as schema-less databases and other related topics including information exploration.

In addition, there have been a number of new areas that are being addressed by the database community. They include mining (stream, text, web, …), stream data processing, information integration, semantic webs, non-traditional information management (e.g., bio, environmental, …). We are also witnessing a coming together of sorts of information retrieval and database research.

This panel will try to sort out the role of database research as we see it today and the relevance of traditional techniques as well as new techniques that are being pursued. Also, as the role of a DBMS as a component in a large system is becoming equally important, the panel will also address this aspect.

Panelists

Sharma Chakravarthy, University of Texas-Arlington (Moderator)

H V Jagadish, University of Michigan-Ann Arbor

V Govindarajan, Aztec Software, Bangalore

S Seshadri, Yahoo India, Bangalore

Panel 2 Database Research for Social Empowerment: Does It Exist?

This panel addresses the question of "socially relevant" “research” issues in database technology. Is database research influenced by social and cultural contexts? If so, what kind of research questions exist in disadvantaged contexts? If not, then does it mean that the socio-cultural context has absolutely no relevance to database research? What drives research in databases? What would it take to build a vibrant database research setup in marginalized contexts?

These questions will be debated by a set of four panelists, who are known for their work in various areas of database research. They would be taking a stance on the larger question and give their views on whether it is pertinent or counter-productive to focus database research onto specific socio-cultural contexts.

Panelists

Srinath Srinivasa, IIIT Bangalore

Vikram Pudi, IIIT Hyderabad

P Krishna Reddy, IIIT Hyderabad

Sunita Sarawagi, IIT Bombay

A Kumaran, IISc Bangalore

Tutorials

Tutorial 1

Querying and Mining Data Streams: You Only Get One Look

Rajeev Rastogi, Lucent Bell Labs, India

Tutorial 2

Approximate Query Processing Techniques

Gautam Das, University of Texas-Arlington, USA

Tutorial 3

The Continued Saga of DB-IR integration

Ricardo Baeza-Yates, University of Chile

Tutorial 4

Web Information Retrieval

Krishna Bharat, Google India

Author Index

Adaikkalavan, Raman 58

Agrawal, Neeraj 103

Apte, Manoj M. 148

Bae, Hae-Young 131

Chakravarthy, Sharma 58

Dass, Rajanish 156

Deshpande, Prasad M. 112

Dwivedi, Santosh 168

Garg, Kumkum 176

Ge, Jun-Wei 131

Girish K. Palshikar 148

Gopinath P. S. 138

Gowrishankar K. A. 138

Le Gruenwald 83

Halatchev, Mihail 83

Holmes, Scott 103

Joshi, Sachindra 103

Kale, Mandar S. 148

Karlapalem, Kamalakar 37,122,183

Kedia, Vipul 183

Kießling, Werner 15

Krämer, Jürgen70

Liu, Xiao-Li 131

Liu, Zhao-Hong 131

Lohia, Ashish 122

Mahanti, Ambuj 156

Maier, Martin 138

McNichols, Brendan 112

Menezes, Bernard 168

Mohania, Mukesh 37

Mutalikdesai M.R. 138

Negi, Sumit 103

Patel R. B. 176

Pavan Kumar K.V.N.N 37

Radha Krishna P. 37

Richmond, Michael 112

Sarawagi, Sunita 3

Seeger, Bernhard 70

Singh, Ashish 168

Sinha, Kirti 122

Sreenivasa Kumar P 95

Srinivasa, Srinath 138

Srinivasan, Savitha 112

Subramanyam G.V. 95

Thummala, Vamsidhar 183

Vadapalli, Soujanya 27,122

Valluri, S R 46

Vennam, Jhansi Rani 27

Vincent, Millist 37

Vinod Vydiswaran V.G. 3

Zbarsky, Vladimir 112