CS 632: Advanced DBMS
(better titled perhaps as Advances in Data Based Systems)

Krithi Ramamritham and S. Sudarshan

Spring 2006  

Previous offerings: 2004, 2003, 2002, 2001, 2000, 1999.

This years format will include paper presentations from all students taking the course, in addition to exam(s) and a project. Reading material will consist primarily of research papers.

Marks

Project Ideas

Seminar Ideas


Textbook (for background material only): Database System Concepts, 5th Ed. Avi Silberschatz, Hank Korth, and S. Sudarshan. McGraw Hill, 2005. ( Slides for all book chapters )

    Serializability

    Guest lectures by Prof. Alan Fekete, Univ. Sydney
  1. Topics in Database Isolation: Lecture 1: Isolation Levels (Talk in pdf)(Homework problems) (Jan 3, 2006)
  2. Topics in Database Isolation: Lecture 2: Safe Use of Low Isolation (Talk in pdf). Based on: Allocating Isolation Levels to Transactions, by Alan Fekete, PODS 2005, and A Read-Only Transaction Anomaly Under Snapshot Isolation, By Alan Fekete, Elizabeth O'Neil, and Patrick O'Neil, SIGMOD Record 33(3), Sep 2004 (Jan 4, 2006)
  3. Lecture 3: Replica Management (Jan 6, 2006)

    Adaptive Query Processing

    Guest Lecture by Prof. Amol Deshpande, Univ. Maryland College Park
  4. Adaptive Query Processing with Eddies (ppt) (10 Jan 2006)
    Assigned reading: Eddies: Continuously Adaptive Query Processing, Avnur and Hellerstein, SIGMOD 2000.
    Also: Probablistic Databases talk by Amol Deshpande on 12 Jan 2006 (not officially part of CS632)

    Pervasive Environments and Semantic Web

    Guest Lectures by Prof. Anupam Joshi, Univ. Maryland Baltimore County
  5. Managing Data (and Services) in Pervasive Environments (ppt) (Jan 12, 2006) (Related paper: On Data Management in Pervasive Computing Environments, by Filip Perich, Anupam Joshi, Timothy Finin, and Yelena Yesha, IEEE Trans. on Knowledge and Data Engg, VOL. 16, NO. 5, MAY 2004)
  6. A Gentle Introduction to the Semantic Web (aka the Researcher's Web 2.0) (ppt) (Jan 14, 2006)
    Also: Talks on Video Segmentation by Sharat Chandran (IIT Bombay), Medical Image Mining by Arcot Sowmya (U. New South Wales) part of IITB-UNSW workshop. (Jan 17, 2006)

    Query Processing and Optimization

  7. DBMS performance. A multi-dimensional challenge by Vadiraja Bhatt, Sr. Staff Software Engineer, Sybase Inc. Pune (Talk in ppt) (Jan 20, 2006)
  8. Rule-Based Query Optimization using the Volcano Framework.
    Chapter 2 from Multiquery Optimization and Applications, Prasan Roy, PhD thesis, 2000 (Jan 24, 2006)
  9. Materialized View Selection and Maintenance Using Multi-Query Optimization,
    Hoshi Mistry, Prasan Roy, S. Sudarshan and Krithi Ramamritham
    SIGMOD 2001
    PPT (Jan 27, 2006)
  10. Optimizing Nested Queries with Parameter Sort Orders Ravindra Guravannavar, Ramanujam H.S., S. Sudarshan
    (Talk by Ravi G.) PPT

    Information Integration

    Guest lecture by Harrick Vin (TRDDC Pune and Univ. Texas Austin) on Conquering Complexity Through Managed Evolution; the talk describes activities and challenges involved in the Data-intensive Computing initiative, including an information integration and inference architecture being developed at TRDDC. (Jan 31, 2006)

    Pervasive Computing / Semantic Web

  11. On Data Management in Pervasive Computing Environments, Filip Perich, Anupam Joshi, Timothy Finin, and Yelena Yesha, IEEE Trans. on Knowledge and Data Engg, VOL. 16, NO. 5, MAY 2004) (Manish) (Tue 7 Feb 2006) PDF

    Adaptive Query Processing

  12. Content-Based Routing: Different Plans for Different Data Pedro Bizarro, Shivnath Babu, David DeWitt, Jennifer Widom VLDB 2005 (Saju Dominic) (Fri 10 Feb) PPT PDF

    Materialization and Caching

  13. Robust Query Processing through Progressive Optimization, Volker Markl, Vijayshankar Raman, David E. Simmen, Guy M. Lohman, Hamid Pirahesh, SIGMOD 2004: 659-670 (Raja Agrawal) (Fri 10 Feb) PPT PDF
  14. Optimizing Refresh of a Set of Materialized Views, Nathan Folkert, Abhinav Gupta, Andrew Witkowski, Sankar Subramanian, Srikanth Bellamkonda, Shrikanth Shankar, Tolga Bozkaya, Lei Sheng (Oracle, USA) VLDB 2005 (Sudhir Jorwekar) (Tue 15 Feb) Slides
  15. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server, Surajit Chaudhuri, Vivek R. Narasayya, VLDB 1997: 146-155 (Suresh Iyengar) (Friday 17 Feb) Slides PPT This talk will also cover the following paper, which is NOT on the reading list for the course: Automated Selection of Materialized Views and Indexes in SQL Databases, Sanjay Agrawal, Surajit Chaudhuri, Vivek R. Narasayya, VLDB 2000: 496-505)
  16. Compressing SQL workloads, Surajit Chaudhuri, Ashish Kumar Gupta, Vivek R. Narasayya, SIGMOD Conference 2002: 488-499 (Amit Pathak) (Friday 17 Feb) Slides PS

    Midsemester Exam

    Wed Feb 22, 9.30-11.30 AM

    Data Dissemnination/Streams

  17. An Efficient and Resilient Approach to Filtering and Disseminating Streaming Data, Shetal Shah, Shyamshankar Dharmarajan and Krithi Ramamritham, 29th VLDB Conference, September 9 - 12, 2003, Berlin, Germany, pp.57 - 68.
    Talk (in ppt) by Shetal Shah, Tue Feb 28
  18. Query Processing, Resource Management, and Approximation in a Data Stream Management System
    Motwani, Widom, Arasu, Babcock, Babu, Datar, Manku, Olston, Rosenstein and Varma, CIDR 2003
    (Fri Mar 3)(PODS 2002 talk by Motwani)

    Sensor Networks

  19. TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks.,
    Samuel Madden, Michael J. Franklin, Joseph Hellerstein, and Wei Hong,
    Proceedings of the Fifth Symposium on Operating Systems Design and implementation (OSDI ’02), December 9 - 11, 2002, Boston, MA, USA.

    Extra paper (not part of assigned reading material):
    Temporal In-network Aggregation in Sensor Networks (TiNA) Tina: A scheme for temporal coherency-aware in-network aggregation, 31. M. A. Sharaf, J. Beaver, A. Labrinidis, and P. K. Chrysanthis. In Proc. of MobiDE, 2003.A And a longer version to appear in VLDB Journal: Balancing Energy Efficiency and Quality of Aggregate Data in Sensor Networks Talk (in pdf) by Dhananjay Muli and Sandeep Satpal, Tuesday Mar 6 2006

    Peer to Peer Systems

  20. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications, I. Stoica, R. Morris, D. Karger, M. Frans Kaashoek, H. Balakrishnan, In Proc. ACM SIGCOMM Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communication, 2001. Expanded version appears in IEEE/ACM Trans. Networking, 11(1), February 2003.

    Extra Paper (not assigned reading):
    A Scalable Content-Addressable Network, S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker, In Proc. ACM SIGCOMM Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communication, 2001.
    Talk (in pdf) by Sandeep Shelke and Shrirang Shirodkar, Friday Mar 9 2006

    Hardware and Databases

  21. Making B+-trees Cache Conscious in Main Memory, Jun Rao and Kenneth A. Ross, SIGMOD 2000.
    (Extra papers, not required reading: Weaving Relations for Cache Performance, Anastassia Ailamaki, David J. DeWitt, Mark D. Hill and Marios Skounakis, VLDB 2001.
    Buffering Database Operations for Enhanced Instruction Cache Performance Jingren Zhou and Kenneth A. Ross, SIGMOD 2004 )
    Talk (pdf) by Unmesh and Kamlesh, Tue Mar 13, 2006

    XML Query Processing

  22. Structural Joins: A Primitive for Efficient XML Query Pattern Matching, D. Srivastava, S. Al-Khalifa, H.V. Jagadish, N. Koudas, J.M. Patel, Y.Wu, ICDE 2002.
    (Extra paper, not required reading: ORDPATHs: Insert-Friendly XML Node Labels, Patrick E. O'Neil, Elizabeth J. O'Neil, Shankar Pal, Istvan Cseri, Gideon Schaller, Nigel Westbury, SIGMOD 2004: 903-908.)
    Talk (ppt) by Bhavana Dalvi and Uma Sawant, Fri Mar 16, 2006

    IR and DB

  23. Keyword Searching and Browsing in Databases using BANKS
    Gaurav Bhalotia, Charuta Nakhe, Arvind Hulgeri, Soumen Chakrabarti and S. Sudarshan, ICDE 2002
    Talk (ppt) by Gaurav Kumar Bijay and Esha Palta, Tue Mar 20, 2006 (Old talk: PPT)

    IR and XML

  24. XRANK: Ranked Keyword Search over XML Documents, L. Guo, F. Shao, C. Botev, J. Shanmugasundaram, SIGMOD 2003
    Extra Paper: The SphereSearch Engine for Unified Ranked Retrieval of Heterogenous XML and Web Documents by Jens Graupmann, Ralf Schenkel and Gerhard Weikum, VLDB 2005.
    Talk (ppt) by Meghana Kshirsagar and Nitin Gupta, Fri Mar 23, 2006

    Misc

  25. Postgres-R(SI): Combining Replica Control with Concurrency Control based on Snapshot Isolation, Shuqing Wu, Bettina Kemme ICDE 2005: 422-433
    Talk (ppt) by (Rishiraj Gupta, Tue Mar 27, 2006)
  26. Estimating Progress of Execution of SQL Queries, Surajit Chaudhuri, Vivek Narasayya, and Ravishankar Ramamurthy, SIGMOD 2004.
    Talk (pdf) by (Santosh Kumar C, Tue Mar 27, 2006)

    Privacy

  27. Mondrian Multidimensional K-Anonymity K. LeFevre, D. DeWitt, and R. Ramakrishnan. ICDE 2006
    Note The following paper was also covered in detail, and is part of suggested reading, although you need not read it in full detail:
    Incognito: Efficient Full-Domain K-Anonymity., K. LeFevre, D. DeWitt, and R. Ramakrishnan, SIGMOD 2005.
    (Extra paper: Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement through Generalization and Suppression, Pierangela Samarati and Latanya Sweeney, Procs. of the IEEE Symposium on Research in Security and Privacy, 1998. )
    Talk (pdf) by (Vibhooti Verma and Parul Halwe, Friday March 31)

    OLAP

  28. OLAP Over Uncertain and Imprecise Data Douglas Burdick, Prasad Deshpande, T. S. Jayram, Raghu Ramakrishnan and Shivakumar Vaithyanathan, VLDB 2005
    Talk: Olap basics(pdf) and OLAP on uncertain/imprecise data (pdf)
    by (Manuj and Jayesh (aka Kamalakar Chaudhuri), Tue April 4, 2006)

    XML and Java

    NoteExtra paper, familiarity with the talk is sufficient:
    XJ: Facilitating XML Processing in Java Matthew Harren, Mukund Raghavachari, Oded Shmueli, Michael G. Burke, Rajesh Bordawekar and Igor Pechtchanski, WWW 2005.
    Talk (ppt): Guest lecture by Rajesh Bordawekar (IBM T.J. Watson)
    Ad for IBM Eclipse innovation programme (http://www.ibm.com/university/eclipseinnovation)

    Stream Processing

  29. Load Shedding for Aggregation Queries over Data Streams Brian Babcock Mayur Datar Rajeev Motwani, ICDE 2004
    Talk: (pdf) (V. Mahesh Kumar, Apr 11, 2006)
    Extra Paper (general familiarity is sufficient)
    Niagara CQ : A scalable continuous query system for Internet databases Chen, DeWitt, Tian and Wang, SIGMOD 2000
    Talk: (pdf) (Narasimham, Apr 11, 2006)

    Wrap Up Session

    April 14, 2006 Note: General familiarity with following paper/talk is sufficient.
    Query Caching and View Selection for XML Databases, Bhushan Mandhani and Dan Suciu, VLDB 2005

    Seminar Ideas