CS 632: Advanced DBMS

S. Sudarshan

Spring 2009  

Previous offerings: 2007, 2006, 2004, 2003, 2002, 2001, 2000, 1999.

About The Course

Reading material will consist primarily of research papers. All students will have to present a research paper of their choice, either from the list below or other papers subject to instructors approval. There will also be two exams (midsem/endsem) and a course project.
Textbook (for background material only): Database System Concepts, 5th Ed. Avi Silberschatz, Hank Korth, and S. Sudarshan. McGraw Hill, 2005. ( Slides for all book chapters )

The list of papers will get refined as we go along in 2009.

    Query Optimization

  1. Rule-Based Query Optimization using the Volcano Framework.
    Chapter 2 from Multiquery Optimization and Applications,
    Prasan Roy, PhD thesis, IIT Bombay, 2000. ppt
    (Jan 5, 2009)
  2. Efficient and Extensible Algorithms for Multi-Query Optimization,
    Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobhe,
    In ACM SIGMOD Conf. on the Management of Data., 2000. ppt
    (Jan 12, 2009)
  3. Rewriting Procedures for Batched Bindings
    Ravindra Guravannavar and S. Sudarshan, VLDB 2008
    Talk by Ravi G. (ppt)(Jan 15, 2009)

    Extra papers, not included for course.

  4. Query Optimization Over Web Services
    Utkarsh Srivastava, Kamesh Munagala, Jennifer Widom and Rajeev Motwani, SIGMOD 06
    ppt (Jan 19, 2009)
    Tech Report containing proofs
  5. Execution strategies for SQL subqueries
    Mostafa Elhemali, César A. Galindo-Legaria, Torsten Grabs, Milind Joshi
    SIGMOD Conference 2007: 993-1004
    Talk from SIGMOD 07 (ppt) Class lecture (ppt) (Jan 22, 2009)
  6. Query Processing for SQL Updates
    César A. Galindo-Legaria, Stefano Stefani, Florian Waas
    SIGMOD Conference 2004: 844-849
    talk (ppt) (29 Jan 2009)

    Adaptive Query Processing

  7. Eddies: Continuously Adaptive Query Processing,
    Avnur and Hellerstein, SIGMOD 2000.
    (Eddies(ppt)) (taken from http://web.cs.wpi.edu/~cs561/s05/talks/eddy-sigmod00-cs561.ppt) (Adaptive Query Processing using Eddies (ppt) by Amol Deshpande) (Feb 2, 2009)
  8. Robust Query Processing through Progressive Optimization,
    Volker Markl, Vijayshankar Raman, David E. Simmen, Guy M. Lohman, Hamid Pirahesh, SIGMOD 2004: 659-670
    PPT (Feb 5, 09)

    IR and DB

  9. Keyword Searching and Browsing in Databases using BANKS
    Gaurav Bhalotia, Charuta Nakhe, Arvind Hulgeri, Soumen Chakrabarti and S. Sudarshan, ICDE 2002
    (Long talk by Sudarshan, ppt), (Short talk by Ramdas, pdf) (9 Feb 2009) (Extra papers: Bidirectional Search (VLDB 2005) and Sphere Search (VLDB 2005)
  10. Efficient Computation of Diverse Query Results
    Erik Vee, Utkarsh Srivastava, Jayavel Shanmugasundaram, Prashant Bhat, Sihem Amer Yahia ICDE 2008
    (Eric Vee's ICDE talk ppt, and modified version for CS632 ppt) (12 Feb 2009)

    Week of 16-20 March: Midsemester Exam

    Massively Parallel Database/Storage Systems

  11. Background reading: The parallel database chapter and the distributed database chapter from DB Concepts.
    Slides: Chapter 21: Parallel Databases, and Chapter 22: Distributed Databases (plus 3PC, not available on book sit)
  12. Bigtable: A Distributed Storage System for Structured Data
    Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, OSDI 06)
    Video of talk by Jeff Dean: Local mp4 copy OR on video.google.com
    Class Presentation (3 and 5 March 2009)
  13. PNUTS: Yahoo!'s Hosted Data Serving Platform,
    Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver and Ramana Yerneni.
    VLDB (industry track) 2008.
    VLDB Talk by Brian Cooper (ppt)
    Related paper, not required reading
  14. Pig Latin: A Not-So-Foreign Language for Data Processing
    Chris Olston, Brian Reed, Utarsh Srivastava, Ravi Kumar and Andrew Tomkins
    SIGMOD 2008
    Talk (ppt)by Sandeep Patidar (March 9, 2009)

    Peer to Peer Systems

  15. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications,
    I. Stoica, R. Morris, D. Karger, M. Frans Kaashoek, H. Balakrishnan,
    In Proc. ACM SIGCOMM 2001. Expanded version appears in IEEE/ACM Trans. Networking, 11(1), February 2003.
    P2P overview talk (in pdf) (March 5, 2009)

    ( Extra Papers (not assigned reading):

    OLAP

  16. OLAP Over Uncertain and Imprecise Data Douglas Burdick, Prasad Deshpande, T. S. Jayram, Raghu Ramakrishnan and Shivakumar Vaithyanathan, VLDB 2005
    (Talk: Olap basics(pdf) and OLAP on uncertain/imprecise data (pdf) )
    Talk odp and ppt by Parakram Majumdar (March 12, 2009)

    Data Dissemination

  17. An Efficient and Resilient Approach to Filtering and Disseminating Streaming Data,
    Shetal Shah, Shyamshankar Dharmarajan and Krithi Ramamritham, VLDB 2003
    (Dynamic Data Dissemination Talk ppt by Krithi Ramamritham in 2007)
    Talk (in ppt) by Shetal Shah (from 2006)
    Talk by Ajinkya Joshi (March 16, 2009))

    XML Query Processing

  18. Structural Joins: A Primitive for Efficient XML Query Pattern Matching,
    D. Srivastava, S. Al-Khalifa, H.V. Jagadish, N. Koudas, J.M. Patel, Y.Wu, ICDE 2002.
    Talk by Parag Abhyankar (19 March 2009)
    (Extra paper, not required reading: (No class on 23 March 2009)
  19. Fast Computation of Database Operations using Graphics Processors
    Naga K. Govindaraju, Brandon Lloyd, Wei Wang, Ming C. Lin, Dinesh Manocha
    SIGMOD Conference 2004: 215-226
    Talk by Mahendra Chavan (26 March 2009)
    (Extra paper, not required reading:

    Database Testing

  20. Automating the Detection of Snapshot Isolation Anomalies
    Sudhir Jorwekar, Alan Fekete, Krithi Ramamritham, S. Sudarshan
    VLDB 2007: 1263-1274
    Talk by Ajitav Sahoo (30 March 2009)
    Related papers:
  21. Massive Stochastic Testing of SQL Donald R. Slutz
    VLDB 1998: 618-622
    Talk by Manan Shah (1 April 2009)

    Data Storage

  22. Column-stores vs. row-stores: how different are they really?
    Daniel J. Abadi, Samuel Madden, Nabil Hachem:
    SIGMOD Conference 2008: 967-980
    Talk by Bikmal Harikrishna (2 April 2009)

    Security and Privacy

  23. Redundancy and Information Leakage in Fine-Grained Access Control,
    Govind Kabra, Ravishankar Ramamurthy and S. Sudarshan
    Talk by (Adil Sandalwala, 6 April 2009)
    Also: SIGMOD Talk, Overview of database security and an Overview of Finegrained Authorization


    Other interesting papers on privacy, not covered this year:

    Data Streams

  24. Niagara CQ : A scalable continuous query system for Internet databases
    Chen, DeWitt, Tian and Wang, SIGMOD 2000
    Talk by (K. Naresh, 8 April 2009)

    Extra paper, not required reading

    Uncertain and Probabilistic Data

  25. Efficient query evaluation on probabilistic databases.
    N. Dalvi and D. Suciu. VLDB 2004,
    Talk by Veeranjaneyulu Sadhanala (9 April 2009)
    NOTE: read section 4.3 from the Journal version below, the conference version text has a mistake that is corrected in the journal version
    Extra paper (not required reading)
  26. Current research directions in data management: A discussion (9 April 2009)