CS 632: Advanced DBMS

S. Sudarshan

Spring 2010  

Previous offerings: 2009, 2007, 2006, 2004, 2003, 2002, 2001, 2000, 1999.

End paper and Midsem paper from 2009

About The Course

Reading material will consist primarily of research papers. All students will have to present a research paper of their choice, either from the list below or other papers subject to instructors approval. There will also be two exams (midsem/endsem) and a course project. Anyone who does an exceptional course project that has the potential to be a publishable paper is eligible for a straight AA grade. Otherwise the grading breakup would be midsem 25, endsem 40, project 25 and seminar presentation 10.

Textbook (for background material only)

Database System Concepts, 6th Ed.
Avi Silberschatz, Hank Korth, and S. Sudarshan. McGraw Hill, 2010.
(book home page, Local copy of slides for all chapters)

The list of papers below is from 2009, and will get refined as we go along in 2010.

    Query Optimization

  1. Rule-Based Query Optimization using the Volcano Framework.
    Chapter 2 from Multiquery Optimization and Applications,
    Prasan Roy, PhD thesis, IIT Bombay, 2000. ppt
    (Jan 4, 2010)
  2. Efficient and Extensible Algorithms for Multi-Query Optimization,
    Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobhe,
    In ACM SIGMOD Conf. on the Management of Data., 2000. ppt
    (Jan 7, 2010)
  3. Rewriting Procedures for Batched Bindings
    Ravindra Guravannavar and S. Sudarshan, VLDB 2008
    Talk (ppt)(Jan 11, 2010)
    Related papers, not required reading:
  4. Execution strategies for SQL subqueries
    Mostafa Elhemali, Cesar A. Galindo-Legaria, Torsten Grabs, Milind Joshi
    SIGMOD Conference 2007: 993-1004
    Talk from SIGMOD 07 (ppt) Class lecture (ppt) (Jan 14, 2010)
  5. Query Processing for SQL Updates
    Cesar A. Galindo-Legaria, Stefano Stefani, Florian Waas
    SIGMOD Conference 2004: 844-849
    talk (ppt) (18 Jan 2010)

    Adaptive Query Processing

  6. Eddies: Continuously Adaptive Query Processing,
    Avnur and Hellerstein, SIGMOD 2000.
    (Eddies(ppt)) (taken from http://web.cs.wpi.edu/~cs561/s05/talks/eddy-sigmod00-cs561.ppt)
    (Adaptive Query Processing using Eddies (ppt) by Amol Deshpande) (Jan 21, 25, 2010)
  7. Robust Query Processing through Progressive Optimization,
    Volker Markl, Vijayshankar Raman, David E. Simmen, Guy M. Lohman, Hamid Pirahesh, SIGMOD 2004: 659-670
    PPT (Jan 25, 28, 2010)
  8. Scalable Join Processing on Very Large RDF Graphs
    Thomas Neumann and Gerhard Weikum, SIGMOD 2009 (Feb 4, 2010)
    (talk on basic rdf3x, talk on scalabe join proc)

    IR and DB

  9. Keyword Searching and Browsing in Databases using BANKS
    Gaurav Bhalotia, Charuta Nakhe, Arvind Hulgeri, Soumen Chakrabarti and S. Sudarshan, ICDE 2002
    (Long talk by Sudarshan, ppt), (Short talk by Ramdas, pdf) (Feb 8 and 11 2010)

    Related papers, not required reading:
  10. Combining Keyword Search and Forms for Ad Hoc Querying of Databases
    Eric Chu, Akanksha Baid, Xiaoyong Chai, AnHai Doan and Jeffrey Naughton, SIGMOD 2009 (Feb 22, 2010) (talk (pptx))

    Related papers, not required reading:

    Week of 13-20 Feb: Midsemester Exam

    Massively Parallel Database/Storage Systems

  11. Background reading: The parallel database chapter and the distributed database chapter from DB Concepts.
    Slides: Chapter 18: Parallel Databases, and Chapter 19: Distributed Databases (plus 3PC, not available on book site) (Feb 22, 2010)
  12. Bigtable: A Distributed Storage System for Structured Data
    Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, OSDI 06)
    Video of talk by Jeff Dean: Local mp4 copy OR on video.google.com
    Class Presentation (Feb 25, 2010)
  13. PNUTS: Yahoo!'s Hosted Data Serving Platform,
    Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver and Ramana Yerneni.
    VLDB (industry track) 2008.
    VLDB Talk by Brian Cooper (ppt) (March 4, 2010)
    Related papers, not required reading:
  14. MapReduce: Simplifed Data Processing on Large Clusters
    Jeffrey Dean and Sanjay Ghemawat, OSDI 2004,
    Talk by Dinesh Dharme (8 March 2010)
  15. Map-Reduce-Merge: Simplified Relational Data Processing on Large Cluster Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao and D. Stott Parker, SIGMOD 2007
    Talk by Senthilnathan N (8 March 2010)

    Database Testing

  16. Reverse Query Processing
    Carsten Binnig, Donald Kossmann and Eric Lo, ICDE 2007,
    Talk by Bhupesh Chawda (11 March 2010)
    Related papers, not required reading:
  17. Automating the Detection of Snapshot Isolation Anomalies
    Sudhir Jorwekar, Alan Fekete, Krithi Ramamritham, S. Sudarshan
    VLDB 2007: 1263-1274
    Talk by Shailendra Shrivastav (15 March 2010)
    Related papers, not required reading:

    Peer to Peer Systems

  18. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications,
    I. Stoica, R. Morris, D. Karger, M. Frans Kaashoek, H. Balakrishnan,
    In Proc. ACM SIGCOMM 2001. Expanded version appears in IEEE/ACM Trans. Networking, 11(1), February 2003.
    P2P overview talk (in pdf) and
    talk by Mohammed Junaid and Gopalakrishnan S. (18 March 2010))
    Related papers, not required reading:

    Consistency and Asynchrony

  19. Consistency Rationing in the Cloud: Pay only when it matters
    Tim Kraska, Martin Hentschel, Gustavo Alonso and Donald Kossmann, VLDB 2009
    Talk by Sandeep and Rajashekar (22 March 2010)
  20. Asynchronous view maintenance for VLSD databases
    Parag Agrawal, Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava, Raghu Ramakrishnan, SIGMOD 2009
    Talk by Purva Joshi (05 April 2010)

    Data Streams

  21. Finding the Frequent Items in Streams of Data
    Graham Cormode and Marios Hadjielftheriou, VLDB 2008 and CACM 52(1) Oct 2009
    Talk by Ankur Agarwal (25 March 2010)

    Related paper, not required reading

    Data Storage

  22. Column-stores vs. row-stores: how different are they really?
    Daniel J. Abadi, Samuel Madden, Nabil Hachem:
    SIGMOD Conference 2008: 967-980
    Talk by Karthik SR (07 April 2010). See also VLDB 09 tutorial on column stores by Hariozopoulos, Abadi and Boncz

    Security and Privacy

  23. Redundancy and Information Leakage in Fine-Grained Access Control,
    Govind Kabra, Ravishankar Ramamurthy and S. Sudarshan
    Talk by (Aditya Joshi and Subhait Datta, 08 April 2010)
    Also: SIGMOD Talk, Overview of database security and an Overview of Finegrained Authorization


    Other interesting papers on privacy, not covered this year:

    Dependence Detection

  24. Integrating conflicting data: the role of source dependence.
    Xin Luna Dong, Laure Berti-Equille and Divesh Srivastava. Procs. VLDB Endowment (PVLDB), 2(1): 550-561, 2009.
    talk by Divesh Srivastava (12 April 2010)
    Additional reading

    XML Query Processing

  25. Structural Joins: A Primitive for Efficient XML Query Pattern Matching,
    D. Srivastava, S. Al-Khalifa, H.V. Jagadish, N. Koudas, J.M. Patel, Y.Wu, ICDE 2002.
    Talk by Sandhya and Prabhas Samant (14 April 2010)
    Related papers, not required reading:

    Uncertain and Probabilistic Data

  26. OLAP Over Uncertain and Imprecise Data Douglas Burdick, Prasad Deshpande, T. S. Jayram, Raghu Ramakrishnan and Shivakumar Vaithyanathan, VLDB 2005
    (Talk: Olap basics(pdf) and OLAP on uncertain/imprecise data (pdf) )
    Talk odp and ppt
    and by T. S. Jayram (15 April 2010) (Related material if you are interested, but not part of CS632:
  27. Current research directions in data management: A discussion (15 April 2009)