CS 632:  Advanced DBMS 
 S. Sudarshan  
Spring 2009  
 
  Previous offerings: 
2007, 
2006, 
2004, 
2003, 
2002, 
2001, 2000, 1999. 
About The Course
Reading material will consist primarily of research papers. 
All students will have to present a research paper
of their choice, either from the list below or other
papers subject to instructors approval.
There will also be two exams (midsem/endsem) and a course
project.
Textbook (for background material only):
Database System Concepts, 5th Ed.
Avi Silberschatz, Hank Korth, and S. Sudarshan.
McGraw Hill, 2005.
( Slides for all book chapters )
 
The list of papers will get refined as we go along in 2009.  
Query Optimization
- 
Rule-Based Query Optimization using the 
Volcano Framework.  
 Chapter 2 from 
Multiquery Optimization and Applications,
 Prasan Roy, PhD thesis, IIT Bombay, 2000. 
ppt
 (Jan 5, 2009)
-  Efficient and Extensible Algorithms for Multi-Query Optimization,
 Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobhe,
 In ACM SIGMOD Conf. on the Management of Data., 2000.
ppt
 (Jan 12, 2009)
- 
Rewriting Procedures for Batched Bindings
 Ravindra Guravannavar and S. Sudarshan, VLDB 2008
 Talk by Ravi G. (ppt)(Jan 15, 2009)
Extra papers, not included for course.
 
-   
Query Optimization Over Web Services
 Utkarsh Srivastava, Kamesh Munagala, Jennifer Widom and Rajeev Motwani,
SIGMOD 06
 ppt (Jan 19, 2009)
 Tech Report containing proofs
- 
Execution strategies for SQL subqueries
 Mostafa Elhemali, César A. Galindo-Legaria, Torsten Grabs, Milind Joshi
 SIGMOD Conference 2007: 993-1004
 Talk
from SIGMOD 07 (ppt)
Class lecture (ppt) (Jan 22, 2009)
- 
Query Processing for SQL Updates
 César A. Galindo-Legaria, Stefano Stefani, Florian Waas
 SIGMOD Conference 2004: 844-849
 talk (ppt) (29 Jan 2009) Adaptive Query Processing 
-  
Eddies: Continuously Adaptive Query Processing, 
 Avnur and Hellerstein, SIGMOD 2000.
 (Eddies(ppt)) (taken from
http://web.cs.wpi.edu/~cs561/s05/talks/eddy-sigmod00-cs561.ppt)
(Adaptive Query Processing using Eddies (ppt) by Amol Deshpande) 
(Feb 2, 2009)
-  
Robust Query Processing through Progressive Optimization,
 Volker Markl, Vijayshankar Raman, David E. Simmen, Guy M. Lohman, 
Hamid Pirahesh,
SIGMOD 2004: 659-670
 PPT (Feb 5, 09)
 IR and DB
-  Keyword Searching and Browsing 
in Databases using BANKS 
 Gaurav Bhalotia, Charuta Nakhe, Arvind Hulgeri, Soumen Chakrabarti and
S. Sudarshan, ICDE 2002
 (Long talk by Sudarshan, ppt),
(Short talk by Ramdas, pdf) (9 Feb 2009)
(Extra papers:  Bidirectional Search (VLDB 2005) 
and Sphere Search (VLDB 2005)
-  
Efficient Computation of Diverse Query Results
 Erik Vee, Utkarsh Srivastava, Jayavel Shanmugasundaram, 
Prashant Bhat, Sihem Amer Yahia
ICDE 2008
 (Eric Vee's ICDE talk ppt, and modified version for CS632
ppt)  (12 Feb 2009)Week of 16-20 March: Midsemester ExamMassively Parallel Database/Storage Systems
-  Background reading: The parallel database chapter and the
distributed database chapter from DB Concepts.  
 Slides:  Chapter 21: Parallel Databases, and 
Chapter 22: Distributed Databases (plus 3PC, not available on book sit)
-  
Bigtable: A Distributed Storage System for Structured Data
 Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, OSDI 06)
 Video of talk by Jeff Dean: Local mp4 copy OR on video.google.com
 Class Presentation (3 and 5 March 2009)
 -  You can also read about the Google AppEngines DataStore API, 
an API in Python, which is allegedly built on top of Google's
MegaStore, which itself is supposedly a relational engine on top of 
BigTable.  However, no details of Megastore are public, and 
the only online information comes from (believe it or not) 
a blog entry of James Hamilton of Microsoft SQL Server, 
derived from a talk by Jonas Karlsson at SIGMOD 2008.
 
- 
PNUTS: Yahoo!'s Hosted Data Serving Platform,
 Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver and Ramana Yerneni.
 VLDB (industry track) 2008.
 VLDB Talk by Brian Cooper (ppt)
 Related paper, not required reading
-  database implementation on S3 (Brantner et al SIGMOD 2008)
 
- 
Pig Latin: A Not-So-Foreign Language for Data Processing
 Chris Olston, Brian Reed, Utarsh Srivastava, Ravi Kumar and Andrew Tomkins
 SIGMOD 2008
 Talk (ppt)by Sandeep Patidar (March 9, 2009) Peer to Peer Systems 
-  
  Chord: A Scalable Peer-to-Peer Lookup Service for Internet
  Applications,
 I. Stoica, R. Morris, D. Karger, M. Frans Kaashoek, H. Balakrishnan,
 In Proc. ACM SIGCOMM 2001. 
Expanded version appears in IEEE/ACM Trans. Networking, 11(1), February 2003.
 P2P overview talk (in pdf)  (March 5, 2009)
( Extra Papers (not assigned reading): 
 
 
- 
A Scalable Content-Addressable Network,
 S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker, 
In Proc. ACM SIGCOMM 2001)
 
-    Querying the Internet with PIER
 Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo,
Scott Shenker, and Ion Stoica, VLDB 03
 (Talk:ppt)
  OLAP
-  
OLAP Over Uncertain and Imprecise Data
Douglas Burdick, Prasad Deshpande, T. S. Jayram, Raghu Ramakrishnan 
and Shivakumar Vaithyanathan, VLDB 2005
 (Talk:
Olap basics(pdf) and 
OLAP on uncertain/imprecise data (pdf) 
)
 Talk odp and ppt by Parakram Majumdar (March 12, 2009) Data Dissemination
- 
An Efficient and Resilient Approach to Filtering and Disseminating 
Streaming Data,
 Shetal Shah, Shyamshankar Dharmarajan  and  Krithi Ramamritham,
VLDB 2003
 (Dynamic Data Dissemination Talk ppt
by Krithi Ramamritham in 2007)
 Talk (in ppt) by Shetal Shah (from 2006)
 Talk  by Ajinkya Joshi (March 16, 2009)) XML Query Processing
-  Structural Joins: A Primitive for 
Efficient XML Query Pattern Matching, 
 D. Srivastava, S. Al-Khalifa, H.V. Jagadish, N. Koudas, J.M. Patel, Y.Wu, 
ICDE 2002.
 Talk by
Parag Abhyankar (19 March 2009)
 (Extra paper, not required reading:
(No class on 23 March 2009)
- 
 Fast Computation of Database Operations using Graphics Processors
 Naga K. Govindaraju, Brandon Lloyd, Wei Wang, Ming C. Lin, Dinesh Manocha
 SIGMOD Conference 2004: 215-226
 Talk by Mahendra Chavan (26 March 2009)
 (Extra paper, not required reading:Database Testing
- 
 Automating the Detection of Snapshot 
Isolation Anomalies
 Sudhir Jorwekar, Alan Fekete, Krithi Ramamritham, S. Sudarshan
 VLDB 2007: 1263-1274
 Talk by Ajitav Sahoo (30 March 2009)
 Related papers:
-  Massive Stochastic Testing of SQL
Donald R. Slutz
 VLDB 1998: 618-622
 Talk by
 Manan Shah (1 April 2009)Data Storage
-  
Column-stores vs. row-stores: how different are they really?
 Daniel J. Abadi, Samuel Madden, Nabil Hachem:
 SIGMOD Conference 2008: 967-980
 Talk by 
 Bikmal Harikrishna (2 April 2009)Security and Privacy
-  
Redundancy and Information Leakage
in Fine-Grained Access Control,
 Govind Kabra, Ravishankar Ramamurthy and S. Sudarshan
 Talk by 
 (Adil Sandalwala, 6 April 2009)
 Also:  SIGMOD Talk, 
Overview of database security and 
an Overview of Finegrained Authorization
 
 
 Other interesting papers on privacy, not covered this year:Data Streams
-  
Niagara CQ : A scalable continuous query system for Internet databases
 Chen, DeWitt, Tian and Wang, SIGMOD 2000
 Talk by  (K. Naresh, 8 April 2009)
 
 Extra paper, not required reading
Query Processing, Resource Management, and 
	Approximation in a Data Stream Management System 
 Motwani, Widom, Arasu, Babcock, Babu, Datar, Manku, Olston, 
	Rosenstein and Varma, CIDR 2003
 (Fri Mar 3)(PODS 2002 talk by Motwani)
  Uncertain and Probabilistic Data
-  
Efficient query evaluation on probabilistic databases.
 N. Dalvi and D. Suciu.  VLDB 2004,
 Talk by Veeranjaneyulu Sadhanala (9 April 2009)
 NOTE: read section 4.3 from the Journal version below,
the conference version text has a mistake that is corrected in the journal version
 Extra paper (not required reading)
-  Current research directions in data management: A discussion
(9 April 2009)