CS 632: Advanced DBMS
S. Sudarshan
Spring 2009
Previous offerings:
2007,
2006,
2004,
2003,
2002,
2001, 2000, 1999.
About The Course
Reading material will consist primarily of research papers.
All students will have to present a research paper
of their choice, either from the list below or other
papers subject to instructors approval.
There will also be two exams (midsem/endsem) and a course
project.
Textbook (for background material only):
Database System Concepts, 5th Ed.
Avi Silberschatz, Hank Korth, and S. Sudarshan.
McGraw Hill, 2005.
( Slides for all book chapters )
The list of papers will get refined as we go along in 2009.
Query Optimization
-
Rule-Based Query Optimization using the
Volcano Framework.
Chapter 2 from
Multiquery Optimization and Applications,
Prasan Roy, PhD thesis, IIT Bombay, 2000.
ppt
(Jan 5, 2009)
- Efficient and Extensible Algorithms for Multi-Query Optimization,
Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobhe,
In ACM SIGMOD Conf. on the Management of Data., 2000.
ppt
(Jan 12, 2009)
-
Rewriting Procedures for Batched Bindings
Ravindra Guravannavar and S. Sudarshan, VLDB 2008
Talk by Ravi G. (ppt)(Jan 15, 2009)
Extra papers, not included for course.
-
Query Optimization Over Web Services
Utkarsh Srivastava, Kamesh Munagala, Jennifer Widom and Rajeev Motwani,
SIGMOD 06
ppt (Jan 19, 2009)
Tech Report containing proofs
-
Execution strategies for SQL subqueries
Mostafa Elhemali, César A. Galindo-Legaria, Torsten Grabs, Milind Joshi
SIGMOD Conference 2007: 993-1004
Talk
from SIGMOD 07 (ppt)
Class lecture (ppt) (Jan 22, 2009)
-
Query Processing for SQL Updates
César A. Galindo-Legaria, Stefano Stefani, Florian Waas
SIGMOD Conference 2004: 844-849
talk (ppt) (29 Jan 2009)
Adaptive Query Processing
-
Eddies: Continuously Adaptive Query Processing,
Avnur and Hellerstein, SIGMOD 2000.
(Eddies(ppt)) (taken from
http://web.cs.wpi.edu/~cs561/s05/talks/eddy-sigmod00-cs561.ppt)
(Adaptive Query Processing using Eddies (ppt) by Amol Deshpande)
(Feb 2, 2009)
-
Robust Query Processing through Progressive Optimization,
Volker Markl, Vijayshankar Raman, David E. Simmen, Guy M. Lohman,
Hamid Pirahesh,
SIGMOD 2004: 659-670
PPT (Feb 5, 09)
IR and DB
- Keyword Searching and Browsing
in Databases using BANKS
Gaurav Bhalotia, Charuta Nakhe, Arvind Hulgeri, Soumen Chakrabarti and
S. Sudarshan, ICDE 2002
(Long talk by Sudarshan, ppt),
(Short talk by Ramdas, pdf) (9 Feb 2009)
(Extra papers: Bidirectional Search (VLDB 2005)
and Sphere Search (VLDB 2005)
-
Efficient Computation of Diverse Query Results
Erik Vee, Utkarsh Srivastava, Jayavel Shanmugasundaram,
Prashant Bhat, Sihem Amer Yahia
ICDE 2008
(Eric Vee's ICDE talk ppt, and modified version for CS632
ppt) (12 Feb 2009)
Week of 16-20 March: Midsemester Exam
Massively Parallel Database/Storage Systems
- Background reading: The parallel database chapter and the
distributed database chapter from DB Concepts.
Slides: Chapter 21: Parallel Databases, and
Chapter 22: Distributed Databases (plus 3PC, not available on book sit)
-
Bigtable: A Distributed Storage System for Structured Data
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, OSDI 06)
Video of talk by Jeff Dean: Local mp4 copy OR on video.google.com
Class Presentation (3 and 5 March 2009)
- You can also read about the Google AppEngines DataStore API,
an API in Python, which is allegedly built on top of Google's
MegaStore, which itself is supposedly a relational engine on top of
BigTable. However, no details of Megastore are public, and
the only online information comes from (believe it or not)
a blog entry of James Hamilton of Microsoft SQL Server,
derived from a talk by Jonas Karlsson at SIGMOD 2008.
-
PNUTS: Yahoo!'s Hosted Data Serving Platform,
Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver and Ramana Yerneni.
VLDB (industry track) 2008.
VLDB Talk by Brian Cooper (ppt)
Related paper, not required reading
- database implementation on S3 (Brantner et al SIGMOD 2008)
-
Pig Latin: A Not-So-Foreign Language for Data Processing
Chris Olston, Brian Reed, Utarsh Srivastava, Ravi Kumar and Andrew Tomkins
SIGMOD 2008
Talk (ppt)by Sandeep Patidar (March 9, 2009)
Peer to Peer Systems
-
Chord: A Scalable Peer-to-Peer Lookup Service for Internet
Applications,
I. Stoica, R. Morris, D. Karger, M. Frans Kaashoek, H. Balakrishnan,
In Proc. ACM SIGCOMM 2001.
Expanded version appears in IEEE/ACM Trans. Networking, 11(1), February 2003.
P2P overview talk (in pdf) (March 5, 2009)
( Extra Papers (not assigned reading):
-
A Scalable Content-Addressable Network,
S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker,
In Proc. ACM SIGCOMM 2001)
- Querying the Internet with PIER
Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo,
Scott Shenker, and Ion Stoica, VLDB 03
(Talk:ppt)
OLAP
-
OLAP Over Uncertain and Imprecise Data
Douglas Burdick, Prasad Deshpande, T. S. Jayram, Raghu Ramakrishnan
and Shivakumar Vaithyanathan, VLDB 2005
(Talk:
Olap basics(pdf) and
OLAP on uncertain/imprecise data (pdf)
)
Talk odp and ppt by Parakram Majumdar (March 12, 2009)
Data Dissemination
-
An Efficient and Resilient Approach to Filtering and Disseminating
Streaming Data,
Shetal Shah, Shyamshankar Dharmarajan and Krithi Ramamritham,
VLDB 2003
(Dynamic Data Dissemination Talk ppt
by Krithi Ramamritham in 2007)
Talk (in ppt) by Shetal Shah (from 2006)
Talk by Ajinkya Joshi (March 16, 2009))
XML Query Processing
- Structural Joins: A Primitive for
Efficient XML Query Pattern Matching,
D. Srivastava, S. Al-Khalifa, H.V. Jagadish, N. Koudas, J.M. Patel, Y.Wu,
ICDE 2002.
Talk by
Parag Abhyankar (19 March 2009)
(Extra paper, not required reading:
(No class on 23 March 2009)
-
Fast Computation of Database Operations using Graphics Processors
Naga K. Govindaraju, Brandon Lloyd, Wei Wang, Ming C. Lin, Dinesh Manocha
SIGMOD Conference 2004: 215-226
Talk by Mahendra Chavan (26 March 2009)
(Extra paper, not required reading:
Database Testing
-
Automating the Detection of Snapshot
Isolation Anomalies
Sudhir Jorwekar, Alan Fekete, Krithi Ramamritham, S. Sudarshan
VLDB 2007: 1263-1274
Talk by Ajitav Sahoo (30 March 2009)
Related papers:
- Massive Stochastic Testing of SQL
Donald R. Slutz
VLDB 1998: 618-622
Talk by
Manan Shah (1 April 2009)
Data Storage
-
Column-stores vs. row-stores: how different are they really?
Daniel J. Abadi, Samuel Madden, Nabil Hachem:
SIGMOD Conference 2008: 967-980
Talk by
Bikmal Harikrishna (2 April 2009)
Security and Privacy
-
Redundancy and Information Leakage
in Fine-Grained Access Control,
Govind Kabra, Ravishankar Ramamurthy and S. Sudarshan
Talk by
(Adil Sandalwala, 6 April 2009)
Also: SIGMOD Talk,
Overview of database security and
an Overview of Finegrained Authorization
Other interesting papers on privacy, not covered this year:
Data Streams
-
Niagara CQ : A scalable continuous query system for Internet databases
Chen, DeWitt, Tian and Wang, SIGMOD 2000
Talk by (K. Naresh, 8 April 2009)
Extra paper, not required reading
Query Processing, Resource Management, and
Approximation in a Data Stream Management System
Motwani, Widom, Arasu, Babcock, Babu, Datar, Manku, Olston,
Rosenstein and Varma, CIDR 2003
(Fri Mar 3)(PODS 2002 talk by Motwani)
Uncertain and Probabilistic Data
-
Efficient query evaluation on probabilistic databases.
N. Dalvi and D. Suciu. VLDB 2004,
Talk by Veeranjaneyulu Sadhanala (9 April 2009)
NOTE: read section 4.3 from the Journal version below,
the conference version text has a mistake that is corrected in the journal version
Extra paper (not required reading)
- Current research directions in data management: A discussion
(9 April 2009)