CS 632: Advanced DBMS
S. Sudarshan
Spring 2010
Previous offerings:
2009,
2007,
2006,
2004,
2003,
2002,
2001, 2000, 1999.
End paper and
Midsem paper
from 2009
About The Course
Reading material will consist primarily of research papers.
All students will have to present a research paper
of their choice, either from the list below or other
papers subject to instructors approval.
There will also be two exams (midsem/endsem) and a course
project.
Anyone who does an exceptional course project that has the
potential to be a publishable paper is eligible for a
straight AA grade. Otherwise the grading breakup would be
midsem 25, endsem 40, project 25 and seminar presentation 10.
Textbook (for background material only)
Database System Concepts, 6th Ed.
Avi Silberschatz, Hank Korth, and S. Sudarshan.
McGraw Hill, 2010.
(book home page,
Local copy of slides for all chapters)
The list of papers below is from 2009, and will get refined as we
go along in 2010.
Query Optimization
-
Rule-Based Query Optimization using the
Volcano Framework.
Chapter 2 from
Multiquery Optimization and Applications,
Prasan Roy, PhD thesis, IIT Bombay, 2000.
ppt
(Jan 4, 2010)
- Efficient and Extensible Algorithms for Multi-Query Optimization,
Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobhe,
In ACM SIGMOD Conf. on the Management of Data., 2000.
ppt
(Jan 7, 2010)
-
Rewriting Procedures for Batched Bindings
Ravindra Guravannavar and S. Sudarshan, VLDB 2008
Talk (ppt)(Jan 11, 2010)
Related papers, not required reading:
-
Execution strategies for SQL subqueries
Mostafa Elhemali, Cesar A. Galindo-Legaria, Torsten Grabs, Milind Joshi
SIGMOD Conference 2007: 993-1004
Talk
from SIGMOD 07 (ppt)
Class lecture (ppt) (Jan 14, 2010)
-
Query Processing for SQL Updates
Cesar A. Galindo-Legaria, Stefano Stefani, Florian Waas
SIGMOD Conference 2004: 844-849
talk (ppt) (18 Jan 2010)
Adaptive Query Processing
-
Eddies: Continuously Adaptive Query Processing,
Avnur and Hellerstein, SIGMOD 2000.
(Eddies(ppt)) (taken from
http://web.cs.wpi.edu/~cs561/s05/talks/eddy-sigmod00-cs561.ppt)
(Adaptive Query Processing using Eddies (ppt) by Amol Deshpande)
(Jan 21, 25, 2010)
-
Robust Query Processing through Progressive Optimization,
Volker Markl, Vijayshankar Raman, David E. Simmen, Guy M. Lohman,
Hamid Pirahesh,
SIGMOD 2004: 659-670
PPT (Jan 25, 28, 2010)
-
Scalable Join Processing on Very Large RDF Graphs
Thomas Neumann and Gerhard Weikum, SIGMOD 2009 (Feb 4, 2010)
(talk on basic rdf3x,
talk on scalabe join proc)
IR and DB
- Keyword Searching and Browsing
in Databases using BANKS
Gaurav Bhalotia, Charuta Nakhe, Arvind Hulgeri, Soumen Chakrabarti and
S. Sudarshan, ICDE 2002
(Long talk by Sudarshan, ppt),
(Short talk by Ramdas, pdf)
(Feb 8 and 11 2010)
Related papers, not required reading:
-
Combining Keyword Search and Forms for Ad Hoc Querying of Databases
Eric Chu, Akanksha Baid, Xiaoyong Chai, AnHai Doan and Jeffrey Naughton,
SIGMOD 2009 (Feb 22, 2010) (talk (pptx))
Related papers, not required reading:
Week of 13-20 Feb: Midsemester Exam
Massively Parallel Database/Storage Systems
- Background reading: The parallel database chapter and the
distributed database chapter from DB Concepts.
Slides:
Chapter 18: Parallel Databases, and
Chapter 19: Distributed Databases (plus 3PC, not available on book site)
(Feb 22, 2010)
-
Bigtable: A Distributed Storage System for Structured Data
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, OSDI 06)
Video of talk by Jeff Dean: Local mp4 copy OR on video.google.com
Class Presentation (Feb 25, 2010)
- You can also read about the Google AppEngines DataStore API,
an API in Python, which is allegedly built on top of Google's
MegaStore, which itself is supposedly a relational engine on top of
BigTable. However, no details of Megastore are public, and
the only online information comes from (believe it or not)
a blog entry of James Hamilton of Microsoft SQL Server,
derived from a talk by Jonas Karlsson at SIGMOD 2008.
-
PNUTS: Yahoo!'s Hosted Data Serving Platform,
Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver and Ramana Yerneni.
VLDB (industry track) 2008.
VLDB Talk by Brian Cooper (ppt) (March 4, 2010)
Related papers, not required reading:
-
MapReduce: Simplifed Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat, OSDI 2004,
Talk by Dinesh Dharme (8 March 2010)
-
Map-Reduce-Merge: Simplified Relational Data Processing on Large Cluster
Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao and D. Stott Parker,
SIGMOD 2007
Talk by Senthilnathan N (8 March 2010)
Database Testing
- Reverse Query Processing
Carsten Binnig, Donald Kossmann and Eric Lo, ICDE 2007,
Talk by Bhupesh Chawda (11 March 2010)
Related papers, not required reading:
-
Automating the Detection of Snapshot
Isolation Anomalies
Sudhir Jorwekar, Alan Fekete, Krithi Ramamritham, S. Sudarshan
VLDB 2007: 1263-1274
Talk by Shailendra Shrivastav (15 March 2010)
Related papers, not required reading:
Peer to Peer Systems
-
Chord: A Scalable Peer-to-Peer Lookup Service for Internet
Applications,
I. Stoica, R. Morris, D. Karger, M. Frans Kaashoek, H. Balakrishnan,
In Proc. ACM SIGCOMM 2001.
Expanded version appears in IEEE/ACM Trans. Networking, 11(1), February 2003.
P2P overview talk (in pdf) and
talk by
Mohammed Junaid and Gopalakrishnan S. (18 March 2010))
Related papers, not required reading:
-
A Scalable Content-Addressable Network,
S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker,
In Proc. ACM SIGCOMM 2001)
- Querying the Internet with PIER
Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo,
Scott Shenker, and Ion Stoica, VLDB 03
(Talk:ppt)
Consistency and Asynchrony
- Consistency Rationing in the Cloud: Pay only when it matters
Tim Kraska, Martin Hentschel, Gustavo Alonso and Donald Kossmann, VLDB 2009
Talk by Sandeep and Rajashekar (22 March 2010)
-
Asynchronous view maintenance for VLSD databases
Parag Agrawal, Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava, Raghu Ramakrishnan, SIGMOD 2009
Talk by Purva Joshi (05 April 2010)
Data Streams
-
Finding the Frequent Items in Streams of Data
Graham Cormode and Marios Hadjielftheriou, VLDB 2008 and CACM 52(1) Oct 2009
Talk by
Ankur Agarwal (25 March 2010)
Related paper, not required reading
Query Processing, Resource Management, and
Approximation in a Data Stream Management System
Motwani, Widom, Arasu, Babcock, Babu, Datar, Manku, Olston,
Rosenstein and Varma, CIDR 2003
(PODS 2002 talk by Motwani)
Data Storage
-
Column-stores vs. row-stores: how different are they really?
Daniel J. Abadi, Samuel Madden, Nabil Hachem:
SIGMOD Conference 2008: 967-980
Talk by
Karthik SR (07 April 2010).
See also VLDB 09 tutorial on column stores by Hariozopoulos,
Abadi and Boncz
Security and Privacy
-
Redundancy and Information Leakage
in Fine-Grained Access Control,
Govind Kabra, Ravishankar Ramamurthy and S. Sudarshan
Talk by
(Aditya Joshi and Subhait Datta, 08 April 2010)
Also: SIGMOD Talk,
Overview of database security and
an Overview of Finegrained Authorization
Other interesting papers on privacy, not covered this year:
-
l-Diversity: Privacy Beyond k-Anonymity,
Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer and
Muthuramakrishnan Venkitasubramaniam
Talk: ppt
-
Mondrian Multidimensional K-Anonymity
K. LeFevre, D. DeWitt, and R. Ramakrishnan.
ICDE 2006
-
Incognito: Efficient Full-Domain
K-Anonymity.
K. LeFevre, D. DeWitt, and R. Ramakrishnan, SIGMOD 2005.
-
Protecting Privacy when Disclosing Information: k-Anonymity
and its Enforcement through Generalization and Suppression,
Pierangela Samarati and Latanya Sweeney,
Procs. of the IEEE Symposium on Research in Security and Privacy, 1998.)
Talk (pdf)
Dependence Detection
- Integrating conflicting data: the role of source dependence.
Xin Luna Dong, Laure Berti-Equille and Divesh Srivastava.
Procs. VLDB Endowment (PVLDB), 2(1): 550-561, 2009.
talk by Divesh Srivastava (12 April 2010)
Additional reading
XML Query Processing
- Structural Joins: A Primitive for
Efficient XML Query Pattern Matching,
D. Srivastava, S. Al-Khalifa, H.V. Jagadish, N. Koudas, J.M. Patel, Y.Wu,
ICDE 2002.
Talk by
Sandhya and Prabhas Samant (14 April 2010)
Related papers, not required reading:
Uncertain and Probabilistic Data
-
OLAP Over Uncertain and Imprecise Data
Douglas Burdick, Prasad Deshpande, T. S. Jayram, Raghu Ramakrishnan
and Shivakumar Vaithyanathan, VLDB 2005
(Talk:
Olap basics(pdf) and
OLAP on uncertain/imprecise data (pdf)
)
Talk odp and
ppt
and by T. S. Jayram
(15 April 2010)
(Related material if you are interested, but not part of CS632:
- Current research directions in data management: A discussion
(15 April 2009)