CS631 : Implementation techniques in DBMS

Spring 2019


Previous years: Note: Offerings in 2016/17 are not listed here; there was no offering in 2018. Autumn 2015, Autumn 2014, Autumn 2013, Autumn 2012, Spring 2012, 2006, 2005 (Exams in 2012 Spring: midsem and endsem)
Instructor: Sriram Srinivasan and S. Sudarshan
Teaching Assistants: Sai Prasad Kousika and Karnam Ravi Shankar (saiprasad@cse and krshankar@cse)
TA Office Hours: Sai: Mon 10.30 AM-12.30 PM, Wed 10.30 AM-12.30 PM, Ravi: Mon 2.00-4.00 PM, Tue 2.00-4.00 PM Venue SIC 212 (Info Lab)
Faculty Office hours: Sudarshan: Mon 5.00-6.00 PM, Wed 12.00 - 1.00 PM, Room SIA 206

Lecture schedule Slot 2: Mon 9.30-10.25, Tue 10.35-11.30, Thu 11.35-12.30

Course Websites: In addition to this static Web page, we will also be using Moodle for assignment submissions, and Piazza for discussions. More information on Piazza sign up to be provided.
Textbook Database System Concepts, Silberschatz, Korth and Sudarshan, 6th edition (2010), McGraw Hill + Excerpts from the 7th edition (released March 2019).

Other reading material will be made available periodically
Book Slides, Errata, solutions to selected exercises and other resources are available at: http://db-book.com

Course contents We will be covering Implementation techniques including storage and indexing, query processing, and transaction processing (Chapters 10-16), Database architectures, parallel and distributed databases/big data (Chapters 17-19 of 6th ed plus new chapters from the 7th ed). We may also cover several research papers during the course. This year we will have an increased focus on Big Data, starting with using Big Data systems, and then diving into their internals. You will also set up PostgreSQL, and learn how to make small changes to it as part of the course assignments. The infrastructure for your course project will be your choice of PostgreSQL or any of the open source big data systems. If you wish to do it on any other infrastructure, you can discuss it with me.

Evaluation scheme Midsem: 20, Endsem: 35, Quizzes: 7+8, Assignments: 4+2+4, Project: 20. (Original, now overridden: Quizes/Homework 17%, Mid-sem 20%, Assignments 8%, Project 15%, and End-sem 40% Note : All quizzes will be surprise quizzes. The best N-2 out of N quiz scores will be counted.

Audit Requirements: Must attend all classes and take all exams. No need to do homeworks/assignments/projects.



Project Information
  1. PostgreSQL project topic suggestions (under construction, will add more ideas) Information about Projects TBA

PostgreSQL Resources Click Here (Instruction on creating patch files for submitting your project)


Schedule of Lectures

Topic No. Date TopicNotes
1 Jan 7 Introduction/Overview
2 Jan 8, 10 and 14 Chapter 10: Storage and File Structure Assignment: Download and compile PostgreSQL, and open in Eclipse, using instructions provided here
Physical Storage in PostgreSQL, Look inside above link for information on file layout, database page layout, and free space map (and lots more details src/backend/storage/freespace/README). And how PostgreSQL stores oversized attributes using the TOAST technique
3 Jan 15, 17, 21, 22 Chapter 11: Indexing Index access method interface in PostgreSQL (read only if you want to create new index types)
4 Jan 24, 28, 29 Chapter 12: Query Processing .
TBD Extra class: Overview of PostgreSQL internals -1 PostgreSQL resources, in particular see Tom Lane's talk
5 Jan 31, Feb 4, 5, 7 Chapter 13: Query Optimization Statistics in PostgreSQL, and examples of row estimation in PostgreSQL in particular see how PostgreSQL special cases most common values (MCVs).
Assignments on query plans in PostgreSQL
TBD Extra class: Overview of PostgreSQL internals -2, Debugging in Eclipse ..
6 Feb 11, 12 Chapter 14 Transactions .
7 Feb 14, 18, 19, 21 Chapter 15: Concurrency Control .
Feb 22 - 28 No Class (Midsem Exam) .
8 Mar 4, 5, 7 Chapter 16: Recovery .
9 Mar 11, 12 Chapter 17: Database Architecture
10 Mar 14, 18, 19 Parallel and Distributed Storage .
11 Mar 25, 26, 28 Parallel and Distributed Query Processing
12 Apr 1, 2, 4, 8, 9, 11 Parallel and Distributed Transaction Processing .
13 Apr 15 RAFT: In search of an Understandable Consensus Algorithm, Diego Ongarro and John Ousterhout, USENIX ATC 2014 .
14 Apr 16 Chapter 24: Advanced Application Development + Read Chapter 24 slides on performance tuning and performance benchmarks + Overview of CS 632 .