CS631 : Implementation techniques in DBMS

Autumn 2019

Previous years: Note: Offerings in 2016/17 are not listed here; there was no offering in 2018. Spring 2019, Autumn 2015, Autumn 2014, Autumn 2013, Autumn 2012, Spring 2012, 2006, 2005 (Exams in 2012 Spring: midsem and endsem)
Instructor: S. Sudarshan
Teaching Assistants: Ashwini Jain and Balaji P, (aswinijain@cse and psbalaji@cse)
TA Office Hours: Ashwini: Tue 1130-1230, Wed 530-630 Balaji: Mon 1030-1130, Wed 6.30-7.30 Faculty Office hours: Sudarshan: To be fixed Room SIA 206.

Lecture schedule Slot 1: Mon 8.30 to 9.25, Tue 9.30-10.25, Thu 10.35-11.30

Course Websites: In addition to this static Web page, we will also be using Moodle for assignment submissions, and Piazza for discussions. More information on Piazza sign up to be provided.
Textbook Database System Concepts, Silberschatz, Korth and Sudarshan, 7th edition (2019), McGraw Hill. (Note: The 7th ed is not yet available in India. You can use corresponding chapters of the 6th edition along with book slides till it becomes available.)

Other reading material will be made available periodically
Book Slides, Errata, solutions to selected exercises and other resources are available at: https://db-book.com

Course contents We will be covering Implementation techniques including storage and indexing, query processing, and transaction processing (Chapters 12-19), Database System Architectures (Chapter 20) and Parallel and Distributed Storage, Query Processing and Transaction Processing (Chapters 21-23). We may also cover several research papers during the course. As in the Spring 2019 offering, we will have an increased focus on Big Data, starting with using Big Data systems, and then diving into their internals. You will also set up PostgreSQL, and learn how to make small changes to it as part of the course assignments. The infrastructure for your course project will be your choice of PostgreSQL or any of the open source big data systems. If you wish to do it on any other infrastructure, you can discuss it with me.

Evaluation scheme Midsem: 20, Endsem: 35, Quizzes: 7+8, Assignments: 4+2+4, Project: 20.

Audit Requirements: Must attend all classes and take all exams. No need to do homeworks/assignments/projects.

Project Information
  1. PostgreSQL project topic suggestions (under construction, will add more ideas) Information about Projects TBA

PostgreSQL Resources Click Here (Instruction on creating patch files for submitting your project)

Schedule of Lectures

Topic No. Date TopicNotes
1 July 29 Introduction/Overview
2 July 30, Aug 1 Chapter 12 and 13: Physical Storage Systems and Data Storage Structures Assignment: Download and compile PostgreSQL, and open in Eclipse, using instructions provided here
Physical Storage in PostgreSQL, Look inside above link for information on file layout, database page layout, and free space map (and lots more details src/backend/storage/freespace/README). And how PostgreSQL stores oversized attributes using the TOAST technique
3 Aug 5 Chapter 10: Big Data
4 Aug 6, 8, 12 Chapter 14: Indexing Index access method interface in PostgreSQL (read only if you want to create new index types)
5 Aug 13, 19, 20 Chapter 15: Query Processing .
TBD Extra class: Overview of PostgreSQL internals -1 PostgreSQL resources, in particular see Tom Lane's talk
6 Aug 22, 26, 27, 29 Chapter 16: Query Optimization Statistics in PostgreSQL, and examples of row estimation in PostgreSQL in particular see how PostgreSQL special cases most common values (MCVs).
Assignments on query plans in PostgreSQL
TBD Extra class: Overview of PostgreSQL internals -2, Debugging in Eclipse ..
7 Sep 2, 3 Chapter 17 Transactions .
8 Sep 5, 9, 10, 12 Chapter 18: Concurrency Control .
Sep 16 - 20 No Class (Midsem Exam) .
9 Sep 23, 24, 26 Chapter 19: Recovery .
9 Sep 30, Oct 1, 3 Chapter 20: Database Architecture
10 Oct 7, 8, 10 Parallel and Distributed Storage .
11 Oct 14, 15, 17 Parallel and Distributed Query Processing
12 Oct 21, 22, 24, 28, 29, 31 Parallel and Distributed Transaction Processing .
13 Nov 2 RAFT: In search of an Understandable Consensus Algorithm, Diego Ongarro and John Ousterhout, USENIX ATC 2014 .
14 Nov 4, 5 Chapter 25: Advanced Application Development .
15 Nov 7 Current Research Areas + CS 632 .