CS 744: Design and Engineering of Computing Systems

Autumn 2017

CS744
Lectures: Slot 1 (Mon 8:30 am, Tue 9:30 am, Thu 10:35 am), Venue: SIC 201.

Instructor: Mythili Vutukuru (mythili @ cse.iitb.ac.in)

Instructor Office Hours: By appointment (please send email to fix up a time).

TAs: Trishal Patel (trishal@cse), Akash Kanase (akanase@cse), Priyanka Naik (ppnaik@cse)

Announcements
  • End semester exam is on Nov 17, Fri, 2-5 pm in New CSE building 103.
  • Please signup for demo/viva slots for evaluation of phases 3 and 4 of the project. More information about the project is here.

Course Overview

The course is designed to be a broad introduction to computer systems, covering the basic as well as some advanced topics pertaining to the design and engineering of computer systems. It is primarily aimed at PG students (Masters and Ph.D) who wish to pursue research in the area of systems. The course is a hands-on course, with a heavy programming-based project component.

At the end of the course, the students will be able to understand the basic principles behind the design of real computer systems, and will be able to build a simple system themselves as part of the course project. The course will enable the students to understand computer systems end-to-end, starting from the application design, to the operating system interactions, all the way down to the hardware components.

The topics covered in the course include:

Grading

Textbooks and References

There is no one textbook that covers all the topics in the course. For the operating systems concepts, please refer to my undergrad OS course offering CS347 homepage that has some lecture notes, practice problems, and textbook references. For advanced concepts, links to research papers will be posted on this page.


Schedule

Lecture# Date Topics Notes References
1 July 17 (Mon) Introduction, course logistics, course overview. lec01.txt [CS347_Introduction]
2 July 18 (Tue) Processes. lec02.txt [CS347_Processes_Threads]
3 July 20 (Thu) Threads and Concurrency. lec03.txt A sample tutorial on pthreads (you can find many more online)

[CS347_Synchronization]
4 July 24 (Mon) Conditional Variables and Thread Synchronization. lec04.txt
5 July 25 (Tue) Semaphores. Inter-process synchronization and communication. lec05.txt Little Book of Semaphores has many interesting synchronization problems.
6 July 27 (Thu) Socket programming: synchronous and event-driven I/O. lec06.txt A sample tutorial on socket programming, and a sample tutorial on event-driven I/O using epoll.
- July 31 (Mon) NO CLASS.
- Aug 1 (Tue) Tutorial on multithreaded programming (pthreads) and socket programming.
7 Aug 3 (Thu) Disk I/O and filesystems overview. lec07.txt [CS347_Filesystems_IO]
8 Aug 7 (Mon) Memory subsystem: overview. lec08.txt [CS347_Memory]
9 Aug 8 (Tue) Memory allocation and memory access. lec09.txt
10 Aug 10 (Thu) Performance evaluation basics, open systems. lec10.txt
11 Aug 14 (Mon) Performance evaluation: closed systems. lec11.txt
- Aug 15 (Tue) HOLIDAY (Independence Day)
12 Aug 17 (Thu) Performance evaluation: case studies. lec12.txt [Open vs. Closed] "Open Versus Closed: A Cautionary Tale", Schroeder et al.

[Performance comparison of web servers] "Comparing the Performance of Web Server Architectures", Pariag et al.
13 Aug 21 (Mon) CPU caches: introduction lec13.txt Section 3.0-3.3 of [Drepper] "What Every Programmer Should Know About Memory", Ulrich Drepper.
14 Aug 22 (Tue) Cache coherence in multicore systems lec14.txt
15 Aug 24 (Thu) Understanding cache performance with experiments. lec15.txt
16 Aug 28 (Mon) Techniques to improve cache performance lec16.txt Section 6.2 of [Drepper].
- Aug 29 (Tue) Tutorial on debugging and profiling tools (gdb, valgrind/memcheck, valgrind/massif, oprofile).
- Aug 31 (Thu) NO CLASS (Office hours to clear doubts on project phase 1 in Circular Hall, 4th floor of KReSIT).
- Sep 4 (Mon) Midsem review and practice problems Practice problems and solutions.
17 Sep 5 (Tue) Multicore systems: overview. lec17.txt
- Sep 7 (Thu) Midsem review.
- Sep 18 (Mon) Midsem paper discussion. Midsem question paper and solutions.
18 Sep 19 (Tue) Atomic instructions, locking, scalable locks. lec18.txt
19 Sep 21 (Thu) Performance overhead of synchronization. lec19.txt [Everything about sync] "Everything You AlwaysWanted to Know About Synchronization but Were Afraid to Ask", David et al.
20 Sep 25 (Mon) Lockfree datastructures. lec20.txt [Lockfree List] "A Pragmatic Implementation of Non-Blocking Linked-Lists", Harris.

Also, section 8.1 of [Drepper].
21 Sep 26 (Tue) More lockfree techniques: Lightweight Synchronization with RCU and RLU, Transactional memory. lec21.txt [Read-Log-Update] "Read-Log-Update: A Lightweight Synchronization Mechanism for Concurrent Programming", Matveev et al.

For a brief overview of transactional memory, Section 8.2 of [Drepper]
22 Sep 28 (Thu) OS modifications for multicore scalability. lec22.txt [Mosbench] "An Analysis of Linux Scalability to Many Cores", Boyd-Wickizer et al.

[Barrelfish] "The Multikernel: A new OS architecture for scalable multicore systems", Baumann et al.
23 Oct 3 (Tue) Multicore scalability of Applications on Linux. lec23.txt
24 Oct 5 (Thu) Dynamic memory allocation in multicore systems. lec24.txt [Hoard] "Hoard: A Scalable Memory Allocator for Multithreaded Applications", Berger et al.
- Oct 9 (Mon) NO CLASS (finish up project phase 2 submissions)
25 Oct 10 (Tue) Network I/O optimizations: overview. lec25.txt
26 Oct 12 (Thu) Network stack design for high performance. lec26.txt [Megapipe] "MegaPipe: A New Programming Interface for Scalable Network I/O", Han et al.

[mTCP] "mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems", Jeong et al.
27 Oct 16 (Mon) High performance key-value stores: a case study. lec27.txt [MICA] "MICA: A Holistic Approach to Fast In-Memory Key-Value Storage", Lim et al.

[Pilaf] "Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store", Mitchell et al.

28 Oct 17 (Tue) Networking research: some recent advances. lec28.txt
29 Oct 23 (Mon) Distributed systems: overview. lec29.txt
30 Oct 24 (Tue) Strong consistency: Raft consensus algorithm. lec30.txt [Raft] "In Search of an Understandable Consensus Algorithm", Ongaro and Ousterhout. Read sections 1, 2, and 5 (until 5.4.1).
31 Oct 26 (Thu) Weak consistency: a case study of Amazon's Dynamo key-value store. lec31.txt [Dynamo] "Dynamo: Amazon's Highly Available Key-value Store", DeCandia et al. Read until Section 4.6.
32 Oct 30 (Mon) Distributed transactions. lec32.txt
- Oct 31 (Tue) NO CLASS (work on project phase 3)
- Nov 2 (Thu) NO CLASS (work on project phase 3)
- Nov 6 (Mon) Endsem review.
- Nov 7 (Tue) Endsem review. Practice problems and solutions.
- Nov 9 (Thu) NO CLASS (office hours to clear endsem doubts)