Systems for Machine Learning

Mythili Vutukuru
Department of Computer Science and Engineering, IIT Bombay



This page contains the course material for CS 794 (Systems for Machine Learning). In this course, you will learn about the infrastructure that underpins modern deep learning systems. The topics covered in this course include GPU architecture and networking, ML programming using CUDA and frameworks like PyTorch, hardware-aware performance optimizations, distributed training, and LLM inference optimizations.

Pre-requisites: Students should have taken a basic course in machine learning, and be familiar with concepts related to neural network training and inference. The course will involve a significant programming component in CUDA and PyTorch, so students must be comfortable with C++ and Python programming in a Linux environment. Students are expected to use free GPU resources available on cloud platforms to solve the take-home assignments.

News: We are working on a textbook for Systems of Machine Learning. Please find below the draft chapters, with accompanying slides and programming assignments. You can check back here for more updates soon.


Chapter# Title Chapter PDF Slides Programming Assignments
1 Introduction to SysML
2 Review of Deep Learning Concepts
3 Programming AI Hardware pdf link GitHub link
4 Hardware-aware Optimizations
5 Machine Learning Programming Frameworks
6 Distributed Training
7 Networking Optimizations
8 LLM Inference Optimizations





The course material for the Spring 2026 offering of this course is archived below.


Lecture# Topics Slides References Programming Assignments
0 Introduction to the course slides PA0: Kaggle setup
1 Overview of deep learning concepts slides PA1: KV caching for GPT model in PyTorch
2 Hardware for AI acceleration slides
3 Hardware-aware performance optimizations slides
4 CUDA programming slides PA2: Optimizations to Matrix Multiplication

PA3: Optimizations to a simple MLP

PA4: Flash Attention
5 High-level ML programming frameworks slides
6 Distributed training slides
7 LLM inference optimizations slides
8 Networking optimizations slides

References: The following textbooks (available online) provide a good background on several topics covered in the course. I am grateful to the authors for permitting me to use content from these books in my slides for the Spring 2026 offering.