CS 747: Foundations of Intelligent and Learning Agents
(Autumn 2021)

(Picture source: https://upload.wikimedia.org/wikipedia/commons/7/7c/Emacs_Tetris_vector_based_detail.svg.)

(Page last edited .)


  Shivaram Kalyanakrishnan
  Office: Room 220, New CSE Building
  Phone: 7704
  E-mail: shivaram@cse.iitb.ac.in

Teaching Assistants

  Santhosh Kumar G.
  E-mail: santhoshkg@iitb.ac.in

  Dibyangshu Mukherjee
  E-mail: dbnshu@cse.iitb.ac.in

  Keshav Agarwal
  E-mail: keshavagarwal@cse.iitb.ac.in

  Ashish Aggarwal
  E-mail: ashishaggarwal@cse.iitb.ac.in

  Debabrata Biswal
  E-mail: dbiswal@cse.iitb.ac.in

  Mohith Jagalmohanan
  E-mail: 203050073@iitb.ac.in

  Shashank Shet
  E-mail: shashankshet@cse.iitb.ac.in

  Venkata Sai Baba Reddy Velugoti
  E-mail: velugoti@cse.iitb.ac.in

  Arance Kurmi
  E-mail: arance@cse.iitb.ac.in

  Yash Gadhia
  E-mail: 180100130@iitb.ac.in

  Rohan Gupta
  E-mail: 180010048@iitb.ac.in

  Gagan Jain
  E-mail: 180100043@iitb.ac.in

  Gosula Vinayaka
  E-mail: 180050033@iitb.ac.in

Course Description

Today's computing systems are increasingly adaptive and autonomous: they are akin to intelligent, decision-making "agents". With its roots in artificial intelligence and machine learning, this course covers the foundational principles of designing such agents. Topics covered include: (1) agency, intelligence, and learning; (2) exploration and multi-armed bandits; (3) Markov Decision Problems and planning; (4) reinforcement learning; (5) multi-agent systems and multi-agent learning; and (6) case studies.

The course will adopt a "hands-on" approach, with programming assignments designed to highlight the relationship between theory and practice. Case studies will offer an end-to-end view of deployed agents. It is hoped that students can apply the learnings from this course to the benefit of their respective pursuits in various areas of computer science and related fields.

This offering of the course (in the on-line mode) will be very similar to the Autumn 2020 offering. The same video lectures (possibly with minor changes) will be used. The evaluation pattern will also be similar.


The course is open to all Ph.D. students, all masters students, and undergraduate/dual-degree students in their third (or higher) year of study.

The course does not formally have other courses as prerequisites. However, lectures and assignments will assume that the student is comfortable with probability and algorithms. Introduction to Probability by Grinstead and Snell is an excellent resource on basic probability. Any student who is not comfortable with the contents of chapters 1 through 7 (and is unable to solve the exercises) is advised against taking CS 747.

The course has an intensive programming component: based on ideas discussed in class, the student must be able to independently design, implement, and evaluate programs in python. The student must be prepared to spend a significant amount of time on the programming component of the course.

Students who are unsure about their preparedness for taking the course are strongly advised to watch the lectures from weeks 1, 2, and 3 from the Autumn 2020 offering, to attempt the quizzes from those weeks, and also to go through Programming Assignment 1. If they are unable to get a reasonable grasp of the material or to negotiate the quizzes and programming assignment, they are advised against taking CS 747.

On-line Mode

Weekly Plan

Each "week" of the course will run Tuesday through the subsequent Monday.

Details of the web-based interaction, as well as a form for requesting the instructor to call back, will be provided on Moodle. In addition, students will be given a feedback form through which they can communicate issues related to the course at any point of time.


The evaluation will be based on the following three components.

The total marks from the three components will be at least 115. However, letter grades will be assigned by taking 100 as the denominator. If a student scores X marks in total, the effective marks for deciding the letter grade will be the minimum of X and 100. The purpose of this scheme is to provide students some amount of slack, given that many students might be facing difficulties related to the pandemic and the on-line mode. Students can still aspire for the maximum grade even if they fail to turn in, or perform poorly, in a couple of quizzes or a programming assignment.

All assessments will be based on individual work.

Submissions to the quizzes and the programming assignments must be turned in through Moodle. Late submissions will not be evaluated; they will receive no marks.

Students auditing the course must have an effective score of 50 or more marks in the course to be awarded an "AU" grade.


Moodle will be the primary course management system. Marks for the assessments will be maintained on the class Moodle page; discussion fora will also be hosted on Moodle. Students who do not have an account on Moodle for the course must send TA Santhosh Kumar G. a request by e-mail, specifying the roll number/employee number for account creation.

Academic Honesty

Students are expected to adhere to the highest standards of integrity and academic honesty. Academic violations, as detailed below, will be dealt with strictly, in accordance with the institute's procedures and disciplinary actions for academic malpractice.

Students are expected to work alone on all the quizzes, the programming assignments, and the end-semester examination. While they are free to discuss the material presented in class with their peers, they must not discuss the contents of the assessments (neither the questions, nor the solutions) with classmates (or anybody other than the instructor and TAs). They must not share code, even if it only pertains to functionalities that are perceived not to be relevant to the core logic of the assessment (for example, file-handling and plotting). They also may not look at solutions to the given quiz/assignment or related ones on the Internet. Violations will be considered acts of dishonesty.

Students are allowed to use resources on the Internet for programming (say to understand a particular command or a data structure), and also to understand concepts (so a Wikipedia page or someone's lecture notes or a textbook can certainly be consulted). It is also okay to use libraries or code snippets for portions unrelated to the core logic of the assignment—typically for operations such as moving data, network communication, etc. However, students must cite every resource consulted or used, whatever be the reason, in a file named references.txt, which must be included in the submission. Failure to list any resource used will be considered an academic violation.

Copying or consulting any external sources during the examination will be treated as cheating.

If in any doubt as to what is legitimate collaboration and what is not, students must ask the instructor.

Texts and References

Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2nd edition, MIT Press, 2018. On-line version.

Algorithms for Reinforcement Learning, Csaba Szepesvári, Morgan & Claypool, 2009. On-line version.

Selected research papers.


This page will serve as the primary source of information regarding the course, the schedule, and related announcements. The Moodle page for the course will be used for recording grades and for students to post questions/comments.

E-mail is the best means of communicating with the instructor outside of office hours; students must send e-mail with "[CS747]" in the header.




Slides and videos on this page are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Permission for their use beyond the scope of the license may be sought by writing to shivaram@cse.iitb.ac.in.

Commons License