CS 747: Foundations of Intelligent and Learning Agents
(Autumn 2019)

(Picture source: https://elephantcountry.org/sites/default/files/styles/main_image_800width/public/images/2016-07/elephant-funny.jpg)


  Shivaram Kalyanakrishnan
  Office: Room 220, New CSE Building
  Phone: 7704
  E-mail: shivaram@cse.iitb.ac.in

Teaching Assistants

  Sabyasachi Ghosh
  E-mail: 174050001@iitb.ac.in

  Sounak Bhattacharya
  E-mail: sounak@cse.iitb.ac.in

  Aman Jangde
  E-mail: amanjangde@cse.iitb.ac.in

  Vinod Kushwaha
  E-mail: vinod@cse.iitb.ac.in


Lectures will be held during Slot 11: 3.30 p.m. – 4.55 p.m. Tuesdays and Fridays in LA 202. Office hours will immediately follow class and be up to 5.30 p.m. on Tuesdays and Fridays. Meetings can also be arranged by appointment.

Course Description

Today's computing systems are becoming increasingly adaptive and autonomous: they are akin to intelligent, decision-making "agents". With its roots in artificial intelligence and machine learning, this course covers the foundational principles of designing such agents. Topics covered include: (1) agency, intelligence, and learning; (2) exploration and multi-armed bandits; (3) Markov Decision Problems and planning; (4) reinforcement learning; (5) multi-agent systems and multi-agent learning; and (6) case studies.

The course will adopt a "hands-on" approach, with programming assignments designed to highlight the relationship between theory and practice. Case studies, as well as invited talks from experts, will offer an "end-to-end" view of deployed agents. It is hoped that students can apply the learnings from this course to the benefit of their respective pursuits in various areas of computer science and related fields.


The course is open to all Ph.D. students, all masters students, and undergraduate/dual-degree students in their fourth (or higher) year of study.

The course does not formally have other courses as prerequisites. However, class lectures and assignments will assume that the student is comfortable with probability and algorithms. Introduction to Probability by Grinstead and Snell is an excellent resource on basic probability. Any student who is not comfortable with the contents of chapters 1 through 7 (and is unable to solve the exercises) is advised against taking CS 747.

The course has an intensive programming component: based on ideas discussed in class, the student must be able to independently design, implement, and evaluate programs in a language of his/her choice. The student must be prepared to spend a significant amount of time on the programming component of the course.


Grades will be based on four programming assignments, each worth 10 marks; a course project worth 20 marks; a mid-semester examination worth 15 marks; and an end-semester examination worth 25 marks. The course project will be undertaken in groups of size at most 4; all other assessments will be based on individual work.

The programming assignments and project must be turned in through Moodle. Late submissions will not be evaluated; they will receive no marks.

Students auditing the course must score 50 or more marks in the course to be awarded an "AU" grade.

Getting accounts

  1. Marks for the assessments will be maintained on the class Moodle page. Students who do not have an account on Moodle must send the instructor a request by e-mail with their roll number/employee number.
  2. All programming assignments will be evaluated on the Software Lab 2 machines. Students who do not have a CSE computing account should request for access on this page.

Academic Honesty

Students are expected to adhere to the highest standards of integrity and academic honesty. Academic violations, as detailed below, will be dealt with strictly, in accordance with the institute's procedures and disciplinary actions for academic malpractice.

Copying or consulting any external sources during the examinations will be treated as cheating.

Students are expected to work alone on all the programming assignments. They may not share code or consult with classmates (or anybody other than the instructor and TAs) about their solutions. They also may not look at solutions to the given assignment or related ones on the Internet. Violations will be considered acts of dishonesty.

Students are allowed to use resources on the Internet for programming (say to understand a particular command or a data structure), and also to understand concepts (so a Wikipedia page or someone's lecture notes or a textbook can certainly be consulted). It is also okay to use libraries or code snippets for portions unrelated to the core logic of the assignment—typically for operations such as moving data, network communication, etc. However, students must cite every resource consulted or used, whatever be the reason, in a file named references.txt, which must be included in the submission. Failure to list any source will be considered an academic violation.

Texts and References

Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2nd edition 2018. On-line version.

Algorithms for Reinforcement Learning, Csaba Szepesvári, Morgan & Claypool, 2009. On-line version.

Selected research papers.


This page will serve as the primary source of information regarding the course, the schedule, and related announcements. The Moodle page for the course will primarily be used for recording grades.

E-mail is the best means of communicating with the instructor; students must send e-mail with "[CS747]" in the header, with a copy marked to the TAs.