CS 747: Foundations of Intelligent and Learning Agents
(Autumn 2016)

(Picture source: http://www.nature.com/polopoly_fs/7.33483.1453824868!/image/WEB_Go---1.jpg_gen/derivatives/landscape_630/WEB_Go---1.jpg.)


  Shivaram Kalyanakrishnan
  Office: Room 220, New CSE Building
  Phone: 7704
  E-mail: shivaram@cse.iitb.ac.in

Teaching Assistants

  Samiran Roy
  Office: 402, New CSE Building, Desk 39
  E-mail: samiranroy@cse.iitb.ac.in

  A. Siddharth
  Office: 402, New CSE Building, Desk 40
  E-mail: siddarth@cse.iitb.ac.in

  Ashish Ramteke
  Office: SynerG Lab, KReSIT Building, Desk B-2
  E-mail: ashishr@cse.iitb.ac.in


Lectures will be held in F. C. Kohli Auditorium, KReSIT Building, during Slot 6: 11.05 a.m. – 12.30 p.m. Wednesdays and Fridays.

Office hours will immediately follow class and be up to 1.15 p.m. on Wednesdays and Fridays. Meetings can also be arranged by appointment.

Course Description

Today's computing systems are becoming increasingly adaptive and autonomous: they are akin to intelligent, decision-making ``agents''. With its roots in artificial intelligence and machine learning, this course covers the foundational principles of designing such agents. Topics covered include: (1) agency, intelligence, and learning; (2) exploration and multi-armed bandits; (3) Markov Decision Problems and planning; (4) reinforcement learning; (5) search; (6) multi-agent systems and multi-agent learning; and (7) case studies.

The course will adopt a ``hands-on'' approach, with programming assignments designed to highlight the relationship between theory and practice. Case studies, as well as invited talks from experts, will offer an ``end-to-end'' view of deployed agents. It is hoped that students can apply the learnings from this course to the benefit of their respective pursuits in various areas of computer science and related fields.


The course is open to all Ph.D. students, all masters students, and undergraduate/dual-degree students in their fourth (or higher) year of study.

The course does not formally have other courses as prerequisites. However, class lectures and assignments will assume that the student is comfortable with probability and algorithms. The course has an intensive programming component: based on ideas discussed in class, the student must be able to independently design, implement, and evaluate programs in a language of his/her choice. The student must be prepared to spend a significant amount of time on the programming component of the course.


Grades will be decided based on four programming assignments, each worth 10 marks; a programming project worth 20 marks; a mid-semester examination worth 15 marks; and an end-semester examination worth 25 marks.

The programming assignments and project must be turned in through Moodle.

Students auditing the course must score 50 or more marks in the course to be awarded an ``AU'' grade.

Academic Honesty

Students are expected to adhere to the highest standards of integrity and academic honesty. Acts such as copying in the examinations and sharing code for the programming assignments will be dealt with strictly, in accordance with the institute's procedures and disciplinary actions for academic malpractice.

Texts and References

Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, MIT Press, 1998. On-line version.

Artificial Intelligence: A Modern Approach, Stuart J. Russell and Peter Norvig, 3rd edition, Prentice-Hall, 2009.

Algorithms for Reinforcement Learning, Csaba Szepesvári, Morgan & Claypool, 2009. On-line version.

Dynamic Programming and Optimal Control, Volume II, Dimitri P. Bertsekas, 4th edition, Athena Scientific, 2012.

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Sébastien Bubeck and Nicolò Cesa-Bianchi, Foundations and Trends in Machine Learning, Volume 5, Number 1, 2012. On-line version.

Selected research papers.


This page will serve as the primary source of information regarding the course, the schedule, and related announcements. The Moodle page for the course will be used for sharing resources for the lectures and assignments, and also for recording grades.

E-mail is the best means of communicating with the instructor; students must send e-mail with ``[CS747]'' in the header, with a copy marked to the TAs.