Title: Tutorial: Exploration and Multi-armed Bandits
Shivaram Kalyanakrishnan, Yahoo! Labs, Bangalore
Date & Time: October 30, 2013 14:30
Venue: S9, Old CSE Building, Second Floor
Imagine that we wish to increase the click through rate (CTR) on the links in a web page, and designers have provided us three possible configurations for rendering the page: D1 (black text on white background), D2 (white text on black background), and D3 (D1 with a smaller font size). Likely, the propensity of users to click on links will differ for D1, D2, and D3. Given that the CTRs of these configurations are unknown to begin, how must we adaptively split traffic among them in order to maximize the total number of clicks? In addition to this illustrative example, the ``explore or exploit?'' dilemma arises in numerous applications in practice, including clinical trials, packet routing, and game playing. This tutorial will introduce the theory of stochastic multi-armed bandits, which answers how to trade off the exploration of competing options with the exploitation of the best. Arguments based on intuition, mathematics, and experiments will lead the audience through a discussion that all are welcome to attend.
Speaker Profile:
Shivaram Kalyanakrishnan is a scientist at Yahoo Labs Bangalore. His primary research interests lie in the fields of artificial intelligence and machine learning, spanning areas such as reinforcement learning, agents and multiagent systems, humanoid robotics, multi-armed bandits, and on-line advertising. He obtained his Ph.D. in Computer Science from the University of Texas at Austin (2011), and his B.Tech. in Computer Science and Engineering from the Indian Institute of Technology Madras (2004). He has extensively used robot soccer as a test domain for his research, and has actively contributed to initiatives such as RoboCup and the Reinforcement Learning competitions.
