Mihir Kulkarni

Deep Reinforcement Learning for Scrabble

For my Master's thesis, for which I'm working with IIT Bombay's Prof. Shivaram Kalyanakrishnan, we're bootstrapping off genetic algorithms to create a deep Q-learning based agent to play the game of Scrabble. Our baseline has achieved a 40% win rate against Quackle, one of the better open-source agents, and we're confident of significantly improving. You can find the report for the first stage of my thesis here.

Causal Impact Modeling

During my intern at Adobe, we used a propensity-based model by Rubin to measure the causal impact of marketings email on user behaviour. We have filed a patent, P5668-US: Estimation of Causal Impact of Digital Marketing Content, with the US PTO, and I also received a pre-placement offer to join Adobe after graduation.

Curriculum Learning

I worked with Prof. Shivaram on implementing the curriculum learning technique proposed by Bengio et al. for the MNIST and Basic Shapes data set. The idea is to improve optimization speed and accuracy by feeding in increasingly high-entropy training data subsets to the optimizer. Each subset corresponds to an objective that is “easier” (more convex) than the next, and provides a good initialization point for optimizing over it. Some methods we tried were using the confusion matrix to try to create easily separable subsets, and inroducing noise to identify good image related features. You can find our report here.

NP-completeness of LCPs

My first research experience had me prove the NP-completeness of the Linear Complementarity Problem (LCP) via a new reduction from Independent Set (IS). I could do this by using approaches from a related problem, after suitably defining an LCP for each instance of IS. Here is the report.

Key Projects

Deep Reinforcement Learning for Scrabble

Causal Impact Modeling

Curriculum Learning

NP-completeness of LCPs