Login
Talks & Seminars
Title: How to Escape Saddle Points Efficiently?
Dr. Praneeth Netrapalli, Microsoft Research India
Date & Time: March 25, 2019 11:35
Venue: Department of Computer Science and Engineering, Room No. 109, 01st Floor, New CSE/CC Building
Abstract:
Non-convex optimization is ubiquitous in modern machine learning applications. Gradient descent based algorithms (such as gradient descent or Nesterov's accelerated gradient descent) are widely used in practice since they have been observed to perform well on these problems. From a theoretical standpoint however, this is quite surprising since these nonconvex problems are teeming with highly suboptimal saddle points, and it is well known that gradient descent (and variants) can get stuck in such saddle points. A key question that arises in resolving this apparent paradox is whether gradient descent based algorithms escape saddle points where the Hessian has negative eigenvalues and converge to points where Hessian is positive semidefinite (such points are called second-order stationary points) efficiently. We answer this question in the affirmative by showing the following: (1) Perturbed gradient descent converges to second-order stationary points in a number of iterations which depends only poly-logarithmically on dimension (i.e., it is almost "dimension-free"). The convergence rate of this procedure matches the well known convergence rate of gradient descent to first-order stationary points, up to log factors, and (2) A variant of Nesterov's accelerated gradient descent converges to second-order stationary points at a faster rate than perturbed gradient descent. The key technical components behind these results are (1) understanding geometry around saddle points and (2) designing a potential function for Nesterov's accelerated gradient descent for non-convex problems. Based on joint works with Chi Jin, Rong Ge, Sham M. Kakade and Mike Jordan.
Speaker Profile:
Praneeth Netrapalli is a researcher at Microsoft Research India, Bengaluru since August 2016. Prior to this, he was a postdoctoral researcher at Microsoft Research New England in Cambridge, MA. He obtained MS and PhD in ECE from UT Austin advised by Sujay Sanghavi. Before coming to UT, he worked in the Quantitative Analysis group at Goldman Sachs, Bangalore. He obtained B-Tech in Electrical Engineering from IIT Bombay. His research focuses on designing efficient algorithms for machine learning problems primarily via stochastic and nonconvex optimization.
List of Talks

Webmail

Username:
Password:
Faculty CSE IT
Forgot Password
    [+] Sitemap     Feedback