Tuesday, December 6, 2016 - 12:00pm to 1:00pm
Location:3305 Newell-Simon Hall
Speaker:CHRISTIAN KROER, Ph.D. Student http://www.cs.cmu.edu/~ckroer/
For More Information, Contact:firstname.lastname@example.org
We study the problem of computing a Nash equilibrium in large-scale two-player zero-sum extensive-form games. While this problem can be solved in polynomial time, sparse iterative methods, in particular regret- based methods and first-order methods (FOMs), are much cheaper and effective and hence usually preferred for large games. Thus far, regret-based methods have largely been favored in practice in spite of their theoretically inferior convergence rates. In this talk we investigate the acceleration of FOMs both theoretically and numerically.
The convergence rates of FOMs depend heavily on the properties of the distance-generating function (DGF) that they are based on. We investigate the acceleration of FOMs for solving extensive-form games through a better design of the dilated entropy function--a class of DGFs related to the domains associated with the extensive-form games--and suggesting new sampling schemes for stochastic FOMs. Our new weighting scheme for the dilated entropy function establishes the first bound on the strong-convexity parameter for DGFs over general treeplexes, i.e., the strategy spaces of general sequential games, and it has no dependence on the branching factor of the player. Thus, we establish first explicit FOM convergence rates for general treeplexes, and significantly improve upon previous results given only for some special cases. Building on this result, we also introduce a class of gradient estimators which, along with our DGF, leads to the first stochastic FOM for extensive-form games.
Experimentally, we investigate the performance of three FOMs (the excessive gap technique, mirror prox, and stochastic mirror prox) and compare their performance to the regret-based algorithms. Equipped with our distance-generating function, we find that mirror prox and the excessive gap technique outperform the prior regret-based methods for finding medium accuracy solutions.
Joint work with Kevin Waugh, Fatma Kilinc-Karzan, and Tuomas Sandholm.
Presented in Partial Fulfillment of the CSD Speaking Skills Requirement