Brian Hu Zhang

New Solution Concepts and Algorithms for Equilibrium Computation and Learning in Extensive-Form Games and Beyond

Abstract

Computational game theory has led to significant breakthroughs in AI dating back to the start of AI as a discipline. For example, it has been instrumental in enabling superhuman AI from recreational games such as two-player zero-sum games chess, go, and heads-up poker to multiplayer games such as six-player poker and HanabiL, and even in games involving human language such as Diplomacy. It has also empowered a growing range of non-recreational applications, such as trading, machine learning robustness and safety, negotiation, conflict resolution, mechanism (e.g., auction) design, information design, security, political campaigning, and self-driving cars.

This thesis pushes the boundary on computational game theory, especially in imperfect-information sequential (extensive-form) games, which are most prevalent in practical applications both in zero-sum games and beyond. We will present new theoretical concepts and frameworks, state-of-the-art and often provably optimal algorithms for computing and learning equilibria, and new ways to apply such algorithms to real-world problems, including problems in economics such as mechanism and information design. We will also draw connections to the broader literature non optimization, yielding new and more efficient algorithms for solving variational inequalities.

The thesis contains two parts. We now highlight selected significant results from each part.

Part I covers the computation of optimal solutions to extensive-form games. We derive new scalable algorithms, solution concepts, and complexity results for computing optimal equilibria in a variety of extensive-form settings including two-player zero-sum games, team games, extensive-form correlated equilibria, and mechanism design. Though seemingly unrelated, the solutions to several of these problems turn out to rest on similar ideas surrounding the construction of a mediator that facilitates correlation or communication among the players, and often involve reductions to two-player zero-sum games–enabling the toolbox of zero-sum techniques, including those in this part, to be applied much more broadly. Among other results, we use our algorithm for hidden-role games to implement the first superhuman agent for Fog of War chess (also known as "dark chess"), a popular imperfect-information variant of chess in which common-knowledge closures are too large to be tackled by prior subgame-solving techniques; and to compute exact optimal strategies for several variants of The Resistance: Avalon.

Part II covers algorithms for learning agents in games, as well as novel connections to optimization more broadly speaking. The performance of a learning algorithm can be measured by the agent's regret. Different notions of regret can be characterized by different sets of strategy transformation functions ("deviations")–larger sets result in tighter notions of regret. We develop new algorithms that achieve more robust notions of equilibrium, in particular robustness to arbitrary low-dimensional sets of deviations, along with matching lower bounds. We also show how to use techniques from the theory of learning in games to solve variational inequalities–leading to, among other results, the first polylog(1/ε)-time algorithm for solving variational inequalities assuming only the Minty property. We also connect our earlier methods for computation of optimal equilibria to the theory of learning in games, resulting in the first algorithms for steering no-regret learning agents to desirable equilibria.

Thesis Committee

Thesis Document