Maria Balcan

Nina Balcan

Cadence Design Systems Professor


CMU Scholars Page

Office 8205 Gates and Hillman Centers


Phone (412) 268-5295

Machine Learning Department
Computer Science Department

Research/Teaching Statement

My research tackles fundamental questions in Machine Learning, Algorithmic Game Theory, and Algorithms. My work develops deep new connections between these areas, using ideas and insights from each of them to solve some of their central and emerging challenges in innovative ways.

Foundations for Machine Learning  Machine learning studies the design of automatic methods for extracting information from data and has become a tremendously successful discipline with a wide variety of important applications in areas such as robotics, healthcare, information retrieval, and sustainability. Its past successful evolution was heavily influenced by mathematical foundations developed for several core problems including generalizing from labeled data. However, with the variety of applications of machine learning across science, engineering, and computing in the age of Big Data, re-examining the underlying foundations of the field has become imperative. A major goal of my research is to substantially advance the field of machine learning by developing foundations and algorithms for a number of important modern learning paradigms. These include interactive learning, where the algorithm and the domain expert engage in a dialogue to facilitate more accurate learning from less data compared to the classic approach of passively observing labeled data; distributed learning, where a large dataset is distributed across multiple servers and the challenge lies in learning with limited communication; and multi-task learning, where the goal is to solve multiple related learning problems from less data by taking advantage of relationship among the learning tasks. My goal is to provide new frameworks explaining the fundamental underlying principles, as well as new powerful, principled, and practical learning algorithms designed to satisfy the new types of constraints and challenges of these modern settings (including statistical efficiency, computational efficiency, noise tolerance, limited supervision or interaction, privacy, low communication, and incentives).

Algorithmic Game Theory  Traditionally, complex systems involving multiple agents each with their own interests in mind have been analyzed through purely game theoretic lenses, but technologies such as the Internet have triggered an increased growth of research concerning algorithmic aspects as well. Yet these approaches are often limited to studying static concepts. My work goes further and shows how machine learning methods can help tackle fundamental open questions regarding information-gathering and dynamics in these settings. For example, in past work, I showed an exciting application of machine learning to automate aspects of auction design and formally address problems of market analysis for designing combinatorial pricing mechanisms with near-optimal revenue guarantees. Along different lines, my current work develops a new approach to analyzing the overall behavior of complex systems in which multiple agents with limited information are selfishly adapting their behavior over time based on past experience. My goal is to develop general techniques for influencing the behavior of natural learning dynamics towards globally good states, as well as to provide powerful tools to reason about economic agents as adaptive, learning entities.

Analysis of Algorithms beyond the Worst Case  Many important optimization problems are unfortunately provably hard even to approximate well on worst-case instances. However, real-world instances often satisfy certain natural regularities or stability properties. A recent direction in my work is designing algorithms for important optimization problems with strong formal guarantees under natural stability assumptions about the input instances. For example, in the context of clustering I showed that approximation stability assumptions (implicit when modeling clustering as approximately optimizing a distance-based objective, e.g., k-means) could be leveraged to overcome worst-case hardness results. I am interested to further analyze in this framework other problems of finding hidden structure in data. I additionally plan to identify other meaningful and generally applicable models of computation beyond worst-case analysis, that accurately model real-world instances and could provide a useful alternative to traditional worst-case models in a broad range of optimization problems.