Doctoral Thesis Proposal - Bernardo Subercaseaux
— 10:15am
Location:
In Person
-
Reddy Conference Room, Gates Hillman 4405
Speaker:
BERNARDO SUBERCASEAUX
,
Ph.D. Student
Computer Science Department
Carnegie Mellon University
https://bsubercaseaux.github.io/
Automated reasoning engines, and SAT solvers in particular, have become a powerful tool for tackling hard combinatorial problems. While SAT solvers have improved dramatically, their effectiveness still depends critically on how problems are encoded into conjunctive normal form (CNF). Encoding choices routinely account for runtime differences of several orders of magnitude, yet their design remains more art than science—guided by intuition and hard-won experience rather than a principled theory. This thesis aims to shed light on the design of effective SAT encodings, from both practical and theoretical perspectives.
The first part is dedicated to the art of encodings. We show how clever problem-specific encodings, together with other automated reasoning techniques, have allowed us to solve a variety of open problems in discrete mathematics ranging from graph coloring to discrete geometry. These encodings benefit from mathematical insights incorporated via additional constraints or auxiliary variables that represent semantically important aspects of the problem. Notably, some of these insights are themselves inspired by computational experiments, establishing a symbiotic relationship between automated reasoning and discrete mathematics. We also apply this art to computational problems in explainable AI—uncovering patterns in decision trees, nearest-neighbor classifiers, and other symbolic models—demonstrating the transferability of our techniques beyond mathematics.
The second part is dedicated to the foundations of a science of encodings. We present a landscape of theoretical results regarding the number of clauses and auxiliary variables required to encode several building blocks of propositional encodings, from cardinality constraints to arbitrary k-CNF functions. We posit that encoding boolean functions into CNF with as few clauses as possible offers a fascinating yet largely unexplored territory for circuit complexity. By leveraging auxiliary variables and wide clauses, this clause-minimization model permits rich combinatorial structures that surpass known lower bounds for circuits. Furthermore, while clause-minimization does not always correlate with solver performance, theoretical developments in this model have led us to novel encodings that run faster on practical problems. More broadly, this thesis aims to establish that the clause-minimization model is not only an elegant theory, but also a realistic path towards empirical speedups.
Thesis Committee
Marijn Heule (Chair)
Jeremy Avigad
Ruben Martins
Stefan Szeider (Technische Universität Wien)
Ryan Williams (Massachusetts Institute of Technology)
Additional Information
For More Information:
matthewstewart@cmu.edu