Mingkuan Xu
Optimization and Simulation of Quantum Circuits
Abstract
Optimizing and simulating quantum circuits at scale are critical bottlenecks in realizing the full potential of quantum computing. This thesis delivers a comprehensive suite of systems to overcome these limitations, spanning both physical compilation and classical simulation.
For quantum circuit optimization, we first automate the discovery and verification of transformation rules by introducing Equivalence Circuit Class (ECC) sets and an efficient generation algorithm for arbitrary gate sets. We deploy these rules within the superoptimizer Quartz, which utilizes a cost-based backtracking search to escape local minima that trap traditional rule-based methods. To eliminate the exponential runtime penalties inherent to pure search, we introduce QALM, a hybrid optimization framework. By unifying search-based exploration with greedy rule-based exploitation in an interleaved control loop, QALM dynamically escapes local minima without exhaustive backtracking. Evaluated across a large benchmark suite of 248 diverse circuits, QALM outperforms existing search-based optimizers in final circuit quality while operating with the computational efficiency of rule-based systems.
While the prior two approaches aim for global optimization, this problem is intrinsically QMA-hard, creating a bottleneck for large programs. To circumvent this issue and scale up, we introduce OAC, a cut-and-meld circuit optimization algorithm. OAC cuts a circuit into subcircuits, applies an existing oracle optimizer independently, and seamlessly melds the results. This algorithm operates with a linear number of oracle calls while attaining local optimality. Empirical evaluation shows that OAC significantly improves the efficiency of state-of-the-art optimizers while enhancing overall quality.
Beyond physical execution, the scalable simulation of quantum circuits on classical hardware presents another major challenge. To address this, we present Atlas, a distributed GPU-based simulator that hierarchically partitions circuits to exploit data parallelism while minimizing communication overhead. By leveraging integer linear programming to allocate structurally related gates to nearby GPUs and dynamic programming for kernel scheduling, Atlas runs significantly faster than prior state-of-the-art GPU simulators.
Together, these frameworks provide a robust toolchain, improving both the execution of quantum circuits on physical devices and their scalable classical simulation.