Master of Science in Computer Science Thesis Defense
— 3:00pm
Location:
In Person

NewellSimon 3002
Speaker:
SHUCHEN LI
,
Master's Student, Computer Science Department, Carnegie Mellon University
Robust Mean Estimation Against Oblivious Adversaries
We give the first polynomial time algorithm that achieves arbitrarily high accuracy for robust mean estimation, when the adversary corrupts the sample before the Laplace noise is added. For sufficiently small constant α > 0 and accuracy ε > 0, our algorithm takes as input a sample Y ⊆ ℝ^{d} of size n ≥ poly(d, ε^{i}) obtained by an i.i.d. sample X ⊆ ℝ^{d} of size n from the Laplace distribution with unknown mean μ and covariance I, while adversarially corrupting an alpha fraction of the points before the Laplace noise is added. Our algorithm runs in Õ(nd) time, and outputs an estimation μ^{ˆ} that ∥ μ^{ˆ}  μ ∥_{2} ≤ ε with high probability.
Our algorithm transforms the sample to a Fouriersparse signal that encodes the true mean μ, and applies the sparse Fourier transform to decode μ. In Huber’s contamination model, where the adversary corrupts an α fraction of the sample after the noise is added, it is known that there is a Ω(α) lower bound on the error of the estimator. In contrast, our algorithm can estimate the mean arbitrarily closely in this corruptionbeforenoise setting.
Our algorithm in this new setting has many possible applications. For example, the one that motivated us is maxaffine regression. In maxaffine regression, the model is the maximum of k linear models, where the noise is added after taking the maximum. If we can extend our algorithm to the listdecodable setting, then we immediately get an algorithm for the maxaffine regression.
Thesis Committee:
Pravesh Kothari (Chair)
David Woodruff