Computer Science Thesis Proposal

Tuesday, August 2, 2016 -
1:00pm to 2:00pm


8102 Gates & Hillman Centers


SHIVA KAUL, Ph.D. Student

For More Information, Contact:

A learning algorithm is agnostic if it doesn’t presume a perfect model of how input data produce output data. Such algorithms are difficult to design, even for the basic task of classifying data as well as the best linear separator. This has led to a persistent rift between practice and theory: popular algorithms, such as SVMs and logistic regression, are susceptible to noise; provable agnostic algorithms involve brute-force sampling (which uses too much time) or fitting polynomials (which uses too much data.) We recently introduced a new classification algorithm, KG, which is both practical and agnostic. I achieves promising experimental performance for both natural and artificial problems. We seek to deepen our theoretical understanding of the algorithm and expand its practical applications. The main question we shall answer is: when is KG provably fast? It eventually converges to the correct solution for a wide variety of input distributions. However, these intersect with a litany of hardness results, so restricting the input distribution seems necessary. Based on experimental evidence and the mechanics of the algorithm, we believe it is possible the algorithm runs in polynomial time when the inputs are normally distributed. If so, this algorithm would solve a notorious problem in computer science: learning logarithmically-sparse parities with noise. This would resolve a variety of challenges in learning theory, such as learning DNFs (encountered in 1984 by Valiant) and learning log-juntas (the subject of a prize offered in 2003 by Blum). As exciting as this possibility seems, it does not contradict known hardness results, nor does it upset the consensus on related problems in cryptography or complexity theory. We propose to gain more experimental and theoretical evidence for this possibility. In practice, many classification tasks involve multiple classes. When the number of classes is large, we do not believe fast agnostic classification is possible. We posit stronger lower bounds for classification with a growing number of classes which depend on P != NP rather than weaker conjectures about refuting random constraint satisfaction problems. Thesis Committee: Geoff Gordon (Chair) Avrim Blum Nina Balcan Varun Kanade (University of Oxford) Copy of Thesis Summary


Thesis Proposal