Computer Science Thesis Proposal

Tuesday, December 3, 2019 - 1:00pm


3305 Newell-Simon Hall



Coding for Synchronization Errors

Coding theory is the study of algorithms and techniques that facilitate reliable information transmission over noisy mediums, mostly through combinatorial objects called error-correcting codes. Following the inspiring works of Shannon and Hamming, a sophisticated and extensive body of research on error correcting codes has led to a deep and detailed theoretical understanding as well as practical implementations that have helped fuel the digital revolution. Error-correcting codes can be found in essentially all modern communication and computation systems. While being remarkably successful in understanding the theoretical limits and trade-offs of reliable communication under errors and erasures, the coding theory literature lags significantly behind when it comes to overcoming errors that concern the timing of communications. In particular, the study of correcting synchronization errors, i.e., symbol insertions and deletions, while initially introduced by Levenshtein in the 60s, has significantly fallen behind our highly sophisticated knowledge of codes for Hamming-type errors.

This thesis investigates coding against synchronization errors under a variety of models and attempts to understand trade-offs between different qualities of interest in respective coding schemes such as rate, distance, and algorithmic qualities of the code. Most of our results rely on synchronization strings, simple yet powerful pseudo-random objects that have proven to be very effective solutions for coping with synchronization errors in various settings.

Through indexing with strings that satisfy certain pseudo-random properties, we provide synchronization codes that achieve near-optimal rate-distance trade-off. We further attempt to provide constructions that enable fast encoding/decoding procedures. We study the same problem under the list-decoding regime, where the decoder is expected to provide a short list of codewords that is guaranteed to contain the sent message. We will also try to better understand fundamental limits of list-decoding for synchronization errors such as the list-decoding capacity or maximal error resilience for list-decodable codes. This thesis furthermore studies synchronization strings and other related pseudo-random string properties as combinatorial objects that are of independent interest. Such combinatorial objects will be used to extend some of our techniques to alternative communication problems such as coding from block transposition errors or coding for interactive communication.

Thesis Committee:
Bernhard Haeupler (Chair)
Venkatesan Guruswami
Rashmi Vinayak
Madhu Sudan (Harvard University)
Sergey Yekhanin (Microsoft Research)

Additional Proposal Information

For More Information, Contact:


Thesis Proposal