Computer Science Masters Thesis Presentation

Tuesday, July 21, 2015 - 4:00pm


8102 Gates & Hillman Centers


YIHUA FANG, 5th Year Masters Student /YIHUA%20FANG

For More Information, Contact:

To effectively use distributed systems in Machine Learning (ML) applications, practitioners are required to possess a considerable amount of expertise in the area. Although highly abstracted distributed system frameworks such as Hadoop can help to reduce the complexity of writing code for distributed systems, their performances are incomparable to that of specialized implementations. In light of this observation, the Petuum Project was indented to provide a new framework for implementing highly efficient distributed machine learning applications through a high-level programming interface. It consists of a core distributed ML system named Bosen using the Parameter Server (PS) paradigm and a collection of ML applications implemented on top of the Bosen. Unlike a general purpose distributed framework, the Bosen is designed to maximize the performance for ML algorithms by taking advantage of the data correlation, staleness, and other statistical properties. Its interface provides a relatively easy-to-use programming model for read and write access to ML models and it follows an Stale Synchronous Parallel (SSP) consistency model to improve performance by allowing the worker to access older and more staled values. Our work consists of a number of performance improvement features for the Bosen project and the Petuum JBosen project, a Java version of the Bosen project with simplified system interface and implementations. To improve the performance of the Bosen, we designed and experimented new consistency models on top of the SSP, new designs for the in-memory cache and using fixed schedule to reduce network communication. For JBosen project, we designed and implemented a new distributed ML framework in Java incorporating the design of the PS architecture and SSP consistency model from the Bosen project. However, Unlike the Bosen project which aims to maximize the performance, the JBosen project aims to maximize the usability. This project also serves to reach a greater audiences who for various reasons are unable to use the Bosen which is written in C++ and as an easy starting point for developers who want to write programs on the Bosen. Thesis Committee:Erix XingKayvon Fatahalian Copy of Thesis Document


Masters Thesis Presentation