Joint Systems Design and Implementation / Intel Science & Technology Center / Database Seminar

Thursday, October 25, 2018 - 12:00pm to 1:00pm

Location:

Panther Hollow Conference Room 4101 Robert Mehrabian Collaborative Innovation Center

Speaker:

FELIPE ARAMBURU, Chief Technology Officer at BlazingDB - https://www.linkedin.com/in/felipe-aramburu-707a5b48/

The Design & Implementation Of BlazingDB: An Open-Source GPU-Accelerated Database Management System


Speaker: Felipe Aramburu

Location: CIC 4101


The Design & Implementation Of BlazingDB: An Open-Source GPU-Accelerated Database Management System

BlazingDB has spent the past six months working on an open-source project (libgdf) alongside Anaconda and Nvidia. Libgdf is a library of computational primitives on top of a memory layout which is similar to Apache Arrow but optimized for GPUs. We have created a distributed, GPU-accelerated ETL pipeline that takes a user from reading data in Parquet, to performing SQL operations over that dataset, and finally feeding that data into xgboost, a machine learning library that allows us to leverage GPUs.

In this talk, we will present the design and implementation of BlazingDB for GPU query processing. We will discuss how BlazingDB performs query optimization, distributes workloads over compute resources, and communicates between the different layer. We will also present our methods for using latbuffers for CPU data and Cuda IPC for GPU data. Lastly, we will described our relational algebra engine that operates on data via Cuda IPC, interprets query plans, stores results sets. We will leverage the solutions mentioned above to accelerate a machine learning use case using the xgBoost library.

Felipe Aramburu is a maker. From aquaponics, beer and cheese-making to home automation. He is obsessed with creating. Before being CTO of BlazingDB he and his brother had a consulting company based out of Peru where they originally built BlazingDB as a tool to help them with their own consulting work. Before this he was the CTO of kWhOURs which provided a SaaS solution for energy auditing. Through BlazingDB he has become a high performance junkie that spends nights dreaming about how hybrid processing systems are going to change the world.

Faculty Host: Andy Pavlo

Partially funded by Yahoo! Labs

Event Website:

http://www.pdl.cmu.edu/SDI/2018/102518.html

For More Information, Contact:

Keywords:

Seminar Series