Parallel Data Laboratory Summer Talks - Shasank Chavan and Greg Ganger

— 2:00pm

Virtual Presentations - Remote Access - Zoom


      Vice President, Data In-Memory and AI Technologies, Oracle       

—  Leveraging Generative AI with Oracle AI Vector Search

AI Vector Search in Oracle 23ai is a new, transformative way to intelligently search through your unstructured business data efficiently, and accurately, by using AI techniques to match on the semantics, or meaning, of the underlying data. With the inclusion of a new VECTOR datatype, new approximate search indexes, and new SQL operators and extensions, enterprise companies can quickly and easily leverage AI Vector Search to build modern, generative-ai applications with just a few lines of SQL! And with this simplicity comes power, as AI Vector Search is fully integrated with Oracle’s enterprise-grade functionality, such as transactions, RAC, and Exadata. This talk will dive into the mechanics of AI Vector Search, ensuring a solid understanding of its implementation and benefits. 

Shasank Chavan is the Vice President of the Data, In-Memory and AI Technologies group at Oracle. He leads an organization of brilliant engineers working on the nexus between AI systems and modern databases. His team is currently hyper-focussed on developing the next-generation, AI-centric data storage engine, designed for in-memory OLTP, Analytics and Vector Search capabilities to power the AI and Generative AI revolution to come. Shasank earned his BS/MS in Computer Science at the University of California, San Diego. He has accumulated 50+ patents over a span of 25 years working on systems software technology. 

      Jatras Professor of Electrical and Computer Engineering, Carnegie Mellon University
      Director, Parallel Data Laboratory (PDL)

—  Cluster Storage Systems Need Declarative IO Interfaces 

Storage systems continue to be built around decades-old imperative interfaces, like read/write and get/put. Although this low-level interface can be used for any framework or application, it can lead to significant IO inefficiencies, especially in cases (e.g., data maintenance tasks like compaction, integrity checks, rebalancing, etc.), for which caches tend to be least effective. Although not a new fact, IO efficiency is reaching emergency status, as the IOPS/TB (or BW/TB) available from each storage device in large-scale cluster storage drops with each increase in device approaches are needed to more efficiently use the IOPS/TB available. It's time to augment cluster storage with declarative interfaces, whereby data maintenance tasks and data management applications can register need for sets of data items and allow the storage system to orchestrate the corresponding IO. So, rather than converting order-flexible and time-flexible needs into an arbitrary ordering of "do this now" imperative IO, the flexibility can be exposed to and exploited by the storage system. With this flexibility, significant opportunities arise for eliminating redundant IO (e.g., data read for an integrity check could also be used for rebalancing), smoothing IO bursts, and coelescing IOs. This talk will describe the declarative IO concept, argue for their importance, talk about our early exploration into them, and invite discussion and collaboration.   

Greg Ganger is the Jatras Professor of ECE and CS (by courtesy) at Carnegie Mellon University (CMU). Since 2001, he has also served as the Director of CMU's Parallel Data Laboratory (PDL) research center focused on data storage and processing systems. He has broad research interests in computer systems, including storage/file systems, cloud computing, ML systems, distributed systems, and operating systems. He earned his collegiate degrees from the University of Michigan and did a postdoc at MIT before joining CMU. He still loves playing basketball..he's lost a step but developed a sweet 3-point shot. And, no, the surfing pictures are not photoshopped. 

Zoom Participation.  See announcement.

Event Website:

Add event to Google
Add event to iCal