Database Seminar - Jark Wu
— 5:30pm
Location:
Virtual Presentation - ET
-
Remote Access - Zoom
Speaker:
JARK WU
,
Original Creator, Apache Fluss
Lead, Flink SQL and Fluss Teams
Alibaba Cloud
https://www.linkedin.com/in/jarkwu/
Monday, December 8, 2025, 4:30 – 5:30pm
Modern data lakehouses promise unified batch and streaming processing, yet their storage layer remains inherently batch-oriented—optimized for large, immutable files. This mismatch forces streaming workloads to rely on external systems (e.g., Kafka), while analytical queries operate on stale snapshots, breaking end-to-end freshness.
In this talk, I’ll present Apache Fluss (incubating), a lakehouse-native streaming storage system designed to bridge this gap. Fluss rethinks streaming storage from the ground up for analytical workloads. Its core abstraction is a columnar stream built on Apache Arrow, enabling sub-second ingestion and high-hroughput analytical scans. Furthermore, Fluss introduces the "Streaming Lakehouse" concept that Fluss serves as the real-time data layer on top of Lakehouse. It allows query engines to seamlessly unify both fresh streaming data in Fluss and historical data in Lakehouse (Iceberg) to achieve truly real-time data
analytics.
—
Jark Wu is the original creator of Apache Fluss and PMC member of Apache Flink. He currently leads the Flink SQL (streaming compute) and Fluss (streaming storage) teams at Alibaba Cloud, where he is dedicated to building a serverless Flink cloud service. His work focuses on data streaming systems for over a decade.
This talk is part of the Future Data Systems Seminar Series.
Zoom Participation. See announcement.
For More Information:
db-www@cs.cmu.edu