Thesis Oral Defense - Haithem Turki

— 5:00pm

In Person and Virtual - ET - Reddy Conference Room, Gates Hillman 4405 and Zoom

HAITHEM TURKI , Ph.D. Candidate, Computer Science Department, Carnegie Mellon University

Towards City-Scale Neural Rendering

Advances in neural rendering techniques have led to significant progress towards photo-realistic novel view synthesis. When combined with increases in data processing and compute capability, this promises to unlock numerous VR applications, including virtual telepresence, search and rescue, and autonomous driving. Large-scale virtual reality, long the domain of science fiction, feels markedly more tangible. 

This thesis explores the frontier of large-scale neural rendering by building upon Neural Radiance Fields (NeRFs), a family of methods attracting attention due to their state-of-the-art rendering quality and conceptual simplicity. Since its inception, at least 3,000 papers have been proposed in less than three years by research groups across the world across numerous use cases. However, many shortcomings remain. The first is scale itself. Only a handful of existing methods capture scenes larger than a room. Those that do only handle static reconstruction, which limits their applicability. Another is speed, as rendering falls below interactive thresholds. Current acceleration methods remain too slow or degrade quality at high resolution. Quality is a third issue, as NeRF assumes ideal viewpoint conditions that are unrealistic in practice and degrades when they are violated. 

We first explore scaling within the context of static reconstruction. We design a sparse network structure that specializes parameters to different regions of the scene that can be trained in parallel, allowing us to scale linearly as we increase model capacity (vs quadratically in the original NeRF), and reconstruct urban-scale environments orders of magnitude larger than prior work. We then address dynamic reconstruction of entire cities, and build the largest dynamic NeRF representation to date. To accelerate rendering, we improve sampling efficiency through a hybrid surface-volume representation that encourages the model to represent as much of the world as possible through surfaces (which require few samples per ray) while maintaining the freedom to render transparency and finer details (which pure surface representations struggle to capture). We finally propose a fast anti-aliasing method that greatly improves rendering quality when training with data collected from freeform camera trajectories. Importantly, our method incurs a minimal performance overhead and is compatible with the scale and speed improvements previously mentioned.

Thesis Committee:

Deva Ramanan (Chair)
Shubham Tulsiani
Jessica K. Hodgins
Martial Hebert
Jonathan T. Barron (Google DeepMind)

In Person and Zoom Participation.  See announcement.

Event Website:

Add event to Google
Add event to iCal