This dissertation explores the synthesis of novel views of complex scenes through the optimization of a volumetric scene function using a sparse set of input views. Our approach represents the scene as a neural radiance field (NeRF), a field of densities and emitted radiance based on 5D coordinates encompassing spatial location (x, y, z) and viewing direction (Θ, φ). NeRF enables the rendering of photorealistic novel views that surpass previous techniques, leading to numerous follow-ups and extensions in the computer vision and graphics communities. To enhance the representation of high-frequency details in NeRFs, we introduce a Fourier feature mapping technique that effectively learns high-frequency functions within low-dimensional problem domains, including NeRF. We demonstrate the benefits of leveraging learned initial weight parameters through standard meta-learning algorithms, resulting in accelerated convergence, stronger priors, and improved generalization for coordinate-based networks. In addition, we improve the scalability of NeRFs with a proposed method capable of representing arbitrarily large scenes. This method enables city-scale reconstructions using data captured under diverse environmental conditions. Finally, we present the Nerfstudio framework, a comprehensive suite of modular components and tools designed for the development and deployment of NeRF-based methods. This framework empowers researchers and practitioners with real-time visualization, streamlined data pipelines, and export capabilities, facilitating the democratization of NeRFs and extending their impact beyond research settings. With their potential to transform computer graphics, virtual reality, augmented reality, and other domains, NeRFs hold promise for revolutionizing the way we perceive and interact with digital worlds.




Download Full History