Incorporating Metric Depth Learning into Neural Radiance Fields

Abstract

Neural radiance fields (NeRF) encode a scene into a neural representation that enables photo-realistic rendering of novel views. Neural However, a successful reconstruction from RGB images requires a large number of input views taken under static conditions — typically up to a few hundred images for room-size scenes, and the prediction of volume density is usually not as accurate as the RGB, which leads to ghostly artifact issues. To address this issue, a metric depth learning module is incorporated with the classic NeRF framework to leverage the depth prior for faster training speed and better rendering quality.

Overview of the Architecture

Loss Function

$$ L=L_{rgb}+\lambda_dL_{Depth} $$ where $\lambda_d$ represents the weight of $L_{Depth}$.

Results

Reconstructed model’s depth map with depth learning prior(Proposed):

Reconstructed model’s depth map without depth learning prior(Original NeRF):

Final Rendering Results: