Existing monocular depth estimation methods based on deep learning use RGB images as input and deeptrain models to learn the relationship between the RGB value and the depth value. Under different lighting conditions, the relative depth between the objects in the scene never changes. Thus, we suppose that information such as illumination and luster contained in the original image is depth-irrelevant. Based on this conjecture, we propose RADepthNet, which separates depth-related features from depthirrelevant ones and incorporates boundary features. Because depth values tend to be smooth inside the object but abrupt at the boundary, a boundary is a crucial clue for depth estimation. Thus, we propose using boundary features to improve depth prediction results.
To verify the effectiveness of the proposed method, we conducted depth estimation experiments on two datasets: NYU-Depth v2 and a newly built soccer video dataset, FIFADataset, which is publicly available.
Our main contributions are as follows:
- We propose RADepthNet, a novel network that separates depth-irrelevant information from depth-related information and fuses depth-related information with boundary features for better depth estimation.
- We propose a reflectance extraction module to decompose an image into depth-related reflectance and depth-irrelevant illumination based on the retinex theory and implicit constraints. Using only depth-related reflectance maps, we are able to reduce the interference of depth-irrelevant information on depth estimation accuracy.
- We construct a new dataset, FIFADataset, for depth estimation of soccer scenes, containing 6.5k pairs of RGB images and depth maps of soccer scenes extracted from FIFA soccer games. We conducted experiments using NYU-Depth v2 and FIFADataset. Extensive qualitative and quantitative results show that our model achieves state-of-the-art performance in monocular depth estimation.
However, FIFADataset is a synthetic dataset. The domain adaptation problem from synthetic to real datasets is a significant challenge. In the future, we will consider using synthetic data for depth estimation of real soccer game videos and improving the tracking accuracy of players and soccer balls by combining depth information.
Virtual Reality & Intelligent Hardware
RADepthNet: Reflectance-Aware Monocular Depth Estimation
Article Publication Date