News Release

RADepthNet: Reflectance-aware monocular depth estimation

Peer-Reviewed Publication

Beijing Zhongke Journal Publising Co. Ltd.

Qualitative comparison with the state-of-the-art on the FIFADataset.

image: Lee et al. can make predictions close to the ground-truth depth for some images, but some regions′ predicted values seriously deviate from the ground truth. Xue et al. can generally predict small objects′ depth, such as players, but there is a large deviation between the predicted value and the ground truth. Our method outputs depth predictions closer to the ground truth. view more 

Credit: Beijing Zhongke Journal Publising Co. Ltd.

Existing monocular depth estimation methods based on deep learning use RGB images as input and deeptrain models to learn the relationship between the RGB value and the depth value. Under different lighting conditions, the relative depth between the objects in the scene never changes. Thus, we suppose that information such as illumination and luster contained in the original image is depth-irrelevant. Based on this conjecture, we propose RADepthNet, which separates depth-related features from depthirrelevant ones and incorporates boundary features. Because depth values tend to be smooth inside the object but abrupt at the boundary, a boundary is a crucial clue for depth estimation. Thus, we propose using boundary features to improve depth prediction results.

To verify the effectiveness of the proposed method, we conducted depth estimation experiments on two datasets: NYU-Depth v2 and a newly built soccer video dataset, FIFADataset, which is publicly available.

Our main contributions are as follows:

- We propose RADepthNet, a novel network that separates depth-irrelevant information from depth-related information and fuses depth-related information with boundary features for better depth estimation.

- We propose a reflectance extraction module to decompose an image into depth-related reflectance and depth-irrelevant illumination based on the retinex theory and implicit constraints. Using only depth-related reflectance maps, we are able to reduce the interference of depth-irrelevant information on depth estimation accuracy.

- We construct a new dataset, FIFADataset, for depth estimation of soccer scenes, containing 6.5k pairs of RGB images and depth maps of soccer scenes extracted from FIFA soccer games. We conducted experiments using NYU-Depth v2 and FIFADataset. Extensive qualitative and quantitative results show that our model achieves state-of-the-art performance in monocular depth estimation.

However, FIFADataset is a synthetic dataset. The domain adaptation problem from synthetic to real datasets is a significant challenge. In the future, we will consider using synthetic data for depth estimation of real soccer game videos and improving the tracking accuracy of players and soccer balls by combining depth information.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.