DeepStereoVision

This project is funded by German Ministry of Education and Research (BMBF).

Stereo matching is one of the popular approaches for depth estimation using binocular vision. In essence, by finding the matching points between a pair of rectified left and right images, one can compute the disparity (displacement) between these points and easily infer the depth. This passive strategy is in contrast to other techniques, which adopt lasers or light sources to actively measure the depth, and thus, it can handle more challenging scenarios.

Stereo vision has benefited from deep neural networks, bringing in more accurate binocular depth estimations. In particular, the end-to-end networks, which process the whole pipeline in one network, have yielded a substantial increase in the disparity estimation accuracy compared to classical methods. However, on the other side of the coin, these networks have a huge computational complexity, which often hinders them to be executed even on a moderate GPU.

As such, designing light-weight deep networks is becoming one of the most active areas of research in the computer vision community. Likewise, this is essential for deep stereo matching networks, especially if these networks need to be utilized in devices with memory constraints or in fast runtime scenarios. Accordingly, in this project, we mainly investigate methods to lower the computational burden of these networks, while maintaining the high accuracy of disparity maps.

Pipeline for Deep Neural Networks based Stereo Methods

Datasets

In this project, we majorly work with following publically available datasets:

[1]	KITTI 2015 and KITTI 2012 Dataset
[2]	Sceneflow Dataset
[3]	Middlebury Dataset
[4]	DrivingStereo

An Interface for Evaluating DNN-based Depth Perception Models

An application developed in Team Project (SoSe 2021) [Code]

Publications

[1]	Faranak Shamsafar, Samuel Woerz, Rafia Rahim, and Andreas Zell. “MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching”, In IEEE Winter Conference on Applications of Computer Vision (WACV), Hawaii, USA, January 2022. [ Link \| Code ] (*equal contribution)
[2]	Rafia Rahim, Faranak Shamsafar, and Andreas Zell. “Separable Convolutions for Optimizing 3D Stereo Networks”. In IEEE International Conference on Image Processing (ICIP), pages 3208-3212, Anchorage, AK, USA, September 2021. [ Link ] [Code] [Poster]