Efficient Stereo Matching Leveraging Deep Local and Context Information

Stereo matching is a challenging problem with respect to weak texture, discontinuities, illumination difference and occlusions. Therefore, a deep learning framework is presented in this paper, which focuses on the first and last stage of typical stereo methods: the matching cost computation and the...

Full description

Bibliographic Details
Main Authors: Xiaoqing Ye, Jiamao Li, Han Wang, Hexiao Huang, Xiaolin Zhang
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8047938/
id doaj-20a31df0e1dc4039888c914233844bd3
record_format Article
spelling doaj-20a31df0e1dc4039888c914233844bd32021-03-29T20:13:20ZengIEEEIEEE Access2169-35362017-01-015187451875510.1109/ACCESS.2017.27543188047938Efficient Stereo Matching Leveraging Deep Local and Context InformationXiaoqing Ye0https://orcid.org/0000-0003-3268-880XJiamao Li1Han Wang2Hexiao Huang3Xiaolin Zhang4Chinese Academy of Sciences, Shanghai Institute of Microsystem and Information Technology, Shanghai, ChinaChinese Academy of Sciences, Shanghai Institute of Microsystem and Information Technology, Shanghai, ChinaChinese Academy of Sciences, Shanghai Institute of Microsystem and Information Technology, Shanghai, ChinaShanghai Open University, Shanghai, ChinaChinese Academy of Sciences, Shanghai Institute of Microsystem and Information Technology, Shanghai, ChinaStereo matching is a challenging problem with respect to weak texture, discontinuities, illumination difference and occlusions. Therefore, a deep learning framework is presented in this paper, which focuses on the first and last stage of typical stereo methods: the matching cost computation and the disparity refinement. For matching cost computation, two patch-based network architectures are exploited to allow the trade-off between speed and accuracy, both of which leverage multi-size and multi-layer pooling unit with no strides to learn cross-scale feature representations. For disparity refinement, unlike traditional handcrafted refinement algorithms, we incorporate the initial optimal and sub-optimal disparity maps before outlier detection. Furthermore, diverse base learners are encouraged to focus on specific replacement tasks, corresponding to the smooth regions and details. Experiments on different datasets demonstrate the effectiveness of our approach, which is able to obtain sub-pixel accuracy and restore occlusions to a great extent. Specifically, our accurate framework attains near-peak accuracy both in non-occluded and occluded region and our fast framework achieves competitive performance against the fast algorithms on Middlebury benchmark.https://ieeexplore.ieee.org/document/8047938/Stereo visionmatching costdisparity refinementconvolutional neural networkocclusion restoration
collection DOAJ
language English
format Article
sources DOAJ
author Xiaoqing Ye
Jiamao Li
Han Wang
Hexiao Huang
Xiaolin Zhang
spellingShingle Xiaoqing Ye
Jiamao Li
Han Wang
Hexiao Huang
Xiaolin Zhang
Efficient Stereo Matching Leveraging Deep Local and Context Information
IEEE Access
Stereo vision
matching cost
disparity refinement
convolutional neural network
occlusion restoration
author_facet Xiaoqing Ye
Jiamao Li
Han Wang
Hexiao Huang
Xiaolin Zhang
author_sort Xiaoqing Ye
title Efficient Stereo Matching Leveraging Deep Local and Context Information
title_short Efficient Stereo Matching Leveraging Deep Local and Context Information
title_full Efficient Stereo Matching Leveraging Deep Local and Context Information
title_fullStr Efficient Stereo Matching Leveraging Deep Local and Context Information
title_full_unstemmed Efficient Stereo Matching Leveraging Deep Local and Context Information
title_sort efficient stereo matching leveraging deep local and context information
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2017-01-01
description Stereo matching is a challenging problem with respect to weak texture, discontinuities, illumination difference and occlusions. Therefore, a deep learning framework is presented in this paper, which focuses on the first and last stage of typical stereo methods: the matching cost computation and the disparity refinement. For matching cost computation, two patch-based network architectures are exploited to allow the trade-off between speed and accuracy, both of which leverage multi-size and multi-layer pooling unit with no strides to learn cross-scale feature representations. For disparity refinement, unlike traditional handcrafted refinement algorithms, we incorporate the initial optimal and sub-optimal disparity maps before outlier detection. Furthermore, diverse base learners are encouraged to focus on specific replacement tasks, corresponding to the smooth regions and details. Experiments on different datasets demonstrate the effectiveness of our approach, which is able to obtain sub-pixel accuracy and restore occlusions to a great extent. Specifically, our accurate framework attains near-peak accuracy both in non-occluded and occluded region and our fast framework achieves competitive performance against the fast algorithms on Middlebury benchmark.
topic Stereo vision
matching cost
disparity refinement
convolutional neural network
occlusion restoration
url https://ieeexplore.ieee.org/document/8047938/
work_keys_str_mv AT xiaoqingye efficientstereomatchingleveragingdeeplocalandcontextinformation
AT jiamaoli efficientstereomatchingleveragingdeeplocalandcontextinformation
AT hanwang efficientstereomatchingleveragingdeeplocalandcontextinformation
AT hexiaohuang efficientstereomatchingleveragingdeeplocalandcontextinformation
AT xiaolinzhang efficientstereomatchingleveragingdeeplocalandcontextinformation
_version_ 1724195067944501248