Efficient Stereo Matching Leveraging Deep Local and Context Information
Stereo matching is a challenging problem with respect to weak texture, discontinuities, illumination difference and occlusions. Therefore, a deep learning framework is presented in this paper, which focuses on the first and last stage of typical stereo methods: the matching cost computation and the...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2017-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8047938/ |
id |
doaj-20a31df0e1dc4039888c914233844bd3 |
---|---|
record_format |
Article |
spelling |
doaj-20a31df0e1dc4039888c914233844bd32021-03-29T20:13:20ZengIEEEIEEE Access2169-35362017-01-015187451875510.1109/ACCESS.2017.27543188047938Efficient Stereo Matching Leveraging Deep Local and Context InformationXiaoqing Ye0https://orcid.org/0000-0003-3268-880XJiamao Li1Han Wang2Hexiao Huang3Xiaolin Zhang4Chinese Academy of Sciences, Shanghai Institute of Microsystem and Information Technology, Shanghai, ChinaChinese Academy of Sciences, Shanghai Institute of Microsystem and Information Technology, Shanghai, ChinaChinese Academy of Sciences, Shanghai Institute of Microsystem and Information Technology, Shanghai, ChinaShanghai Open University, Shanghai, ChinaChinese Academy of Sciences, Shanghai Institute of Microsystem and Information Technology, Shanghai, ChinaStereo matching is a challenging problem with respect to weak texture, discontinuities, illumination difference and occlusions. Therefore, a deep learning framework is presented in this paper, which focuses on the first and last stage of typical stereo methods: the matching cost computation and the disparity refinement. For matching cost computation, two patch-based network architectures are exploited to allow the trade-off between speed and accuracy, both of which leverage multi-size and multi-layer pooling unit with no strides to learn cross-scale feature representations. For disparity refinement, unlike traditional handcrafted refinement algorithms, we incorporate the initial optimal and sub-optimal disparity maps before outlier detection. Furthermore, diverse base learners are encouraged to focus on specific replacement tasks, corresponding to the smooth regions and details. Experiments on different datasets demonstrate the effectiveness of our approach, which is able to obtain sub-pixel accuracy and restore occlusions to a great extent. Specifically, our accurate framework attains near-peak accuracy both in non-occluded and occluded region and our fast framework achieves competitive performance against the fast algorithms on Middlebury benchmark.https://ieeexplore.ieee.org/document/8047938/Stereo visionmatching costdisparity refinementconvolutional neural networkocclusion restoration |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xiaoqing Ye Jiamao Li Han Wang Hexiao Huang Xiaolin Zhang |
spellingShingle |
Xiaoqing Ye Jiamao Li Han Wang Hexiao Huang Xiaolin Zhang Efficient Stereo Matching Leveraging Deep Local and Context Information IEEE Access Stereo vision matching cost disparity refinement convolutional neural network occlusion restoration |
author_facet |
Xiaoqing Ye Jiamao Li Han Wang Hexiao Huang Xiaolin Zhang |
author_sort |
Xiaoqing Ye |
title |
Efficient Stereo Matching Leveraging Deep Local and Context Information |
title_short |
Efficient Stereo Matching Leveraging Deep Local and Context Information |
title_full |
Efficient Stereo Matching Leveraging Deep Local and Context Information |
title_fullStr |
Efficient Stereo Matching Leveraging Deep Local and Context Information |
title_full_unstemmed |
Efficient Stereo Matching Leveraging Deep Local and Context Information |
title_sort |
efficient stereo matching leveraging deep local and context information |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2017-01-01 |
description |
Stereo matching is a challenging problem with respect to weak texture, discontinuities, illumination difference and occlusions. Therefore, a deep learning framework is presented in this paper, which focuses on the first and last stage of typical stereo methods: the matching cost computation and the disparity refinement. For matching cost computation, two patch-based network architectures are exploited to allow the trade-off between speed and accuracy, both of which leverage multi-size and multi-layer pooling unit with no strides to learn cross-scale feature representations. For disparity refinement, unlike traditional handcrafted refinement algorithms, we incorporate the initial optimal and sub-optimal disparity maps before outlier detection. Furthermore, diverse base learners are encouraged to focus on specific replacement tasks, corresponding to the smooth regions and details. Experiments on different datasets demonstrate the effectiveness of our approach, which is able to obtain sub-pixel accuracy and restore occlusions to a great extent. Specifically, our accurate framework attains near-peak accuracy both in non-occluded and occluded region and our fast framework achieves competitive performance against the fast algorithms on Middlebury benchmark. |
topic |
Stereo vision matching cost disparity refinement convolutional neural network occlusion restoration |
url |
https://ieeexplore.ieee.org/document/8047938/ |
work_keys_str_mv |
AT xiaoqingye efficientstereomatchingleveragingdeeplocalandcontextinformation AT jiamaoli efficientstereomatchingleveragingdeeplocalandcontextinformation AT hanwang efficientstereomatchingleveragingdeeplocalandcontextinformation AT hexiaohuang efficientstereomatchingleveragingdeeplocalandcontextinformation AT xiaolinzhang efficientstereomatchingleveragingdeeplocalandcontextinformation |
_version_ |
1724195067944501248 |