CSANet: Channel and Spatial Mixed Attention CNN for Pedestrian Detection

Current mainstream pedestrian detectors tend to profit directly from convolutional neural networks (CNNs) that are designed for image classification. While requiring a large downsampling factor to produce high-level semantic features, CNNs cannot adaptively focus on the useful channels and regions o...

Full description

Bibliographic Details
Main Authors:	Yunbo Zhang, Pengfei Yi, Dongsheng Zhou, Xin Yang, Deyun Yang, Qiang Zhang, Xiaopeng Wei
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Convolutional neural network dual attention network pedestrian detection
Online Access:	https://ieeexplore.ieee.org/document/9060881/

id	doaj-b863a4c7ad7c4fd589080e34e971e247
record_format	Article
spelling	doaj-b863a4c7ad7c4fd589080e34e971e2472021-03-30T02:20:59ZengIEEEIEEE Access2169-35362020-01-018762437625210.1109/ACCESS.2020.29864769060881CSANet: Channel and Spatial Mixed Attention CNN for Pedestrian DetectionYunbo Zhang0Pengfei Yi1Dongsheng Zhou2https://orcid.org/0000-0003-3414-9623Xin Yang3https://orcid.org/0000-0002-8046-722XDeyun Yang4Qiang Zhang5https://orcid.org/0000-0003-3776-9799Xiaopeng Wei6Key Laboratory of Advanced Design and Intelligent Computing (Ministry of Education), School of Soft Engineering, Dalian University, Dalian, ChinaKey Laboratory of Advanced Design and Intelligent Computing (Ministry of Education), School of Soft Engineering, Dalian University, Dalian, ChinaKey Laboratory of Advanced Design and Intelligent Computing (Ministry of Education), School of Soft Engineering, Dalian University, Dalian, ChinaSchool of Computer Science and Technology, Dalian University of Technology, Dalian, ChinaSchool of Information Science and Technology, Taishan University, Tai’an, ChinaKey Laboratory of Advanced Design and Intelligent Computing (Ministry of Education), School of Soft Engineering, Dalian University, Dalian, ChinaSchool of Computer Science and Technology, Dalian University of Technology, Dalian, ChinaCurrent mainstream pedestrian detectors tend to profit directly from convolutional neural networks (CNNs) that are designed for image classification. While requiring a large downsampling factor to produce high-level semantic features, CNNs cannot adaptively focus on the useful channels and regions of the feature maps, which limits the accuracy of pedestrian detection. To obtain a higher accuracy, we propose a single-stage pedestrian detector with channel and spatial attentions (CSANet), which can locate useful channels and regions automatically while extracting features. The backbone of CSANet is different from that of mainstream pedestrian detectors, which can effectively highlight the pedestrian-likely regions and suppress the background. Specifically, we model contextual dependencies from channel and spatial dimensions of the feature maps, respectively. The channel attention module can selectively promote CNNs to focus on key channels by integrating associated features. Meantime, the spatial attention module can illuminate semantic pixels by aggregating similar features of all channels. Eventually, the two modules are connected in series to further enhance the representation of feature maps. Experiment results show that CSANet achieves the state-of-the-art performance with $MR^{-2}$ of 3.55% on Caltech dataset and obtains competitive performance on CityPersons dataset while maintaining a high computational efficiency.https://ieeexplore.ieee.org/document/9060881/Convolutional neural networkdual attention networkpedestrian detection
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Yunbo Zhang Pengfei Yi Dongsheng Zhou Xin Yang Deyun Yang Qiang Zhang Xiaopeng Wei
spellingShingle	Yunbo Zhang Pengfei Yi Dongsheng Zhou Xin Yang Deyun Yang Qiang Zhang Xiaopeng Wei CSANet: Channel and Spatial Mixed Attention CNN for Pedestrian Detection IEEE Access Convolutional neural network dual attention network pedestrian detection
author_facet	Yunbo Zhang Pengfei Yi Dongsheng Zhou Xin Yang Deyun Yang Qiang Zhang Xiaopeng Wei
author_sort	Yunbo Zhang
title	CSANet: Channel and Spatial Mixed Attention CNN for Pedestrian Detection
title_short	CSANet: Channel and Spatial Mixed Attention CNN for Pedestrian Detection
title_full	CSANet: Channel and Spatial Mixed Attention CNN for Pedestrian Detection
title_fullStr	CSANet: Channel and Spatial Mixed Attention CNN for Pedestrian Detection
title_full_unstemmed	CSANet: Channel and Spatial Mixed Attention CNN for Pedestrian Detection
title_sort	csanet: channel and spatial mixed attention cnn for pedestrian detection
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2020-01-01
description	Current mainstream pedestrian detectors tend to profit directly from convolutional neural networks (CNNs) that are designed for image classification. While requiring a large downsampling factor to produce high-level semantic features, CNNs cannot adaptively focus on the useful channels and regions of the feature maps, which limits the accuracy of pedestrian detection. To obtain a higher accuracy, we propose a single-stage pedestrian detector with channel and spatial attentions (CSANet), which can locate useful channels and regions automatically while extracting features. The backbone of CSANet is different from that of mainstream pedestrian detectors, which can effectively highlight the pedestrian-likely regions and suppress the background. Specifically, we model contextual dependencies from channel and spatial dimensions of the feature maps, respectively. The channel attention module can selectively promote CNNs to focus on key channels by integrating associated features. Meantime, the spatial attention module can illuminate semantic pixels by aggregating similar features of all channels. Eventually, the two modules are connected in series to further enhance the representation of feature maps. Experiment results show that CSANet achieves the state-of-the-art performance with $MR^{-2}$ of 3.55% on Caltech dataset and obtains competitive performance on CityPersons dataset while maintaining a high computational efficiency.
topic	Convolutional neural network dual attention network pedestrian detection
url	https://ieeexplore.ieee.org/document/9060881/
work_keys_str_mv	AT yunbozhang csanetchannelandspatialmixedattentioncnnforpedestriandetection AT pengfeiyi csanetchannelandspatialmixedattentioncnnforpedestriandetection AT dongshengzhou csanetchannelandspatialmixedattentioncnnforpedestriandetection AT xinyang csanetchannelandspatialmixedattentioncnnforpedestriandetection AT deyunyang csanetchannelandspatialmixedattentioncnnforpedestriandetection AT qiangzhang csanetchannelandspatialmixedattentioncnnforpedestriandetection AT xiaopengwei csanetchannelandspatialmixedattentioncnnforpedestriandetection
_version_	1724185334219014144

CSANet: Channel and Spatial Mixed Attention CNN for Pedestrian Detection

Similar Items