Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism

Object detection for vehicles and pedestrians is extremely difficult to achieve in autopilot applications for the Internet of vehicles, and it is a task that requires the ability to locate and identify smaller targets even in complex environments. This paper proposes a single-stage object detection...

Full description

Bibliographic Details
Main Authors: Mingtao Guo, Donghui Xue, Peng Li, He Xu
Format: Article
Language:English
Published: MDPI AG 2020-12-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/11/12/583
id doaj-69241b3d83344e1f8492cd8623f3013f
record_format Article
spelling doaj-69241b3d83344e1f8492cd8623f3013f2020-12-17T00:04:53ZengMDPI AGInformation2078-24892020-12-011158358310.3390/info11120583Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention MechanismMingtao Guo0Donghui Xue1Peng Li2He Xu3School of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210023, ChinaSchool of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210023, ChinaSchool of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210023, ChinaSchool of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210023, ChinaObject detection for vehicles and pedestrians is extremely difficult to achieve in autopilot applications for the Internet of vehicles, and it is a task that requires the ability to locate and identify smaller targets even in complex environments. This paper proposes a single-stage object detection network (YOLOv3-promote) for the detection of vehicles and pedestrians in complex environments in cities, which improves on the traditional You Only Look Once version 3 (YOLOv3). First, spatial pyramid pooling is used to fuse local and global features in an image to better enrich the expression ability of the feature map and to more effectively detect targets with large size differences in the image; second, an attention mechanism is added to the feature map to weight each channel, thereby enhancing key features and removing redundant features, which allows for strengthening the ability of the feature network to discriminate between target objects and backgrounds; lastly, the anchor box derived from the K-means clustering algorithm is fitted to the final prediction box to complete the positioning and identification of target vehicles and pedestrians. The experimental results show that the proposed method achieved 91.4 mAP (mean average precision), 83.2 F1 score, and 43.7 frames per second (FPS) on the KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute) dataset, and the detection performance was superior to the conventional YOLOv3 algorithm in terms of both accuracy and speed.https://www.mdpi.com/2078-2489/11/12/583Internet of vehiclesautonomous drivingobject detectionattention mechanismsspatial pyramid pooling
collection DOAJ
language English
format Article
sources DOAJ
author Mingtao Guo
Donghui Xue
Peng Li
He Xu
spellingShingle Mingtao Guo
Donghui Xue
Peng Li
He Xu
Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism
Information
Internet of vehicles
autonomous driving
object detection
attention mechanisms
spatial pyramid pooling
author_facet Mingtao Guo
Donghui Xue
Peng Li
He Xu
author_sort Mingtao Guo
title Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism
title_short Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism
title_full Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism
title_fullStr Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism
title_full_unstemmed Vehicle Pedestrian Detection Method Based on Spatial Pyramid Pooling and Attention Mechanism
title_sort vehicle pedestrian detection method based on spatial pyramid pooling and attention mechanism
publisher MDPI AG
series Information
issn 2078-2489
publishDate 2020-12-01
description Object detection for vehicles and pedestrians is extremely difficult to achieve in autopilot applications for the Internet of vehicles, and it is a task that requires the ability to locate and identify smaller targets even in complex environments. This paper proposes a single-stage object detection network (YOLOv3-promote) for the detection of vehicles and pedestrians in complex environments in cities, which improves on the traditional You Only Look Once version 3 (YOLOv3). First, spatial pyramid pooling is used to fuse local and global features in an image to better enrich the expression ability of the feature map and to more effectively detect targets with large size differences in the image; second, an attention mechanism is added to the feature map to weight each channel, thereby enhancing key features and removing redundant features, which allows for strengthening the ability of the feature network to discriminate between target objects and backgrounds; lastly, the anchor box derived from the K-means clustering algorithm is fitted to the final prediction box to complete the positioning and identification of target vehicles and pedestrians. The experimental results show that the proposed method achieved 91.4 mAP (mean average precision), 83.2 F1 score, and 43.7 frames per second (FPS) on the KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute) dataset, and the detection performance was superior to the conventional YOLOv3 algorithm in terms of both accuracy and speed.
topic Internet of vehicles
autonomous driving
object detection
attention mechanisms
spatial pyramid pooling
url https://www.mdpi.com/2078-2489/11/12/583
work_keys_str_mv AT mingtaoguo vehiclepedestriandetectionmethodbasedonspatialpyramidpoolingandattentionmechanism
AT donghuixue vehiclepedestriandetectionmethodbasedonspatialpyramidpoolingandattentionmechanism
AT pengli vehiclepedestriandetectionmethodbasedonspatialpyramidpoolingandattentionmechanism
AT hexu vehiclepedestriandetectionmethodbasedonspatialpyramidpoolingandattentionmechanism
_version_ 1724380679214465024