Multi-Head Self-Attention for 3D Point Cloud Classification

3D point cloud classification is a hot issue in recent years. 3D point cloud is different from regular data such as image and text. Disorder of point cloud makes two-dimensional (2D) convolution neural network (CNN) hard to be applied. When features are acquired from input data, it is important to e...

Full description

Bibliographic Details
Main Authors: Xue-Yao Gao, Yan-Zhao Wang, Chun-Xiang Zhang, Jia-Qi Lu
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9319138/
id doaj-5e7b5ad95c584b338a1e96e76c2a0d80
record_format Article
spelling doaj-5e7b5ad95c584b338a1e96e76c2a0d802021-03-30T15:25:28ZengIEEEIEEE Access2169-35362021-01-019181371814710.1109/ACCESS.2021.30504889319138Multi-Head Self-Attention for 3D Point Cloud ClassificationXue-Yao Gao0https://orcid.org/0000-0001-8046-5796Yan-Zhao Wang1https://orcid.org/0000-0002-1954-3537Chun-Xiang Zhang2https://orcid.org/0000-0002-7676-6630Jia-Qi Lu3https://orcid.org/0000-0001-7938-3371School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, ChinaSchool of Computer Science and Technology, Harbin University of Science and Technology, Harbin, ChinaSchool of Software and Microelectronics, Harbin University of Science and Technology, Harbin, ChinaCollege of Arts and Sciences, Northeast Agricultural University, Harbin, China3D point cloud classification is a hot issue in recent years. 3D point cloud is different from regular data such as image and text. Disorder of point cloud makes two-dimensional (2D) convolution neural network (CNN) hard to be applied. When features are acquired from input data, it is important to extract global and local information effectively. In this paper, we propose a 3D model classification method based on multi-head self-attention mechanism which consumes sparse point clouds and learns robust latent representation of 3D point cloud. The framework is composed of self-attention layer, multilayer perceptrons (MLPs), fully connected (FC) layer, max-pooling layer and softmax layer. Feature vector of point includes spatial coordinates and shape descriptors, and they are encoded by self-attention layers to extract relationships among them. Outputs of attention layers are concatenated and put into MLPs to extract features. When they are transformed into the expected dimension by MLPs, max-pooling layer will be applied to get features in high level. Then, they are put into fully connected layer. Softmax layer is used to determine category of 3D model. The proposed method is applied to ModelNet40. Experimental results show that the proposed method is robust to rotation variance, position variance and point sparsity.https://ieeexplore.ieee.org/document/9319138/INDEX TERMS point cloudconvolutional neural networkself-attention mechanismfeature vectorspatial coordinatesshape descriptor
collection DOAJ
language English
format Article
sources DOAJ
author Xue-Yao Gao
Yan-Zhao Wang
Chun-Xiang Zhang
Jia-Qi Lu
spellingShingle Xue-Yao Gao
Yan-Zhao Wang
Chun-Xiang Zhang
Jia-Qi Lu
Multi-Head Self-Attention for 3D Point Cloud Classification
IEEE Access
INDEX TERMS point cloud
convolutional neural network
self-attention mechanism
feature vector
spatial coordinates
shape descriptor
author_facet Xue-Yao Gao
Yan-Zhao Wang
Chun-Xiang Zhang
Jia-Qi Lu
author_sort Xue-Yao Gao
title Multi-Head Self-Attention for 3D Point Cloud Classification
title_short Multi-Head Self-Attention for 3D Point Cloud Classification
title_full Multi-Head Self-Attention for 3D Point Cloud Classification
title_fullStr Multi-Head Self-Attention for 3D Point Cloud Classification
title_full_unstemmed Multi-Head Self-Attention for 3D Point Cloud Classification
title_sort multi-head self-attention for 3d point cloud classification
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description 3D point cloud classification is a hot issue in recent years. 3D point cloud is different from regular data such as image and text. Disorder of point cloud makes two-dimensional (2D) convolution neural network (CNN) hard to be applied. When features are acquired from input data, it is important to extract global and local information effectively. In this paper, we propose a 3D model classification method based on multi-head self-attention mechanism which consumes sparse point clouds and learns robust latent representation of 3D point cloud. The framework is composed of self-attention layer, multilayer perceptrons (MLPs), fully connected (FC) layer, max-pooling layer and softmax layer. Feature vector of point includes spatial coordinates and shape descriptors, and they are encoded by self-attention layers to extract relationships among them. Outputs of attention layers are concatenated and put into MLPs to extract features. When they are transformed into the expected dimension by MLPs, max-pooling layer will be applied to get features in high level. Then, they are put into fully connected layer. Softmax layer is used to determine category of 3D model. The proposed method is applied to ModelNet40. Experimental results show that the proposed method is robust to rotation variance, position variance and point sparsity.
topic INDEX TERMS point cloud
convolutional neural network
self-attention mechanism
feature vector
spatial coordinates
shape descriptor
url https://ieeexplore.ieee.org/document/9319138/
work_keys_str_mv AT xueyaogao multiheadselfattentionfor3dpointcloudclassification
AT yanzhaowang multiheadselfattentionfor3dpointcloudclassification
AT chunxiangzhang multiheadselfattentionfor3dpointcloudclassification
AT jiaqilu multiheadselfattentionfor3dpointcloudclassification
_version_ 1724179522465562624