Multi-Head Self-Attention for 3D Point Cloud Classification
3D point cloud classification is a hot issue in recent years. 3D point cloud is different from regular data such as image and text. Disorder of point cloud makes two-dimensional (2D) convolution neural network (CNN) hard to be applied. When features are acquired from input data, it is important to e...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9319138/ |
id |
doaj-5e7b5ad95c584b338a1e96e76c2a0d80 |
---|---|
record_format |
Article |
spelling |
doaj-5e7b5ad95c584b338a1e96e76c2a0d802021-03-30T15:25:28ZengIEEEIEEE Access2169-35362021-01-019181371814710.1109/ACCESS.2021.30504889319138Multi-Head Self-Attention for 3D Point Cloud ClassificationXue-Yao Gao0https://orcid.org/0000-0001-8046-5796Yan-Zhao Wang1https://orcid.org/0000-0002-1954-3537Chun-Xiang Zhang2https://orcid.org/0000-0002-7676-6630Jia-Qi Lu3https://orcid.org/0000-0001-7938-3371School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, ChinaSchool of Computer Science and Technology, Harbin University of Science and Technology, Harbin, ChinaSchool of Software and Microelectronics, Harbin University of Science and Technology, Harbin, ChinaCollege of Arts and Sciences, Northeast Agricultural University, Harbin, China3D point cloud classification is a hot issue in recent years. 3D point cloud is different from regular data such as image and text. Disorder of point cloud makes two-dimensional (2D) convolution neural network (CNN) hard to be applied. When features are acquired from input data, it is important to extract global and local information effectively. In this paper, we propose a 3D model classification method based on multi-head self-attention mechanism which consumes sparse point clouds and learns robust latent representation of 3D point cloud. The framework is composed of self-attention layer, multilayer perceptrons (MLPs), fully connected (FC) layer, max-pooling layer and softmax layer. Feature vector of point includes spatial coordinates and shape descriptors, and they are encoded by self-attention layers to extract relationships among them. Outputs of attention layers are concatenated and put into MLPs to extract features. When they are transformed into the expected dimension by MLPs, max-pooling layer will be applied to get features in high level. Then, they are put into fully connected layer. Softmax layer is used to determine category of 3D model. The proposed method is applied to ModelNet40. Experimental results show that the proposed method is robust to rotation variance, position variance and point sparsity.https://ieeexplore.ieee.org/document/9319138/INDEX TERMS point cloudconvolutional neural networkself-attention mechanismfeature vectorspatial coordinatesshape descriptor |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xue-Yao Gao Yan-Zhao Wang Chun-Xiang Zhang Jia-Qi Lu |
spellingShingle |
Xue-Yao Gao Yan-Zhao Wang Chun-Xiang Zhang Jia-Qi Lu Multi-Head Self-Attention for 3D Point Cloud Classification IEEE Access INDEX TERMS point cloud convolutional neural network self-attention mechanism feature vector spatial coordinates shape descriptor |
author_facet |
Xue-Yao Gao Yan-Zhao Wang Chun-Xiang Zhang Jia-Qi Lu |
author_sort |
Xue-Yao Gao |
title |
Multi-Head Self-Attention for 3D Point Cloud Classification |
title_short |
Multi-Head Self-Attention for 3D Point Cloud Classification |
title_full |
Multi-Head Self-Attention for 3D Point Cloud Classification |
title_fullStr |
Multi-Head Self-Attention for 3D Point Cloud Classification |
title_full_unstemmed |
Multi-Head Self-Attention for 3D Point Cloud Classification |
title_sort |
multi-head self-attention for 3d point cloud classification |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
3D point cloud classification is a hot issue in recent years. 3D point cloud is different from regular data such as image and text. Disorder of point cloud makes two-dimensional (2D) convolution neural network (CNN) hard to be applied. When features are acquired from input data, it is important to extract global and local information effectively. In this paper, we propose a 3D model classification method based on multi-head self-attention mechanism which consumes sparse point clouds and learns robust latent representation of 3D point cloud. The framework is composed of self-attention layer, multilayer perceptrons (MLPs), fully connected (FC) layer, max-pooling layer and softmax layer. Feature vector of point includes spatial coordinates and shape descriptors, and they are encoded by self-attention layers to extract relationships among them. Outputs of attention layers are concatenated and put into MLPs to extract features. When they are transformed into the expected dimension by MLPs, max-pooling layer will be applied to get features in high level. Then, they are put into fully connected layer. Softmax layer is used to determine category of 3D model. The proposed method is applied to ModelNet40. Experimental results show that the proposed method is robust to rotation variance, position variance and point sparsity. |
topic |
INDEX TERMS point cloud convolutional neural network self-attention mechanism feature vector spatial coordinates shape descriptor |
url |
https://ieeexplore.ieee.org/document/9319138/ |
work_keys_str_mv |
AT xueyaogao multiheadselfattentionfor3dpointcloudclassification AT yanzhaowang multiheadselfattentionfor3dpointcloudclassification AT chunxiangzhang multiheadselfattentionfor3dpointcloudclassification AT jiaqilu multiheadselfattentionfor3dpointcloudclassification |
_version_ |
1724179522465562624 |