Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition

Speech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new...

Full description

Bibliographic Details
Main Authors:	Hong Chen, Yuan Lin, Yongming Li, Wei Wang, Pin Wang, Yan Lei
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Embedded hybrid feature sparse stacked autoencoder ensemble learning feature fusion L1 regularization speech mental health recognition
Online Access:	https://ieeexplore.ieee.org/document/9348883/

id	doaj-4e1029e37cad4c62884db95a4caa020b
record_format	Article
spelling	doaj-4e1029e37cad4c62884db95a4caa020b2021-03-30T15:27:03ZengIEEEIEEE Access2169-35362021-01-019287292874110.1109/ACCESS.2021.30573829348883Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech RecognitionHong Chen0Yuan Lin1https://orcid.org/0000-0003-1564-5613Yongming Li2https://orcid.org/0000-0002-7542-4356Wei Wang3Pin Wang4https://orcid.org/0000-0002-4214-0488Yan Lei5Chongqing University Cancer Hospital, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaChongqing University Cancer Hospital, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSpeech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new high-quality features. This paper proposes an embedded hybrid feature deep sparse stacked autoencoder ensemble method to solve this problem. Firstly, the speech features are extracted based on prior knowledge and called original features. Secondly, the original features are embedded into the deep network (Sparse Stacked Autoencoder) to filter the output of the hidden layer, to enhance the complementarity between the deep features and the original features. Thirdly, the L1 regularized feature selection mechanism is designed to reduce the hybrid feature set formed by the combination of deep features and original features. Finally, a manifold projection classifier ensemble is designed to enhance the stability of classification. Besides, this paper for the first time proposes a speech collection scheme for mental health recognition. We construct a large-scale Chinese mental health speech database for verification of the proposed algorithm of mental health. In the experimental section, the proposed algorithm is verified and compared with the representative related algorithms. The experimental results show that the proposed algorithm has better classification accuracy than the other representative algorithms. The proposed method combines the advantages of deep feature learning and traditional feature extraction methods more efficiently to solve the small sample problem.https://ieeexplore.ieee.org/document/9348883/Embedded hybrid feature sparse stacked autoencoderensemble learningfeature fusionL1 regularizationspeech mental health recognition
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Hong Chen Yuan Lin Yongming Li Wei Wang Pin Wang Yan Lei
spellingShingle	Hong Chen Yuan Lin Yongming Li Wei Wang Pin Wang Yan Lei Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition IEEE Access Embedded hybrid feature sparse stacked autoencoder ensemble learning feature fusion L1 regularization speech mental health recognition
author_facet	Hong Chen Yuan Lin Yongming Li Wei Wang Pin Wang Yan Lei
author_sort	Hong Chen
title	Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_short	Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_full	Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_fullStr	Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_full_unstemmed	Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_sort	hybrid feature embedded sparse stacked autoencoder and manifold dimensionality reduction ensemble for mental health speech recognition
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2021-01-01
description	Speech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new high-quality features. This paper proposes an embedded hybrid feature deep sparse stacked autoencoder ensemble method to solve this problem. Firstly, the speech features are extracted based on prior knowledge and called original features. Secondly, the original features are embedded into the deep network (Sparse Stacked Autoencoder) to filter the output of the hidden layer, to enhance the complementarity between the deep features and the original features. Thirdly, the L1 regularized feature selection mechanism is designed to reduce the hybrid feature set formed by the combination of deep features and original features. Finally, a manifold projection classifier ensemble is designed to enhance the stability of classification. Besides, this paper for the first time proposes a speech collection scheme for mental health recognition. We construct a large-scale Chinese mental health speech database for verification of the proposed algorithm of mental health. In the experimental section, the proposed algorithm is verified and compared with the representative related algorithms. The experimental results show that the proposed algorithm has better classification accuracy than the other representative algorithms. The proposed method combines the advantages of deep feature learning and traditional feature extraction methods more efficiently to solve the small sample problem.
topic	Embedded hybrid feature sparse stacked autoencoder ensemble learning feature fusion L1 regularization speech mental health recognition
url	https://ieeexplore.ieee.org/document/9348883/
work_keys_str_mv	AT hongchen hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT yuanlin hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT yongmingli hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT weiwang hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT pinwang hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT yanlei hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
_version_	1724179401470377984

Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition

Similar Items