Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
Speech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9348883/ |
id |
doaj-4e1029e37cad4c62884db95a4caa020b |
---|---|
record_format |
Article |
spelling |
doaj-4e1029e37cad4c62884db95a4caa020b2021-03-30T15:27:03ZengIEEEIEEE Access2169-35362021-01-019287292874110.1109/ACCESS.2021.30573829348883Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech RecognitionHong Chen0Yuan Lin1https://orcid.org/0000-0003-1564-5613Yongming Li2https://orcid.org/0000-0002-7542-4356Wei Wang3Pin Wang4https://orcid.org/0000-0002-4214-0488Yan Lei5Chongqing University Cancer Hospital, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaChongqing University Cancer Hospital, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSpeech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new high-quality features. This paper proposes an embedded hybrid feature deep sparse stacked autoencoder ensemble method to solve this problem. Firstly, the speech features are extracted based on prior knowledge and called original features. Secondly, the original features are embedded into the deep network (Sparse Stacked Autoencoder) to filter the output of the hidden layer, to enhance the complementarity between the deep features and the original features. Thirdly, the L1 regularized feature selection mechanism is designed to reduce the hybrid feature set formed by the combination of deep features and original features. Finally, a manifold projection classifier ensemble is designed to enhance the stability of classification. Besides, this paper for the first time proposes a speech collection scheme for mental health recognition. We construct a large-scale Chinese mental health speech database for verification of the proposed algorithm of mental health. In the experimental section, the proposed algorithm is verified and compared with the representative related algorithms. The experimental results show that the proposed algorithm has better classification accuracy than the other representative algorithms. The proposed method combines the advantages of deep feature learning and traditional feature extraction methods more efficiently to solve the small sample problem.https://ieeexplore.ieee.org/document/9348883/Embedded hybrid feature sparse stacked autoencoderensemble learningfeature fusionL1 regularizationspeech mental health recognition |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Hong Chen Yuan Lin Yongming Li Wei Wang Pin Wang Yan Lei |
spellingShingle |
Hong Chen Yuan Lin Yongming Li Wei Wang Pin Wang Yan Lei Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition IEEE Access Embedded hybrid feature sparse stacked autoencoder ensemble learning feature fusion L1 regularization speech mental health recognition |
author_facet |
Hong Chen Yuan Lin Yongming Li Wei Wang Pin Wang Yan Lei |
author_sort |
Hong Chen |
title |
Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition |
title_short |
Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition |
title_full |
Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition |
title_fullStr |
Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition |
title_full_unstemmed |
Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition |
title_sort |
hybrid feature embedded sparse stacked autoencoder and manifold dimensionality reduction ensemble for mental health speech recognition |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
Speech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new high-quality features. This paper proposes an embedded hybrid feature deep sparse stacked autoencoder ensemble method to solve this problem. Firstly, the speech features are extracted based on prior knowledge and called original features. Secondly, the original features are embedded into the deep network (Sparse Stacked Autoencoder) to filter the output of the hidden layer, to enhance the complementarity between the deep features and the original features. Thirdly, the L1 regularized feature selection mechanism is designed to reduce the hybrid feature set formed by the combination of deep features and original features. Finally, a manifold projection classifier ensemble is designed to enhance the stability of classification. Besides, this paper for the first time proposes a speech collection scheme for mental health recognition. We construct a large-scale Chinese mental health speech database for verification of the proposed algorithm of mental health. In the experimental section, the proposed algorithm is verified and compared with the representative related algorithms. The experimental results show that the proposed algorithm has better classification accuracy than the other representative algorithms. The proposed method combines the advantages of deep feature learning and traditional feature extraction methods more efficiently to solve the small sample problem. |
topic |
Embedded hybrid feature sparse stacked autoencoder ensemble learning feature fusion L1 regularization speech mental health recognition |
url |
https://ieeexplore.ieee.org/document/9348883/ |
work_keys_str_mv |
AT hongchen hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT yuanlin hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT yongmingli hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT weiwang hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT pinwang hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition AT yanlei hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition |
_version_ |
1724179401470377984 |