Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition

Speech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new...

Full description

Bibliographic Details
Main Authors: Hong Chen, Yuan Lin, Yongming Li, Wei Wang, Pin Wang, Yan Lei
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9348883/
id doaj-4e1029e37cad4c62884db95a4caa020b
record_format Article
spelling doaj-4e1029e37cad4c62884db95a4caa020b2021-03-30T15:27:03ZengIEEEIEEE Access2169-35362021-01-019287292874110.1109/ACCESS.2021.30573829348883Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech RecognitionHong Chen0Yuan Lin1https://orcid.org/0000-0003-1564-5613Yongming Li2https://orcid.org/0000-0002-7542-4356Wei Wang3Pin Wang4https://orcid.org/0000-0002-4214-0488Yan Lei5Chongqing University Cancer Hospital, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaChongqing University Cancer Hospital, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSpeech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new high-quality features. This paper proposes an embedded hybrid feature deep sparse stacked autoencoder ensemble method to solve this problem. Firstly, the speech features are extracted based on prior knowledge and called original features. Secondly, the original features are embedded into the deep network (Sparse Stacked Autoencoder) to filter the output of the hidden layer, to enhance the complementarity between the deep features and the original features. Thirdly, the L1 regularized feature selection mechanism is designed to reduce the hybrid feature set formed by the combination of deep features and original features. Finally, a manifold projection classifier ensemble is designed to enhance the stability of classification. Besides, this paper for the first time proposes a speech collection scheme for mental health recognition. We construct a large-scale Chinese mental health speech database for verification of the proposed algorithm of mental health. In the experimental section, the proposed algorithm is verified and compared with the representative related algorithms. The experimental results show that the proposed algorithm has better classification accuracy than the other representative algorithms. The proposed method combines the advantages of deep feature learning and traditional feature extraction methods more efficiently to solve the small sample problem.https://ieeexplore.ieee.org/document/9348883/Embedded hybrid feature sparse stacked autoencoderensemble learningfeature fusionL1 regularizationspeech mental health recognition
collection DOAJ
language English
format Article
sources DOAJ
author Hong Chen
Yuan Lin
Yongming Li
Wei Wang
Pin Wang
Yan Lei
spellingShingle Hong Chen
Yuan Lin
Yongming Li
Wei Wang
Pin Wang
Yan Lei
Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
IEEE Access
Embedded hybrid feature sparse stacked autoencoder
ensemble learning
feature fusion
L1 regularization
speech mental health recognition
author_facet Hong Chen
Yuan Lin
Yongming Li
Wei Wang
Pin Wang
Yan Lei
author_sort Hong Chen
title Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_short Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_full Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_fullStr Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_full_unstemmed Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_sort hybrid feature embedded sparse stacked autoencoder and manifold dimensionality reduction ensemble for mental health speech recognition
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description Speech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new high-quality features. This paper proposes an embedded hybrid feature deep sparse stacked autoencoder ensemble method to solve this problem. Firstly, the speech features are extracted based on prior knowledge and called original features. Secondly, the original features are embedded into the deep network (Sparse Stacked Autoencoder) to filter the output of the hidden layer, to enhance the complementarity between the deep features and the original features. Thirdly, the L1 regularized feature selection mechanism is designed to reduce the hybrid feature set formed by the combination of deep features and original features. Finally, a manifold projection classifier ensemble is designed to enhance the stability of classification. Besides, this paper for the first time proposes a speech collection scheme for mental health recognition. We construct a large-scale Chinese mental health speech database for verification of the proposed algorithm of mental health. In the experimental section, the proposed algorithm is verified and compared with the representative related algorithms. The experimental results show that the proposed algorithm has better classification accuracy than the other representative algorithms. The proposed method combines the advantages of deep feature learning and traditional feature extraction methods more efficiently to solve the small sample problem.
topic Embedded hybrid feature sparse stacked autoencoder
ensemble learning
feature fusion
L1 regularization
speech mental health recognition
url https://ieeexplore.ieee.org/document/9348883/
work_keys_str_mv AT hongchen hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT yuanlin hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT yongmingli hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT weiwang hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT pinwang hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT yanlei hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
_version_ 1724179401470377984