Applying K-Nearest Neighbor and Convolutional Neural Network pooling on Mandarin Vowel Recognition

碩士 === 國立中興大學 === 統計學研究所 === 107 === In the past 10 years, with the rise of the artificial neural network, machine learning has made rapid advance in speech and image recognition, while Chinese pronunciation can be divided into two parts: vowel and consonant. Take ＜ㄈㄚˇ>for example,＜ㄈ>is conson...

Full description

Bibliographic Details
Main Authors:	You-Cheng Lin, 林祐丞
Other Authors:	李宗寶
Format:	Others
Language:	zh-TW
Published:	2019
Online Access:	http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107NCHU5337011%22.&searchmode=basic

id	ndltd-TW-107NCHU5337011
record_format	oai_dc
spelling	ndltd-TW-107NCHU53370112019-11-30T06:09:39Z http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107NCHU5337011%22.&searchmode=basic Applying K-Nearest Neighbor and Convolutional Neural Network pooling on Mandarin Vowel Recognition 最近鄰居法與卷積神經網路池化對中文母音辨識之探討 You-Cheng Lin 林祐丞碩士國立中興大學統計學研究所 107 In the past 10 years, with the rise of the artificial neural network, machine learning has made rapid advance in speech and image recognition, while Chinese pronunciation can be divided into two parts: vowel and consonant. Take ＜ㄈㄚˇ>for example,＜ㄈ>is consonant and ＜ㄚˇ> is vowel . There are 160 types of vowels, 36 types of consonants, and 5 tones which compose 1,391 Chinese pronunciation. In this paper, the k-nearest neighbor(KNN) method and the convolutional neural network(CNN) are used to identify the vowel. The data is recorded from 20 speakers. After sampling, endpoint detection, and frame cutting, the dimension of parameter matrix of each pronunciation data is 53x39, and then attempts to use k-nearest neighbor method , and the convolutional neural network model to identify different hyperparameters with non-pooling, maximum pooling, and average pooling respectively, and explores the effects of various combinations on the accuracy of identification. Because CNN is nonlinear fitting, the recognition rate is much higher than KNN which is linear. The hyperparameter of CNN's highest resolution is 45 frame, kernel size is 5x5, and number of kernel is (512. , 1024, 2048), and using the average pooling, the recognition rate of the vowel can reach 0.9647. 李宗寶 2019 學位論文 ; thesis 21 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立中興大學 === 統計學研究所 === 107 === In the past 10 years, with the rise of the artificial neural network, machine learning has made rapid advance in speech and image recognition, while Chinese pronunciation can be divided into two parts: vowel and consonant. Take ＜ㄈㄚˇ>for example,＜ㄈ>is consonant and ＜ㄚˇ> is vowel . There are 160 types of vowels, 36 types of consonants, and 5 tones which compose 1,391 Chinese pronunciation. In this paper, the k-nearest neighbor(KNN) method and the convolutional neural network(CNN) are used to identify the vowel. The data is recorded from 20 speakers. After sampling, endpoint detection, and frame cutting, the dimension of parameter matrix of each pronunciation data is 53x39, and then attempts to use k-nearest neighbor method , and the convolutional neural network model to identify different hyperparameters with non-pooling, maximum pooling, and average pooling respectively, and explores the effects of various combinations on the accuracy of identification. Because CNN is nonlinear fitting, the recognition rate is much higher than KNN which is linear. The hyperparameter of CNN's highest resolution is 45 frame, kernel size is 5x5, and number of kernel is (512. , 1024, 2048), and using the average pooling, the recognition rate of the vowel can reach 0.9647.
author2	李宗寶
author_facet	李宗寶 You-Cheng Lin 林祐丞
author	You-Cheng Lin 林祐丞
spellingShingle	You-Cheng Lin 林祐丞 Applying K-Nearest Neighbor and Convolutional Neural Network pooling on Mandarin Vowel Recognition
author_sort	You-Cheng Lin
title	Applying K-Nearest Neighbor and Convolutional Neural Network pooling on Mandarin Vowel Recognition
title_short	Applying K-Nearest Neighbor and Convolutional Neural Network pooling on Mandarin Vowel Recognition
title_full	Applying K-Nearest Neighbor and Convolutional Neural Network pooling on Mandarin Vowel Recognition
title_fullStr	Applying K-Nearest Neighbor and Convolutional Neural Network pooling on Mandarin Vowel Recognition
title_full_unstemmed	Applying K-Nearest Neighbor and Convolutional Neural Network pooling on Mandarin Vowel Recognition
title_sort	applying k-nearest neighbor and convolutional neural network pooling on mandarin vowel recognition
publishDate	2019
url	http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107NCHU5337011%22.&searchmode=basic
work_keys_str_mv	AT youchenglin applyingknearestneighborandconvolutionalneuralnetworkpoolingonmandarinvowelrecognition AT línyòuchéng applyingknearestneighborandconvolutionalneuralnetworkpoolingonmandarinvowelrecognition AT youchenglin zuìjìnlínjūfǎyǔjuǎnjīshénjīngwǎnglùchíhuàduìzhōngwénmǔyīnbiànshízhītàntǎo AT línyòuchéng zuìjìnlínjūfǎyǔjuǎnjīshénjīngwǎnglùchíhuàduìzhōngwénmǔyīnbiànshízhītàntǎo
_version_	1719300442630914048

Applying K-Nearest Neighbor and Convolutional Neural Network pooling on Mandarin Vowel Recognition

Similar Items