Topic-oriented Text Mining on Open-ended Questionnaires using Latent Dirichlet Allocation

碩士 === 國立中央大學 === 資訊工程學系 === 107 === As the education system evolved over the past few years, domestic universities are committed to improving students’ learning outcomes. The most common way of evaluating learning outcomes is through questionnaires, filled in by students at the midst and the end of...

Full description

Bibliographic Details
Main Authors:	Yu-Ju Chen, 陳昱儒
Other Authors:	蔡孟峰
Format:	Others
Language:	zh-TW
Published:	2019
Online Access:	http://ndltd.ncl.edu.tw/handle/jhgkaq

id	ndltd-TW-107NCU05392115
record_format	oai_dc
spelling	ndltd-TW-107NCU053921152019-10-22T05:28:14Z http://ndltd.ncl.edu.tw/handle/jhgkaq Topic-oriented Text Mining on Open-ended Questionnaires using Latent Dirichlet Allocation 基於隱含狄利克雷分布進行開放式問卷之主題導向文字探勘 Yu-Ju Chen 陳昱儒碩士國立中央大學資訊工程學系 107 As the education system evolved over the past few years, domestic universities are committed to improving students’ learning outcomes. The most common way of evaluating learning outcomes is through questionnaires, filled in by students at the midst and the end of each semester. To provide students a way to give more detailed feedbacks, these questionnaires usually contain a section for students to give comments through pure text. The comment section is designed for students to write any thoughts and opinions, there aren’t any restrictions or rules to how it should be written. These human-generated text are unstructured, and often contain writing mistakes and miss used words. With the lack of structure, it is hard for these text data to be processed as normal data using data mining techniques. Thus, we aim to analyze these text data from course evaluation questionnaires though text mining. Due to the miscellaneous content and the fact that there aren’t enough human-labeled data, it is hard to perform supervised classification methods on these text. Therefore, we use an unsupervised topic analysis technique to find the latent topic distribution of the data. Topic modeling can infer latent topic distributions and cluster similar documents without defining topic labels or train data beforehand. We perform topic modeling by implementing latent Dirichlet allocation (LDA) using Gibbs sampling, and further estimate unseen data with the LDA model. In this thesis, we imply topic analysis on the comment section of the course evaluation questionnaire. We believe that with this automatic topic modeling method, it would be more efficient for analysts to analyze text data in questionnaires. Moreover, future work on automatic questionnaire analysis can be built on this approach. 蔡孟峰 2019 學位論文 ; thesis 43 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立中央大學 === 資訊工程學系 === 107 === As the education system evolved over the past few years, domestic universities are committed to improving students’ learning outcomes. The most common way of evaluating learning outcomes is through questionnaires, filled in by students at the midst and the end of each semester. To provide students a way to give more detailed feedbacks, these questionnaires usually contain a section for students to give comments through pure text. The comment section is designed for students to write any thoughts and opinions, there aren’t any restrictions or rules to how it should be written. These human-generated text are unstructured, and often contain writing mistakes and miss used words. With the lack of structure, it is hard for these text data to be processed as normal data using data mining techniques. Thus, we aim to analyze these text data from course evaluation questionnaires though text mining. Due to the miscellaneous content and the fact that there aren’t enough human-labeled data, it is hard to perform supervised classification methods on these text. Therefore, we use an unsupervised topic analysis technique to find the latent topic distribution of the data. Topic modeling can infer latent topic distributions and cluster similar documents without defining topic labels or train data beforehand. We perform topic modeling by implementing latent Dirichlet allocation (LDA) using Gibbs sampling, and further estimate unseen data with the LDA model. In this thesis, we imply topic analysis on the comment section of the course evaluation questionnaire. We believe that with this automatic topic modeling method, it would be more efficient for analysts to analyze text data in questionnaires. Moreover, future work on automatic questionnaire analysis can be built on this approach.
author2	蔡孟峰
author_facet	蔡孟峰 Yu-Ju Chen 陳昱儒
author	Yu-Ju Chen 陳昱儒
spellingShingle	Yu-Ju Chen 陳昱儒 Topic-oriented Text Mining on Open-ended Questionnaires using Latent Dirichlet Allocation
author_sort	Yu-Ju Chen
title	Topic-oriented Text Mining on Open-ended Questionnaires using Latent Dirichlet Allocation
title_short	Topic-oriented Text Mining on Open-ended Questionnaires using Latent Dirichlet Allocation
title_full	Topic-oriented Text Mining on Open-ended Questionnaires using Latent Dirichlet Allocation
title_fullStr	Topic-oriented Text Mining on Open-ended Questionnaires using Latent Dirichlet Allocation
title_full_unstemmed	Topic-oriented Text Mining on Open-ended Questionnaires using Latent Dirichlet Allocation
title_sort	topic-oriented text mining on open-ended questionnaires using latent dirichlet allocation
publishDate	2019
url	http://ndltd.ncl.edu.tw/handle/jhgkaq
work_keys_str_mv	AT yujuchen topicorientedtextminingonopenendedquestionnairesusinglatentdirichletallocation AT chényùrú topicorientedtextminingonopenendedquestionnairesusinglatentdirichletallocation AT yujuchen jīyúyǐnhándílìkèléifēnbùjìnxíngkāifàngshìwènjuǎnzhīzhǔtídǎoxiàngwénzìtànkān AT chényùrú jīyúyǐnhándílìkèléifēnbùjìnxíngkāifàngshìwènjuǎnzhīzhǔtídǎoxiàngwénzìtànkān
_version_	1719274233757958144

Topic-oriented Text Mining on Open-ended Questionnaires using Latent Dirichlet Allocation

Similar Items