Comparison of Clustering Algorithms for Learning Analytics with Educational Datasets

Learning Analytics is becoming a key tool for the analysis and improvement of digital education processes, and its potential benefit grows with the size of the student cohorts generating data. In the context of Open Education, the potentially massive student cohorts and the global audience represent...

Full description

Bibliographic Details
Main Authors: Álvaro Martínez Navarro, Pablo Moreno-Ger
Format: Article
Language:English
Published: Universidad Internacional de La Rioja (UNIR) 2018-09-01
Series:International Journal of Interactive Multimedia and Artificial Intelligence
Subjects:
Online Access:http://www.ijimai.org/journal/node/2111
id doaj-274c57d923ff4d2b91881a8b21f2c12a
record_format Article
spelling doaj-274c57d923ff4d2b91881a8b21f2c12a2020-11-24T22:05:47ZengUniversidad Internacional de La Rioja (UNIR)International Journal of Interactive Multimedia and Artificial Intelligence1989-16601989-16602018-09-01529610.9781/ijimai.2018.02.003ijimai.2018.02.003Comparison of Clustering Algorithms for Learning Analytics with Educational DatasetsÁlvaro Martínez NavarroPablo Moreno-GerLearning Analytics is becoming a key tool for the analysis and improvement of digital education processes, and its potential benefit grows with the size of the student cohorts generating data. In the context of Open Education, the potentially massive student cohorts and the global audience represent a great opportunity for significant analyses and breakthroughs in the field of learning analytics. However, these potentially huge datasets require proper analysis techniques, and different algorithms, tools and approaches may perform better in this specific context. In this work, we compare different clustering algorithms using an educational dataset. We start by identifying the most relevant algorithms in Learning Analytics and benchmark them to determine, according to internal validation and stability measurements, which algorithms perform better. We analyzed seven algorithms, and determined that K-means and PAM were the best performers among partition algorithms, and DIANA was the best performer among hierarchical algorithms.http://www.ijimai.org/journal/node/2111ClusteringComputer LanguagesData AnalysisEngineering StudentsPerformance EvaluationUnsupervised Learning
collection DOAJ
language English
format Article
sources DOAJ
author Álvaro Martínez Navarro
Pablo Moreno-Ger
spellingShingle Álvaro Martínez Navarro
Pablo Moreno-Ger
Comparison of Clustering Algorithms for Learning Analytics with Educational Datasets
International Journal of Interactive Multimedia and Artificial Intelligence
Clustering
Computer Languages
Data Analysis
Engineering Students
Performance Evaluation
Unsupervised Learning
author_facet Álvaro Martínez Navarro
Pablo Moreno-Ger
author_sort Álvaro Martínez Navarro
title Comparison of Clustering Algorithms for Learning Analytics with Educational Datasets
title_short Comparison of Clustering Algorithms for Learning Analytics with Educational Datasets
title_full Comparison of Clustering Algorithms for Learning Analytics with Educational Datasets
title_fullStr Comparison of Clustering Algorithms for Learning Analytics with Educational Datasets
title_full_unstemmed Comparison of Clustering Algorithms for Learning Analytics with Educational Datasets
title_sort comparison of clustering algorithms for learning analytics with educational datasets
publisher Universidad Internacional de La Rioja (UNIR)
series International Journal of Interactive Multimedia and Artificial Intelligence
issn 1989-1660
1989-1660
publishDate 2018-09-01
description Learning Analytics is becoming a key tool for the analysis and improvement of digital education processes, and its potential benefit grows with the size of the student cohorts generating data. In the context of Open Education, the potentially massive student cohorts and the global audience represent a great opportunity for significant analyses and breakthroughs in the field of learning analytics. However, these potentially huge datasets require proper analysis techniques, and different algorithms, tools and approaches may perform better in this specific context. In this work, we compare different clustering algorithms using an educational dataset. We start by identifying the most relevant algorithms in Learning Analytics and benchmark them to determine, according to internal validation and stability measurements, which algorithms perform better. We analyzed seven algorithms, and determined that K-means and PAM were the best performers among partition algorithms, and DIANA was the best performer among hierarchical algorithms.
topic Clustering
Computer Languages
Data Analysis
Engineering Students
Performance Evaluation
Unsupervised Learning
url http://www.ijimai.org/journal/node/2111
work_keys_str_mv AT alvaromartineznavarro comparisonofclusteringalgorithmsforlearninganalyticswitheducationaldatasets
AT pablomorenoger comparisonofclusteringalgorithmsforlearninganalyticswitheducationaldatasets
_version_ 1725824619410096128