Plagiarism Detection in Students' Theses Using The Cosine Similarity Method

The main requirement for graduation from students is to make a final scientific paper. One of the factors determining the quality of a student's scientific work is the uniqueness and innovation of the work. This research aims to apply data mining methods to detect similarities in titles, abstra...

Full description

Bibliographic Details
Main Authors: Oppi Anda Resta, Addin Aditya, Febry Eka Purwiantono
Format: Article
Language:English
Published: Politeknik Ganesha Medan 2021-05-01
Series:Sinkron
Subjects:
Online Access:https://jurnal.polgan.ac.id/index.php/sinkron/article/view/10909
id doaj-843cabd86c2e425392a72b0fa58a570a
record_format Article
spelling doaj-843cabd86c2e425392a72b0fa58a570a2021-05-01T16:43:25ZengPoliteknik Ganesha MedanSinkron2541-044X2541-20192021-05-015230531310.33395/sinkron.v5i2.109091173Plagiarism Detection in Students' Theses Using The Cosine Similarity MethodOppi Anda RestaAddin AdityaFebry Eka PurwiantonoThe main requirement for graduation from students is to make a final scientific paper. One of the factors determining the quality of a student's scientific work is the uniqueness and innovation of the work. This research aims to apply data mining methods to detect similarities in titles, abstracts, or topics of students' final scientific papers so that plagiarism does not occur. In this research, the cosine similarity method is combined with the preprocessing method and TF-IDF to calculate the level of similarity between the title and the abstract of a student's final scientific paper, then the results will be displayed and compared with the existing final project repository based on the threshold value to make a decision whether scientific work can be accepted or rejected. Based on the test data and training data that has been applied to the TF-IDF method, it shows that the percentage level of similarity between the training data document and the test data document is 8%. This shows that the student thesis is still classified as unique and does not contain plagiarism content. The findings of this study can help the university in managing the administration of student theses so that plagiarism does not occur. Furthermore, it is necessary to study further adding methods to increase the accuracy of system performance so that when the process is run the system will work faster and optimally.https://jurnal.polgan.ac.id/index.php/sinkron/article/view/10909plagiarismtext miningcosine similaritytf-idfstudent theses
collection DOAJ
language English
format Article
sources DOAJ
author Oppi Anda Resta
Addin Aditya
Febry Eka Purwiantono
spellingShingle Oppi Anda Resta
Addin Aditya
Febry Eka Purwiantono
Plagiarism Detection in Students' Theses Using The Cosine Similarity Method
Sinkron
plagiarism
text mining
cosine similarity
tf-idf
student theses
author_facet Oppi Anda Resta
Addin Aditya
Febry Eka Purwiantono
author_sort Oppi Anda Resta
title Plagiarism Detection in Students' Theses Using The Cosine Similarity Method
title_short Plagiarism Detection in Students' Theses Using The Cosine Similarity Method
title_full Plagiarism Detection in Students' Theses Using The Cosine Similarity Method
title_fullStr Plagiarism Detection in Students' Theses Using The Cosine Similarity Method
title_full_unstemmed Plagiarism Detection in Students' Theses Using The Cosine Similarity Method
title_sort plagiarism detection in students' theses using the cosine similarity method
publisher Politeknik Ganesha Medan
series Sinkron
issn 2541-044X
2541-2019
publishDate 2021-05-01
description The main requirement for graduation from students is to make a final scientific paper. One of the factors determining the quality of a student's scientific work is the uniqueness and innovation of the work. This research aims to apply data mining methods to detect similarities in titles, abstracts, or topics of students' final scientific papers so that plagiarism does not occur. In this research, the cosine similarity method is combined with the preprocessing method and TF-IDF to calculate the level of similarity between the title and the abstract of a student's final scientific paper, then the results will be displayed and compared with the existing final project repository based on the threshold value to make a decision whether scientific work can be accepted or rejected. Based on the test data and training data that has been applied to the TF-IDF method, it shows that the percentage level of similarity between the training data document and the test data document is 8%. This shows that the student thesis is still classified as unique and does not contain plagiarism content. The findings of this study can help the university in managing the administration of student theses so that plagiarism does not occur. Furthermore, it is necessary to study further adding methods to increase the accuracy of system performance so that when the process is run the system will work faster and optimally.
topic plagiarism
text mining
cosine similarity
tf-idf
student theses
url https://jurnal.polgan.ac.id/index.php/sinkron/article/view/10909
work_keys_str_mv AT oppiandaresta plagiarismdetectioninstudentsthesesusingthecosinesimilaritymethod
AT addinaditya plagiarismdetectioninstudentsthesesusingthecosinesimilaritymethod
AT febryekapurwiantono plagiarismdetectioninstudentsthesesusingthecosinesimilaritymethod
_version_ 1721496946921701376