Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor

Social media contain abundant information about the events or news occurring all over the world. Social media growth has a greater impact on various domains like marketing, e-commerce, health care, e-governance, and politics, etc. Currently, Twitter was developed as one of the social media platforms...

Full description

Bibliographic Details
Main Authors: Batchanaboyina M. Rao, Devarakonda Nagaraju
Format: Article
Language:English
Published: De Gruyter 2019-05-01
Series:Journal of Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1515/jisys-2018-0476
id doaj-852ce49586b24513b15b6cff52eb414c
record_format Article
spelling doaj-852ce49586b24513b15b6cff52eb414c2021-09-06T19:40:39ZengDe GruyterJournal of Intelligent Systems0334-18602191-026X2019-05-012911416142410.1515/jisys-2018-0476Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest NeighborBatchanaboyina M. Rao0Devarakonda Nagaraju1Computer Science and Engineering, Achraya Nagarjuna University, Guntur, A.P.-522510, IndiaDepartment of Information Technology, Lakireddy Balireddy College of Engineering, Mylavaram, Krishna (DT), A.P.-521230, IndiaSocial media contain abundant information about the events or news occurring all over the world. Social media growth has a greater impact on various domains like marketing, e-commerce, health care, e-governance, and politics, etc. Currently, Twitter was developed as one of the social media platforms, and now, it is one of the most popular social media platforms. There are 1 billion user’s profiles and millions of active users, who post tweets daily. In this research, buzz detection in social media was carried out by the semantic approach using the condensed nearest neighbor (SACNN). The Twitter and Tom’s Hardware data are stored in the UC Irvine Machine Learning Repository, and this dataset is used in this research for outlier detection. The min–max normalization technique is applied to the social media dataset, and additionally, missing values were replaced by the normalized value. The condensed nearest neighbor (CNN) is used for semantic analysis of the database, and based on the optimized value provided by the proposed method, the threshold is calculated. The threshold value is used to classify buzz and non-buzz discussions in the social media database. The result showed that the SACNN achieved 99% of accuracy, and relative error is less than the existing methods.https://doi.org/10.1515/jisys-2018-0476uc irvine machine learning repositorynormalizationoutlier detectionsemantic approach using condensed nearest neighbor (sacnn)twitter
collection DOAJ
language English
format Article
sources DOAJ
author Batchanaboyina M. Rao
Devarakonda Nagaraju
spellingShingle Batchanaboyina M. Rao
Devarakonda Nagaraju
Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor
Journal of Intelligent Systems
uc irvine machine learning repository
normalization
outlier detection
semantic approach using condensed nearest neighbor (sacnn)
twitter
author_facet Batchanaboyina M. Rao
Devarakonda Nagaraju
author_sort Batchanaboyina M. Rao
title Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor
title_short Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor
title_full Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor
title_fullStr Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor
title_full_unstemmed Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor
title_sort design and evaluation of outlier detection based on semantic condensed nearest neighbor
publisher De Gruyter
series Journal of Intelligent Systems
issn 0334-1860
2191-026X
publishDate 2019-05-01
description Social media contain abundant information about the events or news occurring all over the world. Social media growth has a greater impact on various domains like marketing, e-commerce, health care, e-governance, and politics, etc. Currently, Twitter was developed as one of the social media platforms, and now, it is one of the most popular social media platforms. There are 1 billion user’s profiles and millions of active users, who post tweets daily. In this research, buzz detection in social media was carried out by the semantic approach using the condensed nearest neighbor (SACNN). The Twitter and Tom’s Hardware data are stored in the UC Irvine Machine Learning Repository, and this dataset is used in this research for outlier detection. The min–max normalization technique is applied to the social media dataset, and additionally, missing values were replaced by the normalized value. The condensed nearest neighbor (CNN) is used for semantic analysis of the database, and based on the optimized value provided by the proposed method, the threshold is calculated. The threshold value is used to classify buzz and non-buzz discussions in the social media database. The result showed that the SACNN achieved 99% of accuracy, and relative error is less than the existing methods.
topic uc irvine machine learning repository
normalization
outlier detection
semantic approach using condensed nearest neighbor (sacnn)
twitter
url https://doi.org/10.1515/jisys-2018-0476
work_keys_str_mv AT batchanaboyinamrao designandevaluationofoutlierdetectionbasedonsemanticcondensednearestneighbor
AT devarakondanagaraju designandevaluationofoutlierdetectionbasedonsemanticcondensednearestneighbor
_version_ 1717768025922338816