Ontology boosted deep learning for disease name extraction from Twitter messages

Abstract This paper presents an ontology based deep learning approach for extracting disease names from Twitter messages. The approach relies on simple features obtained via conceptual representations of messages to obtain results that out-perform those from word level models. The significance of th...

Full description

Bibliographic Details
Main Authors: Mark Abraham Magumba, Peter Nabende, Ernest Mwebaze
Format: Article
Language:English
Published: SpringerOpen 2018-09-01
Series:Journal of Big Data
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40537-018-0139-2
id doaj-66f2098fea4744d49dfc0aaa7c429dfe
record_format Article
spelling doaj-66f2098fea4744d49dfc0aaa7c429dfe2020-11-25T01:26:59ZengSpringerOpenJournal of Big Data2196-11152018-09-015111910.1186/s40537-018-0139-2Ontology boosted deep learning for disease name extraction from Twitter messagesMark Abraham Magumba0Peter Nabende1Ernest Mwebaze2Department of Information Systems, Makerere University College of Computing and Information SciencesDepartment of Information Systems, Makerere University College of Computing and Information SciencesDepartment of Computer Science, Makerere University College of Computing and Information SciencesAbstract This paper presents an ontology based deep learning approach for extracting disease names from Twitter messages. The approach relies on simple features obtained via conceptual representations of messages to obtain results that out-perform those from word level models. The significance of this development is that it can potentially reduce the cost of generating named entity recognition models by reducing the cost of annotating training data since ontology creation is a one-time cost as the conceptual level the ontology is meant to be fairly static and reusable. This is of great importance when it comes to social media text like Twitter messages where you have a large, unbounded lexicon with spatial and temporal variations and other inherent biases that make it logistically untenable to annotate a representative amount of text for general purpose models for live applications.http://link.springer.com/article/10.1186/s40537-018-0139-2EpidemiologyTwitterSentiment analysisText classificationConcept ontologyData mining
collection DOAJ
language English
format Article
sources DOAJ
author Mark Abraham Magumba
Peter Nabende
Ernest Mwebaze
spellingShingle Mark Abraham Magumba
Peter Nabende
Ernest Mwebaze
Ontology boosted deep learning for disease name extraction from Twitter messages
Journal of Big Data
Epidemiology
Twitter
Sentiment analysis
Text classification
Concept ontology
Data mining
author_facet Mark Abraham Magumba
Peter Nabende
Ernest Mwebaze
author_sort Mark Abraham Magumba
title Ontology boosted deep learning for disease name extraction from Twitter messages
title_short Ontology boosted deep learning for disease name extraction from Twitter messages
title_full Ontology boosted deep learning for disease name extraction from Twitter messages
title_fullStr Ontology boosted deep learning for disease name extraction from Twitter messages
title_full_unstemmed Ontology boosted deep learning for disease name extraction from Twitter messages
title_sort ontology boosted deep learning for disease name extraction from twitter messages
publisher SpringerOpen
series Journal of Big Data
issn 2196-1115
publishDate 2018-09-01
description Abstract This paper presents an ontology based deep learning approach for extracting disease names from Twitter messages. The approach relies on simple features obtained via conceptual representations of messages to obtain results that out-perform those from word level models. The significance of this development is that it can potentially reduce the cost of generating named entity recognition models by reducing the cost of annotating training data since ontology creation is a one-time cost as the conceptual level the ontology is meant to be fairly static and reusable. This is of great importance when it comes to social media text like Twitter messages where you have a large, unbounded lexicon with spatial and temporal variations and other inherent biases that make it logistically untenable to annotate a representative amount of text for general purpose models for live applications.
topic Epidemiology
Twitter
Sentiment analysis
Text classification
Concept ontology
Data mining
url http://link.springer.com/article/10.1186/s40537-018-0139-2
work_keys_str_mv AT markabrahammagumba ontologyboosteddeeplearningfordiseasenameextractionfromtwittermessages
AT peternabende ontologyboosteddeeplearningfordiseasenameextractionfromtwittermessages
AT ernestmwebaze ontologyboosteddeeplearningfordiseasenameextractionfromtwittermessages
_version_ 1725107663461679104