Analysis of machine learning algorithms on bioinformatics data of varying quality

One of the main applications of machine learning in bioinformatics is the construction of classification models which can accurately classify new instances using information gained from previous instances. With the help of machine learning algorithms (such as supervised classification and gene selec...

Full description

Bibliographic Details
Other Authors: Shanab, Ahmad Abu (author)
Format: Others
Language:English
Published: Florida Atlantic University
Subjects:
Online Access:http://purl.flvc.org./fau/fd/FA00004425
http://purl.flvc.org/fau/fd/FA00004425
id ndltd-fau.edu-oai-fau.digital.flvc.org-fau_31342
record_format oai_dc
spelling ndltd-fau.edu-oai-fau.digital.flvc.org-fau_313422019-07-04T03:52:03Z Analysis of machine learning algorithms on bioinformatics data of varying quality FA00004425 Shanab, Ahmad Abu (author) Khoshgoftaar, Taghi M. (Thesis advisor) Florida Atlantic University (Degree grantor) College of Engineering and Computer Science Department of Computer and Electrical Engineering and Computer Science 154 p. application/pdf Electronic Thesis or Dissertation Text English One of the main applications of machine learning in bioinformatics is the construction of classification models which can accurately classify new instances using information gained from previous instances. With the help of machine learning algorithms (such as supervised classification and gene selection) new meaningful knowledge can be extracted from bioinformatics datasets that can help in disease diagnosis and prognosis as well as in prescribing the right treatment for a disease. One particular challenge encountered when analyzing bioinformatics datasets is data noise, which refers to incorrect or missing values in datasets. Noise can be introduced as a result of experimental errors (e.g. faulty microarray chips, insufficient resolution, image corruption, and incorrect laboratory procedures), as well as other errors (errors during data processing, transfer, and/or mining). A special type of data noise called class noise, which occurs when an instance/example is mislabeled. Previous research showed that class noise has a detrimental impact on machine learning algorithms (e.g. worsened classification performance and unstable feature selection). In addition to data noise, gene expression datasets can suffer from the problems of high dimensionality (a very large feature space) and class imbalance (unequal distribution of instances between classes). As a result of these inherent problems, constructing accurate classification models becomes more challenging. Florida Atlantic University Artificial intelligence Bioinformatics Machine learning System design Theory of computation Includes bibliography. Dissertation (Ph.D.)--Florida Atlantic University, 2015. FAU Electronic Theses and Dissertations Collection Copyright © is held by the author, with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. http://purl.flvc.org./fau/fd/FA00004425 http://purl.flvc.org/fau/fd/FA00004425 http://rightsstatements.org/vocab/InC/1.0/ https://fau.digital.flvc.org/islandora/object/fau%3A31342/datastream/TN/view/Analysis%20of%20machine%20learning%20algorithms%20on%20bioinformatics%20data%20of%20varying%20quality.jpg
collection NDLTD
language English
format Others
sources NDLTD
topic Artificial intelligence
Bioinformatics
Machine learning
System design
Theory of computation
spellingShingle Artificial intelligence
Bioinformatics
Machine learning
System design
Theory of computation
Analysis of machine learning algorithms on bioinformatics data of varying quality
description One of the main applications of machine learning in bioinformatics is the construction of classification models which can accurately classify new instances using information gained from previous instances. With the help of machine learning algorithms (such as supervised classification and gene selection) new meaningful knowledge can be extracted from bioinformatics datasets that can help in disease diagnosis and prognosis as well as in prescribing the right treatment for a disease. One particular challenge encountered when analyzing bioinformatics datasets is data noise, which refers to incorrect or missing values in datasets. Noise can be introduced as a result of experimental errors (e.g. faulty microarray chips, insufficient resolution, image corruption, and incorrect laboratory procedures), as well as other errors (errors during data processing, transfer, and/or mining). A special type of data noise called class noise, which occurs when an instance/example is mislabeled. Previous research showed that class noise has a detrimental impact on machine learning algorithms (e.g. worsened classification performance and unstable feature selection). In addition to data noise, gene expression datasets can suffer from the problems of high dimensionality (a very large feature space) and class imbalance (unequal distribution of instances between classes). As a result of these inherent problems, constructing accurate classification models becomes more challenging. === Includes bibliography. === Dissertation (Ph.D.)--Florida Atlantic University, 2015. === FAU Electronic Theses and Dissertations Collection
author2 Shanab, Ahmad Abu (author)
author_facet Shanab, Ahmad Abu (author)
title Analysis of machine learning algorithms on bioinformatics data of varying quality
title_short Analysis of machine learning algorithms on bioinformatics data of varying quality
title_full Analysis of machine learning algorithms on bioinformatics data of varying quality
title_fullStr Analysis of machine learning algorithms on bioinformatics data of varying quality
title_full_unstemmed Analysis of machine learning algorithms on bioinformatics data of varying quality
title_sort analysis of machine learning algorithms on bioinformatics data of varying quality
publisher Florida Atlantic University
url http://purl.flvc.org./fau/fd/FA00004425
http://purl.flvc.org/fau/fd/FA00004425
_version_ 1719219074239561728