Mining for Significant Information from Unstructured and Structured Biological Data and Its Applications

Massive amounts of biological data are being accumulated in science. Searching for significant meaningful information and patterns from different types of data is necessary towards gaining knowledge from these large amounts of data available to users. However, data mining techniques do not normally...

Full description

Bibliographic Details
Main Author: Al-Azzam, Omar Ghazi
Format: Others
Published: North Dakota State University 2017
Subjects:
Online Access:https://hdl.handle.net/10365/26509
id ndltd-ndsu.edu-oai-library.ndsu.edu-10365-26509
record_format oai_dc
spelling ndltd-ndsu.edu-oai-library.ndsu.edu-10365-265092021-09-28T17:11:28Z Mining for Significant Information from Unstructured and Structured Biological Data and Its Applications Al-Azzam, Omar Ghazi Data mining. Gene mapping. Massive amounts of biological data are being accumulated in science. Searching for significant meaningful information and patterns from different types of data is necessary towards gaining knowledge from these large amounts of data available to users. However, data mining techniques do not normally deal with significance. Integrating data mining techniques with standard statistical procedures provides a way for mining statistically signi- ficant, interesting information from both structured and unstructured data. In this dissertation, different algorithms for mining significant biological information from both unstructured and structured data are proposed. A weighted-density-based approach is presented for mining item data from unstructured textual representations. Different algorithms in the area of radiation hybrid mapping are developed for mining significant information from structured binary data. The proposed algorithms have different applications in the ordering problem in radiation hybrid mapping including: identifying unreliable markers, and building solid framework maps. Effectiveness of the proposed algorithms towards improving map stability is demonstrated. Map stability is determined based on resampling analysis. The proposed algorithms deal effectively and efficiently with multidimensional data and also reduce computational cost dramatically. Evaluation shows that the proposed algorithms outperform comparative methods in terms of both accuracy and computation cost. 2017-09-26T20:42:25Z 2017-09-26T20:42:25Z 2012 text/dissertation https://hdl.handle.net/10365/26509 NDSU Policy 190.6.2 https://www.ndsu.edu/fileadmin/policy/190.pdf application/pdf North Dakota State University
collection NDLTD
format Others
sources NDLTD
topic Data mining.
Gene mapping.
spellingShingle Data mining.
Gene mapping.
Al-Azzam, Omar Ghazi
Mining for Significant Information from Unstructured and Structured Biological Data and Its Applications
description Massive amounts of biological data are being accumulated in science. Searching for significant meaningful information and patterns from different types of data is necessary towards gaining knowledge from these large amounts of data available to users. However, data mining techniques do not normally deal with significance. Integrating data mining techniques with standard statistical procedures provides a way for mining statistically signi- ficant, interesting information from both structured and unstructured data. In this dissertation, different algorithms for mining significant biological information from both unstructured and structured data are proposed. A weighted-density-based approach is presented for mining item data from unstructured textual representations. Different algorithms in the area of radiation hybrid mapping are developed for mining significant information from structured binary data. The proposed algorithms have different applications in the ordering problem in radiation hybrid mapping including: identifying unreliable markers, and building solid framework maps. Effectiveness of the proposed algorithms towards improving map stability is demonstrated. Map stability is determined based on resampling analysis. The proposed algorithms deal effectively and efficiently with multidimensional data and also reduce computational cost dramatically. Evaluation shows that the proposed algorithms outperform comparative methods in terms of both accuracy and computation cost.
author Al-Azzam, Omar Ghazi
author_facet Al-Azzam, Omar Ghazi
author_sort Al-Azzam, Omar Ghazi
title Mining for Significant Information from Unstructured and Structured Biological Data and Its Applications
title_short Mining for Significant Information from Unstructured and Structured Biological Data and Its Applications
title_full Mining for Significant Information from Unstructured and Structured Biological Data and Its Applications
title_fullStr Mining for Significant Information from Unstructured and Structured Biological Data and Its Applications
title_full_unstemmed Mining for Significant Information from Unstructured and Structured Biological Data and Its Applications
title_sort mining for significant information from unstructured and structured biological data and its applications
publisher North Dakota State University
publishDate 2017
url https://hdl.handle.net/10365/26509
work_keys_str_mv AT alazzamomarghazi miningforsignificantinformationfromunstructuredandstructuredbiologicaldataanditsapplications
_version_ 1719485545592127488