A Study on Data Mining Analysis in Bio-chip Data

博士 === 國立清華大學 === 工業工程與工程管理學系 === 96 === Owing to increasing breakthroughs for microarray in biochips and gene cloning technologies, biotechnology is now an emergent and promising industry worldwide. Although information technology advancements enable complex calculation and comprehensive data stora...

Full description

Bibliographic Details
Main Authors: Kuo-Sheng Lin, 林國勝
Other Authors: 簡禎富
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/97070805880627034008
Description
Summary:博士 === 國立清華大學 === 工業工程與工程管理學系 === 96 === Owing to increasing breakthroughs for microarray in biochips and gene cloning technologies, biotechnology is now an emergent and promising industry worldwide. Although information technology advancements enable complex calculation and comprehensive data storage involved in biotechnology, a number of critical issues need to be addressed for both practice and research needs. This study aims to develop a data mining framework within a proposed cluster analysis algorithm for analyzing huge bio-chip data that are different from the data addressed in manufacturing and service industries. Bio-chip data that consists of high-dimensional attributes have more attributes than specimens. Feature selection and extraction is critical to remove noisy features and reduce the dimensionality in microarray analysis. In particular, specific genes between normal and abnormal individuals are extracted in decision rules to clarify the relationships among genes and diseases; the relationship of in-group and with-group among genes is needed to be built up. We adopt the breast cancer patient cDNA microarray dataset for validating the proposed approach. We firstly extracted significant genes from more than 44,000 genes and then use decision tree to derive classification rules, and use the proposed algorithm to build up cluster relationship by displaying table list to support medical diagnosis and reference. The results showed practical viability of this framework.