Robust imputation method for missing values in microarray data

<p>Abstract</p> <p>Background</p> <p>When analyzing microarray gene expression data, missing values are often encountered. Most multivariate statistical methods proposed for microarray data analysis cannot be applied when the data have missing values. Numerous imputatio...

Full description

Bibliographic Details
Main Authors: Park Taesung, Lee Eun-Kyung, Yoon Dankyu
Format: Article
Language:English
Published: BMC 2007-05-01
Series:BMC Bioinformatics
id doaj-d9af8867e6f643caaa5bef17b495bccd
record_format Article
spelling doaj-d9af8867e6f643caaa5bef17b495bccd2020-11-25T01:45:00ZengBMCBMC Bioinformatics1471-21052007-05-018Suppl 2S610.1186/1471-2105-8-S2-S6Robust imputation method for missing values in microarray dataPark TaesungLee Eun-KyungYoon Dankyu<p>Abstract</p> <p>Background</p> <p>When analyzing microarray gene expression data, missing values are often encountered. Most multivariate statistical methods proposed for microarray data analysis cannot be applied when the data have missing values. Numerous imputation algorithms have been proposed to estimate the missing values. In this study, we develop a robust least squares estimation with principal components (RLSP) method by extending the local least square imputation (LLSimpute) method. The basic idea of our method is to employ quantile regression to estimate the missing values, using the estimated principal components of a selected set of similar genes.</p> <p>Results</p> <p>Using the normalized root mean squares error, the performance of the proposed method was evaluated and compared with other previously proposed imputation methods. The proposed RLSP method clearly outperformed the weighted <it>k</it>-nearest neighbors imputation (kNNimpute) method and LLSimpute method, and showed competitive results with Bayesian principal component analysis (BPCA) method.</p> <p>Conclusion</p> <p>Adapting the principal components of the selected genes and employing the quantile regression model improved the robustness and accuracy of missing value imputation. Thus, the proposed RLSP method is, according to our empirical studies, more robust and accurate than the widely used kNNimpute and LLSimpute methods.</p>
collection DOAJ
language English
format Article
sources DOAJ
author Park Taesung
Lee Eun-Kyung
Yoon Dankyu
spellingShingle Park Taesung
Lee Eun-Kyung
Yoon Dankyu
Robust imputation method for missing values in microarray data
BMC Bioinformatics
author_facet Park Taesung
Lee Eun-Kyung
Yoon Dankyu
author_sort Park Taesung
title Robust imputation method for missing values in microarray data
title_short Robust imputation method for missing values in microarray data
title_full Robust imputation method for missing values in microarray data
title_fullStr Robust imputation method for missing values in microarray data
title_full_unstemmed Robust imputation method for missing values in microarray data
title_sort robust imputation method for missing values in microarray data
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2007-05-01
description <p>Abstract</p> <p>Background</p> <p>When analyzing microarray gene expression data, missing values are often encountered. Most multivariate statistical methods proposed for microarray data analysis cannot be applied when the data have missing values. Numerous imputation algorithms have been proposed to estimate the missing values. In this study, we develop a robust least squares estimation with principal components (RLSP) method by extending the local least square imputation (LLSimpute) method. The basic idea of our method is to employ quantile regression to estimate the missing values, using the estimated principal components of a selected set of similar genes.</p> <p>Results</p> <p>Using the normalized root mean squares error, the performance of the proposed method was evaluated and compared with other previously proposed imputation methods. The proposed RLSP method clearly outperformed the weighted <it>k</it>-nearest neighbors imputation (kNNimpute) method and LLSimpute method, and showed competitive results with Bayesian principal component analysis (BPCA) method.</p> <p>Conclusion</p> <p>Adapting the principal components of the selected genes and employing the quantile regression model improved the robustness and accuracy of missing value imputation. Thus, the proposed RLSP method is, according to our empirical studies, more robust and accurate than the widely used kNNimpute and LLSimpute methods.</p>
work_keys_str_mv AT parktaesung robustimputationmethodformissingvaluesinmicroarraydata
AT leeeunkyung robustimputationmethodformissingvaluesinmicroarraydata
AT yoondankyu robustimputationmethodformissingvaluesinmicroarraydata
_version_ 1725025825264238592