Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage
One-class modelling is a useful approach in metabolomics for the untargeted detection of abnormal metabolite profiles, when information from a set of reference observations is available to model “normal” or baseline metabolite profiles. Such outlying profiles are typically identified by comparing th...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-04-01
|
Series: | Metabolites |
Subjects: | |
Online Access: | https://www.mdpi.com/2218-1989/11/4/237 |
id |
doaj-f21024ffe9b14ecdad59b5bcf12b5b29 |
---|---|
record_format |
Article |
spelling |
doaj-f21024ffe9b14ecdad59b5bcf12b5b292021-04-13T23:03:08ZengMDPI AGMetabolites2218-19892021-04-011123723710.3390/metabo11040237Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-ShrinkageAlberto Brini0Vahe Avagyan1Ric C. H. de Vos2Jack H. Vossen3Edwin R. van den Heuvel4Jasper Engel5Department of Mathematics and Computer Science, Eindhoven University of Technology, 5600 MB Eindhoven, The NetherlandsBiometris, Wageningen University and Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The NetherlandsBioscience, Wageningen University and Research, Droevendaalsesteeg 1, 6700 AA Wageningen, The NetherlandsPlant Breeding, Wageningen University and Research, Droevendaalsesteeg 1, 6700 AJ Wageningen, The NetherlandsDepartment of Mathematics and Computer Science, Eindhoven University of Technology, 5600 MB Eindhoven, The NetherlandsBiometris, Wageningen University and Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The NetherlandsOne-class modelling is a useful approach in metabolomics for the untargeted detection of abnormal metabolite profiles, when information from a set of reference observations is available to model “normal” or baseline metabolite profiles. Such outlying profiles are typically identified by comparing the distance between an observation and the reference class to a critical limit. Often, multivariate distance measures such as the Mahalanobis distance (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>MD</mi></mrow></semantics></math></inline-formula>) or principal component-based measures are used. These approaches, however, are either not applicable to untargeted metabolomics data, or their results are unreliable. In this paper, five distance measures for one-class modeling in untargeted metabolites are proposed. They are based on a combination of the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>MD</mi></mrow></semantics></math></inline-formula> and five so-called eigenvalue-shrinkage estimators of the covariance matrix of the reference class. A simple cross-validation procedure is proposed to set the critical limit for outlier detection. Simulation studies are used to identify which distance measure provides the best performance for one-class modeling, in terms of type I error and power to identify abnormal metabolite profiles. Empirical evidence demonstrates that this method has better type I error (false positive rate) and improved outlier detection power than the standard (principal component-based) one-class models. The method is illustrated by its application to liquid chromatography coupled to mass spectrometry (LC-MS) and nuclear magnetic response spectroscopy (NMR) untargeted metabolomics data from two studies on food safety assessment and diagnosis of rare diseases, respectively.https://www.mdpi.com/2218-1989/11/4/237high-dimensional dataone-class modeluntargeted metabolomicsmahalonobis distanceeigenvalue-shrinkagecritical value |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Alberto Brini Vahe Avagyan Ric C. H. de Vos Jack H. Vossen Edwin R. van den Heuvel Jasper Engel |
spellingShingle |
Alberto Brini Vahe Avagyan Ric C. H. de Vos Jack H. Vossen Edwin R. van den Heuvel Jasper Engel Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage Metabolites high-dimensional data one-class model untargeted metabolomics mahalonobis distance eigenvalue-shrinkage critical value |
author_facet |
Alberto Brini Vahe Avagyan Ric C. H. de Vos Jack H. Vossen Edwin R. van den Heuvel Jasper Engel |
author_sort |
Alberto Brini |
title |
Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage |
title_short |
Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage |
title_full |
Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage |
title_fullStr |
Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage |
title_full_unstemmed |
Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage |
title_sort |
improved one-class modeling of high-dimensional metabolomics data via eigenvalue-shrinkage |
publisher |
MDPI AG |
series |
Metabolites |
issn |
2218-1989 |
publishDate |
2021-04-01 |
description |
One-class modelling is a useful approach in metabolomics for the untargeted detection of abnormal metabolite profiles, when information from a set of reference observations is available to model “normal” or baseline metabolite profiles. Such outlying profiles are typically identified by comparing the distance between an observation and the reference class to a critical limit. Often, multivariate distance measures such as the Mahalanobis distance (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>MD</mi></mrow></semantics></math></inline-formula>) or principal component-based measures are used. These approaches, however, are either not applicable to untargeted metabolomics data, or their results are unreliable. In this paper, five distance measures for one-class modeling in untargeted metabolites are proposed. They are based on a combination of the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>MD</mi></mrow></semantics></math></inline-formula> and five so-called eigenvalue-shrinkage estimators of the covariance matrix of the reference class. A simple cross-validation procedure is proposed to set the critical limit for outlier detection. Simulation studies are used to identify which distance measure provides the best performance for one-class modeling, in terms of type I error and power to identify abnormal metabolite profiles. Empirical evidence demonstrates that this method has better type I error (false positive rate) and improved outlier detection power than the standard (principal component-based) one-class models. The method is illustrated by its application to liquid chromatography coupled to mass spectrometry (LC-MS) and nuclear magnetic response spectroscopy (NMR) untargeted metabolomics data from two studies on food safety assessment and diagnosis of rare diseases, respectively. |
topic |
high-dimensional data one-class model untargeted metabolomics mahalonobis distance eigenvalue-shrinkage critical value |
url |
https://www.mdpi.com/2218-1989/11/4/237 |
work_keys_str_mv |
AT albertobrini improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage AT vaheavagyan improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage AT ricchdevos improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage AT jackhvossen improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage AT edwinrvandenheuvel improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage AT jasperengel improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage |
_version_ |
1721528393308045312 |