Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage

One-class modelling is a useful approach in metabolomics for the untargeted detection of abnormal metabolite profiles, when information from a set of reference observations is available to model “normal” or baseline metabolite profiles. Such outlying profiles are typically identified by comparing th...

Full description

Bibliographic Details
Main Authors: Alberto Brini, Vahe Avagyan, Ric C. H. de Vos, Jack H. Vossen, Edwin R. van den Heuvel, Jasper Engel
Format: Article
Language:English
Published: MDPI AG 2021-04-01
Series:Metabolites
Subjects:
Online Access:https://www.mdpi.com/2218-1989/11/4/237
id doaj-f21024ffe9b14ecdad59b5bcf12b5b29
record_format Article
spelling doaj-f21024ffe9b14ecdad59b5bcf12b5b292021-04-13T23:03:08ZengMDPI AGMetabolites2218-19892021-04-011123723710.3390/metabo11040237Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-ShrinkageAlberto Brini0Vahe Avagyan1Ric C. H. de Vos2Jack H. Vossen3Edwin R. van den Heuvel4Jasper Engel5Department of Mathematics and Computer Science, Eindhoven University of Technology, 5600 MB Eindhoven, The NetherlandsBiometris, Wageningen University and Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The NetherlandsBioscience, Wageningen University and Research, Droevendaalsesteeg 1, 6700 AA Wageningen, The NetherlandsPlant Breeding, Wageningen University and Research, Droevendaalsesteeg 1, 6700 AJ Wageningen, The NetherlandsDepartment of Mathematics and Computer Science, Eindhoven University of Technology, 5600 MB Eindhoven, The NetherlandsBiometris, Wageningen University and Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The NetherlandsOne-class modelling is a useful approach in metabolomics for the untargeted detection of abnormal metabolite profiles, when information from a set of reference observations is available to model “normal” or baseline metabolite profiles. Such outlying profiles are typically identified by comparing the distance between an observation and the reference class to a critical limit. Often, multivariate distance measures such as the Mahalanobis distance (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>MD</mi></mrow></semantics></math></inline-formula>) or principal component-based measures are used. These approaches, however, are either not applicable to untargeted metabolomics data, or their results are unreliable. In this paper, five distance measures for one-class modeling in untargeted metabolites are proposed. They are based on a combination of the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>MD</mi></mrow></semantics></math></inline-formula> and five so-called eigenvalue-shrinkage estimators of the covariance matrix of the reference class. A simple cross-validation procedure is proposed to set the critical limit for outlier detection. Simulation studies are used to identify which distance measure provides the best performance for one-class modeling, in terms of type I error and power to identify abnormal metabolite profiles. Empirical evidence demonstrates that this method has better type I error (false positive rate) and improved outlier detection power than the standard (principal component-based) one-class models. The method is illustrated by its application to liquid chromatography coupled to mass spectrometry (LC-MS) and nuclear magnetic response spectroscopy (NMR) untargeted metabolomics data from two studies on food safety assessment and diagnosis of rare diseases, respectively.https://www.mdpi.com/2218-1989/11/4/237high-dimensional dataone-class modeluntargeted metabolomicsmahalonobis distanceeigenvalue-shrinkagecritical value
collection DOAJ
language English
format Article
sources DOAJ
author Alberto Brini
Vahe Avagyan
Ric C. H. de Vos
Jack H. Vossen
Edwin R. van den Heuvel
Jasper Engel
spellingShingle Alberto Brini
Vahe Avagyan
Ric C. H. de Vos
Jack H. Vossen
Edwin R. van den Heuvel
Jasper Engel
Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage
Metabolites
high-dimensional data
one-class model
untargeted metabolomics
mahalonobis distance
eigenvalue-shrinkage
critical value
author_facet Alberto Brini
Vahe Avagyan
Ric C. H. de Vos
Jack H. Vossen
Edwin R. van den Heuvel
Jasper Engel
author_sort Alberto Brini
title Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage
title_short Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage
title_full Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage
title_fullStr Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage
title_full_unstemmed Improved One-Class Modeling of High-Dimensional Metabolomics Data via Eigenvalue-Shrinkage
title_sort improved one-class modeling of high-dimensional metabolomics data via eigenvalue-shrinkage
publisher MDPI AG
series Metabolites
issn 2218-1989
publishDate 2021-04-01
description One-class modelling is a useful approach in metabolomics for the untargeted detection of abnormal metabolite profiles, when information from a set of reference observations is available to model “normal” or baseline metabolite profiles. Such outlying profiles are typically identified by comparing the distance between an observation and the reference class to a critical limit. Often, multivariate distance measures such as the Mahalanobis distance (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>MD</mi></mrow></semantics></math></inline-formula>) or principal component-based measures are used. These approaches, however, are either not applicable to untargeted metabolomics data, or their results are unreliable. In this paper, five distance measures for one-class modeling in untargeted metabolites are proposed. They are based on a combination of the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>MD</mi></mrow></semantics></math></inline-formula> and five so-called eigenvalue-shrinkage estimators of the covariance matrix of the reference class. A simple cross-validation procedure is proposed to set the critical limit for outlier detection. Simulation studies are used to identify which distance measure provides the best performance for one-class modeling, in terms of type I error and power to identify abnormal metabolite profiles. Empirical evidence demonstrates that this method has better type I error (false positive rate) and improved outlier detection power than the standard (principal component-based) one-class models. The method is illustrated by its application to liquid chromatography coupled to mass spectrometry (LC-MS) and nuclear magnetic response spectroscopy (NMR) untargeted metabolomics data from two studies on food safety assessment and diagnosis of rare diseases, respectively.
topic high-dimensional data
one-class model
untargeted metabolomics
mahalonobis distance
eigenvalue-shrinkage
critical value
url https://www.mdpi.com/2218-1989/11/4/237
work_keys_str_mv AT albertobrini improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage
AT vaheavagyan improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage
AT ricchdevos improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage
AT jackhvossen improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage
AT edwinrvandenheuvel improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage
AT jasperengel improvedoneclassmodelingofhighdimensionalmetabolomicsdataviaeigenvalueshrinkage
_version_ 1721528393308045312