Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data

Background. Many statistical models have been tested to predict phenotypic or virological response from genotypic data. A statistical framework called Super Learner has been introduced either to compare different methods/learners (discrete Super Learner) or to combine them in a Super Learner predict...

Full description

Bibliographic Details
Main Authors:	Allal Houssaïni, Lambert Assoumou, Anne Geneviève Marcelin, Jean Michel Molina, Vincent Calvez, Philippe Flandre
Format:	Article
Language:	English
Published:	Hindawi Limited 2012-01-01
Series:	AIDS Research and Treatment
Online Access:	http://dx.doi.org/10.1155/2012/478467

id	doaj-a7f229e5f65844afbec403b107a2c237
record_format	Article
spelling	doaj-a7f229e5f65844afbec403b107a2c2372020-11-24T22:15:27ZengHindawi LimitedAIDS Research and Treatment2090-12402090-12592012-01-01201210.1155/2012/478467478467Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial DataAllal Houssaïni0Lambert Assoumou1Anne Geneviève Marcelin2Jean Michel Molina3Vincent Calvez4Philippe Flandre5INSERM, UMR-S 943, 56 Boulevard Vincent Auriol, BP 335, 75625 Paris Cedex 13, FranceINSERM, UMR-S 943, 56 Boulevard Vincent Auriol, BP 335, 75625 Paris Cedex 13, FranceINSERM, UMR-S 943, 56 Boulevard Vincent Auriol, BP 335, 75625 Paris Cedex 13, FranceService des Maladies Infectieuses, Hôpital Saint Louis, AP-HP, Paris, FranceINSERM, UMR-S 943, 56 Boulevard Vincent Auriol, BP 335, 75625 Paris Cedex 13, FranceINSERM, UMR-S 943, 56 Boulevard Vincent Auriol, BP 335, 75625 Paris Cedex 13, FranceBackground. Many statistical models have been tested to predict phenotypic or virological response from genotypic data. A statistical framework called Super Learner has been introduced either to compare different methods/learners (discrete Super Learner) or to combine them in a Super Learner prediction method. Methods. The Jaguar trial is used to apply the Super Learner framework. The Jaguar study is an “add-on” trial comparing the efficacy of adding didanosine to an on-going failing regimen. Our aim was also to investigate the impact on the use of different cross-validation strategies and different loss functions. Four different repartitions between training set and validations set were tested through two loss functions. Six statistical methods were compared. We assess performance by evaluating R2 values and accuracy by calculating the rates of patients being correctly classified. Results. Our results indicated that the more recent Super Learner methodology of building a new predictor based on a weighted combination of different methods/learners provided good performance. A simple linear model provided similar results to those of this new predictor. Slight discrepancy arises between the two loss functions investigated, and slight difference arises also between results based on cross-validated risks and results from full dataset. The Super Learner methodology and linear model provided around 80% of patients correctly classified. The difference between the lower and higher rates is around 10 percent. The number of mutations retained in different learners also varys from one to 41. Conclusions. The more recent Super Learner methodology combining the prediction of many learners provided good performance on our small dataset.http://dx.doi.org/10.1155/2012/478467
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Allal Houssaïni Lambert Assoumou Anne Geneviève Marcelin Jean Michel Molina Vincent Calvez Philippe Flandre
spellingShingle	Allal Houssaïni Lambert Assoumou Anne Geneviève Marcelin Jean Michel Molina Vincent Calvez Philippe Flandre Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data AIDS Research and Treatment
author_facet	Allal Houssaïni Lambert Assoumou Anne Geneviève Marcelin Jean Michel Molina Vincent Calvez Philippe Flandre
author_sort	Allal Houssaïni
title	Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data
title_short	Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data
title_full	Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data
title_fullStr	Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data
title_full_unstemmed	Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data
title_sort	investigation of super learner methodology on hiv-1 small sample: application on jaguar trial data
publisher	Hindawi Limited
series	AIDS Research and Treatment
issn	2090-1240 2090-1259
publishDate	2012-01-01
description	Background. Many statistical models have been tested to predict phenotypic or virological response from genotypic data. A statistical framework called Super Learner has been introduced either to compare different methods/learners (discrete Super Learner) or to combine them in a Super Learner prediction method. Methods. The Jaguar trial is used to apply the Super Learner framework. The Jaguar study is an “add-on” trial comparing the efficacy of adding didanosine to an on-going failing regimen. Our aim was also to investigate the impact on the use of different cross-validation strategies and different loss functions. Four different repartitions between training set and validations set were tested through two loss functions. Six statistical methods were compared. We assess performance by evaluating R2 values and accuracy by calculating the rates of patients being correctly classified. Results. Our results indicated that the more recent Super Learner methodology of building a new predictor based on a weighted combination of different methods/learners provided good performance. A simple linear model provided similar results to those of this new predictor. Slight discrepancy arises between the two loss functions investigated, and slight difference arises also between results based on cross-validated risks and results from full dataset. The Super Learner methodology and linear model provided around 80% of patients correctly classified. The difference between the lower and higher rates is around 10 percent. The number of mutations retained in different learners also varys from one to 41. Conclusions. The more recent Super Learner methodology combining the prediction of many learners provided good performance on our small dataset.
url	http://dx.doi.org/10.1155/2012/478467
work_keys_str_mv	AT allalhoussaini investigationofsuperlearnermethodologyonhiv1smallsampleapplicationonjaguartrialdata AT lambertassoumou investigationofsuperlearnermethodologyonhiv1smallsampleapplicationonjaguartrialdata AT annegenevievemarcelin investigationofsuperlearnermethodologyonhiv1smallsampleapplicationonjaguartrialdata AT jeanmichelmolina investigationofsuperlearnermethodologyonhiv1smallsampleapplicationonjaguartrialdata AT vincentcalvez investigationofsuperlearnermethodologyonhiv1smallsampleapplicationonjaguartrialdata AT philippeflandre investigationofsuperlearnermethodologyonhiv1smallsampleapplicationonjaguartrialdata
_version_	1725794223036301312

Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data

Similar Items