Comparison of Data Fusion Methods as Consensus Scores for Ensemble Docking

Ensemble docking is a widely applied concept in structure-based virtual screening—to at least partly account for protein flexibility—usually granting a significant performance gain at a modest cost of speed. From the individual, single-structure docking scores, a consensus score...

Full description

Bibliographic Details
Main Authors: Dávid Bajusz, Anita Rácz, Károly Héberger
Format: Article
Language:English
Published: MDPI AG 2019-07-01
Series:Molecules
Subjects:
SRD
AUC
Online Access:https://www.mdpi.com/1420-3049/24/15/2690
id doaj-fea8b84cbf024bda805638cad806a75b
record_format Article
spelling doaj-fea8b84cbf024bda805638cad806a75b2020-11-25T02:29:28ZengMDPI AGMolecules1420-30492019-07-012415269010.3390/molecules24152690molecules24152690Comparison of Data Fusion Methods as Consensus Scores for Ensemble DockingDávid Bajusz0Anita Rácz1Károly Héberger2Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar tudósok krt. 2, H-1117 Budapest, HungaryPlasma Chemistry Research Group, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar tudósok krt. 2, H-1117 Budapest, HungaryPlasma Chemistry Research Group, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar tudósok krt. 2, H-1117 Budapest, HungaryEnsemble docking is a widely applied concept in structure-based virtual screening—to at least partly account for protein flexibility—usually granting a significant performance gain at a modest cost of speed. From the individual, single-structure docking scores, a consensus score needs to be produced by data fusion: this is usually done by taking the best docking score from the available pool (in most cases— and in this study as well—this is the minimum score). Nonetheless, there are a number of other fusion rules that can be applied. We report here the results of a detailed statistical comparison of seven fusion rules for ensemble docking, on five case studies of current drug targets, based on four performance metrics. Sevenfold cross-validation and variance analysis (ANOVA) allowed us to highlight the best fusion rules. The results are presented in bubble plots, to unite the four performance metrics into a single, comprehensive image. Notably, we suggest the use of the geometric and harmonic means as better alternatives to the generally applied minimum fusion rule.https://www.mdpi.com/1420-3049/24/15/2690ensemble dockingdata fusionSRDROC curveAUCBEDROC
collection DOAJ
language English
format Article
sources DOAJ
author Dávid Bajusz
Anita Rácz
Károly Héberger
spellingShingle Dávid Bajusz
Anita Rácz
Károly Héberger
Comparison of Data Fusion Methods as Consensus Scores for Ensemble Docking
Molecules
ensemble docking
data fusion
SRD
ROC curve
AUC
BEDROC
author_facet Dávid Bajusz
Anita Rácz
Károly Héberger
author_sort Dávid Bajusz
title Comparison of Data Fusion Methods as Consensus Scores for Ensemble Docking
title_short Comparison of Data Fusion Methods as Consensus Scores for Ensemble Docking
title_full Comparison of Data Fusion Methods as Consensus Scores for Ensemble Docking
title_fullStr Comparison of Data Fusion Methods as Consensus Scores for Ensemble Docking
title_full_unstemmed Comparison of Data Fusion Methods as Consensus Scores for Ensemble Docking
title_sort comparison of data fusion methods as consensus scores for ensemble docking
publisher MDPI AG
series Molecules
issn 1420-3049
publishDate 2019-07-01
description Ensemble docking is a widely applied concept in structure-based virtual screening—to at least partly account for protein flexibility—usually granting a significant performance gain at a modest cost of speed. From the individual, single-structure docking scores, a consensus score needs to be produced by data fusion: this is usually done by taking the best docking score from the available pool (in most cases— and in this study as well—this is the minimum score). Nonetheless, there are a number of other fusion rules that can be applied. We report here the results of a detailed statistical comparison of seven fusion rules for ensemble docking, on five case studies of current drug targets, based on four performance metrics. Sevenfold cross-validation and variance analysis (ANOVA) allowed us to highlight the best fusion rules. The results are presented in bubble plots, to unite the four performance metrics into a single, comprehensive image. Notably, we suggest the use of the geometric and harmonic means as better alternatives to the generally applied minimum fusion rule.
topic ensemble docking
data fusion
SRD
ROC curve
AUC
BEDROC
url https://www.mdpi.com/1420-3049/24/15/2690
work_keys_str_mv AT davidbajusz comparisonofdatafusionmethodsasconsensusscoresforensembledocking
AT anitaracz comparisonofdatafusionmethodsasconsensusscoresforensembledocking
AT karolyheberger comparisonofdatafusionmethodsasconsensusscoresforensembledocking
_version_ 1724832958776344576