QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping

Abstract An affinity fingerprint is the vector consisting of compound’s affinity or potency against the reference panel of protein targets. Here, we present the QAFFP fingerprint, 440 elements long in silico QSAR-based affinity fingerprint, components of which are predicted by Random Forest regressi...

Full description

Bibliographic Details
Main Authors: C. Škuta, I. Cortés-Ciriano, W. Dehaen, P. Kříž, G. J. P. van Westen, I. V. Tetko, A. Bender, D. Svozil
Format: Article
Language:English
Published: BMC 2020-05-01
Series:Journal of Cheminformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13321-020-00443-6
id doaj-030fcddcbd334c029e28370217cfcb3d
record_format Article
spelling doaj-030fcddcbd334c029e28370217cfcb3d2020-11-25T03:24:01ZengBMCJournal of Cheminformatics1758-29462020-05-0112111610.1186/s13321-020-00443-6QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hoppingC. Škuta0I. Cortés-Ciriano1W. Dehaen2P. Kříž3G. J. P. van Westen4I. V. Tetko5A. Bender6D. Svozil7CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i.Centre for Molecular Informatics, Department of Chemistry, University of CambridgeCZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i.Department of Mathematics, Faculty of Chemical Engineering, University of Chemistry and Technology PragueComputational Drug Discovery, Drug Discovery and Safety, LACDR, Leiden UniversityHelmholtz Zentrum Muenchen – German Research Center for Environmental Health (GmbH) and BIGCHEM GmbHCentre for Molecular Informatics, Department of Chemistry, University of CambridgeCZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i.Abstract An affinity fingerprint is the vector consisting of compound’s affinity or potency against the reference panel of protein targets. Here, we present the QAFFP fingerprint, 440 elements long in silico QSAR-based affinity fingerprint, components of which are predicted by Random Forest regression models trained on bioactivity data from the ChEMBL database. Both real-valued (rv-QAFFP) and binary (b-QAFFP) versions of the QAFFP fingerprint were implemented and their performance in similarity searching, biological activity classification and scaffold hopping was assessed and compared to that of the 1024 bits long Morgan2 fingerprint (the RDKit implementation of the ECFP4 fingerprint). In both similarity searching and biological activity classification, the QAFFP fingerprint yields retrieval rates, measured by AUC (~ 0.65 and ~ 0.70 for similarity searching depending on data sets, and ~ 0.85 for classification) and EF5 (~ 4.67 and ~ 5.82 for similarity searching depending on data sets, and ~ 2.10 for classification), comparable to that of the Morgan2 fingerprint (similarity searching AUC of ~ 0.57 and ~ 0.66, and EF5 of ~ 4.09 and ~ 6.41, depending on data sets, classification AUC of ~ 0.87, and EF5 of ~ 2.16). However, the QAFFP fingerprint outperforms the Morgan2 fingerprint in scaffold hopping as it is able to retrieve 1146 out of existing 1749 scaffolds, while the Morgan2 fingerprint reveals only 864 scaffolds.http://link.springer.com/article/10.1186/s13321-020-00443-6Affinity fingerprintBiological fingerprintQSARSimilarity searchingBioactivity modelingScaffold hopping
collection DOAJ
language English
format Article
sources DOAJ
author C. Škuta
I. Cortés-Ciriano
W. Dehaen
P. Kříž
G. J. P. van Westen
I. V. Tetko
A. Bender
D. Svozil
spellingShingle C. Škuta
I. Cortés-Ciriano
W. Dehaen
P. Kříž
G. J. P. van Westen
I. V. Tetko
A. Bender
D. Svozil
QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping
Journal of Cheminformatics
Affinity fingerprint
Biological fingerprint
QSAR
Similarity searching
Bioactivity modeling
Scaffold hopping
author_facet C. Škuta
I. Cortés-Ciriano
W. Dehaen
P. Kříž
G. J. P. van Westen
I. V. Tetko
A. Bender
D. Svozil
author_sort C. Škuta
title QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping
title_short QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping
title_full QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping
title_fullStr QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping
title_full_unstemmed QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping
title_sort qsar-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping
publisher BMC
series Journal of Cheminformatics
issn 1758-2946
publishDate 2020-05-01
description Abstract An affinity fingerprint is the vector consisting of compound’s affinity or potency against the reference panel of protein targets. Here, we present the QAFFP fingerprint, 440 elements long in silico QSAR-based affinity fingerprint, components of which are predicted by Random Forest regression models trained on bioactivity data from the ChEMBL database. Both real-valued (rv-QAFFP) and binary (b-QAFFP) versions of the QAFFP fingerprint were implemented and their performance in similarity searching, biological activity classification and scaffold hopping was assessed and compared to that of the 1024 bits long Morgan2 fingerprint (the RDKit implementation of the ECFP4 fingerprint). In both similarity searching and biological activity classification, the QAFFP fingerprint yields retrieval rates, measured by AUC (~ 0.65 and ~ 0.70 for similarity searching depending on data sets, and ~ 0.85 for classification) and EF5 (~ 4.67 and ~ 5.82 for similarity searching depending on data sets, and ~ 2.10 for classification), comparable to that of the Morgan2 fingerprint (similarity searching AUC of ~ 0.57 and ~ 0.66, and EF5 of ~ 4.09 and ~ 6.41, depending on data sets, classification AUC of ~ 0.87, and EF5 of ~ 2.16). However, the QAFFP fingerprint outperforms the Morgan2 fingerprint in scaffold hopping as it is able to retrieve 1146 out of existing 1749 scaffolds, while the Morgan2 fingerprint reveals only 864 scaffolds.
topic Affinity fingerprint
Biological fingerprint
QSAR
Similarity searching
Bioactivity modeling
Scaffold hopping
url http://link.springer.com/article/10.1186/s13321-020-00443-6
work_keys_str_mv AT cskuta qsarderivedaffinityfingerprintspart1fingerprintconstructionandmodelingperformanceforsimilaritysearchingbioactivityclassificationandscaffoldhopping
AT icortesciriano qsarderivedaffinityfingerprintspart1fingerprintconstructionandmodelingperformanceforsimilaritysearchingbioactivityclassificationandscaffoldhopping
AT wdehaen qsarderivedaffinityfingerprintspart1fingerprintconstructionandmodelingperformanceforsimilaritysearchingbioactivityclassificationandscaffoldhopping
AT pkriz qsarderivedaffinityfingerprintspart1fingerprintconstructionandmodelingperformanceforsimilaritysearchingbioactivityclassificationandscaffoldhopping
AT gjpvanwesten qsarderivedaffinityfingerprintspart1fingerprintconstructionandmodelingperformanceforsimilaritysearchingbioactivityclassificationandscaffoldhopping
AT ivtetko qsarderivedaffinityfingerprintspart1fingerprintconstructionandmodelingperformanceforsimilaritysearchingbioactivityclassificationandscaffoldhopping
AT abender qsarderivedaffinityfingerprintspart1fingerprintconstructionandmodelingperformanceforsimilaritysearchingbioactivityclassificationandscaffoldhopping
AT dsvozil qsarderivedaffinityfingerprintspart1fingerprintconstructionandmodelingperformanceforsimilaritysearchingbioactivityclassificationandscaffoldhopping
_version_ 1724604030001348608