Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability

Abstract This study aims at improving upon existing activity predictions methods by augmenting chemical structure fingerprints with bio-activity based fingerprints derived from high-throughput screening (HTS) data (HTSFPs) and thereby showcasing the benefits of combining different descriptor types....

Full description

Bibliographic Details
Main Authors: Oliver Laufkötter, Noé Sturm, Jürgen Bajorath, Hongming Chen, Ola Engkvist
Format: Article
Language:English
Published: BMC 2019-08-01
Series:Journal of Cheminformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13321-019-0376-1
id doaj-f6952732ab6b4d9c8851248ff8ad6ff5
record_format Article
spelling doaj-f6952732ab6b4d9c8851248ff8ad6ff52020-11-25T03:52:42ZengBMCJournal of Cheminformatics1758-29462019-08-0111111410.1186/s13321-019-0376-1Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capabilityOliver Laufkötter0Noé Sturm1Jürgen Bajorath2Hongming Chen3Ola Engkvist4Hit Discovery, Discovery Sciences, R&D, AstraZenecaHit Discovery, Discovery Sciences, R&D, AstraZenecaDepartment of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-UniversitätHit Discovery, Discovery Sciences, R&D, AstraZenecaHit Discovery, Discovery Sciences, R&D, AstraZenecaAbstract This study aims at improving upon existing activity predictions methods by augmenting chemical structure fingerprints with bio-activity based fingerprints derived from high-throughput screening (HTS) data (HTSFPs) and thereby showcasing the benefits of combining different descriptor types. This type of descriptor would be applied in an iterative screening scenario for more targeted compound set selection. The HTSFPs were generated from HTS data obtained from PubChem and combined with an ECFP4 structural fingerprint. The bioactivity-structure hybrid (BaSH) fingerprint was benchmarked against the individual ECFP4 and HTSFP fingerprints. Their performance was evaluated via retrospective analysis of a subset of the PubChem HTS data. Results showed that the BaSH fingerprint has improved predictive performance as well as scaffold hopping capability. The BaSH fingerprint identified unique compounds compared to both the ECFP4 and the HTSFP fingerprint indicating synergistic effects between the two fingerprints. A feature importance analysis showed that a small subset of the HTSFP features contribute most to the overall performance of the BaSH fingerprint. This hybrid approach allows for activity prediction of compounds with only sparse HTSFPs due to the supporting effect from the structural fingerprint.http://link.springer.com/article/10.1186/s13321-019-0376-1Machine learningRandom forestHigh throughput screeningActivity predictionHTSFPECFP
collection DOAJ
language English
format Article
sources DOAJ
author Oliver Laufkötter
Noé Sturm
Jürgen Bajorath
Hongming Chen
Ola Engkvist
spellingShingle Oliver Laufkötter
Noé Sturm
Jürgen Bajorath
Hongming Chen
Ola Engkvist
Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability
Journal of Cheminformatics
Machine learning
Random forest
High throughput screening
Activity prediction
HTSFP
ECFP
author_facet Oliver Laufkötter
Noé Sturm
Jürgen Bajorath
Hongming Chen
Ola Engkvist
author_sort Oliver Laufkötter
title Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability
title_short Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability
title_full Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability
title_fullStr Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability
title_full_unstemmed Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability
title_sort combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability
publisher BMC
series Journal of Cheminformatics
issn 1758-2946
publishDate 2019-08-01
description Abstract This study aims at improving upon existing activity predictions methods by augmenting chemical structure fingerprints with bio-activity based fingerprints derived from high-throughput screening (HTS) data (HTSFPs) and thereby showcasing the benefits of combining different descriptor types. This type of descriptor would be applied in an iterative screening scenario for more targeted compound set selection. The HTSFPs were generated from HTS data obtained from PubChem and combined with an ECFP4 structural fingerprint. The bioactivity-structure hybrid (BaSH) fingerprint was benchmarked against the individual ECFP4 and HTSFP fingerprints. Their performance was evaluated via retrospective analysis of a subset of the PubChem HTS data. Results showed that the BaSH fingerprint has improved predictive performance as well as scaffold hopping capability. The BaSH fingerprint identified unique compounds compared to both the ECFP4 and the HTSFP fingerprint indicating synergistic effects between the two fingerprints. A feature importance analysis showed that a small subset of the HTSFP features contribute most to the overall performance of the BaSH fingerprint. This hybrid approach allows for activity prediction of compounds with only sparse HTSFPs due to the supporting effect from the structural fingerprint.
topic Machine learning
Random forest
High throughput screening
Activity prediction
HTSFP
ECFP
url http://link.springer.com/article/10.1186/s13321-019-0376-1
work_keys_str_mv AT oliverlaufkotter combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability
AT noesturm combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability
AT jurgenbajorath combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability
AT hongmingchen combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability
AT olaengkvist combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability
_version_ 1724481374021222400