Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability
Abstract This study aims at improving upon existing activity predictions methods by augmenting chemical structure fingerprints with bio-activity based fingerprints derived from high-throughput screening (HTS) data (HTSFPs) and thereby showcasing the benefits of combining different descriptor types....
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-08-01
|
Series: | Journal of Cheminformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13321-019-0376-1 |
id |
doaj-f6952732ab6b4d9c8851248ff8ad6ff5 |
---|---|
record_format |
Article |
spelling |
doaj-f6952732ab6b4d9c8851248ff8ad6ff52020-11-25T03:52:42ZengBMCJournal of Cheminformatics1758-29462019-08-0111111410.1186/s13321-019-0376-1Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capabilityOliver Laufkötter0Noé Sturm1Jürgen Bajorath2Hongming Chen3Ola Engkvist4Hit Discovery, Discovery Sciences, R&D, AstraZenecaHit Discovery, Discovery Sciences, R&D, AstraZenecaDepartment of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-UniversitätHit Discovery, Discovery Sciences, R&D, AstraZenecaHit Discovery, Discovery Sciences, R&D, AstraZenecaAbstract This study aims at improving upon existing activity predictions methods by augmenting chemical structure fingerprints with bio-activity based fingerprints derived from high-throughput screening (HTS) data (HTSFPs) and thereby showcasing the benefits of combining different descriptor types. This type of descriptor would be applied in an iterative screening scenario for more targeted compound set selection. The HTSFPs were generated from HTS data obtained from PubChem and combined with an ECFP4 structural fingerprint. The bioactivity-structure hybrid (BaSH) fingerprint was benchmarked against the individual ECFP4 and HTSFP fingerprints. Their performance was evaluated via retrospective analysis of a subset of the PubChem HTS data. Results showed that the BaSH fingerprint has improved predictive performance as well as scaffold hopping capability. The BaSH fingerprint identified unique compounds compared to both the ECFP4 and the HTSFP fingerprint indicating synergistic effects between the two fingerprints. A feature importance analysis showed that a small subset of the HTSFP features contribute most to the overall performance of the BaSH fingerprint. This hybrid approach allows for activity prediction of compounds with only sparse HTSFPs due to the supporting effect from the structural fingerprint.http://link.springer.com/article/10.1186/s13321-019-0376-1Machine learningRandom forestHigh throughput screeningActivity predictionHTSFPECFP |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Oliver Laufkötter Noé Sturm Jürgen Bajorath Hongming Chen Ola Engkvist |
spellingShingle |
Oliver Laufkötter Noé Sturm Jürgen Bajorath Hongming Chen Ola Engkvist Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability Journal of Cheminformatics Machine learning Random forest High throughput screening Activity prediction HTSFP ECFP |
author_facet |
Oliver Laufkötter Noé Sturm Jürgen Bajorath Hongming Chen Ola Engkvist |
author_sort |
Oliver Laufkötter |
title |
Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability |
title_short |
Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability |
title_full |
Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability |
title_fullStr |
Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability |
title_full_unstemmed |
Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability |
title_sort |
combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability |
publisher |
BMC |
series |
Journal of Cheminformatics |
issn |
1758-2946 |
publishDate |
2019-08-01 |
description |
Abstract This study aims at improving upon existing activity predictions methods by augmenting chemical structure fingerprints with bio-activity based fingerprints derived from high-throughput screening (HTS) data (HTSFPs) and thereby showcasing the benefits of combining different descriptor types. This type of descriptor would be applied in an iterative screening scenario for more targeted compound set selection. The HTSFPs were generated from HTS data obtained from PubChem and combined with an ECFP4 structural fingerprint. The bioactivity-structure hybrid (BaSH) fingerprint was benchmarked against the individual ECFP4 and HTSFP fingerprints. Their performance was evaluated via retrospective analysis of a subset of the PubChem HTS data. Results showed that the BaSH fingerprint has improved predictive performance as well as scaffold hopping capability. The BaSH fingerprint identified unique compounds compared to both the ECFP4 and the HTSFP fingerprint indicating synergistic effects between the two fingerprints. A feature importance analysis showed that a small subset of the HTSFP features contribute most to the overall performance of the BaSH fingerprint. This hybrid approach allows for activity prediction of compounds with only sparse HTSFPs due to the supporting effect from the structural fingerprint. |
topic |
Machine learning Random forest High throughput screening Activity prediction HTSFP ECFP |
url |
http://link.springer.com/article/10.1186/s13321-019-0376-1 |
work_keys_str_mv |
AT oliverlaufkotter combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability AT noesturm combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability AT jurgenbajorath combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability AT hongmingchen combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability AT olaengkvist combiningstructuralandbioactivitybasedfingerprintsimprovespredictionperformanceandscaffoldhoppingcapability |
_version_ |
1724481374021222400 |