A community-powered search of machine learning strategy space to find NMR property prediction models.

The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of...

Full description

Bibliographic Details
Main Authors: Lars A Bratholm, Will Gerrard, Brandon Anderson, Shaojie Bai, Sunghwan Choi, Lam Dang, Pavel Hanchar, Addison Howard, Sanghoon Kim, Zico Kolter, Risi Kondor, Mordechai Kornbluth, Youhan Lee, Youngsoo Lee, Jonathan P Mailoa, Thanh Tu Nguyen, Milos Popovic, Goran Rakocevic, Walter Reade, Wonho Song, Luka Stojanovic, Erik H Thiede, Nebojsa Tijanic, Andres Torrubia, Devin Willmott, Craig P Butts, David R Glowacki
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2021-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0253612
id doaj-7bd641a416ff403b90f3c0a0e9726773
record_format Article
spelling doaj-7bd641a416ff403b90f3c0a0e97267732021-08-03T04:33:21ZengPublic Library of Science (PLoS)PLoS ONE1932-62032021-01-01167e025361210.1371/journal.pone.0253612A community-powered search of machine learning strategy space to find NMR property prediction models.Lars A BratholmWill GerrardBrandon AndersonShaojie BaiSunghwan ChoiLam DangPavel HancharAddison HowardSanghoon KimZico KolterRisi KondorMordechai KornbluthYouhan LeeYoungsoo LeeJonathan P MailoaThanh Tu NguyenMilos PopovicGoran RakocevicWalter ReadeWonho SongLuka StojanovicErik H ThiedeNebojsa TijanicAndres TorrubiaDevin WillmottCraig P ButtsDavid R GlowackiThe rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published 'in-house' efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties.https://doi.org/10.1371/journal.pone.0253612
collection DOAJ
language English
format Article
sources DOAJ
author Lars A Bratholm
Will Gerrard
Brandon Anderson
Shaojie Bai
Sunghwan Choi
Lam Dang
Pavel Hanchar
Addison Howard
Sanghoon Kim
Zico Kolter
Risi Kondor
Mordechai Kornbluth
Youhan Lee
Youngsoo Lee
Jonathan P Mailoa
Thanh Tu Nguyen
Milos Popovic
Goran Rakocevic
Walter Reade
Wonho Song
Luka Stojanovic
Erik H Thiede
Nebojsa Tijanic
Andres Torrubia
Devin Willmott
Craig P Butts
David R Glowacki
spellingShingle Lars A Bratholm
Will Gerrard
Brandon Anderson
Shaojie Bai
Sunghwan Choi
Lam Dang
Pavel Hanchar
Addison Howard
Sanghoon Kim
Zico Kolter
Risi Kondor
Mordechai Kornbluth
Youhan Lee
Youngsoo Lee
Jonathan P Mailoa
Thanh Tu Nguyen
Milos Popovic
Goran Rakocevic
Walter Reade
Wonho Song
Luka Stojanovic
Erik H Thiede
Nebojsa Tijanic
Andres Torrubia
Devin Willmott
Craig P Butts
David R Glowacki
A community-powered search of machine learning strategy space to find NMR property prediction models.
PLoS ONE
author_facet Lars A Bratholm
Will Gerrard
Brandon Anderson
Shaojie Bai
Sunghwan Choi
Lam Dang
Pavel Hanchar
Addison Howard
Sanghoon Kim
Zico Kolter
Risi Kondor
Mordechai Kornbluth
Youhan Lee
Youngsoo Lee
Jonathan P Mailoa
Thanh Tu Nguyen
Milos Popovic
Goran Rakocevic
Walter Reade
Wonho Song
Luka Stojanovic
Erik H Thiede
Nebojsa Tijanic
Andres Torrubia
Devin Willmott
Craig P Butts
David R Glowacki
author_sort Lars A Bratholm
title A community-powered search of machine learning strategy space to find NMR property prediction models.
title_short A community-powered search of machine learning strategy space to find NMR property prediction models.
title_full A community-powered search of machine learning strategy space to find NMR property prediction models.
title_fullStr A community-powered search of machine learning strategy space to find NMR property prediction models.
title_full_unstemmed A community-powered search of machine learning strategy space to find NMR property prediction models.
title_sort community-powered search of machine learning strategy space to find nmr property prediction models.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2021-01-01
description The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published 'in-house' efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties.
url https://doi.org/10.1371/journal.pone.0253612
work_keys_str_mv AT larsabratholm acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT willgerrard acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT brandonanderson acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT shaojiebai acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT sunghwanchoi acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT lamdang acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT pavelhanchar acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT addisonhoward acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT sanghoonkim acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT zicokolter acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT risikondor acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT mordechaikornbluth acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT youhanlee acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT youngsoolee acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT jonathanpmailoa acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT thanhtunguyen acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT milospopovic acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT goranrakocevic acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT walterreade acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT wonhosong acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT lukastojanovic acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT erikhthiede acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT nebojsatijanic acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT andrestorrubia acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT devinwillmott acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT craigpbutts acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT davidrglowacki acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT larsabratholm communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT willgerrard communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT brandonanderson communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT shaojiebai communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT sunghwanchoi communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT lamdang communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT pavelhanchar communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT addisonhoward communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT sanghoonkim communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT zicokolter communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT risikondor communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT mordechaikornbluth communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT youhanlee communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT youngsoolee communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT jonathanpmailoa communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT thanhtunguyen communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT milospopovic communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT goranrakocevic communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT walterreade communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT wonhosong communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT lukastojanovic communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT erikhthiede communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT nebojsatijanic communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT andrestorrubia communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT devinwillmott communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT craigpbutts communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
AT davidrglowacki communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels
_version_ 1721223967357796352