Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology

Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitid...

Full description

Bibliographic Details
Main Authors: Ashley I. Heinson, Yawwani Gunawardana, Bastiaan Moesker, Carmen C. Denman Hume, Elena Vataga, Yper Hall, Elena Stylianou, Helen McShane, Ann Williams, Mahesan Niranjan, Christopher H. Woelk
Format: Article
Language:English
Published: MDPI AG 2017-02-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:http://www.mdpi.com/1422-0067/18/2/312
id doaj-392e525a902443cca08fb5901a171dab
record_format Article
spelling doaj-392e525a902443cca08fb5901a171dab2020-11-25T01:29:28ZengMDPI AGInternational Journal of Molecular Sciences1422-00672017-02-0118231210.3390/ijms18020312ijms18020312Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse VaccinologyAshley I. Heinson0Yawwani Gunawardana1Bastiaan Moesker2Carmen C. Denman Hume3Elena Vataga4Yper Hall5Elena Stylianou6Helen McShane7Ann Williams8Mahesan Niranjan9Christopher H. Woelk10Faculty of Medicine, University of Southampton, Southampton SO17 1BJ, UKFaculty of Medicine, University of Southampton, Southampton SO17 1BJ, UKFaculty of Medicine, University of Southampton, Southampton SO17 1BJ, UKLondon School of Hygiene and Tropical Medicine (LSHTM), Department of Pathogen Molecular BiologyLondon WC1E 7HT, UKSolutions, University of Southampton, Southampton SO17 1BJ, UKPublic Health England, National Infection Service, Porton Down Salisbury, SP4 0JG, UKThe Jenner Institute, University of Oxford, Oxford OX3 7DQ, UKThe Jenner Institute, University of Oxford, Oxford OX3 7DQ, UKPublic Health England, National Infection Service, Porton Down Salisbury, SP4 0JG, UKDepartment of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UKFaculty of Medicine, University of Southampton, Southampton SO17 1BJ, UKReverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.http://www.mdpi.com/1422-0067/18/2/312reverse vaccinologymachine learningsupport vector machinebacterial protective antigenbacterial pathogen
collection DOAJ
language English
format Article
sources DOAJ
author Ashley I. Heinson
Yawwani Gunawardana
Bastiaan Moesker
Carmen C. Denman Hume
Elena Vataga
Yper Hall
Elena Stylianou
Helen McShane
Ann Williams
Mahesan Niranjan
Christopher H. Woelk
spellingShingle Ashley I. Heinson
Yawwani Gunawardana
Bastiaan Moesker
Carmen C. Denman Hume
Elena Vataga
Yper Hall
Elena Stylianou
Helen McShane
Ann Williams
Mahesan Niranjan
Christopher H. Woelk
Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology
International Journal of Molecular Sciences
reverse vaccinology
machine learning
support vector machine
bacterial protective antigen
bacterial pathogen
author_facet Ashley I. Heinson
Yawwani Gunawardana
Bastiaan Moesker
Carmen C. Denman Hume
Elena Vataga
Yper Hall
Elena Stylianou
Helen McShane
Ann Williams
Mahesan Niranjan
Christopher H. Woelk
author_sort Ashley I. Heinson
title Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology
title_short Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology
title_full Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology
title_fullStr Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology
title_full_unstemmed Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology
title_sort enhancing the biological relevance of machine learning classifiers for reverse vaccinology
publisher MDPI AG
series International Journal of Molecular Sciences
issn 1422-0067
publishDate 2017-02-01
description Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.
topic reverse vaccinology
machine learning
support vector machine
bacterial protective antigen
bacterial pathogen
url http://www.mdpi.com/1422-0067/18/2/312
work_keys_str_mv AT ashleyiheinson enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT yawwanigunawardana enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT bastiaanmoesker enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT carmencdenmanhume enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT elenavataga enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT yperhall enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT elenastylianou enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT helenmcshane enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT annwilliams enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT mahesanniranjan enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
AT christopherhwoelk enhancingthebiologicalrelevanceofmachinelearningclassifiersforreversevaccinology
_version_ 1725096952761155584