Bioactive molecule prediction using extreme gradient boosting

Following the explosive growth in chemical and biological data, the shift from traditional methods of drug discovery to computer-aided means has made data mining and machine learning methods integral parts of today's drug discovery process. In this paper, extreme gradient boosting (Xgboost), wh...

Full description

Bibliographic Details
Main Authors: Mustapha, I. B. (Author), Saeed, F. (Author)
Format: Article
Language:English
Published: MDPI AG, 2016.
Subjects:
Online Access:Get fulltext
LEADER 01504 am a22001453u 4500
001 72233
042 |a dc 
100 1 0 |a Mustapha, I. B.  |e author 
700 1 0 |a Saeed, F.  |e author 
245 0 0 |a Bioactive molecule prediction using extreme gradient boosting 
260 |b MDPI AG,   |c 2016. 
856 |z Get fulltext  |u http://eprints.utm.my/id/eprint/72233/1/FaisalSaeed2016_BioactiveMoleculePredictionUsingExtreme.pdf 
520 |a Following the explosive growth in chemical and biological data, the shift from traditional methods of drug discovery to computer-aided means has made data mining and machine learning methods integral parts of today's drug discovery process. In this paper, extreme gradient boosting (Xgboost), which is an ensemble of Classification and Regression Tree (CART) and a variant of the Gradient Boosting Machine, was investigated for the prediction of biological activity based on quantitative description of the compound's molecular structure. Seven datasets, well known in the literature were used in this paper and experimental results show that Xgboost can outperform machine learning algorithms like Random Forest (RF), Support Vector Machines (LSVM), Radial Basis Function Neural Network (RBFN) and Naïve Bayes (NB) for the prediction of biological activities. In addition to its ability to detect minority activity classes in highly imbalanced datasets, it showed remarkable performance on both high and low diversity datasets. 
546 |a en 
650 0 4 |a QA75 Electronic computers. Computer science