Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.

The evolution of the influenza A virus to increase its host range is a major concern worldwide. Molecular mechanisms of increasing host range are largely unknown. Influenza surface proteins play determining roles in reorganization of host-sialic acid receptors and host range. In an attempt to uncove...

Full description

Bibliographic Details
Main Authors: Mansour Ebrahimi, Parisa Aghagolzadeh, Narges Shamabadi, Ahmad Tahmasebi, Mohammed Alsharifi, David L Adelson, Farhid Hemmatzadeh, Esmaeil Ebrahimie
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4014573?pdf=render
id doaj-0de64721b5e84ec8ae1e75b7b5872a01
record_format Article
spelling doaj-0de64721b5e84ec8ae1e75b7b5872a012020-11-25T00:23:38ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0195e9698410.1371/journal.pone.0096984Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.Mansour EbrahimiParisa AghagolzadehNarges ShamabadiAhmad TahmasebiMohammed AlsharifiDavid L AdelsonFarhid HemmatzadehEsmaeil EbrahimieThe evolution of the influenza A virus to increase its host range is a major concern worldwide. Molecular mechanisms of increasing host range are largely unknown. Influenza surface proteins play determining roles in reorganization of host-sialic acid receptors and host range. In an attempt to uncover the physic-chemical attributes which govern HA subtyping, we performed a large scale functional analysis of over 7000 sequences of 16 different HA subtypes. Large number (896) of physic-chemical protein characteristics were calculated for each HA sequence. Then, 10 different attribute weighting algorithms were used to find the key characteristics distinguishing HA subtypes. Furthermore, to discover machine leaning models which can predict HA subtypes, various Decision Tree, Support Vector Machine, Naïve Bayes, and Neural Network models were trained on calculated protein characteristics dataset as well as 10 trimmed datasets generated by attribute weighting algorithms. The prediction accuracies of the machine learning methods were evaluated by 10-fold cross validation. The results highlighted the frequency of Gln (selected by 80% of attribute weighting algorithms), percentage/frequency of Tyr, percentage of Cys, and frequencies of Try and Glu (selected by 70% of attribute weighting algorithms) as the key features that are associated with HA subtyping. Random Forest tree induction algorithm and RBF kernel function of SVM (scaled by grid search) showed high accuracy of 98% in clustering and predicting HA subtypes based on protein attributes. Decision tree models were successful in monitoring the short mutation/reassortment paths by which influenza virus can gain the key protein structure of another HA subtype and increase its host range in a short period of time with less energy consumption. Extracting and mining a large number of amino acid attributes of HA subtypes of influenza A virus through supervised algorithms represent a new avenue for understanding and predicting possible future structure of influenza pandemics.http://europepmc.org/articles/PMC4014573?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Mansour Ebrahimi
Parisa Aghagolzadeh
Narges Shamabadi
Ahmad Tahmasebi
Mohammed Alsharifi
David L Adelson
Farhid Hemmatzadeh
Esmaeil Ebrahimie
spellingShingle Mansour Ebrahimi
Parisa Aghagolzadeh
Narges Shamabadi
Ahmad Tahmasebi
Mohammed Alsharifi
David L Adelson
Farhid Hemmatzadeh
Esmaeil Ebrahimie
Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.
PLoS ONE
author_facet Mansour Ebrahimi
Parisa Aghagolzadeh
Narges Shamabadi
Ahmad Tahmasebi
Mohammed Alsharifi
David L Adelson
Farhid Hemmatzadeh
Esmaeil Ebrahimie
author_sort Mansour Ebrahimi
title Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.
title_short Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.
title_full Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.
title_fullStr Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.
title_full_unstemmed Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.
title_sort understanding the undelaying mechanism of ha-subtyping in the level of physic-chemical characteristics of protein.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2014-01-01
description The evolution of the influenza A virus to increase its host range is a major concern worldwide. Molecular mechanisms of increasing host range are largely unknown. Influenza surface proteins play determining roles in reorganization of host-sialic acid receptors and host range. In an attempt to uncover the physic-chemical attributes which govern HA subtyping, we performed a large scale functional analysis of over 7000 sequences of 16 different HA subtypes. Large number (896) of physic-chemical protein characteristics were calculated for each HA sequence. Then, 10 different attribute weighting algorithms were used to find the key characteristics distinguishing HA subtypes. Furthermore, to discover machine leaning models which can predict HA subtypes, various Decision Tree, Support Vector Machine, Naïve Bayes, and Neural Network models were trained on calculated protein characteristics dataset as well as 10 trimmed datasets generated by attribute weighting algorithms. The prediction accuracies of the machine learning methods were evaluated by 10-fold cross validation. The results highlighted the frequency of Gln (selected by 80% of attribute weighting algorithms), percentage/frequency of Tyr, percentage of Cys, and frequencies of Try and Glu (selected by 70% of attribute weighting algorithms) as the key features that are associated with HA subtyping. Random Forest tree induction algorithm and RBF kernel function of SVM (scaled by grid search) showed high accuracy of 98% in clustering and predicting HA subtypes based on protein attributes. Decision tree models were successful in monitoring the short mutation/reassortment paths by which influenza virus can gain the key protein structure of another HA subtype and increase its host range in a short period of time with less energy consumption. Extracting and mining a large number of amino acid attributes of HA subtypes of influenza A virus through supervised algorithms represent a new avenue for understanding and predicting possible future structure of influenza pandemics.
url http://europepmc.org/articles/PMC4014573?pdf=render
work_keys_str_mv AT mansourebrahimi understandingtheundelayingmechanismofhasubtypinginthelevelofphysicchemicalcharacteristicsofprotein
AT parisaaghagolzadeh understandingtheundelayingmechanismofhasubtypinginthelevelofphysicchemicalcharacteristicsofprotein
AT nargesshamabadi understandingtheundelayingmechanismofhasubtypinginthelevelofphysicchemicalcharacteristicsofprotein
AT ahmadtahmasebi understandingtheundelayingmechanismofhasubtypinginthelevelofphysicchemicalcharacteristicsofprotein
AT mohammedalsharifi understandingtheundelayingmechanismofhasubtypinginthelevelofphysicchemicalcharacteristicsofprotein
AT davidladelson understandingtheundelayingmechanismofhasubtypinginthelevelofphysicchemicalcharacteristicsofprotein
AT farhidhemmatzadeh understandingtheundelayingmechanismofhasubtypinginthelevelofphysicchemicalcharacteristicsofprotein
AT esmaeilebrahimie understandingtheundelayingmechanismofhasubtypinginthelevelofphysicchemicalcharacteristicsofprotein
_version_ 1725355828114882560