Feature selection methods for identifying genetic determinants of host species in RNA viruses.

Despite environmental, social and ecological dependencies, emergence of zoonotic viruses in human populations is clearly also affected by genetic factors which determine cross-species transmission potential. RNA viruses pose an interesting case study given their mutation rates are orders of magnitud...

Full description

Bibliographic Details
Main Authors: Ricardo Aguas, Neil M Ferguson
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC3794897?pdf=render
id doaj-58f9d64b345242929077db49d5ab9b7b
record_format Article
spelling doaj-58f9d64b345242929077db49d5ab9b7b2020-11-24T21:55:55ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582013-01-01910e100325410.1371/journal.pcbi.1003254Feature selection methods for identifying genetic determinants of host species in RNA viruses.Ricardo AguasNeil M FergusonDespite environmental, social and ecological dependencies, emergence of zoonotic viruses in human populations is clearly also affected by genetic factors which determine cross-species transmission potential. RNA viruses pose an interesting case study given their mutation rates are orders of magnitude higher than any other pathogen--as reflected by the recent emergence of SARS and Influenza for example. Here, we show how feature selection techniques can be used to reliably classify viral sequences by host species, and to identify the crucial minority of host-specific sites in pathogen genomic data. The variability in alleles at those sites can be translated into prediction probabilities that a particular pathogen isolate is adapted to a given host. We illustrate the power of these methods by: 1) identifying the sites explaining SARS coronavirus differences between human, bat and palm civet samples; 2) showing how cross species jumps of rabies virus among bat populations can be readily identified; and 3) de novo identification of likely functional influenza host discriminant markers.http://europepmc.org/articles/PMC3794897?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Ricardo Aguas
Neil M Ferguson
spellingShingle Ricardo Aguas
Neil M Ferguson
Feature selection methods for identifying genetic determinants of host species in RNA viruses.
PLoS Computational Biology
author_facet Ricardo Aguas
Neil M Ferguson
author_sort Ricardo Aguas
title Feature selection methods for identifying genetic determinants of host species in RNA viruses.
title_short Feature selection methods for identifying genetic determinants of host species in RNA viruses.
title_full Feature selection methods for identifying genetic determinants of host species in RNA viruses.
title_fullStr Feature selection methods for identifying genetic determinants of host species in RNA viruses.
title_full_unstemmed Feature selection methods for identifying genetic determinants of host species in RNA viruses.
title_sort feature selection methods for identifying genetic determinants of host species in rna viruses.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2013-01-01
description Despite environmental, social and ecological dependencies, emergence of zoonotic viruses in human populations is clearly also affected by genetic factors which determine cross-species transmission potential. RNA viruses pose an interesting case study given their mutation rates are orders of magnitude higher than any other pathogen--as reflected by the recent emergence of SARS and Influenza for example. Here, we show how feature selection techniques can be used to reliably classify viral sequences by host species, and to identify the crucial minority of host-specific sites in pathogen genomic data. The variability in alleles at those sites can be translated into prediction probabilities that a particular pathogen isolate is adapted to a given host. We illustrate the power of these methods by: 1) identifying the sites explaining SARS coronavirus differences between human, bat and palm civet samples; 2) showing how cross species jumps of rabies virus among bat populations can be readily identified; and 3) de novo identification of likely functional influenza host discriminant markers.
url http://europepmc.org/articles/PMC3794897?pdf=render
work_keys_str_mv AT ricardoaguas featureselectionmethodsforidentifyinggeneticdeterminantsofhostspeciesinrnaviruses
AT neilmferguson featureselectionmethodsforidentifyinggeneticdeterminantsofhostspeciesinrnaviruses
_version_ 1725860541216325632