Identification of natural selection in genomic data with deep convolutional neural network

Background: With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Network...

Full description

Bibliographic Details
Main Authors: Bertorelle, G. (Author), Nguembang Fadja, A. (Author), Riguzzi, F. (Author), Trucchi, E. (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02154nam a2200313Ia 4500
001 10.1186-s13040-021-00280-9
008 220427s2021 CNT 000 0 und d
020 |a 17560381 (ISSN) 
245 1 0 |a Identification of natural selection in genomic data with deep convolutional neural network 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s13040-021-00280-9 
520 3 |a Background: With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Networks have been proposed to make inferences on demographic and adaptive processes using genomic data. Even though it was already shown to be powerful and efficient in different fields of investigation, Supervised Machine Learning has still to be explored as to unfold its enormous potential in evolutionary genomics. Results: The paper proposes a method based on Supervised Machine Learning for classifying genomic data, represented as windows of genomic sequences from a sample of individuals belonging to the same population. A Convolutional Neural Network is used to test whether a genomic window shows the signature of natural selection. Training performed on simulated data show that the proposed model can accurately predict neutral and selection processes on portions of genomes taken from real populations with almost 90% accuracy. © 2021, The Author(s). 
650 0 4 |a adult 
650 0 4 |a article 
650 0 4 |a convolutional neural network 
650 0 4 |a Convolutional Neural Networks 
650 0 4 |a deep learning 
650 0 4 |a Deep Learning 
650 0 4 |a Genomic data 
650 0 4 |a human tissue 
650 0 4 |a Inference of natural selection 
650 0 4 |a natural selection 
650 0 4 |a simulation 
650 0 4 |a supervised machine learning 
700 1 |a Bertorelle, G.  |e author 
700 1 |a Nguembang Fadja, A.  |e author 
700 1 |a Riguzzi, F.  |e author 
700 1 |a Trucchi, E.  |e author 
773 |t BioData Mining