A systematic review of the application of machine learning in the detection and classification of transposable elements

Background Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and...

Full description

Bibliographic Details
Main Authors: Simon Orozco-Arias, Gustavo Isaza, Romain Guyot, Reinel Tabares-Soto
Format: Article
Language:English
Published: PeerJ Inc. 2019-12-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/8311.pdf
id doaj-54195f0e1ee94ae4aae2befdd2a7c939
record_format Article
spelling doaj-54195f0e1ee94ae4aae2befdd2a7c9392020-11-25T01:40:58ZengPeerJ Inc.PeerJ2167-83592019-12-017e831110.7717/peerj.8311A systematic review of the application of machine learning in the detection and classification of transposable elementsSimon Orozco-Arias0Gustavo Isaza1Romain Guyot2Reinel Tabares-Soto3Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Caldas, ColombiaDepartment of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, ColombiaInstitut de Recherche pour le Développement, CIRAD, University of Montpellier, Montpellier, FranceDepartment of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, ColombiaBackground Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and have been applied to solving several scientific problems. Methodology We followed the Systematic Literature Review (SLR) process, applying the six stages of the review protocol from it, but added a previous stage, which aims to detect the need for a review. Then search equations were formulated and executed in several literature databases. Relevant publications were scanned and used to extract evidence to answer research questions. Results Several ML approaches have already been tested on other bioinformatics problems with promising results, yet there are few algorithms and architectures available in literature focused specifically on TEs, despite representing the majority of the nuclear DNA of many organisms. Only 35 articles were found and categorized as relevant in TE or related fields. Conclusions ML is a powerful tool that can be used to address many problems. Although ML techniques have been used widely in other biological tasks, their utilization in TE analyses is still limited. Following the SLR, it was possible to notice that the use of ML for TE analyses (detection and classification) is an open problem, and this new field of research is growing in interest.https://peerj.com/articles/8311.pdfTransposable elementsRetrotransposonsDetectionClassificationBioinformaticsMachine learning
collection DOAJ
language English
format Article
sources DOAJ
author Simon Orozco-Arias
Gustavo Isaza
Romain Guyot
Reinel Tabares-Soto
spellingShingle Simon Orozco-Arias
Gustavo Isaza
Romain Guyot
Reinel Tabares-Soto
A systematic review of the application of machine learning in the detection and classification of transposable elements
PeerJ
Transposable elements
Retrotransposons
Detection
Classification
Bioinformatics
Machine learning
author_facet Simon Orozco-Arias
Gustavo Isaza
Romain Guyot
Reinel Tabares-Soto
author_sort Simon Orozco-Arias
title A systematic review of the application of machine learning in the detection and classification of transposable elements
title_short A systematic review of the application of machine learning in the detection and classification of transposable elements
title_full A systematic review of the application of machine learning in the detection and classification of transposable elements
title_fullStr A systematic review of the application of machine learning in the detection and classification of transposable elements
title_full_unstemmed A systematic review of the application of machine learning in the detection and classification of transposable elements
title_sort systematic review of the application of machine learning in the detection and classification of transposable elements
publisher PeerJ Inc.
series PeerJ
issn 2167-8359
publishDate 2019-12-01
description Background Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and have been applied to solving several scientific problems. Methodology We followed the Systematic Literature Review (SLR) process, applying the six stages of the review protocol from it, but added a previous stage, which aims to detect the need for a review. Then search equations were formulated and executed in several literature databases. Relevant publications were scanned and used to extract evidence to answer research questions. Results Several ML approaches have already been tested on other bioinformatics problems with promising results, yet there are few algorithms and architectures available in literature focused specifically on TEs, despite representing the majority of the nuclear DNA of many organisms. Only 35 articles were found and categorized as relevant in TE or related fields. Conclusions ML is a powerful tool that can be used to address many problems. Although ML techniques have been used widely in other biological tasks, their utilization in TE analyses is still limited. Following the SLR, it was possible to notice that the use of ML for TE analyses (detection and classification) is an open problem, and this new field of research is growing in interest.
topic Transposable elements
Retrotransposons
Detection
Classification
Bioinformatics
Machine learning
url https://peerj.com/articles/8311.pdf
work_keys_str_mv AT simonorozcoarias asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements
AT gustavoisaza asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements
AT romainguyot asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements
AT reineltabaressoto asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements
AT simonorozcoarias systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements
AT gustavoisaza systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements
AT romainguyot systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements
AT reineltabaressoto systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements
_version_ 1725043312334733312