A systematic review of the application of machine learning in the detection and classification of transposable elements
Background Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2019-12-01
|
Series: | PeerJ |
Subjects: | |
Online Access: | https://peerj.com/articles/8311.pdf |
id |
doaj-54195f0e1ee94ae4aae2befdd2a7c939 |
---|---|
record_format |
Article |
spelling |
doaj-54195f0e1ee94ae4aae2befdd2a7c9392020-11-25T01:40:58ZengPeerJ Inc.PeerJ2167-83592019-12-017e831110.7717/peerj.8311A systematic review of the application of machine learning in the detection and classification of transposable elementsSimon Orozco-Arias0Gustavo Isaza1Romain Guyot2Reinel Tabares-Soto3Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Caldas, ColombiaDepartment of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, ColombiaInstitut de Recherche pour le Développement, CIRAD, University of Montpellier, Montpellier, FranceDepartment of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, ColombiaBackground Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and have been applied to solving several scientific problems. Methodology We followed the Systematic Literature Review (SLR) process, applying the six stages of the review protocol from it, but added a previous stage, which aims to detect the need for a review. Then search equations were formulated and executed in several literature databases. Relevant publications were scanned and used to extract evidence to answer research questions. Results Several ML approaches have already been tested on other bioinformatics problems with promising results, yet there are few algorithms and architectures available in literature focused specifically on TEs, despite representing the majority of the nuclear DNA of many organisms. Only 35 articles were found and categorized as relevant in TE or related fields. Conclusions ML is a powerful tool that can be used to address many problems. Although ML techniques have been used widely in other biological tasks, their utilization in TE analyses is still limited. Following the SLR, it was possible to notice that the use of ML for TE analyses (detection and classification) is an open problem, and this new field of research is growing in interest.https://peerj.com/articles/8311.pdfTransposable elementsRetrotransposonsDetectionClassificationBioinformaticsMachine learning |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Simon Orozco-Arias Gustavo Isaza Romain Guyot Reinel Tabares-Soto |
spellingShingle |
Simon Orozco-Arias Gustavo Isaza Romain Guyot Reinel Tabares-Soto A systematic review of the application of machine learning in the detection and classification of transposable elements PeerJ Transposable elements Retrotransposons Detection Classification Bioinformatics Machine learning |
author_facet |
Simon Orozco-Arias Gustavo Isaza Romain Guyot Reinel Tabares-Soto |
author_sort |
Simon Orozco-Arias |
title |
A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_short |
A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_full |
A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_fullStr |
A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_full_unstemmed |
A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_sort |
systematic review of the application of machine learning in the detection and classification of transposable elements |
publisher |
PeerJ Inc. |
series |
PeerJ |
issn |
2167-8359 |
publishDate |
2019-12-01 |
description |
Background Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and have been applied to solving several scientific problems. Methodology We followed the Systematic Literature Review (SLR) process, applying the six stages of the review protocol from it, but added a previous stage, which aims to detect the need for a review. Then search equations were formulated and executed in several literature databases. Relevant publications were scanned and used to extract evidence to answer research questions. Results Several ML approaches have already been tested on other bioinformatics problems with promising results, yet there are few algorithms and architectures available in literature focused specifically on TEs, despite representing the majority of the nuclear DNA of many organisms. Only 35 articles were found and categorized as relevant in TE or related fields. Conclusions ML is a powerful tool that can be used to address many problems. Although ML techniques have been used widely in other biological tasks, their utilization in TE analyses is still limited. Following the SLR, it was possible to notice that the use of ML for TE analyses (detection and classification) is an open problem, and this new field of research is growing in interest. |
topic |
Transposable elements Retrotransposons Detection Classification Bioinformatics Machine learning |
url |
https://peerj.com/articles/8311.pdf |
work_keys_str_mv |
AT simonorozcoarias asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT gustavoisaza asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT romainguyot asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT reineltabaressoto asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT simonorozcoarias systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT gustavoisaza systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT romainguyot systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT reineltabaressoto systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements |
_version_ |
1725043312334733312 |