Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural Network

The enhancer is a short regulatory element that plays a major role in up-regulating eukaryotic gene expression. To identify enhancers, an experimental process takes a long time and high cost; therefore, an accurate computational tool is a much-needed work in this area. Existing techniques were devel...

Full description

Bibliographic Details
Main Authors: Jhabindra Khanal, Hilal Tayara, Kil To Chong
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9044822/
id doaj-955fad8826b64cfbbb444b9f89372902
record_format Article
spelling doaj-955fad8826b64cfbbb444b9f893729022021-03-30T02:56:07ZengIEEEIEEE Access2169-35362020-01-018583695837610.1109/ACCESS.2020.29826669044822Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural NetworkJhabindra Khanal0https://orcid.org/0000-0001-6470-1365Hilal Tayara1https://orcid.org/0000-0001-5678-3479Kil To Chong2https://orcid.org/0000-0002-1952-0001Department of Electronics and Information Engineering, Chonbuk National University, Jeonju, South KoreaDepartment of Electronics and Information Engineering, Chonbuk National University, Jeonju, South KoreaAdvanced Electronics and Information Research Center, Chonbuk National University, Jeonju, South KoreaThe enhancer is a short regulatory element that plays a major role in up-regulating eukaryotic gene expression. To identify enhancers, an experimental process takes a long time and high cost; therefore, an accurate computational tool is a much-needed work in this area. Existing techniques were developed by the use of handcrafted features followed by machine learning techniques, while the proposed model extracts the features of enhancers from raw DNA sequences by the integration of natural language processing (NLP) technique using word2vec and convolutional neural network (CNN). Therefore, an accurate computational tool, iEnhancer-CNN, is developed. The developed tool can predict enhancers and their strength. The evaluation results show that iEnhancer-CNN is remarkably superior to the existing state-of-the-art models. In more detail, iEnhancer-CNN improved the accuracy of enhancer and enhancer strength identification by 2.6% and 11.4%, respectively. A web server for the iEnhancer-CNN is freely available at https://home.jbnu.ac.kr/NSCL/iEnhancer-CNN.htm.https://ieeexplore.ieee.org/document/9044822/Convolutional neural networkDNA sequencedeep learningenhancersK-mersword2vec
collection DOAJ
language English
format Article
sources DOAJ
author Jhabindra Khanal
Hilal Tayara
Kil To Chong
spellingShingle Jhabindra Khanal
Hilal Tayara
Kil To Chong
Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural Network
IEEE Access
Convolutional neural network
DNA sequence
deep learning
enhancers
K-mers
word2vec
author_facet Jhabindra Khanal
Hilal Tayara
Kil To Chong
author_sort Jhabindra Khanal
title Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural Network
title_short Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural Network
title_full Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural Network
title_fullStr Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural Network
title_full_unstemmed Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural Network
title_sort identifying enhancers and their strength by the integration of word embedding and convolution neural network
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description The enhancer is a short regulatory element that plays a major role in up-regulating eukaryotic gene expression. To identify enhancers, an experimental process takes a long time and high cost; therefore, an accurate computational tool is a much-needed work in this area. Existing techniques were developed by the use of handcrafted features followed by machine learning techniques, while the proposed model extracts the features of enhancers from raw DNA sequences by the integration of natural language processing (NLP) technique using word2vec and convolutional neural network (CNN). Therefore, an accurate computational tool, iEnhancer-CNN, is developed. The developed tool can predict enhancers and their strength. The evaluation results show that iEnhancer-CNN is remarkably superior to the existing state-of-the-art models. In more detail, iEnhancer-CNN improved the accuracy of enhancer and enhancer strength identification by 2.6% and 11.4%, respectively. A web server for the iEnhancer-CNN is freely available at https://home.jbnu.ac.kr/NSCL/iEnhancer-CNN.htm.
topic Convolutional neural network
DNA sequence
deep learning
enhancers
K-mers
word2vec
url https://ieeexplore.ieee.org/document/9044822/
work_keys_str_mv AT jhabindrakhanal identifyingenhancersandtheirstrengthbytheintegrationofwordembeddingandconvolutionneuralnetwork
AT hilaltayara identifyingenhancersandtheirstrengthbytheintegrationofwordembeddingandconvolutionneuralnetwork
AT kiltochong identifyingenhancersandtheirstrengthbytheintegrationofwordembeddingandconvolutionneuralnetwork
_version_ 1724184325250875392