Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes

Progress in the “omics” fields such as genomics, transcriptomics, proteomics, and metabolomics has engendered a need for innovative analytical techniques to derive meaningful information from the ever increasing molecular data. KNApSAcK motorcycle DB is a popular database for enzymes related to seco...

Full description

Bibliographic Details
Main Authors: Nelson Kibinge, Shun Ikeda, Naoaki Ono, Md. Altaf-Ul-Amin, Shigehiko Kanaya
Format: Article
Language:English
Published: Hindawi Limited 2014-01-01
Series:BioMed Research International
Online Access:http://dx.doi.org/10.1155/2014/753428
id doaj-39fbf27d63594a26b1006e7ddf9294af
record_format Article
spelling doaj-39fbf27d63594a26b1006e7ddf9294af2020-11-24T23:37:52ZengHindawi LimitedBioMed Research International2314-61332314-61412014-01-01201410.1155/2014/753428753428Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid EnzymesNelson Kibinge0Shun Ikeda1Naoaki Ono2Md. Altaf-Ul-Amin3Shigehiko Kanaya4Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192, JapanGraduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192, JapanGraduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192, JapanGraduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192, JapanGraduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192, JapanProgress in the “omics” fields such as genomics, transcriptomics, proteomics, and metabolomics has engendered a need for innovative analytical techniques to derive meaningful information from the ever increasing molecular data. KNApSAcK motorcycle DB is a popular database for enzymes related to secondary metabolic pathways in plants. One of the challenges in analyses of protein sequence data in such repositories is the standard notation of sequences as strings of alphabetical characters. This has created lack of a natural underlying metric that eases amenability to computation. In view of this requirement, we applied novel integration of selected biochemical and physical attributes of amino acids derived from the amino acid index and quantified in numerical scale, to examine diversity of peptide sequences of terpenoid synthases accumulated in KNApSAcK motorcycle DB. We initially generated a reduced amino acid index table. This is a set of biochemical and physical properties obtained by random forest feature selection of important indices from the amino acid index. Principal component analysis was then applied for characterization of enzymes involved in synthesis of terpenoids. The variance explained was increased by incorporation of residue attributes for analyses.http://dx.doi.org/10.1155/2014/753428
collection DOAJ
language English
format Article
sources DOAJ
author Nelson Kibinge
Shun Ikeda
Naoaki Ono
Md. Altaf-Ul-Amin
Shigehiko Kanaya
spellingShingle Nelson Kibinge
Shun Ikeda
Naoaki Ono
Md. Altaf-Ul-Amin
Shigehiko Kanaya
Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes
BioMed Research International
author_facet Nelson Kibinge
Shun Ikeda
Naoaki Ono
Md. Altaf-Ul-Amin
Shigehiko Kanaya
author_sort Nelson Kibinge
title Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes
title_short Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes
title_full Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes
title_fullStr Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes
title_full_unstemmed Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes
title_sort integration of residue attributes for sequence diversity characterization of terpenoid enzymes
publisher Hindawi Limited
series BioMed Research International
issn 2314-6133
2314-6141
publishDate 2014-01-01
description Progress in the “omics” fields such as genomics, transcriptomics, proteomics, and metabolomics has engendered a need for innovative analytical techniques to derive meaningful information from the ever increasing molecular data. KNApSAcK motorcycle DB is a popular database for enzymes related to secondary metabolic pathways in plants. One of the challenges in analyses of protein sequence data in such repositories is the standard notation of sequences as strings of alphabetical characters. This has created lack of a natural underlying metric that eases amenability to computation. In view of this requirement, we applied novel integration of selected biochemical and physical attributes of amino acids derived from the amino acid index and quantified in numerical scale, to examine diversity of peptide sequences of terpenoid synthases accumulated in KNApSAcK motorcycle DB. We initially generated a reduced amino acid index table. This is a set of biochemical and physical properties obtained by random forest feature selection of important indices from the amino acid index. Principal component analysis was then applied for characterization of enzymes involved in synthesis of terpenoids. The variance explained was increased by incorporation of residue attributes for analyses.
url http://dx.doi.org/10.1155/2014/753428
work_keys_str_mv AT nelsonkibinge integrationofresidueattributesforsequencediversitycharacterizationofterpenoidenzymes
AT shunikeda integrationofresidueattributesforsequencediversitycharacterizationofterpenoidenzymes
AT naoakiono integrationofresidueattributesforsequencediversitycharacterizationofterpenoidenzymes
AT mdaltafulamin integrationofresidueattributesforsequencediversitycharacterizationofterpenoidenzymes
AT shigehikokanaya integrationofresidueattributesforsequencediversitycharacterizationofterpenoidenzymes
_version_ 1725518655904546816