Improved measurements of RNA structure conservation with generalized centroid estimators
Identification of non-protein-coding RNAs (ncRNAs) in genomes is acrucial task for not only molecular cell biology but alsobioinformatics. Secondary structures of ncRNAs are employed as a keyfeature of ncRNA analysis since biological functions of ncRNAs aredeeply related to their secondary structure...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2011-08-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | http://journal.frontiersin.org/Journal/10.3389/fgene.2011.00054/full |
id |
doaj-edd077e5b9f64e6a97b6979e064f7f31 |
---|---|
record_format |
Article |
spelling |
doaj-edd077e5b9f64e6a97b6979e064f7f312020-11-25T01:09:20ZengFrontiers Media S.A.Frontiers in Genetics1664-80212011-08-01210.3389/fgene.2011.0005412574Improved measurements of RNA structure conservation with generalized centroid estimatorsYohei eOkada0Yutaka eSaito1Kengo eSato2Yasubumi eSakakibara3Keio UniversityKeio UniversityKeio UniversityKeio UniversityIdentification of non-protein-coding RNAs (ncRNAs) in genomes is acrucial task for not only molecular cell biology but alsobioinformatics. Secondary structures of ncRNAs are employed as a keyfeature of ncRNA analysis since biological functions of ncRNAs aredeeply related to their secondary structures. Although the minimumfree energy (MFE) structure of an RNA sequence is regarded as the moststable structure, MFE alone could not be an appropriate measure foridentifying ncRNAs since the free energy is heavily biased by thenucleotide composition. Therefore, instead of MFE itself, severalalternative measures for identifying ncRNAs have been proposed such asthe structure conservation index (SCI) and the base pair distance(BPD), both of which employ MFE structures. However, thesemeasurements are unfortunately not suitable for identifying ncRNAs insome cases including the genome-wide search and incur high falsediscovery rate. In this study, we propose improved measurements basedon SCI and BPD, applying generalized centroid estimators toincorporate the robustness against low quality multiple alignments.Our experiments show that our proposed methods achieve higher accuracythan the original SCI and BPD for not only human-curated structuralalignments but also low quality alignments produced by CLUSTALW. Furthermore, the centroid-based SCI on CLUSTAL W alignments is moreaccurate than or comparable with that of the original SCI onstructural alignments generated with RAF, a high quality structuralaligner, for which two-fold expensive computational time is requiredon average. We conclude that our methods are more suitable forgenome-wide alignments which are of low quality from the point of viewon secondary structures than the original SCI and BPD.http://journal.frontiersin.org/Journal/10.3389/fgene.2011.00054/fullnon-coding RNAscentroid estimatorsstructure conservation index |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yohei eOkada Yutaka eSaito Kengo eSato Yasubumi eSakakibara |
spellingShingle |
Yohei eOkada Yutaka eSaito Kengo eSato Yasubumi eSakakibara Improved measurements of RNA structure conservation with generalized centroid estimators Frontiers in Genetics non-coding RNAs centroid estimators structure conservation index |
author_facet |
Yohei eOkada Yutaka eSaito Kengo eSato Yasubumi eSakakibara |
author_sort |
Yohei eOkada |
title |
Improved measurements of RNA structure conservation with generalized centroid estimators |
title_short |
Improved measurements of RNA structure conservation with generalized centroid estimators |
title_full |
Improved measurements of RNA structure conservation with generalized centroid estimators |
title_fullStr |
Improved measurements of RNA structure conservation with generalized centroid estimators |
title_full_unstemmed |
Improved measurements of RNA structure conservation with generalized centroid estimators |
title_sort |
improved measurements of rna structure conservation with generalized centroid estimators |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Genetics |
issn |
1664-8021 |
publishDate |
2011-08-01 |
description |
Identification of non-protein-coding RNAs (ncRNAs) in genomes is acrucial task for not only molecular cell biology but alsobioinformatics. Secondary structures of ncRNAs are employed as a keyfeature of ncRNA analysis since biological functions of ncRNAs aredeeply related to their secondary structures. Although the minimumfree energy (MFE) structure of an RNA sequence is regarded as the moststable structure, MFE alone could not be an appropriate measure foridentifying ncRNAs since the free energy is heavily biased by thenucleotide composition. Therefore, instead of MFE itself, severalalternative measures for identifying ncRNAs have been proposed such asthe structure conservation index (SCI) and the base pair distance(BPD), both of which employ MFE structures. However, thesemeasurements are unfortunately not suitable for identifying ncRNAs insome cases including the genome-wide search and incur high falsediscovery rate. In this study, we propose improved measurements basedon SCI and BPD, applying generalized centroid estimators toincorporate the robustness against low quality multiple alignments.Our experiments show that our proposed methods achieve higher accuracythan the original SCI and BPD for not only human-curated structuralalignments but also low quality alignments produced by CLUSTALW. Furthermore, the centroid-based SCI on CLUSTAL W alignments is moreaccurate than or comparable with that of the original SCI onstructural alignments generated with RAF, a high quality structuralaligner, for which two-fold expensive computational time is requiredon average. We conclude that our methods are more suitable forgenome-wide alignments which are of low quality from the point of viewon secondary structures than the original SCI and BPD. |
topic |
non-coding RNAs centroid estimators structure conservation index |
url |
http://journal.frontiersin.org/Journal/10.3389/fgene.2011.00054/full |
work_keys_str_mv |
AT yoheieokada improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators AT yutakaesaito improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators AT kengoesato improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators AT yasubumiesakakibara improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators |
_version_ |
1725179574343434240 |