Improved measurements of RNA structure conservation with generalized centroid estimators

Identification of non-protein-coding RNAs (ncRNAs) in genomes is acrucial task for not only molecular cell biology but alsobioinformatics. Secondary structures of ncRNAs are employed as a keyfeature of ncRNA analysis since biological functions of ncRNAs aredeeply related to their secondary structure...

Full description

Bibliographic Details
Main Authors: Yohei eOkada, Yutaka eSaito, Kengo eSato, Yasubumi eSakakibara
Format: Article
Language:English
Published: Frontiers Media S.A. 2011-08-01
Series:Frontiers in Genetics
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fgene.2011.00054/full
id doaj-edd077e5b9f64e6a97b6979e064f7f31
record_format Article
spelling doaj-edd077e5b9f64e6a97b6979e064f7f312020-11-25T01:09:20ZengFrontiers Media S.A.Frontiers in Genetics1664-80212011-08-01210.3389/fgene.2011.0005412574Improved measurements of RNA structure conservation with generalized centroid estimatorsYohei eOkada0Yutaka eSaito1Kengo eSato2Yasubumi eSakakibara3Keio UniversityKeio UniversityKeio UniversityKeio UniversityIdentification of non-protein-coding RNAs (ncRNAs) in genomes is acrucial task for not only molecular cell biology but alsobioinformatics. Secondary structures of ncRNAs are employed as a keyfeature of ncRNA analysis since biological functions of ncRNAs aredeeply related to their secondary structures. Although the minimumfree energy (MFE) structure of an RNA sequence is regarded as the moststable structure, MFE alone could not be an appropriate measure foridentifying ncRNAs since the free energy is heavily biased by thenucleotide composition. Therefore, instead of MFE itself, severalalternative measures for identifying ncRNAs have been proposed such asthe structure conservation index (SCI) and the base pair distance(BPD), both of which employ MFE structures. However, thesemeasurements are unfortunately not suitable for identifying ncRNAs insome cases including the genome-wide search and incur high falsediscovery rate. In this study, we propose improved measurements basedon SCI and BPD, applying generalized centroid estimators toincorporate the robustness against low quality multiple alignments.Our experiments show that our proposed methods achieve higher accuracythan the original SCI and BPD for not only human-curated structuralalignments but also low quality alignments produced by CLUSTALW. Furthermore, the centroid-based SCI on CLUSTAL W alignments is moreaccurate than or comparable with that of the original SCI onstructural alignments generated with RAF, a high quality structuralaligner, for which two-fold expensive computational time is requiredon average. We conclude that our methods are more suitable forgenome-wide alignments which are of low quality from the point of viewon secondary structures than the original SCI and BPD.http://journal.frontiersin.org/Journal/10.3389/fgene.2011.00054/fullnon-coding RNAscentroid estimatorsstructure conservation index
collection DOAJ
language English
format Article
sources DOAJ
author Yohei eOkada
Yutaka eSaito
Kengo eSato
Yasubumi eSakakibara
spellingShingle Yohei eOkada
Yutaka eSaito
Kengo eSato
Yasubumi eSakakibara
Improved measurements of RNA structure conservation with generalized centroid estimators
Frontiers in Genetics
non-coding RNAs
centroid estimators
structure conservation index
author_facet Yohei eOkada
Yutaka eSaito
Kengo eSato
Yasubumi eSakakibara
author_sort Yohei eOkada
title Improved measurements of RNA structure conservation with generalized centroid estimators
title_short Improved measurements of RNA structure conservation with generalized centroid estimators
title_full Improved measurements of RNA structure conservation with generalized centroid estimators
title_fullStr Improved measurements of RNA structure conservation with generalized centroid estimators
title_full_unstemmed Improved measurements of RNA structure conservation with generalized centroid estimators
title_sort improved measurements of rna structure conservation with generalized centroid estimators
publisher Frontiers Media S.A.
series Frontiers in Genetics
issn 1664-8021
publishDate 2011-08-01
description Identification of non-protein-coding RNAs (ncRNAs) in genomes is acrucial task for not only molecular cell biology but alsobioinformatics. Secondary structures of ncRNAs are employed as a keyfeature of ncRNA analysis since biological functions of ncRNAs aredeeply related to their secondary structures. Although the minimumfree energy (MFE) structure of an RNA sequence is regarded as the moststable structure, MFE alone could not be an appropriate measure foridentifying ncRNAs since the free energy is heavily biased by thenucleotide composition. Therefore, instead of MFE itself, severalalternative measures for identifying ncRNAs have been proposed such asthe structure conservation index (SCI) and the base pair distance(BPD), both of which employ MFE structures. However, thesemeasurements are unfortunately not suitable for identifying ncRNAs insome cases including the genome-wide search and incur high falsediscovery rate. In this study, we propose improved measurements basedon SCI and BPD, applying generalized centroid estimators toincorporate the robustness against low quality multiple alignments.Our experiments show that our proposed methods achieve higher accuracythan the original SCI and BPD for not only human-curated structuralalignments but also low quality alignments produced by CLUSTALW. Furthermore, the centroid-based SCI on CLUSTAL W alignments is moreaccurate than or comparable with that of the original SCI onstructural alignments generated with RAF, a high quality structuralaligner, for which two-fold expensive computational time is requiredon average. We conclude that our methods are more suitable forgenome-wide alignments which are of low quality from the point of viewon secondary structures than the original SCI and BPD.
topic non-coding RNAs
centroid estimators
structure conservation index
url http://journal.frontiersin.org/Journal/10.3389/fgene.2011.00054/full
work_keys_str_mv AT yoheieokada improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators
AT yutakaesaito improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators
AT kengoesato improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators
AT yasubumiesakakibara improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators
_version_ 1725179574343434240