Characterization and visualization of RNA secondary structure Boltzmann ensemble via information theory

Abstract Background The nearest neighbor model and associated dynamic programming algorithms allow for the efficient estimation of the RNA secondary structure Boltzmann ensemble. However because a given RNA secondary structure only contains a fraction of the possible helices that could form from a g...

Full description

Bibliographic Details
Main Authors: Luan Lin, Wilson H. McKerrow, Bryce Richards, Chukiat Phonsom, Charles E. Lawrence
Format: Article
Language:English
Published: BMC 2018-03-01
Series:BMC Bioinformatics
Subjects:
RNA
Online Access:http://link.springer.com/article/10.1186/s12859-018-2078-5
Description
Summary:Abstract Background The nearest neighbor model and associated dynamic programming algorithms allow for the efficient estimation of the RNA secondary structure Boltzmann ensemble. However because a given RNA secondary structure only contains a fraction of the possible helices that could form from a given sequence, the Boltzmann ensemble is multimodal. Several methods exist for clustering structures and finding those modes. However less focus is given to exploring the underlying reasons for this multimodality: the presence of conflicting basepairs. Information theory, or more specifically mutual information, provides a method to identify those basepairs that are key to the secondary structure. Results To this end we find most informative basepairs and visualize the effect of these basepairs on the secondary structure. Knowing whether a most informative basepair is present tells us not only the status of the particular pair but also provides a large amount of information about which other pairs are present or not present. We find that a few basepairs account for a large amount of the structural uncertainty. The identification of these pairs indicates small changes to sequence or stability that will have a large effect on structure. Conclusion We provide a novel algorithm that uses mutual information to identify the key basepairs that lead to a multimodal Boltzmann distribution. We then visualize the effect of these pairs on the overall Boltzmann ensemble.
ISSN:1471-2105