The Purine Bias of Coding Sequences is Determined by Physicochemical Constraints on Proteins

For this report, we analyzed protein secondary structures in relation to the statistics of three nucleotide codon positions. The purpose of this investigation was to find which properties of the ribosome, tRNA or protein level, could explain the purine bias (Rrr) as it is observed in coding DNA. We...

Full description

Bibliographic Details
Main Authors: Miguel Ponce De Leon, Antonio Basilio De Miranda, Fernando Alvarez-Valin, Nicolas Carels
Format: Article
Language:English
Published: SAGE Publishing 2014-01-01
Series:Bioinformatics and Biology Insights
Online Access:https://doi.org/10.4137/BBI.S13161
id doaj-642e5a71ff7e4d3180a2b0e81c1b60a6
record_format Article
spelling doaj-642e5a71ff7e4d3180a2b0e81c1b60a62020-11-25T03:19:23ZengSAGE PublishingBioinformatics and Biology Insights1177-93222014-01-01810.4137/BBI.S13161The Purine Bias of Coding Sequences is Determined by Physicochemical Constraints on ProteinsMiguel Ponce De Leon0Antonio Basilio De Miranda1Fernando Alvarez-Valin2Nicolas Carels3Secció Biomatemática, Facultad de Ciencias, Universidad de la República, Iguá, Montevideo, Uruguay.Fundação Oswaldo Cruz (FIOCRUZ), Instituto Oswaldo Cruz (IOC), Laboratório de Genômica Funcional e Bioinformática, Rio de Janeiro, RJ, Brazil.Secció Biomatemática, Facultad de Ciencias, Universidad de la República, Iguá, Montevideo, Uruguay.Fundação Oswaldo Cruz (FIOCRUZ), Instituto Oswaldo Cruz (IOC), Laboratório de Genômica Funcional e Bioinformática, Rio de Janeiro, RJ, Brazil.For this report, we analyzed protein secondary structures in relation to the statistics of three nucleotide codon positions. The purpose of this investigation was to find which properties of the ribosome, tRNA or protein level, could explain the purine bias (Rrr) as it is observed in coding DNA. We found that the Rrr pattern is the consequence of a regularity (the codon structure) resulting from physicochemical constraints on proteins and thermodynamic constraints on ribosomal machinery. The physicochemical constraints on proteins mainly come from the hydropathy and molecular weight (MW) of secondary structures as well as the energy cost of amino acid synthesis. These constraints appear through a network of statistical correlations, such as (i) the cost of amino acid synthesis, which is in favor of a higher level of guanine in the first codon position, (ii) the constructive contribution of hydropathy alternation in proteins, (iii) the spatial organization of secondary structure in proteins according to solvent accessibility, (iv) the spatial organization of secondary structure according to amino acid hydropathy, (v) the statistical correlation of MW with protein secondary structures and their overall hydropathy, (vi) the statistical correlation of thymine in the second codon position with hydropathy and the energy cost of amino acid synthesis, and (vii) the statistical correlation of adenine in the second codon position with amino acid complexity and the MW of secondary protein structures. Amino acid physicochemical properties and functional constraints on proteins constitute a code that is translated into a purine bias within the coding DNA via tRNAs. In that sense, the Rrr pattern within coding DNA is the effect of information transfer on nucleotide composition from protein to DNA by selection according to the codon positions. Thus, coding DNA structure and ribosomal machinery co-evolved to minimize the energy cost of protein coding given the functional constraints on proteins.https://doi.org/10.4137/BBI.S13161
collection DOAJ
language English
format Article
sources DOAJ
author Miguel Ponce De Leon
Antonio Basilio De Miranda
Fernando Alvarez-Valin
Nicolas Carels
spellingShingle Miguel Ponce De Leon
Antonio Basilio De Miranda
Fernando Alvarez-Valin
Nicolas Carels
The Purine Bias of Coding Sequences is Determined by Physicochemical Constraints on Proteins
Bioinformatics and Biology Insights
author_facet Miguel Ponce De Leon
Antonio Basilio De Miranda
Fernando Alvarez-Valin
Nicolas Carels
author_sort Miguel Ponce De Leon
title The Purine Bias of Coding Sequences is Determined by Physicochemical Constraints on Proteins
title_short The Purine Bias of Coding Sequences is Determined by Physicochemical Constraints on Proteins
title_full The Purine Bias of Coding Sequences is Determined by Physicochemical Constraints on Proteins
title_fullStr The Purine Bias of Coding Sequences is Determined by Physicochemical Constraints on Proteins
title_full_unstemmed The Purine Bias of Coding Sequences is Determined by Physicochemical Constraints on Proteins
title_sort purine bias of coding sequences is determined by physicochemical constraints on proteins
publisher SAGE Publishing
series Bioinformatics and Biology Insights
issn 1177-9322
publishDate 2014-01-01
description For this report, we analyzed protein secondary structures in relation to the statistics of three nucleotide codon positions. The purpose of this investigation was to find which properties of the ribosome, tRNA or protein level, could explain the purine bias (Rrr) as it is observed in coding DNA. We found that the Rrr pattern is the consequence of a regularity (the codon structure) resulting from physicochemical constraints on proteins and thermodynamic constraints on ribosomal machinery. The physicochemical constraints on proteins mainly come from the hydropathy and molecular weight (MW) of secondary structures as well as the energy cost of amino acid synthesis. These constraints appear through a network of statistical correlations, such as (i) the cost of amino acid synthesis, which is in favor of a higher level of guanine in the first codon position, (ii) the constructive contribution of hydropathy alternation in proteins, (iii) the spatial organization of secondary structure in proteins according to solvent accessibility, (iv) the spatial organization of secondary structure according to amino acid hydropathy, (v) the statistical correlation of MW with protein secondary structures and their overall hydropathy, (vi) the statistical correlation of thymine in the second codon position with hydropathy and the energy cost of amino acid synthesis, and (vii) the statistical correlation of adenine in the second codon position with amino acid complexity and the MW of secondary protein structures. Amino acid physicochemical properties and functional constraints on proteins constitute a code that is translated into a purine bias within the coding DNA via tRNAs. In that sense, the Rrr pattern within coding DNA is the effect of information transfer on nucleotide composition from protein to DNA by selection according to the codon positions. Thus, coding DNA structure and ribosomal machinery co-evolved to minimize the energy cost of protein coding given the functional constraints on proteins.
url https://doi.org/10.4137/BBI.S13161
work_keys_str_mv AT miguelponcedeleon thepurinebiasofcodingsequencesisdeterminedbyphysicochemicalconstraintsonproteins
AT antoniobasiliodemiranda thepurinebiasofcodingsequencesisdeterminedbyphysicochemicalconstraintsonproteins
AT fernandoalvarezvalin thepurinebiasofcodingsequencesisdeterminedbyphysicochemicalconstraintsonproteins
AT nicolascarels thepurinebiasofcodingsequencesisdeterminedbyphysicochemicalconstraintsonproteins
AT miguelponcedeleon purinebiasofcodingsequencesisdeterminedbyphysicochemicalconstraintsonproteins
AT antoniobasiliodemiranda purinebiasofcodingsequencesisdeterminedbyphysicochemicalconstraintsonproteins
AT fernandoalvarezvalin purinebiasofcodingsequencesisdeterminedbyphysicochemicalconstraintsonproteins
AT nicolascarels purinebiasofcodingsequencesisdeterminedbyphysicochemicalconstraintsonproteins
_version_ 1724622694567116800