Genome characterization through a mathematical model of the genetic code: an analysis of the whole chromosome 1 of A. thaliana

The objective of this work is to characterize the genome of the chromosome 1 of A.thaliana, a small flowering plants used as a model organism in studies of biology and genetics, on the basis of a recent mathematical model of the genetic code. I analyze and compare different portions of the genome...

Full description

Bibliographic Details
Main Author: Properzi, Enrico <1976>
Other Authors: Rosa, Rodolfo
Format: Doctoral Thesis
Language:en
Published: Alma Mater Studiorum - Università di Bologna 2013
Subjects:
Online Access:http://amsdottorato.unibo.it/5164/
id ndltd-unibo.it-oai-amsdottorato.cib.unibo.it-5164
record_format oai_dc
spelling ndltd-unibo.it-oai-amsdottorato.cib.unibo.it-51642014-03-24T16:30:17Z Genome characterization through a mathematical model of the genetic code: an analysis of the whole chromosome 1 of A. thaliana Properzi, Enrico <1976> SECS-S/02 Statistica per la ricerca sperimentale e tecnologica The objective of this work is to characterize the genome of the chromosome 1 of A.thaliana, a small flowering plants used as a model organism in studies of biology and genetics, on the basis of a recent mathematical model of the genetic code. I analyze and compare different portions of the genome: genes, exons, coding sequences (CDS), introns, long introns, intergenes, untranslated regions (UTR) and regulatory sequences. In order to accomplish the task, I transformed nucleotide sequences into binary sequences based on the definition of the three different dichotomic classes. The descriptive analysis of binary strings indicate the presence of regularities in each portion of the genome considered. In particular, there are remarkable differences between coding sequences (CDS and exons) and non-coding sequences, suggesting that the frame is important only for coding sequences and that dichotomic classes can be useful to recognize them. Then, I assessed the existence of short-range dependence between binary sequences computed on the basis of the different dichotomic classes. I used three different measures of dependence: the well-known chi-squared test and two indices derived from the concept of entropy i.e. Mutual Information (MI) and Sρ, a normalized version of the “Bhattacharya Hellinger Matusita distance”. The results show that there is a significant short-range dependence structure only for the coding sequences whose existence is a clue of an underlying error detection and correction mechanism. No doubt, further studies are needed in order to assess how the information carried by dichotomic classes could discriminate between coding and noncoding sequence and, therefore, contribute to unveil the role of the mathematical structure in error detection and correction mechanisms. Still, I have shown the potential of the approach presented for understanding the management of genetic information. Alma Mater Studiorum - Università di Bologna Rosa, Rodolfo 2013-02-18 Doctoral Thesis PeerReviewed application/pdf en http://amsdottorato.unibo.it/5164/ info:eu-repo/semantics/openAccess
collection NDLTD
language en
format Doctoral Thesis
sources NDLTD
topic SECS-S/02 Statistica per la ricerca sperimentale e tecnologica
spellingShingle SECS-S/02 Statistica per la ricerca sperimentale e tecnologica
Properzi, Enrico <1976>
Genome characterization through a mathematical model of the genetic code: an analysis of the whole chromosome 1 of A. thaliana
description The objective of this work is to characterize the genome of the chromosome 1 of A.thaliana, a small flowering plants used as a model organism in studies of biology and genetics, on the basis of a recent mathematical model of the genetic code. I analyze and compare different portions of the genome: genes, exons, coding sequences (CDS), introns, long introns, intergenes, untranslated regions (UTR) and regulatory sequences. In order to accomplish the task, I transformed nucleotide sequences into binary sequences based on the definition of the three different dichotomic classes. The descriptive analysis of binary strings indicate the presence of regularities in each portion of the genome considered. In particular, there are remarkable differences between coding sequences (CDS and exons) and non-coding sequences, suggesting that the frame is important only for coding sequences and that dichotomic classes can be useful to recognize them. Then, I assessed the existence of short-range dependence between binary sequences computed on the basis of the different dichotomic classes. I used three different measures of dependence: the well-known chi-squared test and two indices derived from the concept of entropy i.e. Mutual Information (MI) and Sρ, a normalized version of the “Bhattacharya Hellinger Matusita distance”. The results show that there is a significant short-range dependence structure only for the coding sequences whose existence is a clue of an underlying error detection and correction mechanism. No doubt, further studies are needed in order to assess how the information carried by dichotomic classes could discriminate between coding and noncoding sequence and, therefore, contribute to unveil the role of the mathematical structure in error detection and correction mechanisms. Still, I have shown the potential of the approach presented for understanding the management of genetic information.
author2 Rosa, Rodolfo
author_facet Rosa, Rodolfo
Properzi, Enrico <1976>
author Properzi, Enrico <1976>
author_sort Properzi, Enrico <1976>
title Genome characterization through a mathematical model of the genetic code: an analysis of the whole chromosome 1 of A. thaliana
title_short Genome characterization through a mathematical model of the genetic code: an analysis of the whole chromosome 1 of A. thaliana
title_full Genome characterization through a mathematical model of the genetic code: an analysis of the whole chromosome 1 of A. thaliana
title_fullStr Genome characterization through a mathematical model of the genetic code: an analysis of the whole chromosome 1 of A. thaliana
title_full_unstemmed Genome characterization through a mathematical model of the genetic code: an analysis of the whole chromosome 1 of A. thaliana
title_sort genome characterization through a mathematical model of the genetic code: an analysis of the whole chromosome 1 of a. thaliana
publisher Alma Mater Studiorum - Università di Bologna
publishDate 2013
url http://amsdottorato.unibo.it/5164/
work_keys_str_mv AT properzienrico1976 genomecharacterizationthroughamathematicalmodelofthegeneticcodeananalysisofthewholechromosome1ofathaliana
_version_ 1716654566848593920