Analyzing Codon Usage and Coding Sequence Length Biases Across the Tree of Life

Although codon usage bias has been shown to persist through non-random mutations and selection, many avenues of research into the applications of codon usage bias have remained unexplored. In this dissertation, we present several new applications of codon usage bias and their practical uses in a phy...

Full description

Bibliographic Details
Main Author: Miller, Justin B
Format: Others
Published: BYU ScholarsArchive 2018
Subjects:
Online Access:https://scholarsarchive.byu.edu/etd/7603
https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=8603&context=etd
Description
Summary:Although codon usage bias has been shown to persist through non-random mutations and selection, many avenues of research into the applications of codon usage bias have remained unexplored. In this dissertation, we present several new applications of codon usage bias and their practical uses in a phylogenetic construct. We first review the literature and provide background into other software applications of codon usage bias in Chapter 1. In Chapter 2, we show that in tetrapods, codon aversion in orthologs is phylogenetically conserved. We further this analysis in Chapter 3 by exploring codon use and aversion across the Tree of Life, providing frameworks for other researchers to analyze different species subsets. We present a novel algorithm to recover species relationships using codon aversion, without regard to orthologous relationships in Chapter 4. We present several other algorithms in Chapter 5 to also recover species relationships using biases in codon pairing. Chapter 6 analyzes the relationship between codon usage bias in viruses that infect humans and proteins found in tissues that they infect. In Chapter 7, we present our discovery of a conservation in coding sequence lengths in orthologous genes that allowed us to accurately recover orthologous gene relationships and reduce overall ortholog identification runtime by over 96%. In Chapter 8 we discuss a novel algorithm for extracting a ramp of slowly-translated codons located at the beginning of gene sequences, allowing researchers to quickly identify translational bottlenecks. Finally, Chapter 9 touches on future applications of codon usage bias in phylogenetics. This dissertation represents a major vertical leap in phylogenetics by providing a framework and paradigm shift toward utilizing codon usage and coding sequence length biases in future analyses.