Quantitative integration of biological knowledge for the analysis of high-throughput genomic data

The development of high-throughput technologies has changed the way in which we approach questions in biology by allowing us to assess the relative state of tens of thousands of genes or gene products in a single assay. A great deal of research has focused on developing statistical methods to identi...

Full description

Bibliographic Details
Main Author: Chittenden, Thomas William
Other Authors: Holmes, Chris
Published: University of Oxford 2012
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.559860
id ndltd-bl.uk-oai-ethos.bl.uk-559860
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-5598602017-06-27T03:28:31ZQuantitative integration of biological knowledge for the analysis of high-throughput genomic dataChittenden, Thomas WilliamHolmes, Chris2012The development of high-throughput technologies has changed the way in which we approach questions in biology by allowing us to assess the relative state of tens of thousands of genes or gene products in a single assay. A great deal of research has focused on developing statistical methods to identify biologically relevant sets of genes whose collective state correlates with a given phenotype under study. However, placing these gene sets into an intellectual framework that allows for hypothesis generation and mechanistic interpretation remains a significant challenge. To address these issues, we first apply and then extend a well-established gene ontology, singular enrichment analysis method to quantitatively assess overrepresented biological themes within lists of somatically mutated and abnormally expressed genes from publically available human breast, colorectal, lung, prostate, and renal cancer datasets. We further validate the utility of this novel approach with actual experimental laboratory investigations. Finally, we describe a general strategy for constructing prediction models by integrating prior biological knowledge with gene expression data from three large human breast cancer datasets. We show how this biological network-based model improves performance and interoperability by identifying genes more closely related to breast cancer etiology and patient survival. The work presented throughout this manuscript indicates the utility and proposes the future development of such methodologies to address many of the contemporary concerns associated with the analysis of a wide array of high-dimensional genomic data types.572.8University of Oxfordhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.559860Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 572.8
spellingShingle 572.8
Chittenden, Thomas William
Quantitative integration of biological knowledge for the analysis of high-throughput genomic data
description The development of high-throughput technologies has changed the way in which we approach questions in biology by allowing us to assess the relative state of tens of thousands of genes or gene products in a single assay. A great deal of research has focused on developing statistical methods to identify biologically relevant sets of genes whose collective state correlates with a given phenotype under study. However, placing these gene sets into an intellectual framework that allows for hypothesis generation and mechanistic interpretation remains a significant challenge. To address these issues, we first apply and then extend a well-established gene ontology, singular enrichment analysis method to quantitatively assess overrepresented biological themes within lists of somatically mutated and abnormally expressed genes from publically available human breast, colorectal, lung, prostate, and renal cancer datasets. We further validate the utility of this novel approach with actual experimental laboratory investigations. Finally, we describe a general strategy for constructing prediction models by integrating prior biological knowledge with gene expression data from three large human breast cancer datasets. We show how this biological network-based model improves performance and interoperability by identifying genes more closely related to breast cancer etiology and patient survival. The work presented throughout this manuscript indicates the utility and proposes the future development of such methodologies to address many of the contemporary concerns associated with the analysis of a wide array of high-dimensional genomic data types.
author2 Holmes, Chris
author_facet Holmes, Chris
Chittenden, Thomas William
author Chittenden, Thomas William
author_sort Chittenden, Thomas William
title Quantitative integration of biological knowledge for the analysis of high-throughput genomic data
title_short Quantitative integration of biological knowledge for the analysis of high-throughput genomic data
title_full Quantitative integration of biological knowledge for the analysis of high-throughput genomic data
title_fullStr Quantitative integration of biological knowledge for the analysis of high-throughput genomic data
title_full_unstemmed Quantitative integration of biological knowledge for the analysis of high-throughput genomic data
title_sort quantitative integration of biological knowledge for the analysis of high-throughput genomic data
publisher University of Oxford
publishDate 2012
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.559860
work_keys_str_mv AT chittendenthomaswilliam quantitativeintegrationofbiologicalknowledgefortheanalysisofhighthroughputgenomicdata
_version_ 1718466181481889792