Machine Learning and Rank Aggregation Methods for Gene Prioritization from Heterogeneous Data Sources

Gene prioritization involves ranking genes by possible relevance to a disease of interest. This is important in order to narrow down the set of genes to be investigated biologically, and over the years, several computational approaches have been proposed for automat-ically prioritizing genes using s...

Full description

Bibliographic Details
Main Author:	Laha, Anirban
Other Authors:	Agarwal, Shivani
Language:	en_US
Published:	2017
Subjects:	Gene Prioritization Gene Ranking Bipartite Ranking Learning To Rank Rank Aggregation Methods Bipartite Instance Ranking Rank Aggregration Ranking of Genes Gene Data Sources Genes Bipartite Ranking Bipartite Graph Ranking Bioinformatics
Online Access:	http://hdl.handle.net/2005/2866 http://etd.ncsi.iisc.ernet.in/abstracts/3725/G26678-Abs.pdf

id	ndltd-IISc-oai-etd.ncsi.iisc.ernet.in-2005-2866
record_format	oai_dc
spelling	ndltd-IISc-oai-etd.ncsi.iisc.ernet.in-2005-28662017-12-06T03:57:55ZMachine Learning and Rank Aggregation Methods for Gene Prioritization from Heterogeneous Data SourcesLaha, AnirbanGene PrioritizationGene RankingBipartite RankingLearning To RankRank Aggregation MethodsBipartite Instance RankingRank AggregrationRanking of GenesGene Data SourcesGenes Bipartite RankingBipartite Graph RankingBioinformaticsGene prioritization involves ranking genes by possible relevance to a disease of interest. This is important in order to narrow down the set of genes to be investigated biologically, and over the years, several computational approaches have been proposed for automat-ically prioritizing genes using some form of gene-related data, mostly using statistical or machine learning methods. Recently, Agarwal and Sengupta (2009) proposed the use of learning-to-rank methods, which have been used extensively in information retrieval and related fields, to learn a ranking of genes from a given data source, and used this approach to successfully identify novel genes related to leukemia and colon cancer using only gene expression data. In this work, we explore the possibility of combining such learning-to-rank methods with rank aggregation techniques to learn a ranking of genes from multiple heterogeneous data sources, such as gene expression data, gene ontology data, protein-protein interaction data, etc. Rank aggregation methods have their origins in voting theory, and have been used successfully in meta-search applications to aggregate webpage rankings from different search engines. Here we use graph-based learning-to-rank methods to learn a ranking of genes from each individual data source represented as a graph, and then apply rank aggregation methods to aggregate these rankings into a single ranking over the genes. The thesis describes our approach, reports experiments with various data sets, and presents our findings and initial conclusions.Agarwal, Shivani2017-12-05T16:42:18Z2017-12-05T16:42:18Z2017-12-052013Thesishttp://hdl.handle.net/2005/2866http://etd.ncsi.iisc.ernet.in/abstracts/3725/G26678-Abs.pdfen_USG26678
collection	NDLTD
language	en_US
sources	NDLTD
topic	Gene Prioritization Gene Ranking Bipartite Ranking Learning To Rank Rank Aggregation Methods Bipartite Instance Ranking Rank Aggregration Ranking of Genes Gene Data Sources Genes Bipartite Ranking Bipartite Graph Ranking Bioinformatics
spellingShingle	Gene Prioritization Gene Ranking Bipartite Ranking Learning To Rank Rank Aggregation Methods Bipartite Instance Ranking Rank Aggregration Ranking of Genes Gene Data Sources Genes Bipartite Ranking Bipartite Graph Ranking Bioinformatics Laha, Anirban Machine Learning and Rank Aggregation Methods for Gene Prioritization from Heterogeneous Data Sources
description	Gene prioritization involves ranking genes by possible relevance to a disease of interest. This is important in order to narrow down the set of genes to be investigated biologically, and over the years, several computational approaches have been proposed for automat-ically prioritizing genes using some form of gene-related data, mostly using statistical or machine learning methods. Recently, Agarwal and Sengupta (2009) proposed the use of learning-to-rank methods, which have been used extensively in information retrieval and related fields, to learn a ranking of genes from a given data source, and used this approach to successfully identify novel genes related to leukemia and colon cancer using only gene expression data. In this work, we explore the possibility of combining such learning-to-rank methods with rank aggregation techniques to learn a ranking of genes from multiple heterogeneous data sources, such as gene expression data, gene ontology data, protein-protein interaction data, etc. Rank aggregation methods have their origins in voting theory, and have been used successfully in meta-search applications to aggregate webpage rankings from different search engines. Here we use graph-based learning-to-rank methods to learn a ranking of genes from each individual data source represented as a graph, and then apply rank aggregation methods to aggregate these rankings into a single ranking over the genes. The thesis describes our approach, reports experiments with various data sets, and presents our findings and initial conclusions.
author2	Agarwal, Shivani
author_facet	Agarwal, Shivani Laha, Anirban
author	Laha, Anirban
author_sort	Laha, Anirban
title	Machine Learning and Rank Aggregation Methods for Gene Prioritization from Heterogeneous Data Sources
title_short	Machine Learning and Rank Aggregation Methods for Gene Prioritization from Heterogeneous Data Sources
title_full	Machine Learning and Rank Aggregation Methods for Gene Prioritization from Heterogeneous Data Sources
title_fullStr	Machine Learning and Rank Aggregation Methods for Gene Prioritization from Heterogeneous Data Sources
title_full_unstemmed	Machine Learning and Rank Aggregation Methods for Gene Prioritization from Heterogeneous Data Sources
title_sort	machine learning and rank aggregation methods for gene prioritization from heterogeneous data sources
publishDate	2017
url	http://hdl.handle.net/2005/2866 http://etd.ncsi.iisc.ernet.in/abstracts/3725/G26678-Abs.pdf
work_keys_str_mv	AT lahaanirban machinelearningandrankaggregationmethodsforgeneprioritizationfromheterogeneousdatasources
_version_	1718563290026606592

Machine Learning and Rank Aggregation Methods for Gene Prioritization from Heterogeneous Data Sources

Similar Items