Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference

Inferring the topology of a gene regulatory network (GRN) from gene expression data is a challenging but important undertaking for gaining a better understanding of gene regulation. Key challenges include working with noisy data and dealing with a higher number of genes than samples. Although a numb...

Full description

Bibliographic Details
Main Authors:	Furqan Aziz, Animesh Acharjee, John A. Williams, Dominic Russ, Laura Bravo-Merodio, Georgios V. Gkoutos
Format:	Article
Language:	English
Published:	MDPI AG 2020-10-01
Series:	International Journal of Molecular Sciences
Subjects:	gene regulatory network causal modelling omics integration experimental design
Online Access:	https://www.mdpi.com/1422-0067/21/21/7886

id	doaj-564ea5358573445cbedda3422acfc679
record_format	Article
spelling	doaj-564ea5358573445cbedda3422acfc6792020-11-25T03:37:46ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672020-10-01217886788610.3390/ijms21217886Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network InferenceFurqan Aziz0Animesh Acharjee1John A. Williams2Dominic Russ3Laura Bravo-Merodio4Georgios V. Gkoutos5Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham B15 2TT, UKInstitute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham B15 2TT, UKInstitute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham B15 2TT, UKInstitute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham B15 2TT, UKInstitute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham B15 2TT, UKInstitute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham B15 2TT, UKInferring the topology of a gene regulatory network (GRN) from gene expression data is a challenging but important undertaking for gaining a better understanding of gene regulation. Key challenges include working with noisy data and dealing with a higher number of genes than samples. Although a number of different methods have been proposed to infer the structure of a GRN, there are large discrepancies among the different inference algorithms they adopt, rendering their meaningful comparison challenging. In this study, we used two methods, namely the MIDER (Mutual Information Distance and Entropy Reduction) and the PLSNET (Partial least square based feature selection) methods, to infer the structure of a GRN directly from data and computationally validated our results. Both methods were applied to different gene expression datasets resulting from inflammatory bowel disease (IBD), pancreatic ductal adenocarcinoma (PDAC), and acute myeloid leukaemia (AML) studies. For each case, gene regulators were successfully identified. For example, for the case of the IBD dataset, the <i>UGT1A</i> family genes were identified as key regulators while upon analysing the PDAC dataset, the <i>SULF1</i> and <i>THBS2</i> genes were depicted. We further demonstrate that an ensemble-based approach, that combines the output of the MIDER and PLSNET algorithms, can infer the structure of a GRN from data with higher accuracy. We have also estimated the number of the samples required for potential future validation studies. Here, we presented our proposed analysis framework that caters not only to candidate regulator genes prediction for potential validation experiments but also an estimation of the number of samples required for these experiments.https://www.mdpi.com/1422-0067/21/21/7886gene regulatory networkcausal modellingomics integrationexperimental design
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Furqan Aziz Animesh Acharjee John A. Williams Dominic Russ Laura Bravo-Merodio Georgios V. Gkoutos
spellingShingle	Furqan Aziz Animesh Acharjee John A. Williams Dominic Russ Laura Bravo-Merodio Georgios V. Gkoutos Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference International Journal of Molecular Sciences gene regulatory network causal modelling omics integration experimental design
author_facet	Furqan Aziz Animesh Acharjee John A. Williams Dominic Russ Laura Bravo-Merodio Georgios V. Gkoutos
author_sort	Furqan Aziz
title	Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference
title_short	Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference
title_full	Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference
title_fullStr	Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference
title_full_unstemmed	Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference
title_sort	biomarker prioritisation and power estimation using ensemble gene regulatory network inference
publisher	MDPI AG
series	International Journal of Molecular Sciences
issn	1661-6596 1422-0067
publishDate	2020-10-01
description	Inferring the topology of a gene regulatory network (GRN) from gene expression data is a challenging but important undertaking for gaining a better understanding of gene regulation. Key challenges include working with noisy data and dealing with a higher number of genes than samples. Although a number of different methods have been proposed to infer the structure of a GRN, there are large discrepancies among the different inference algorithms they adopt, rendering their meaningful comparison challenging. In this study, we used two methods, namely the MIDER (Mutual Information Distance and Entropy Reduction) and the PLSNET (Partial least square based feature selection) methods, to infer the structure of a GRN directly from data and computationally validated our results. Both methods were applied to different gene expression datasets resulting from inflammatory bowel disease (IBD), pancreatic ductal adenocarcinoma (PDAC), and acute myeloid leukaemia (AML) studies. For each case, gene regulators were successfully identified. For example, for the case of the IBD dataset, the <i>UGT1A</i> family genes were identified as key regulators while upon analysing the PDAC dataset, the <i>SULF1</i> and <i>THBS2</i> genes were depicted. We further demonstrate that an ensemble-based approach, that combines the output of the MIDER and PLSNET algorithms, can infer the structure of a GRN from data with higher accuracy. We have also estimated the number of the samples required for potential future validation studies. Here, we presented our proposed analysis framework that caters not only to candidate regulator genes prediction for potential validation experiments but also an estimation of the number of samples required for these experiments.
topic	gene regulatory network causal modelling omics integration experimental design
url	https://www.mdpi.com/1422-0067/21/21/7886
work_keys_str_mv	AT furqanaziz biomarkerprioritisationandpowerestimationusingensemblegeneregulatorynetworkinference AT animeshacharjee biomarkerprioritisationandpowerestimationusingensemblegeneregulatorynetworkinference AT johnawilliams biomarkerprioritisationandpowerestimationusingensemblegeneregulatorynetworkinference AT dominicruss biomarkerprioritisationandpowerestimationusingensemblegeneregulatorynetworkinference AT laurabravomerodio biomarkerprioritisationandpowerestimationusingensemblegeneregulatorynetworkinference AT georgiosvgkoutos biomarkerprioritisationandpowerestimationusingensemblegeneregulatorynetworkinference
_version_	1724544044052250624

Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference

Similar Items