Discovering collectively informative descriptors from high-throughput experiments

Abstract Background Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the r...

Full description

Bibliographic Details
Main Authors:	Ward William O, Jeffries Clark D, Perkins Diana O, Wright Fred A
Format:	Article
Language:	English
Published:	BMC 2009-12-01
Series:	BMC Bioinformatics
Online Access:	http://www.biomedcentral.com/1471-2105/10/431

id	doaj-f628f51c0bd244eca896830bbeefe074
record_format	Article
spelling	doaj-f628f51c0bd244eca896830bbeefe0742020-11-25T00:33:39ZengBMCBMC Bioinformatics1471-21052009-12-0110143110.1186/1471-2105-10-431Discovering collectively informative descriptors from high-throughput experimentsWard William OJeffries Clark DPerkins Diana OWright Fred A<p>Abstract</p> <p>Background</p> <p>Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the reliability and generalizability of results and also yield new insights that guide future research.</p> <p>Results</p> <p>This paper describes a novel algorithm called BLANKET for symmetric analysis of two experiments that assess informativeness of descriptors. The experiments are required to be related only in that their descriptor sets intersect substantially and their definitions of case and control are consistent. From resulting lists of n descriptors ranked by informativeness, BLANKET determines <b>shortlists </b>of descriptors from each experiment, generally of different lengths p and q. For any pair of shortlists, four numbers are evident: the number of descriptors appearing in both shortlists, in exactly one shortlist, or in neither shortlist. From the associated contingency table, BLANKET computes Right Fisher Exact Test (RFET) values used as scores over a plane of possible pairs of shortlist lengths <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. BLANKET then chooses a pair or pairs with RFET score less than a threshold; the threshold depends upon n and shortlist length limits and represents a quality of intersection achieved by less than 5% of random lists.</p> <p>Conclusions</p> <p>Researchers seek within a universe of descriptors some minimal subset that collectively and efficiently predicts experimental outcomes. Ideally, any smaller subset should be insufficient for reliable prediction and any larger subset should have little additional accuracy. As a method, BLANKET is easy to conceptualize and presents only moderate computational complexity. Many existing databases could be mined using BLANKET to suggest optimal sets of predictive descriptors.</p> http://www.biomedcentral.com/1471-2105/10/431
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Ward William O Jeffries Clark D Perkins Diana O Wright Fred A
spellingShingle	Ward William O Jeffries Clark D Perkins Diana O Wright Fred A Discovering collectively informative descriptors from high-throughput experiments BMC Bioinformatics
author_facet	Ward William O Jeffries Clark D Perkins Diana O Wright Fred A
author_sort	Ward William O
title	Discovering collectively informative descriptors from high-throughput experiments
title_short	Discovering collectively informative descriptors from high-throughput experiments
title_full	Discovering collectively informative descriptors from high-throughput experiments
title_fullStr	Discovering collectively informative descriptors from high-throughput experiments
title_full_unstemmed	Discovering collectively informative descriptors from high-throughput experiments
title_sort	discovering collectively informative descriptors from high-throughput experiments
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2009-12-01
description	<p>Abstract</p> <p>Background</p> <p>Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the reliability and generalizability of results and also yield new insights that guide future research.</p> <p>Results</p> <p>This paper describes a novel algorithm called BLANKET for symmetric analysis of two experiments that assess informativeness of descriptors. The experiments are required to be related only in that their descriptor sets intersect substantially and their definitions of case and control are consistent. From resulting lists of n descriptors ranked by informativeness, BLANKET determines <b>shortlists </b>of descriptors from each experiment, generally of different lengths p and q. For any pair of shortlists, four numbers are evident: the number of descriptors appearing in both shortlists, in exactly one shortlist, or in neither shortlist. From the associated contingency table, BLANKET computes Right Fisher Exact Test (RFET) values used as scores over a plane of possible pairs of shortlist lengths <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. BLANKET then chooses a pair or pairs with RFET score less than a threshold; the threshold depends upon n and shortlist length limits and represents a quality of intersection achieved by less than 5% of random lists.</p> <p>Conclusions</p> <p>Researchers seek within a universe of descriptors some minimal subset that collectively and efficiently predicts experimental outcomes. Ideally, any smaller subset should be insufficient for reliable prediction and any larger subset should have little additional accuracy. As a method, BLANKET is easy to conceptualize and presents only moderate computational complexity. Many existing databases could be mined using BLANKET to suggest optimal sets of predictive descriptors.</p>
url	http://www.biomedcentral.com/1471-2105/10/431
work_keys_str_mv	AT wardwilliamo discoveringcollectivelyinformativedescriptorsfromhighthroughputexperiments AT jeffriesclarkd discoveringcollectivelyinformativedescriptorsfromhighthroughputexperiments AT perkinsdianao discoveringcollectivelyinformativedescriptorsfromhighthroughputexperiments AT wrightfreda discoveringcollectivelyinformativedescriptorsfromhighthroughputexperiments
_version_	1725315637256912896

Discovering collectively informative descriptors from high-throughput experiments

Similar Items