Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments

<p>Abstract</p> <p>Background</p> <p>Systematic processing noise, which includes batch effects, is very common in microarray experiments but is often ignored despite its potential to confound or compromise experimental results. Compromised results are most likely when r...

Full description

Bibliographic Details
Main Authors: Kitchen Robert R, Sabine Vicky S, Simen Arthur A, Dixon J Michael, Bartlett John MS, Sims Andrew H
Format: Article
Language:English
Published: BMC 2011-12-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/12/589
id doaj-c5de05c6895c4ffea6d5c4b9fe42cf7f
record_format Article
spelling doaj-c5de05c6895c4ffea6d5c4b9fe42cf7f2020-11-24T22:21:03ZengBMCBMC Genomics1471-21642011-12-0112158910.1186/1471-2164-12-589Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experimentsKitchen Robert RSabine Vicky SSimen Arthur ADixon J MichaelBartlett John MSSims Andrew H<p>Abstract</p> <p>Background</p> <p>Systematic processing noise, which includes batch effects, is very common in microarray experiments but is often ignored despite its potential to confound or compromise experimental results. Compromised results are most likely when re-analysing or integrating datasets from public repositories due to the different conditions under which each dataset is generated. To better understand the relative noise-contributions of various factors in experimental-design, we assessed several Illumina and Affymetrix datasets for technical variation between replicate hybridisations of Universal Human Reference (UHRR) and individual or pooled breast-tumour RNA.</p> <p>Results</p> <p>A varying degree of systematic noise was observed in each of the datasets, however in all cases the relative amount of variation between standard control RNA replicates was found to be greatest at earlier points in the sample-preparation workflow. For example, 40.6% of the total variation in reported expressions were attributed to replicate extractions, compared to 13.9% due to amplification/labelling and 10.8% between replicate hybridisations. Deliberate probe-wise batch-correction methods were effective in reducing the magnitude of this variation, although the level of improvement was dependent on the sources of noise included in the model. Systematic noise introduced at the chip, run, and experiment levels of a combined Illumina dataset were found to be highly dependant upon the experimental design. Both UHRR and pools of RNA, which were derived from the samples of interest, modelled technical variation well although the pools were significantly better correlated (4% average improvement) and better emulated the effects of systematic noise, over all probes, than the UHRRs. The effect of this noise was not uniform over all probes, with low GC-content probes found to be more vulnerable to batch variation than probes with a higher GC-content.</p> <p>Conclusions</p> <p>The magnitude of systematic processing noise in a microarray experiment is variable across probes and experiments, however it is generally the case that procedures earlier in the sample-preparation workflow are liable to introduce the most noise. Careful experimental design is important to protect against noise, detailed meta-data should always be provided, and diagnostic procedures should be routinely performed prior to downstream analyses for the detection of bias in microarray studies.</p> http://www.biomedcentral.com/1471-2164/12/589
collection DOAJ
language English
format Article
sources DOAJ
author Kitchen Robert R
Sabine Vicky S
Simen Arthur A
Dixon J Michael
Bartlett John MS
Sims Andrew H
spellingShingle Kitchen Robert R
Sabine Vicky S
Simen Arthur A
Dixon J Michael
Bartlett John MS
Sims Andrew H
Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments
BMC Genomics
author_facet Kitchen Robert R
Sabine Vicky S
Simen Arthur A
Dixon J Michael
Bartlett John MS
Sims Andrew H
author_sort Kitchen Robert R
title Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments
title_short Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments
title_full Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments
title_fullStr Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments
title_full_unstemmed Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments
title_sort relative impact of key sources of systematic noise in affymetrix and illumina gene-expression microarray experiments
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2011-12-01
description <p>Abstract</p> <p>Background</p> <p>Systematic processing noise, which includes batch effects, is very common in microarray experiments but is often ignored despite its potential to confound or compromise experimental results. Compromised results are most likely when re-analysing or integrating datasets from public repositories due to the different conditions under which each dataset is generated. To better understand the relative noise-contributions of various factors in experimental-design, we assessed several Illumina and Affymetrix datasets for technical variation between replicate hybridisations of Universal Human Reference (UHRR) and individual or pooled breast-tumour RNA.</p> <p>Results</p> <p>A varying degree of systematic noise was observed in each of the datasets, however in all cases the relative amount of variation between standard control RNA replicates was found to be greatest at earlier points in the sample-preparation workflow. For example, 40.6% of the total variation in reported expressions were attributed to replicate extractions, compared to 13.9% due to amplification/labelling and 10.8% between replicate hybridisations. Deliberate probe-wise batch-correction methods were effective in reducing the magnitude of this variation, although the level of improvement was dependent on the sources of noise included in the model. Systematic noise introduced at the chip, run, and experiment levels of a combined Illumina dataset were found to be highly dependant upon the experimental design. Both UHRR and pools of RNA, which were derived from the samples of interest, modelled technical variation well although the pools were significantly better correlated (4% average improvement) and better emulated the effects of systematic noise, over all probes, than the UHRRs. The effect of this noise was not uniform over all probes, with low GC-content probes found to be more vulnerable to batch variation than probes with a higher GC-content.</p> <p>Conclusions</p> <p>The magnitude of systematic processing noise in a microarray experiment is variable across probes and experiments, however it is generally the case that procedures earlier in the sample-preparation workflow are liable to introduce the most noise. Careful experimental design is important to protect against noise, detailed meta-data should always be provided, and diagnostic procedures should be routinely performed prior to downstream analyses for the detection of bias in microarray studies.</p>
url http://www.biomedcentral.com/1471-2164/12/589
work_keys_str_mv AT kitchenrobertr relativeimpactofkeysourcesofsystematicnoiseinaffymetrixandilluminageneexpressionmicroarrayexperiments
AT sabinevickys relativeimpactofkeysourcesofsystematicnoiseinaffymetrixandilluminageneexpressionmicroarrayexperiments
AT simenarthura relativeimpactofkeysourcesofsystematicnoiseinaffymetrixandilluminageneexpressionmicroarrayexperiments
AT dixonjmichael relativeimpactofkeysourcesofsystematicnoiseinaffymetrixandilluminageneexpressionmicroarrayexperiments
AT bartlettjohnms relativeimpactofkeysourcesofsystematicnoiseinaffymetrixandilluminageneexpressionmicroarrayexperiments
AT simsandrewh relativeimpactofkeysourcesofsystematicnoiseinaffymetrixandilluminageneexpressionmicroarrayexperiments
_version_ 1725772529576968192