Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies.

A diversity of tools is available for identification of variants from genome sequence data. Given the current complexity of incorporating external software into a genome analysis infrastructure, a tendency exists to rely on the results from a single tool alone. The quality of the output variant call...

Full description

Bibliographic Details
Main Authors: Matthew A Field, Vicky Cho, T Daniel Andrews, Chris C Goodnow
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0143199
id doaj-d9a9160d8d2d434d9de1370582bae9c0
record_format Article
spelling doaj-d9a9160d8d2d434d9de1370582bae9c02021-03-03T19:57:32ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-011011e014319910.1371/journal.pone.0143199Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies.Matthew A FieldVicky ChoT Daniel AndrewsChris C GoodnowA diversity of tools is available for identification of variants from genome sequence data. Given the current complexity of incorporating external software into a genome analysis infrastructure, a tendency exists to rely on the results from a single tool alone. The quality of the output variant calls is highly variable however, depending on factors such as sequence library quality as well as the choice of short-read aligner, variant caller, and variant caller filtering strategy. Here we present a two-part study first using the high quality 'genome in a bottle' reference set to demonstrate the significant impact the choice of aligner, variant caller, and variant caller filtering strategy has on overall variant call quality and further how certain variant callers outperform others with increased sample contamination, an important consideration when analyzing sequenced cancer samples. This analysis confirms previous work showing that combining variant calls of multiple tools results in the best quality resultant variant set, for either specificity or sensitivity, depending on whether the intersection or union, of all variant calls is used respectively. Second, we analyze a melanoma cell line derived from a control lymphocyte sample to determine whether software choices affect the detection of clinically important melanoma risk-factor variants finding that only one of the three such variants is unanimously detected under all conditions. Finally, we describe a cogent strategy for implementing a clinical variant detection pipeline; a strategy that requires careful software selection, variant caller filtering optimizing, and combined variant calls in order to effectively minimize false negative variants. While implementing such features represents an increase in complexity and computation the results offer indisputable improvements in data quality.https://doi.org/10.1371/journal.pone.0143199
collection DOAJ
language English
format Article
sources DOAJ
author Matthew A Field
Vicky Cho
T Daniel Andrews
Chris C Goodnow
spellingShingle Matthew A Field
Vicky Cho
T Daniel Andrews
Chris C Goodnow
Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies.
PLoS ONE
author_facet Matthew A Field
Vicky Cho
T Daniel Andrews
Chris C Goodnow
author_sort Matthew A Field
title Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies.
title_short Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies.
title_full Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies.
title_fullStr Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies.
title_full_unstemmed Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies.
title_sort reliably detecting clinically important variants requires both combined variant calls and optimized filtering strategies.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2015-01-01
description A diversity of tools is available for identification of variants from genome sequence data. Given the current complexity of incorporating external software into a genome analysis infrastructure, a tendency exists to rely on the results from a single tool alone. The quality of the output variant calls is highly variable however, depending on factors such as sequence library quality as well as the choice of short-read aligner, variant caller, and variant caller filtering strategy. Here we present a two-part study first using the high quality 'genome in a bottle' reference set to demonstrate the significant impact the choice of aligner, variant caller, and variant caller filtering strategy has on overall variant call quality and further how certain variant callers outperform others with increased sample contamination, an important consideration when analyzing sequenced cancer samples. This analysis confirms previous work showing that combining variant calls of multiple tools results in the best quality resultant variant set, for either specificity or sensitivity, depending on whether the intersection or union, of all variant calls is used respectively. Second, we analyze a melanoma cell line derived from a control lymphocyte sample to determine whether software choices affect the detection of clinically important melanoma risk-factor variants finding that only one of the three such variants is unanimously detected under all conditions. Finally, we describe a cogent strategy for implementing a clinical variant detection pipeline; a strategy that requires careful software selection, variant caller filtering optimizing, and combined variant calls in order to effectively minimize false negative variants. While implementing such features represents an increase in complexity and computation the results offer indisputable improvements in data quality.
url https://doi.org/10.1371/journal.pone.0143199
work_keys_str_mv AT matthewafield reliablydetectingclinicallyimportantvariantsrequiresbothcombinedvariantcallsandoptimizedfilteringstrategies
AT vickycho reliablydetectingclinicallyimportantvariantsrequiresbothcombinedvariantcallsandoptimizedfilteringstrategies
AT tdanielandrews reliablydetectingclinicallyimportantvariantsrequiresbothcombinedvariantcallsandoptimizedfilteringstrategies
AT chriscgoodnow reliablydetectingclinicallyimportantvariantsrequiresbothcombinedvariantcallsandoptimizedfilteringstrategies
_version_ 1714824867264593920