QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data

Abstract Background Mixed infections of Mycobacterium tuberculosis and antibiotic heteroresistance continue to complicate tuberculosis (TB) diagnosis and treatment. Detection of mixed infections has been limited to molecular genotyping techniques, which lack the sensitivity and resolution to accurat...

Full description

Bibliographic Details
Main Authors: Christine Anyansi, Arlin Keo, Bruce J. Walker, Timothy J. Straub, Abigail L. Manson, Ashlee M. Earl, Thomas Abeel
Format: Article
Language:English
Published: BMC 2020-01-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-020-6486-3
id doaj-9d60d92478e145f19b9181138a009445
record_format Article
spelling doaj-9d60d92478e145f19b9181138a0094452021-01-31T16:11:53ZengBMCBMC Genomics1471-21642020-01-0121111610.1186/s12864-020-6486-3QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing dataChristine Anyansi0Arlin Keo1Bruce J. Walker2Timothy J. Straub3Abigail L. Manson4Ashlee M. Earl5Thomas Abeel6Delft Bioinformatics Lab, Delft University of TechnologyDelft Bioinformatics Lab, Delft University of TechnologyInfectious Disease and Microbiome Program, Broad Institute of MIT and HarvardInfectious Disease and Microbiome Program, Broad Institute of MIT and HarvardInfectious Disease and Microbiome Program, Broad Institute of MIT and HarvardInfectious Disease and Microbiome Program, Broad Institute of MIT and HarvardDelft Bioinformatics Lab, Delft University of TechnologyAbstract Background Mixed infections of Mycobacterium tuberculosis and antibiotic heteroresistance continue to complicate tuberculosis (TB) diagnosis and treatment. Detection of mixed infections has been limited to molecular genotyping techniques, which lack the sensitivity and resolution to accurately estimate the multiplicity of TB infections. In contrast, whole genome sequencing offers sensitive views of the genetic differences between strains of M. tuberculosis within a sample. Although metagenomic tools exist to classify strains in a metagenomic sample, most tools have been developed for more divergent species, and therefore cannot provide the sensitivity required to disentangle strains within closely related bacterial species such as M. tuberculosis. Here we present QuantTB, a method to identify and quantify individual M. tuberculosis strains in whole genome sequencing data. QuantTB uses SNP markers to determine the combination of strains that best explain the allelic variation observed in a sample. QuantTB outputs a list of identified strains, their corresponding relative abundances, and a list of drugs for which resistance-conferring mutations (or heteroresistance) have been predicted within the sample. Results We show that QuantTB has a high degree of resolution and is capable of differentiating communities differing by less than 25 SNPs and identifying strains down to 1× coverage. Using simulated data, we found QuantTB outperformed other metagenomic strain identification tools at detecting strains and quantifying strain multiplicity. In a real-world scenario, using a dataset of 50 paired clinical isolates from a study of patients with either reinfections or relapses, we found that QuantTB could detect mixed infections and reinfections at rates concordant with a manually curated approach. Conclusion QuantTB can determine infection multiplicity, identify hetero-resistance patterns, enable differentiation between relapse and re-infection, and clarify transmission events across seemingly unrelated patients – even in low-coverage (1×) samples. QuantTB outperforms existing tools and promises to serve as a valuable resource for both clinicians and researchers working with clinical TB samples.https://doi.org/10.1186/s12864-020-6486-3TuberculosisMycobacterium tuberculosisMixed infectionMetagenomicsStrain level classificationStrain identification
collection DOAJ
language English
format Article
sources DOAJ
author Christine Anyansi
Arlin Keo
Bruce J. Walker
Timothy J. Straub
Abigail L. Manson
Ashlee M. Earl
Thomas Abeel
spellingShingle Christine Anyansi
Arlin Keo
Bruce J. Walker
Timothy J. Straub
Abigail L. Manson
Ashlee M. Earl
Thomas Abeel
QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data
BMC Genomics
Tuberculosis
Mycobacterium tuberculosis
Mixed infection
Metagenomics
Strain level classification
Strain identification
author_facet Christine Anyansi
Arlin Keo
Bruce J. Walker
Timothy J. Straub
Abigail L. Manson
Ashlee M. Earl
Thomas Abeel
author_sort Christine Anyansi
title QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data
title_short QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data
title_full QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data
title_fullStr QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data
title_full_unstemmed QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data
title_sort quanttb – a method to classify mixed mycobacterium tuberculosis infections within whole genome sequencing data
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2020-01-01
description Abstract Background Mixed infections of Mycobacterium tuberculosis and antibiotic heteroresistance continue to complicate tuberculosis (TB) diagnosis and treatment. Detection of mixed infections has been limited to molecular genotyping techniques, which lack the sensitivity and resolution to accurately estimate the multiplicity of TB infections. In contrast, whole genome sequencing offers sensitive views of the genetic differences between strains of M. tuberculosis within a sample. Although metagenomic tools exist to classify strains in a metagenomic sample, most tools have been developed for more divergent species, and therefore cannot provide the sensitivity required to disentangle strains within closely related bacterial species such as M. tuberculosis. Here we present QuantTB, a method to identify and quantify individual M. tuberculosis strains in whole genome sequencing data. QuantTB uses SNP markers to determine the combination of strains that best explain the allelic variation observed in a sample. QuantTB outputs a list of identified strains, their corresponding relative abundances, and a list of drugs for which resistance-conferring mutations (or heteroresistance) have been predicted within the sample. Results We show that QuantTB has a high degree of resolution and is capable of differentiating communities differing by less than 25 SNPs and identifying strains down to 1× coverage. Using simulated data, we found QuantTB outperformed other metagenomic strain identification tools at detecting strains and quantifying strain multiplicity. In a real-world scenario, using a dataset of 50 paired clinical isolates from a study of patients with either reinfections or relapses, we found that QuantTB could detect mixed infections and reinfections at rates concordant with a manually curated approach. Conclusion QuantTB can determine infection multiplicity, identify hetero-resistance patterns, enable differentiation between relapse and re-infection, and clarify transmission events across seemingly unrelated patients – even in low-coverage (1×) samples. QuantTB outperforms existing tools and promises to serve as a valuable resource for both clinicians and researchers working with clinical TB samples.
topic Tuberculosis
Mycobacterium tuberculosis
Mixed infection
Metagenomics
Strain level classification
Strain identification
url https://doi.org/10.1186/s12864-020-6486-3
work_keys_str_mv AT christineanyansi quanttbamethodtoclassifymixedmycobacteriumtuberculosisinfectionswithinwholegenomesequencingdata
AT arlinkeo quanttbamethodtoclassifymixedmycobacteriumtuberculosisinfectionswithinwholegenomesequencingdata
AT brucejwalker quanttbamethodtoclassifymixedmycobacteriumtuberculosisinfectionswithinwholegenomesequencingdata
AT timothyjstraub quanttbamethodtoclassifymixedmycobacteriumtuberculosisinfectionswithinwholegenomesequencingdata
AT abigaillmanson quanttbamethodtoclassifymixedmycobacteriumtuberculosisinfectionswithinwholegenomesequencingdata
AT ashleemearl quanttbamethodtoclassifymixedmycobacteriumtuberculosisinfectionswithinwholegenomesequencingdata
AT thomasabeel quanttbamethodtoclassifymixedmycobacteriumtuberculosisinfectionswithinwholegenomesequencingdata
_version_ 1724316652390055936