Combining comparative genomic analysis with machine learning reveals some promising diagnostic markers to identify five common pathogenic non‐tuberculous mycobacteria

Summary Non‐tuberculous mycobacteria (NTM) can cause various respiratory diseases and even death in severe cases, and its incidence has increased rapidly worldwide. To date, it’s difficult to use routine diagnostic methods and strain identification to precisely diagnose various types of NTM infectio...

Full description

Bibliographic Details
Main Authors: Xinmiao Jia, Linfang Yang, Cuidan Li, Yingchun Xu, Qiwen Yang, Fei Chen
Format: Article
Language:English
Published: Wiley 2021-07-01
Series:Microbial Biotechnology
Online Access:https://doi.org/10.1111/1751-7915.13815
id doaj-bcf3b6ba06dc40cc8737622ec983aa4d
record_format Article
spelling doaj-bcf3b6ba06dc40cc8737622ec983aa4d2021-07-26T21:47:23ZengWileyMicrobial Biotechnology1751-79152021-07-011441539154910.1111/1751-7915.13815Combining comparative genomic analysis with machine learning reveals some promising diagnostic markers to identify five common pathogenic non‐tuberculous mycobacteriaXinmiao Jia0Linfang Yang1Cuidan Li2Yingchun Xu3Qiwen Yang4Fei Chen5Medical Research Center State Key laboratory of Complex Severe and Rare Diseases Peking Union Medical College Hospital Peking Union Medical College Beijing 100730 ChinaDepartments of Dermatology Affiliated Xingtai People’s Hospital of Hebei Medical University Xingtai, Hebei 054001 ChinaCAS Key Laboratory of Genome Sciences & Information China National Center for Bioinformation Chinese Academy of Sciences Beijing Institute of Genomics Beijing 100101 ChinaDepartment of Clinical Laboratory State Key laboratory of Complex Severe and Rare Diseases Peking Union Medical College Hospital Peking Union Medical College Chinese Academy of Medical Sciences Beijing 100730 ChinaDepartment of Clinical Laboratory State Key laboratory of Complex Severe and Rare Diseases Peking Union Medical College Hospital Peking Union Medical College Chinese Academy of Medical Sciences Beijing 100730 ChinaCAS Key Laboratory of Genome Sciences & Information China National Center for Bioinformation Chinese Academy of Sciences Beijing Institute of Genomics Beijing 100101 ChinaSummary Non‐tuberculous mycobacteria (NTM) can cause various respiratory diseases and even death in severe cases, and its incidence has increased rapidly worldwide. To date, it’s difficult to use routine diagnostic methods and strain identification to precisely diagnose various types of NTM infections. We combined systematic comparative genomics with machine learning to select new diagnostic markers for precisely identifying five common pathogenic NTMs (Mycobacterium kansasii, Mycobacterium avium, Mycobacterium intracellular, Mycobacterium chelonae, Mycobacterium abscessus). A panel including six genes and two SNPs (nikA, benM, codA, pfkA2, mpr, yjcH, rrl C2638T, rrl A1173G) was selected to simultaneously identify the five NTMs with high accuracy (> 90%). Notably, the panel only containing the six genes also showed a good classification effect (accuracy > 90%). Additionally, the two panels could precisely differentiate the five NTMs from M. tuberculosis (accuracy > 99%). We also revealed some new marker genes/SNPs/combinations to accurately discriminate any one of the five NTMs separately, which provided the possibility to diagnose one certain NTM infection precisely. Our research not only reveals novel promising diagnostic markers to promote the development of precision diagnosis in NTM infectious, but also provides an insight into precisely identifying various genetically close pathogens through comparative genomics and machine learning.https://doi.org/10.1111/1751-7915.13815
collection DOAJ
language English
format Article
sources DOAJ
author Xinmiao Jia
Linfang Yang
Cuidan Li
Yingchun Xu
Qiwen Yang
Fei Chen
spellingShingle Xinmiao Jia
Linfang Yang
Cuidan Li
Yingchun Xu
Qiwen Yang
Fei Chen
Combining comparative genomic analysis with machine learning reveals some promising diagnostic markers to identify five common pathogenic non‐tuberculous mycobacteria
Microbial Biotechnology
author_facet Xinmiao Jia
Linfang Yang
Cuidan Li
Yingchun Xu
Qiwen Yang
Fei Chen
author_sort Xinmiao Jia
title Combining comparative genomic analysis with machine learning reveals some promising diagnostic markers to identify five common pathogenic non‐tuberculous mycobacteria
title_short Combining comparative genomic analysis with machine learning reveals some promising diagnostic markers to identify five common pathogenic non‐tuberculous mycobacteria
title_full Combining comparative genomic analysis with machine learning reveals some promising diagnostic markers to identify five common pathogenic non‐tuberculous mycobacteria
title_fullStr Combining comparative genomic analysis with machine learning reveals some promising diagnostic markers to identify five common pathogenic non‐tuberculous mycobacteria
title_full_unstemmed Combining comparative genomic analysis with machine learning reveals some promising diagnostic markers to identify five common pathogenic non‐tuberculous mycobacteria
title_sort combining comparative genomic analysis with machine learning reveals some promising diagnostic markers to identify five common pathogenic non‐tuberculous mycobacteria
publisher Wiley
series Microbial Biotechnology
issn 1751-7915
publishDate 2021-07-01
description Summary Non‐tuberculous mycobacteria (NTM) can cause various respiratory diseases and even death in severe cases, and its incidence has increased rapidly worldwide. To date, it’s difficult to use routine diagnostic methods and strain identification to precisely diagnose various types of NTM infections. We combined systematic comparative genomics with machine learning to select new diagnostic markers for precisely identifying five common pathogenic NTMs (Mycobacterium kansasii, Mycobacterium avium, Mycobacterium intracellular, Mycobacterium chelonae, Mycobacterium abscessus). A panel including six genes and two SNPs (nikA, benM, codA, pfkA2, mpr, yjcH, rrl C2638T, rrl A1173G) was selected to simultaneously identify the five NTMs with high accuracy (> 90%). Notably, the panel only containing the six genes also showed a good classification effect (accuracy > 90%). Additionally, the two panels could precisely differentiate the five NTMs from M. tuberculosis (accuracy > 99%). We also revealed some new marker genes/SNPs/combinations to accurately discriminate any one of the five NTMs separately, which provided the possibility to diagnose one certain NTM infection precisely. Our research not only reveals novel promising diagnostic markers to promote the development of precision diagnosis in NTM infectious, but also provides an insight into precisely identifying various genetically close pathogens through comparative genomics and machine learning.
url https://doi.org/10.1111/1751-7915.13815
work_keys_str_mv AT xinmiaojia combiningcomparativegenomicanalysiswithmachinelearningrevealssomepromisingdiagnosticmarkerstoidentifyfivecommonpathogenicnontuberculousmycobacteria
AT linfangyang combiningcomparativegenomicanalysiswithmachinelearningrevealssomepromisingdiagnosticmarkerstoidentifyfivecommonpathogenicnontuberculousmycobacteria
AT cuidanli combiningcomparativegenomicanalysiswithmachinelearningrevealssomepromisingdiagnosticmarkerstoidentifyfivecommonpathogenicnontuberculousmycobacteria
AT yingchunxu combiningcomparativegenomicanalysiswithmachinelearningrevealssomepromisingdiagnosticmarkerstoidentifyfivecommonpathogenicnontuberculousmycobacteria
AT qiwenyang combiningcomparativegenomicanalysiswithmachinelearningrevealssomepromisingdiagnosticmarkerstoidentifyfivecommonpathogenicnontuberculousmycobacteria
AT feichen combiningcomparativegenomicanalysiswithmachinelearningrevealssomepromisingdiagnosticmarkerstoidentifyfivecommonpathogenicnontuberculousmycobacteria
_version_ 1721280681806397440