Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability

Since the outbreak of the COVID-19 pandemic, worldwide research efforts have focused on using artificial intelligence (AI) technologies on various medical data of COVID-19–positive patients in order to identify or classify various aspects of the disease, with promising reported results. However, con...

Full description

Bibliographic Details
Main Authors: Dan Nguyen, Fernando Kay, Jun Tan, Yulong Yan, Yee Seng Ng, Puneeth Iyengar, Ron Peshock, Steve Jiang
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-06-01
Series:Frontiers in Artificial Intelligence
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frai.2021.694875/full
id doaj-1cd1a43a39824a368149286bdca6276b
record_format Article
spelling doaj-1cd1a43a39824a368149286bdca6276b2021-06-29T05:33:27ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122021-06-01410.3389/frai.2021.694875694875Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model GeneralizabilityDan Nguyen0Dan Nguyen1Fernando Kay2Jun Tan3Yulong Yan4Yee Seng Ng5Puneeth Iyengar6Ron Peshock7Steve Jiang8Steve Jiang9Medical Artificial Intelligence and Automation (MAIA) Laboratory, University of Texas Southwestern Medical Center, Dallas, TX, United StatesDepartment of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX, United StatesDepartment of Radiology, University of Texas Southwestern Medical Center, Dallas, TX, United StatesDepartment of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX, United StatesDepartment of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX, United StatesDepartment of Radiology, University of Texas Southwestern Medical Center, Dallas, TX, United StatesDepartment of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX, United StatesDepartment of Radiology, University of Texas Southwestern Medical Center, Dallas, TX, United StatesMedical Artificial Intelligence and Automation (MAIA) Laboratory, University of Texas Southwestern Medical Center, Dallas, TX, United StatesDepartment of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX, United StatesSince the outbreak of the COVID-19 pandemic, worldwide research efforts have focused on using artificial intelligence (AI) technologies on various medical data of COVID-19–positive patients in order to identify or classify various aspects of the disease, with promising reported results. However, concerns have been raised over their generalizability, given the heterogeneous factors in training datasets. This study aims to examine the severity of this problem by evaluating deep learning (DL) classification models trained to identify COVID-19–positive patients on 3D computed tomography (CT) datasets from different countries. We collected one dataset at UT Southwestern (UTSW) and three external datasets from different countries: CC-CCII Dataset (China), COVID-CTset (Iran), and MosMedData (Russia). We divided the data into two classes: COVID-19–positive and COVID-19–negative patients. We trained nine identical DL-based classification models by using combinations of datasets with a 72% train, 8% validation, and 20% test data split. The models trained on a single dataset achieved accuracy/area under the receiver operating characteristic curve (AUC) values of 0.87/0.826 (UTSW), 0.97/0.988 (CC-CCCI), and 0.86/0.873 (COVID-CTset) when evaluated on their own dataset. The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better. However, the performance dropped close to an AUC of 0.5 (random guess) for all models when evaluated on a different dataset outside of its training datasets. Including MosMedData, which only contained positive labels, into the training datasets did not necessarily help the performance of other datasets. Multiple factors likely contributed to these results, such as patient demographics and differences in image acquisition or reconstruction, causing a data shift among different study cohorts.https://www.frontiersin.org/articles/10.3389/frai.2021.694875/fulldeep learninggeneralizabilityconvolutional neural networkclassificationcomputed tomographyCOVID-19
collection DOAJ
language English
format Article
sources DOAJ
author Dan Nguyen
Dan Nguyen
Fernando Kay
Jun Tan
Yulong Yan
Yee Seng Ng
Puneeth Iyengar
Ron Peshock
Steve Jiang
Steve Jiang
spellingShingle Dan Nguyen
Dan Nguyen
Fernando Kay
Jun Tan
Yulong Yan
Yee Seng Ng
Puneeth Iyengar
Ron Peshock
Steve Jiang
Steve Jiang
Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
Frontiers in Artificial Intelligence
deep learning
generalizability
convolutional neural network
classification
computed tomography
COVID-19
author_facet Dan Nguyen
Dan Nguyen
Fernando Kay
Jun Tan
Yulong Yan
Yee Seng Ng
Puneeth Iyengar
Ron Peshock
Steve Jiang
Steve Jiang
author_sort Dan Nguyen
title Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_short Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_full Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_fullStr Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_full_unstemmed Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability
title_sort deep learning–based covid-19 pneumonia classification using chest ct images: model generalizability
publisher Frontiers Media S.A.
series Frontiers in Artificial Intelligence
issn 2624-8212
publishDate 2021-06-01
description Since the outbreak of the COVID-19 pandemic, worldwide research efforts have focused on using artificial intelligence (AI) technologies on various medical data of COVID-19–positive patients in order to identify or classify various aspects of the disease, with promising reported results. However, concerns have been raised over their generalizability, given the heterogeneous factors in training datasets. This study aims to examine the severity of this problem by evaluating deep learning (DL) classification models trained to identify COVID-19–positive patients on 3D computed tomography (CT) datasets from different countries. We collected one dataset at UT Southwestern (UTSW) and three external datasets from different countries: CC-CCII Dataset (China), COVID-CTset (Iran), and MosMedData (Russia). We divided the data into two classes: COVID-19–positive and COVID-19–negative patients. We trained nine identical DL-based classification models by using combinations of datasets with a 72% train, 8% validation, and 20% test data split. The models trained on a single dataset achieved accuracy/area under the receiver operating characteristic curve (AUC) values of 0.87/0.826 (UTSW), 0.97/0.988 (CC-CCCI), and 0.86/0.873 (COVID-CTset) when evaluated on their own dataset. The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better. However, the performance dropped close to an AUC of 0.5 (random guess) for all models when evaluated on a different dataset outside of its training datasets. Including MosMedData, which only contained positive labels, into the training datasets did not necessarily help the performance of other datasets. Multiple factors likely contributed to these results, such as patient demographics and differences in image acquisition or reconstruction, causing a data shift among different study cohorts.
topic deep learning
generalizability
convolutional neural network
classification
computed tomography
COVID-19
url https://www.frontiersin.org/articles/10.3389/frai.2021.694875/full
work_keys_str_mv AT dannguyen deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
AT dannguyen deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
AT fernandokay deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
AT juntan deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
AT yulongyan deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
AT yeesengng deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
AT puneethiyengar deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
AT ronpeshock deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
AT stevejiang deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
AT stevejiang deeplearningbasedcovid19pneumoniaclassificationusingchestctimagesmodelgeneralizability
_version_ 1721355388731785216