Generalizability of deep learning models for dental image analysis

Abstract We assessed the generalizability of deep learning models and how to improve it. Our exemplary use-case was the detection of apical lesions on panoramic radiographs. We employed two datasets of panoramic radiographs from two centers, one in Germany (Charité, Berlin, n = 650) and one in India...

Full description

Bibliographic Details
Main Authors:	Joachim Krois, Anselmo Garcia Cantu, Akhilanand Chaurasia, Ranjitkumar Patil, Prabhat Kumar Chaudhari, Robert Gaudin, Sascha Gehrung, Falk Schwendicke
Format:	Article
Language:	English
Published:	Nature Publishing Group 2021-03-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-021-85454-5

id	doaj-ef0ba152d5814f2e9b3dca9c1db648df
record_format	Article
spelling	doaj-ef0ba152d5814f2e9b3dca9c1db648df2021-03-21T12:34:50ZengNature Publishing GroupScientific Reports2045-23222021-03-011111710.1038/s41598-021-85454-5Generalizability of deep learning models for dental image analysisJoachim Krois0Anselmo Garcia Cantu1Akhilanand Chaurasia2Ranjitkumar Patil3Prabhat Kumar Chaudhari4Robert Gaudin5Sascha Gehrung6Falk Schwendicke7Department of Oral Diagnostics, Digital Health and Health Services Research, Charité - Universitätsmedizin BerlinDepartment of Oral Diagnostics, Digital Health and Health Services Research, Charité - Universitätsmedizin BerlinDepartment of Oral Medicine and Radiology, King George’s Medical UniversityDepartment of Oral Medicine and Radiology, King George’s Medical UniversityDivision of Orthodontics and Dentofacial Deformities, AIIMSDepartment of Oral and Maxillofacial Surgery, Charité - Universitätsmedizin BerlinDepartment of Oral Diagnostics, Digital Health and Health Services Research, Charité - Universitätsmedizin BerlinDepartment of Oral Diagnostics, Digital Health and Health Services Research, Charité - Universitätsmedizin BerlinAbstract We assessed the generalizability of deep learning models and how to improve it. Our exemplary use-case was the detection of apical lesions on panoramic radiographs. We employed two datasets of panoramic radiographs from two centers, one in Germany (Charité, Berlin, n = 650) and one in India (KGMU, Lucknow, n = 650): First, U-Net type models were trained on images from Charité (n = 500) and assessed on test sets from Charité and KGMU (each n = 150). Second, the relevance of image characteristics was explored using pixel-value transformations, aligning the image characteristics in the datasets. Third, cross-center training effects on generalizability were evaluated by stepwise replacing Charite with KGMU images. Last, we assessed the impact of the dental status (presence of root-canal fillings or restorations). Models trained only on Charité images showed a (mean ± SD) F1-score of 54.1 ± 0.8% on Charité and 32.7 ± 0.8% on KGMU data (p < 0.001/t-test). Alignment of image data characteristics between the centers did not improve generalizability. However, by gradually increasing the fraction of KGMU images in the training set (from 0 to 100%) the F1-score on KGMU images improved (46.1 ± 0.9%) at a moderate decrease on Charité images (50.9 ± 0.9%, p < 0.01). Model performance was good on KGMU images showing root-canal fillings and/or restorations, but much lower on KGMU images without root-canal fillings and/or restorations. Our deep learning models were not generalizable across centers. Cross-center training improved generalizability. Noteworthy, the dental status, but not image characteristics were relevant. Understanding the reasons behind limits in generalizability helps to mitigate generalizability problems.https://doi.org/10.1038/s41598-021-85454-5
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Joachim Krois Anselmo Garcia Cantu Akhilanand Chaurasia Ranjitkumar Patil Prabhat Kumar Chaudhari Robert Gaudin Sascha Gehrung Falk Schwendicke
spellingShingle	Joachim Krois Anselmo Garcia Cantu Akhilanand Chaurasia Ranjitkumar Patil Prabhat Kumar Chaudhari Robert Gaudin Sascha Gehrung Falk Schwendicke Generalizability of deep learning models for dental image analysis Scientific Reports
author_facet	Joachim Krois Anselmo Garcia Cantu Akhilanand Chaurasia Ranjitkumar Patil Prabhat Kumar Chaudhari Robert Gaudin Sascha Gehrung Falk Schwendicke
author_sort	Joachim Krois
title	Generalizability of deep learning models for dental image analysis
title_short	Generalizability of deep learning models for dental image analysis
title_full	Generalizability of deep learning models for dental image analysis
title_fullStr	Generalizability of deep learning models for dental image analysis
title_full_unstemmed	Generalizability of deep learning models for dental image analysis
title_sort	generalizability of deep learning models for dental image analysis
publisher	Nature Publishing Group
series	Scientific Reports
issn	2045-2322
publishDate	2021-03-01
description	Abstract We assessed the generalizability of deep learning models and how to improve it. Our exemplary use-case was the detection of apical lesions on panoramic radiographs. We employed two datasets of panoramic radiographs from two centers, one in Germany (Charité, Berlin, n = 650) and one in India (KGMU, Lucknow, n = 650): First, U-Net type models were trained on images from Charité (n = 500) and assessed on test sets from Charité and KGMU (each n = 150). Second, the relevance of image characteristics was explored using pixel-value transformations, aligning the image characteristics in the datasets. Third, cross-center training effects on generalizability were evaluated by stepwise replacing Charite with KGMU images. Last, we assessed the impact of the dental status (presence of root-canal fillings or restorations). Models trained only on Charité images showed a (mean ± SD) F1-score of 54.1 ± 0.8% on Charité and 32.7 ± 0.8% on KGMU data (p < 0.001/t-test). Alignment of image data characteristics between the centers did not improve generalizability. However, by gradually increasing the fraction of KGMU images in the training set (from 0 to 100%) the F1-score on KGMU images improved (46.1 ± 0.9%) at a moderate decrease on Charité images (50.9 ± 0.9%, p < 0.01). Model performance was good on KGMU images showing root-canal fillings and/or restorations, but much lower on KGMU images without root-canal fillings and/or restorations. Our deep learning models were not generalizable across centers. Cross-center training improved generalizability. Noteworthy, the dental status, but not image characteristics were relevant. Understanding the reasons behind limits in generalizability helps to mitigate generalizability problems.
url	https://doi.org/10.1038/s41598-021-85454-5
work_keys_str_mv	AT joachimkrois generalizabilityofdeeplearningmodelsfordentalimageanalysis AT anselmogarciacantu generalizabilityofdeeplearningmodelsfordentalimageanalysis AT akhilanandchaurasia generalizabilityofdeeplearningmodelsfordentalimageanalysis AT ranjitkumarpatil generalizabilityofdeeplearningmodelsfordentalimageanalysis AT prabhatkumarchaudhari generalizabilityofdeeplearningmodelsfordentalimageanalysis AT robertgaudin generalizabilityofdeeplearningmodelsfordentalimageanalysis AT saschagehrung generalizabilityofdeeplearningmodelsfordentalimageanalysis AT falkschwendicke generalizabilityofdeeplearningmodelsfordentalimageanalysis
_version_	1724210374179291136

Generalizability of deep learning models for dental image analysis

Similar Items