Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging
Breast cancer is one of the most lethal and widespread diseases affecting women worldwide. As a result, it is necessary to diagnose breast cancer accurately and efficiently utilizing the most cost-effective and widely used methods. In this research, we demonstrated that synthetically created high-qu...
| Published in: | Mathematics |
|---|---|
| Main Authors: | , , |
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-09-01
|
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/12/18/2808 |
| _version_ | 1850305787990114304 |
|---|---|
| author | Hari Mohan Rai Serhii Dashkevych Joon Yoo |
| author_facet | Hari Mohan Rai Serhii Dashkevych Joon Yoo |
| author_sort | Hari Mohan Rai |
| collection | DOAJ |
| container_title | Mathematics |
| description | Breast cancer is one of the most lethal and widespread diseases affecting women worldwide. As a result, it is necessary to diagnose breast cancer accurately and efficiently utilizing the most cost-effective and widely used methods. In this research, we demonstrated that synthetically created high-quality ultrasound data outperformed conventional augmentation strategies for efficiently diagnosing breast cancer using deep learning. We trained a deep-learning model using the EfficientNet-B7 architecture and a large dataset of 3186 ultrasound images acquired from multiple publicly available sources, as well as 10,000 synthetically generated images using generative adversarial networks (StyleGAN3). The model was trained using five-fold cross-validation techniques and validated using four metrics: accuracy, recall, precision, and the F1 score measure. The results showed that integrating synthetically produced data into the training set increased the classification accuracy from 88.72% to 92.01% based on the F1 score, demonstrating the power of generative models to expand and improve the quality of training datasets in medical-imaging applications. This demonstrated that training the model using a larger set of data comprising synthetic images significantly improved its performance by more than 3% over the genuine dataset with common augmentation. Various data augmentation procedures were also investigated to improve the training set’s diversity and representativeness. This research emphasizes the relevance of using modern artificial intelligence and machine-learning technologies in medical imaging by providing an effective strategy for categorizing ultrasound images, which may lead to increased diagnostic accuracy and optimal treatment options. The proposed techniques are highly promising and have strong potential for future clinical application in the diagnosis of breast cancer. |
| format | Article |
| id | doaj-art-8f7554d8a8e347fd8f2851c59cd601b2 |
| institution | Directory of Open Access Journals |
| issn | 2227-7390 |
| language | English |
| publishDate | 2024-09-01 |
| publisher | MDPI AG |
| record_format | Article |
| spelling | doaj-art-8f7554d8a8e347fd8f2851c59cd601b22025-08-19T23:29:16ZengMDPI AGMathematics2227-73902024-09-011218280810.3390/math12182808Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound ImagingHari Mohan Rai0Serhii Dashkevych1Joon Yoo2School of Computing, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, Republic of KoreaDepartment of Computer Engineering, Vistula University, Stokłosy 3, 02-787 Warszawa, PolandSchool of Computing, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, Republic of KoreaBreast cancer is one of the most lethal and widespread diseases affecting women worldwide. As a result, it is necessary to diagnose breast cancer accurately and efficiently utilizing the most cost-effective and widely used methods. In this research, we demonstrated that synthetically created high-quality ultrasound data outperformed conventional augmentation strategies for efficiently diagnosing breast cancer using deep learning. We trained a deep-learning model using the EfficientNet-B7 architecture and a large dataset of 3186 ultrasound images acquired from multiple publicly available sources, as well as 10,000 synthetically generated images using generative adversarial networks (StyleGAN3). The model was trained using five-fold cross-validation techniques and validated using four metrics: accuracy, recall, precision, and the F1 score measure. The results showed that integrating synthetically produced data into the training set increased the classification accuracy from 88.72% to 92.01% based on the F1 score, demonstrating the power of generative models to expand and improve the quality of training datasets in medical-imaging applications. This demonstrated that training the model using a larger set of data comprising synthetic images significantly improved its performance by more than 3% over the genuine dataset with common augmentation. Various data augmentation procedures were also investigated to improve the training set’s diversity and representativeness. This research emphasizes the relevance of using modern artificial intelligence and machine-learning technologies in medical imaging by providing an effective strategy for categorizing ultrasound images, which may lead to increased diagnostic accuracy and optimal treatment options. The proposed techniques are highly promising and have strong potential for future clinical application in the diagnosis of breast cancer.https://www.mdpi.com/2227-7390/12/18/2808ultrasound imagingbreast cancer detectiondeep learningsynthetic dataset generationStyleGAN3EffiecientNet-B7 |
| spellingShingle | Hari Mohan Rai Serhii Dashkevych Joon Yoo Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging ultrasound imaging breast cancer detection deep learning synthetic dataset generation StyleGAN3 EffiecientNet-B7 |
| title | Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging |
| title_full | Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging |
| title_fullStr | Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging |
| title_full_unstemmed | Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging |
| title_short | Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging |
| title_sort | next generation diagnostics the impact of synthetic data generation on the detection of breast cancer from ultrasound imaging |
| topic | ultrasound imaging breast cancer detection deep learning synthetic dataset generation StyleGAN3 EffiecientNet-B7 |
| url | https://www.mdpi.com/2227-7390/12/18/2808 |
| work_keys_str_mv | AT harimohanrai nextgenerationdiagnosticstheimpactofsyntheticdatagenerationonthedetectionofbreastcancerfromultrasoundimaging AT serhiidashkevych nextgenerationdiagnosticstheimpactofsyntheticdatagenerationonthedetectionofbreastcancerfromultrasoundimaging AT joonyoo nextgenerationdiagnosticstheimpactofsyntheticdatagenerationonthedetectionofbreastcancerfromultrasoundimaging |
