Conditional sliding windows: An approach for handling data limitation in colorectal histopathology image classification

Large amounts of data are required for the training process with a convolutional neural network (CNN) because small datasets with low variation will cause over-fitting, and the model cannot predict new data with high accuracy. Additionally, the non-availability of histopathological medical data pres...

Full description

Bibliographic Details
Main Authors: Toto Haryanto, Heru Suhartanto, Aniati Murni Arymurthy, Kusmardi Kusmardi
Format: Article
Language:English
Published: Elsevier 2021-01-01
Series:Informatics in Medicine Unlocked
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352914821000551
id doaj-0b50f33e3e86464db342922536e01be1
record_format Article
spelling doaj-0b50f33e3e86464db342922536e01be12021-04-18T06:28:13ZengElsevierInformatics in Medicine Unlocked2352-91482021-01-0123100565Conditional sliding windows: An approach for handling data limitation in colorectal histopathology image classificationToto Haryanto0Heru Suhartanto1Aniati Murni Arymurthy2Kusmardi Kusmardi3Faculty of Computer Science, Universitas Indonesia, Kampus Universitas Indonesia, Depok, 16424, Indonesia; Department of Computer Science, IPB University, Kampus IPB Dramaga, Bogor, 16680, Indonesia; Corresponding author. Faculty of Computer Science, Universitas Indonesia, Kampus Universitas Indonesia, Depok, 16424, Indonesia.Faculty of Computer Science, Universitas Indonesia, Kampus Universitas Indonesia, Depok, 16424, IndonesiaFaculty of Computer Science, Universitas Indonesia, Kampus Universitas Indonesia, Depok, 16424, IndonesiaDepartment of Pathology Anatomy, Faculty of Medicine, Universitas Indonesia, Kampus Universitas Indonesia, Jakarta, IndonesiaLarge amounts of data are required for the training process with a convolutional neural network (CNN) because small datasets with low variation will cause over-fitting, and the model cannot predict new data with high accuracy. Additionally, the non-availability of histopathological medical data presents an issue because without ethical permission, such data cannot be obtained easily. Therefore, this study proposes a conditional sliding window algorithm to obtain sub-sample data on images of histopathology.Two sets of original data were used, one from the Warwick dataset with dimensions of 775 × 522 pixels and the other from the Department of Pathology and Anatomy, Faculty of Medicine Universitas Indonesia. The algorithm used was inspired by the conventional sliding window method, but implemented with added conditions, such as sliding the window algorithm from the left on (x,y) pixel coordinates, thereby moving from left to right, then up to down until the entire image was covered. Consequently, the new image was produced with two dimensions: 200 × 200 and 300 × 300 pixels. However, to avoid loss of information, the 25 and 50 pixels overlap were used. In this study, CNN 7-5-7 was designed and proposed to perform the process.The conditional sliding window algorithm can produce various sub-samples depending on the image and window size. Furthermore, the images produced were used to develop a CNN and were proven to accurately predict benign and malignant tissues compared to the model from the original dataset. Moreover, the sensitivity values of the Warwick public dataset and the one generated in this study are above 0.80, which shows that the proposed CNN architecture is more stable compared to the existing methods such as AlexNet and DenseNet121.This study succeeded in solving the limitations of colorectal histopathological training data by developing a conditional sliding window algorithm. This algorithm can be applied to generate other histopathological data. Moreover, our proposed CNN 7-5-7 is the fastest architecture for training, comparable to state-of-the-art methodologies. Furthermore, the dataset was used to develop the model for colorectal cancer identification and integrated on the web-based application for further implementation.http://www.sciencedirect.com/science/article/pii/S2352914821000551AugmentationConvolutional neural networkConditional sliding windowsHistopathology
collection DOAJ
language English
format Article
sources DOAJ
author Toto Haryanto
Heru Suhartanto
Aniati Murni Arymurthy
Kusmardi Kusmardi
spellingShingle Toto Haryanto
Heru Suhartanto
Aniati Murni Arymurthy
Kusmardi Kusmardi
Conditional sliding windows: An approach for handling data limitation in colorectal histopathology image classification
Informatics in Medicine Unlocked
Augmentation
Convolutional neural network
Conditional sliding windows
Histopathology
author_facet Toto Haryanto
Heru Suhartanto
Aniati Murni Arymurthy
Kusmardi Kusmardi
author_sort Toto Haryanto
title Conditional sliding windows: An approach for handling data limitation in colorectal histopathology image classification
title_short Conditional sliding windows: An approach for handling data limitation in colorectal histopathology image classification
title_full Conditional sliding windows: An approach for handling data limitation in colorectal histopathology image classification
title_fullStr Conditional sliding windows: An approach for handling data limitation in colorectal histopathology image classification
title_full_unstemmed Conditional sliding windows: An approach for handling data limitation in colorectal histopathology image classification
title_sort conditional sliding windows: an approach for handling data limitation in colorectal histopathology image classification
publisher Elsevier
series Informatics in Medicine Unlocked
issn 2352-9148
publishDate 2021-01-01
description Large amounts of data are required for the training process with a convolutional neural network (CNN) because small datasets with low variation will cause over-fitting, and the model cannot predict new data with high accuracy. Additionally, the non-availability of histopathological medical data presents an issue because without ethical permission, such data cannot be obtained easily. Therefore, this study proposes a conditional sliding window algorithm to obtain sub-sample data on images of histopathology.Two sets of original data were used, one from the Warwick dataset with dimensions of 775 × 522 pixels and the other from the Department of Pathology and Anatomy, Faculty of Medicine Universitas Indonesia. The algorithm used was inspired by the conventional sliding window method, but implemented with added conditions, such as sliding the window algorithm from the left on (x,y) pixel coordinates, thereby moving from left to right, then up to down until the entire image was covered. Consequently, the new image was produced with two dimensions: 200 × 200 and 300 × 300 pixels. However, to avoid loss of information, the 25 and 50 pixels overlap were used. In this study, CNN 7-5-7 was designed and proposed to perform the process.The conditional sliding window algorithm can produce various sub-samples depending on the image and window size. Furthermore, the images produced were used to develop a CNN and were proven to accurately predict benign and malignant tissues compared to the model from the original dataset. Moreover, the sensitivity values of the Warwick public dataset and the one generated in this study are above 0.80, which shows that the proposed CNN architecture is more stable compared to the existing methods such as AlexNet and DenseNet121.This study succeeded in solving the limitations of colorectal histopathological training data by developing a conditional sliding window algorithm. This algorithm can be applied to generate other histopathological data. Moreover, our proposed CNN 7-5-7 is the fastest architecture for training, comparable to state-of-the-art methodologies. Furthermore, the dataset was used to develop the model for colorectal cancer identification and integrated on the web-based application for further implementation.
topic Augmentation
Convolutional neural network
Conditional sliding windows
Histopathology
url http://www.sciencedirect.com/science/article/pii/S2352914821000551
work_keys_str_mv AT totoharyanto conditionalslidingwindowsanapproachforhandlingdatalimitationincolorectalhistopathologyimageclassification
AT herusuhartanto conditionalslidingwindowsanapproachforhandlingdatalimitationincolorectalhistopathologyimageclassification
AT aniatimurniarymurthy conditionalslidingwindowsanapproachforhandlingdatalimitationincolorectalhistopathologyimageclassification
AT kusmardikusmardi conditionalslidingwindowsanapproachforhandlingdatalimitationincolorectalhistopathologyimageclassification
_version_ 1721523362162802688