Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.

<h4>Background</h4>Heart failure (HF) is a major cause of morbidity and mortality. However, much of the clinical data is unstructured in the form of radiology reports, while the process of data collection and curation is arduous and time-consuming.<h4>Purpose</h4>We utilized...

Full description

Bibliographic Details
Main Authors:	Mohit Pandey, Zhuoran Xu, Evan Sholle, Gabriel Maliakal, Gurpreet Singh, Zahra Fatima, Daria Larine, Benjamin C Lee, Jing Wang, Alexander R van Rosendael, Lohendran Baskaran, Leslee J Shaw, James K Min, Subhi J Al'Aref
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2020-01-01
Series:	PLoS ONE
Online Access:	https://doi.org/10.1371/journal.pone.0236827

id	doaj-c23fb8fb87a24e859f2068195955f9c1
record_format	Article
spelling	doaj-c23fb8fb87a24e859f2068195955f9c12021-04-23T04:30:30ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01157e023682710.1371/journal.pone.0236827Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.Mohit PandeyZhuoran XuEvan SholleGabriel MaliakalGurpreet SinghZahra FatimaDaria LarineBenjamin C LeeJing WangAlexander R van RosendaelLohendran BaskaranLeslee J ShawJames K MinSubhi J Al'Aref<h4>Background</h4>Heart failure (HF) is a major cause of morbidity and mortality. However, much of the clinical data is unstructured in the form of radiology reports, while the process of data collection and curation is arduous and time-consuming.<h4>Purpose</h4>We utilized a machine learning (ML)-based natural language processing (NLP) approach to extract clinical terms from unstructured radiology reports. Additionally, we investigate the prognostic value of the extracted data in predicting all-cause mortality (ACM) in HF patients.<h4>Materials and methods</h4>This observational cohort study utilized 122,025 thoracoabdominal computed tomography (CT) reports from 11,808 HF patients obtained between 2008 and 2018. 1,560 CT reports were manually annotated for the presence or absence of 14 radiographic findings, in addition to age and gender. Thereafter, a Convolutional Neural Network (CNN) was trained, validated and tested to determine the presence or absence of these features. Further, the ability of CNN to predict ACM was evaluated using Cox regression analysis on the extracted features.<h4>Results</h4>11,808 CT reports were analyzed from 11,808 patients (mean age 72.8 ± 14.8 years; 52.7% (6,217/11,808) male) from whom 3,107 died during the 10.6-year follow-up. The CNN demonstrated excellent accuracy for retrieval of the 14 radiographic findings with area-under-the-curve (AUC) ranging between 0.83-1.00 (F1 score 0.84-0.97). Cox model showed the time-dependent AUC for predicting ACM was 0.747 (95% confidence interval [CI] of 0.704-0.790) at 30 days.<h4>Conclusion</h4>An ML-based NLP approach to unstructured CT reports demonstrates excellent accuracy for the extraction of predetermined radiographic findings, and provides prognostic value in HF patients.https://doi.org/10.1371/journal.pone.0236827
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Mohit Pandey Zhuoran Xu Evan Sholle Gabriel Maliakal Gurpreet Singh Zahra Fatima Daria Larine Benjamin C Lee Jing Wang Alexander R van Rosendael Lohendran Baskaran Leslee J Shaw James K Min Subhi J Al'Aref
spellingShingle	Mohit Pandey Zhuoran Xu Evan Sholle Gabriel Maliakal Gurpreet Singh Zahra Fatima Daria Larine Benjamin C Lee Jing Wang Alexander R van Rosendael Lohendran Baskaran Leslee J Shaw James K Min Subhi J Al'Aref Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing. PLoS ONE
author_facet	Mohit Pandey Zhuoran Xu Evan Sholle Gabriel Maliakal Gurpreet Singh Zahra Fatima Daria Larine Benjamin C Lee Jing Wang Alexander R van Rosendael Lohendran Baskaran Leslee J Shaw James K Min Subhi J Al'Aref
author_sort	Mohit Pandey
title	Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.
title_short	Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.
title_full	Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.
title_fullStr	Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.
title_full_unstemmed	Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.
title_sort	extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.
publisher	Public Library of Science (PLoS)
series	PLoS ONE
issn	1932-6203
publishDate	2020-01-01
description	<h4>Background</h4>Heart failure (HF) is a major cause of morbidity and mortality. However, much of the clinical data is unstructured in the form of radiology reports, while the process of data collection and curation is arduous and time-consuming.<h4>Purpose</h4>We utilized a machine learning (ML)-based natural language processing (NLP) approach to extract clinical terms from unstructured radiology reports. Additionally, we investigate the prognostic value of the extracted data in predicting all-cause mortality (ACM) in HF patients.<h4>Materials and methods</h4>This observational cohort study utilized 122,025 thoracoabdominal computed tomography (CT) reports from 11,808 HF patients obtained between 2008 and 2018. 1,560 CT reports were manually annotated for the presence or absence of 14 radiographic findings, in addition to age and gender. Thereafter, a Convolutional Neural Network (CNN) was trained, validated and tested to determine the presence or absence of these features. Further, the ability of CNN to predict ACM was evaluated using Cox regression analysis on the extracted features.<h4>Results</h4>11,808 CT reports were analyzed from 11,808 patients (mean age 72.8 ± 14.8 years; 52.7% (6,217/11,808) male) from whom 3,107 died during the 10.6-year follow-up. The CNN demonstrated excellent accuracy for retrieval of the 14 radiographic findings with area-under-the-curve (AUC) ranging between 0.83-1.00 (F1 score 0.84-0.97). Cox model showed the time-dependent AUC for predicting ACM was 0.747 (95% confidence interval [CI] of 0.704-0.790) at 30 days.<h4>Conclusion</h4>An ML-based NLP approach to unstructured CT reports demonstrates excellent accuracy for the extraction of predetermined radiographic findings, and provides prognostic value in HF patients.
url	https://doi.org/10.1371/journal.pone.0236827
work_keys_str_mv	AT mohitpandey extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT zhuoranxu extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT evansholle extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT gabrielmaliakal extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT gurpreetsingh extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT zahrafatima extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT darialarine extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT benjaminclee extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT jingwang extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT alexanderrvanrosendael extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT lohendranbaskaran extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT lesleejshaw extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT jameskmin extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing AT subhijalaref extractionofradiographicfindingsfromunstructuredthoracoabdominalcomputedtomographyreportsusingconvolutionalneuralnetworkbasednaturallanguageprocessing
_version_	1714662251776966656

Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing.

Similar Items