Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care

OBJECTIVES:. To evaluate whether different approaches in note text preparation (known as preprocessing) can impact machine learning model performance in the case of mortality prediction ICU. DESIGN:. Clinical note text was used to build machine learning models for adults admitted to the ICU. Preproc...

Full description

Bibliographic Details
Main Authors: Malini Mahendra, MD, Yanting Luo, MS, Hunter Mills, MS, Gundolf Schenk, PhD, Atul J. Butte, MD, PhD, R. Adams Dudley, MD, MBA
Format: Article
Language:English
Published: Wolters Kluwer 2021-06-01
Series:Critical Care Explorations
Online Access:http://journals.lww.com/10.1097/CCE.0000000000000450
id doaj-b2bbf1d881894cc78cae91ddc4982399
record_format Article
spelling doaj-b2bbf1d881894cc78cae91ddc49823992021-06-28T03:11:59ZengWolters KluwerCritical Care Explorations2639-80282021-06-0136e045010.1097/CCE.0000000000000450202106000-00007Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive CareMalini Mahendra, MD0Yanting Luo, MS1Hunter Mills, MS2Gundolf Schenk, PhD3Atul J. Butte, MD, PhD4R. Adams Dudley, MD, MBA51 Department of Pediatrics, Division of Pediatric Critical Care, UCSF Benioff Children’s Hospital, University of California San Francisco, San Francisco, CA.2 Philip R. Lee Institute for Health Policy Studies, University of California San Francisco, San Francisco, CA.3 Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA.3 Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA.3 Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA.4 School of Medicine, School of Public Health, and Institute for Health Informatics, University of Minnesota, Minneapolis, MN.OBJECTIVES:. To evaluate whether different approaches in note text preparation (known as preprocessing) can impact machine learning model performance in the case of mortality prediction ICU. DESIGN:. Clinical note text was used to build machine learning models for adults admitted to the ICU. Preprocessing strategies studied were none (raw text), cleaning text, stemming, term frequency-inverse document frequency vectorization, and creation of n-grams. Model performance was assessed by the area under the receiver operating characteristic curve. Models were trained and internally validated on University of California San Francisco data using 10-fold cross validation. These models were then externally validated on Beth Israel Deaconess Medical Center data. SETTING:. ICUs at University of California San Francisco and Beth Israel Deaconess Medical Center. SUBJECTS:. Ten thousand patients in the University of California San Francisco training and internal testing dataset and 27,058 patients in the external validation dataset, Beth Israel Deaconess Medical Center. INTERVENTIONS:. None. MEASUREMENTS AND MAIN RESULTS:. Mortality rate at Beth Israel Deaconess Medical Center and University of California San Francisco was 10.9% and 7.4%, respectively. Data are presented as area under the receiver operating characteristic curve (95% CI) for models validated at University of California San Francisco and area under the receiver operating characteristic curve for models validated at Beth Israel Deaconess Medical Center. Models built and trained on University of California San Francisco data for the prediction of inhospital mortality improved from the raw note text model (AUROC, 0.84; CI, 0.80–0.89) to the term frequency-inverse document frequency model (AUROC, 0.89; CI, 0.85–0.94). When applying the models developed at University of California San Francisco to Beth Israel Deaconess Medical Center data, there was a similar increase in model performance from raw note text (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.72) to the term frequency-inverse document frequency model (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.83). CONCLUSIONS:. Differences in preprocessing strategies for note text impacted model discrimination. Completing a preprocessing pathway including cleaning, stemming, and term frequency-inverse document frequency vectorization resulted in the preprocessing strategy with the greatest improvement in model performance. Further study is needed, with particular emphasis on how to manage author implicit bias present in note text, before natural language processing algorithms are implemented in the clinical setting.http://journals.lww.com/10.1097/CCE.0000000000000450
collection DOAJ
language English
format Article
sources DOAJ
author Malini Mahendra, MD
Yanting Luo, MS
Hunter Mills, MS
Gundolf Schenk, PhD
Atul J. Butte, MD, PhD
R. Adams Dudley, MD, MBA
spellingShingle Malini Mahendra, MD
Yanting Luo, MS
Hunter Mills, MS
Gundolf Schenk, PhD
Atul J. Butte, MD, PhD
R. Adams Dudley, MD, MBA
Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care
Critical Care Explorations
author_facet Malini Mahendra, MD
Yanting Luo, MS
Hunter Mills, MS
Gundolf Schenk, PhD
Atul J. Butte, MD, PhD
R. Adams Dudley, MD, MBA
author_sort Malini Mahendra, MD
title Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care
title_short Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care
title_full Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care
title_fullStr Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care
title_full_unstemmed Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care
title_sort impact of different approaches to preparing notes for analysis with natural language processing on the performance of prediction models in intensive care
publisher Wolters Kluwer
series Critical Care Explorations
issn 2639-8028
publishDate 2021-06-01
description OBJECTIVES:. To evaluate whether different approaches in note text preparation (known as preprocessing) can impact machine learning model performance in the case of mortality prediction ICU. DESIGN:. Clinical note text was used to build machine learning models for adults admitted to the ICU. Preprocessing strategies studied were none (raw text), cleaning text, stemming, term frequency-inverse document frequency vectorization, and creation of n-grams. Model performance was assessed by the area under the receiver operating characteristic curve. Models were trained and internally validated on University of California San Francisco data using 10-fold cross validation. These models were then externally validated on Beth Israel Deaconess Medical Center data. SETTING:. ICUs at University of California San Francisco and Beth Israel Deaconess Medical Center. SUBJECTS:. Ten thousand patients in the University of California San Francisco training and internal testing dataset and 27,058 patients in the external validation dataset, Beth Israel Deaconess Medical Center. INTERVENTIONS:. None. MEASUREMENTS AND MAIN RESULTS:. Mortality rate at Beth Israel Deaconess Medical Center and University of California San Francisco was 10.9% and 7.4%, respectively. Data are presented as area under the receiver operating characteristic curve (95% CI) for models validated at University of California San Francisco and area under the receiver operating characteristic curve for models validated at Beth Israel Deaconess Medical Center. Models built and trained on University of California San Francisco data for the prediction of inhospital mortality improved from the raw note text model (AUROC, 0.84; CI, 0.80–0.89) to the term frequency-inverse document frequency model (AUROC, 0.89; CI, 0.85–0.94). When applying the models developed at University of California San Francisco to Beth Israel Deaconess Medical Center data, there was a similar increase in model performance from raw note text (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.72) to the term frequency-inverse document frequency model (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.83). CONCLUSIONS:. Differences in preprocessing strategies for note text impacted model discrimination. Completing a preprocessing pathway including cleaning, stemming, and term frequency-inverse document frequency vectorization resulted in the preprocessing strategy with the greatest improvement in model performance. Further study is needed, with particular emphasis on how to manage author implicit bias present in note text, before natural language processing algorithms are implemented in the clinical setting.
url http://journals.lww.com/10.1097/CCE.0000000000000450
work_keys_str_mv AT malinimahendramd impactofdifferentapproachestopreparingnotesforanalysiswithnaturallanguageprocessingontheperformanceofpredictionmodelsinintensivecare
AT yantingluoms impactofdifferentapproachestopreparingnotesforanalysiswithnaturallanguageprocessingontheperformanceofpredictionmodelsinintensivecare
AT huntermillsms impactofdifferentapproachestopreparingnotesforanalysiswithnaturallanguageprocessingontheperformanceofpredictionmodelsinintensivecare
AT gundolfschenkphd impactofdifferentapproachestopreparingnotesforanalysiswithnaturallanguageprocessingontheperformanceofpredictionmodelsinintensivecare
AT atuljbuttemdphd impactofdifferentapproachestopreparingnotesforanalysiswithnaturallanguageprocessingontheperformanceofpredictionmodelsinintensivecare
AT radamsdudleymdmba impactofdifferentapproachestopreparingnotesforanalysiswithnaturallanguageprocessingontheperformanceofpredictionmodelsinintensivecare
_version_ 1721357110273376256