KLOSURE: Closing in on open–ended patient questionnaires with text mining

Abstract Background Knee injury and Osteoarthritis Outcome Score (KOOS) is an instrument used to quantify patients’ perceptions about their knee condition and associated problems. It is administered as a 42-item closed-ended questionnaire in which patients are asked to self-assess five outcomes: pai...

Full description

Bibliographic Details
Main Authors: Irena Spasić, David Owen, Andrew Smith, Kate Button
Format: Article
Language:English
Published: BMC 2019-11-01
Series:Journal of Biomedical Semantics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13326-019-0215-3
id doaj-1b25cbdf9cd0432bbff4fd6501cb8d7f
record_format Article
spelling doaj-1b25cbdf9cd0432bbff4fd6501cb8d7f2020-11-25T04:05:21ZengBMCJournal of Biomedical Semantics2041-14802019-11-0110S111110.1186/s13326-019-0215-3KLOSURE: Closing in on open–ended patient questionnaires with text miningIrena Spasić0David Owen1Andrew Smith2Kate Button3School of Computer Science & Informatics, Cardiff UniversitySchool of Computer Science & Informatics, Cardiff UniversitySchool of Psychology, Cardiff UniversitySchool of Healthcare Sciences, Cardiff UniversityAbstract Background Knee injury and Osteoarthritis Outcome Score (KOOS) is an instrument used to quantify patients’ perceptions about their knee condition and associated problems. It is administered as a 42-item closed-ended questionnaire in which patients are asked to self-assess five outcomes: pain, other symptoms, activities of daily living, sport and recreation activities, and quality of life. We developed KLOG as a 10-item open-ended version of the KOOS questionnaire in an attempt to obtain deeper insight into patients’ opinions including their unmet needs. However, the open–ended nature of the questionnaire incurs analytical overhead associated with the interpretation of responses. The goal of this study was to automate such analysis. We implemented KLOSURE as a system for mining free–text responses to the KLOG questionnaire. It consists of two subsystems, one concerned with feature extraction and the other one concerned with classification of feature vectors. Feature extraction is performed by a set of four modules whose main functionalities are linguistic pre-processing, sentiment analysis, named entity recognition and lexicon lookup respectively. Outputs produced by each module are combined into feature vectors. The structure of feature vectors will vary across the KLOG questions. Finally, Weka, a machine learning workbench, was used for classification of feature vectors. Results The precision of the system varied between 62.8 and 95.3%, whereas the recall varied from 58.3 to 87.6% across the 10 questions. The overall performance in terms of F–measure varied between 59.0 and 91.3% with an average of 74.4% and a standard deviation of 8.8. Conclusions We demonstrated the feasibility of mining open-ended patient questionnaires. By automatically mapping free text answers onto a Likert scale, we can effectively measure the progress of rehabilitation over time. In comparison to traditional closed-ended questionnaires, our approach offers much richer information that can be utilised to support clinical decision making. In conclusion, we demonstrated how text mining can be used to combine the benefits of qualitative and quantitative analysis of patient experiences.http://link.springer.com/article/10.1186/s13326-019-0215-3Text miningNatural language processingText classificationNamed entity recognitionSentiment analysisPatient reported outcome measure
collection DOAJ
language English
format Article
sources DOAJ
author Irena Spasić
David Owen
Andrew Smith
Kate Button
spellingShingle Irena Spasić
David Owen
Andrew Smith
Kate Button
KLOSURE: Closing in on open–ended patient questionnaires with text mining
Journal of Biomedical Semantics
Text mining
Natural language processing
Text classification
Named entity recognition
Sentiment analysis
Patient reported outcome measure
author_facet Irena Spasić
David Owen
Andrew Smith
Kate Button
author_sort Irena Spasić
title KLOSURE: Closing in on open–ended patient questionnaires with text mining
title_short KLOSURE: Closing in on open–ended patient questionnaires with text mining
title_full KLOSURE: Closing in on open–ended patient questionnaires with text mining
title_fullStr KLOSURE: Closing in on open–ended patient questionnaires with text mining
title_full_unstemmed KLOSURE: Closing in on open–ended patient questionnaires with text mining
title_sort klosure: closing in on open–ended patient questionnaires with text mining
publisher BMC
series Journal of Biomedical Semantics
issn 2041-1480
publishDate 2019-11-01
description Abstract Background Knee injury and Osteoarthritis Outcome Score (KOOS) is an instrument used to quantify patients’ perceptions about their knee condition and associated problems. It is administered as a 42-item closed-ended questionnaire in which patients are asked to self-assess five outcomes: pain, other symptoms, activities of daily living, sport and recreation activities, and quality of life. We developed KLOG as a 10-item open-ended version of the KOOS questionnaire in an attempt to obtain deeper insight into patients’ opinions including their unmet needs. However, the open–ended nature of the questionnaire incurs analytical overhead associated with the interpretation of responses. The goal of this study was to automate such analysis. We implemented KLOSURE as a system for mining free–text responses to the KLOG questionnaire. It consists of two subsystems, one concerned with feature extraction and the other one concerned with classification of feature vectors. Feature extraction is performed by a set of four modules whose main functionalities are linguistic pre-processing, sentiment analysis, named entity recognition and lexicon lookup respectively. Outputs produced by each module are combined into feature vectors. The structure of feature vectors will vary across the KLOG questions. Finally, Weka, a machine learning workbench, was used for classification of feature vectors. Results The precision of the system varied between 62.8 and 95.3%, whereas the recall varied from 58.3 to 87.6% across the 10 questions. The overall performance in terms of F–measure varied between 59.0 and 91.3% with an average of 74.4% and a standard deviation of 8.8. Conclusions We demonstrated the feasibility of mining open-ended patient questionnaires. By automatically mapping free text answers onto a Likert scale, we can effectively measure the progress of rehabilitation over time. In comparison to traditional closed-ended questionnaires, our approach offers much richer information that can be utilised to support clinical decision making. In conclusion, we demonstrated how text mining can be used to combine the benefits of qualitative and quantitative analysis of patient experiences.
topic Text mining
Natural language processing
Text classification
Named entity recognition
Sentiment analysis
Patient reported outcome measure
url http://link.springer.com/article/10.1186/s13326-019-0215-3
work_keys_str_mv AT irenaspasic klosureclosinginonopenendedpatientquestionnaireswithtextmining
AT davidowen klosureclosinginonopenendedpatientquestionnaireswithtextmining
AT andrewsmith klosureclosinginonopenendedpatientquestionnaireswithtextmining
AT katebutton klosureclosinginonopenendedpatientquestionnaireswithtextmining
_version_ 1724434431268093952