Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus

BackgroundGaining insights that cannot be obtained from health care databases from patients has become an important topic in pharmacovigilance. ObjectiveOur objective was to demonstrate a use case, in which patient-generated data were incorporated in pharmacovigil...

Full description

Bibliographic Details
Main Authors: Shinichi Matsuda, Takumi Ohtomo, Shiho Tomizawa, Yuki Miyano, Miwako Mogi, Hiroshi Kuriki, Terumi Nakayama, Shinichi Watanabe
Format: Article
Language:English
Published: JMIR Publications 2021-06-01
Series:JMIR Public Health and Surveillance
Online Access:https://publichealth.jmir.org/2021/6/e29238
id doaj-ad4da90dbf6947a0a474dfa51f81a970
record_format Article
spelling doaj-ad4da90dbf6947a0a474dfa51f81a9702021-06-30T00:31:00ZengJMIR PublicationsJMIR Public Health and Surveillance2369-29602021-06-0176e2923810.2196/29238Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus ErythematosusShinichi Matsudahttps://orcid.org/0000-0003-1822-1090Takumi Ohtomohttps://orcid.org/0000-0002-6438-6315Shiho Tomizawahttps://orcid.org/0000-0002-5354-0583Yuki Miyanohttps://orcid.org/0000-0002-9964-6690Miwako Mogihttps://orcid.org/0000-0002-3137-8087Hiroshi Kurikihttps://orcid.org/0000-0002-3954-9039Terumi Nakayamahttps://orcid.org/0000-0001-8677-1567Shinichi Watanabehttps://orcid.org/0000-0003-0420-5786 BackgroundGaining insights that cannot be obtained from health care databases from patients has become an important topic in pharmacovigilance. ObjectiveOur objective was to demonstrate a use case, in which patient-generated data were incorporated in pharmacovigilance, to understand the epidemiology and burden of illness in Japanese patients with systemic lupus erythematosus. MethodsWe used data on systemic lupus erythematosus, an autoimmune disease that substantially impairs quality of life, from 2 independent data sets. To understand the disease’s epidemiology, we analyzed a Japanese health insurance claims database. To understand the disease’s burden, we analyzed text data collected from Japanese disease blogs (tōbyōki) written by patients with systemic lupus erythematosus. Natural language processing was applied to these texts to identify frequent patient-level complaints, and term frequency–inverse document frequency was used to explore patient burden during treatment. We explored health-related quality of life based on patient descriptions. ResultsWe analyzed data from 4694 and 635 patients with systemic lupus erythematosus in the health insurance claims database and tōbyōki blogs, respectively. Based on health insurance claims data, the prevalence of systemic lupus erythematosus is 107.70 per 100,000 persons. Tōbyōki text data analysis showed that pain-related words (eg, pain, severe pain, arthralgia) became more important after starting treatment. We also found an increase in patients’ references to mobility and self-care over time, which indicated increased attention to physical disability due to disease progression. ConclusionsA classical medical database represents only a part of a patient's entire treatment experience, and analysis using solely such a database cannot represent patient-level symptoms or patient concerns about treatments. This study showed that analysis of tōbyōki blogs can provide added information on patient-level details, advancing patient-centric pharmacovigilance.https://publichealth.jmir.org/2021/6/e29238
collection DOAJ
language English
format Article
sources DOAJ
author Shinichi Matsuda
Takumi Ohtomo
Shiho Tomizawa
Yuki Miyano
Miwako Mogi
Hiroshi Kuriki
Terumi Nakayama
Shinichi Watanabe
spellingShingle Shinichi Matsuda
Takumi Ohtomo
Shiho Tomizawa
Yuki Miyano
Miwako Mogi
Hiroshi Kuriki
Terumi Nakayama
Shinichi Watanabe
Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus
JMIR Public Health and Surveillance
author_facet Shinichi Matsuda
Takumi Ohtomo
Shiho Tomizawa
Yuki Miyano
Miwako Mogi
Hiroshi Kuriki
Terumi Nakayama
Shinichi Watanabe
author_sort Shinichi Matsuda
title Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus
title_short Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus
title_full Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus
title_fullStr Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus
title_full_unstemmed Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus
title_sort incorporating unstructured patient narratives and health insurance claims data in pharmacovigilance: natural language processing analysis of patient-generated texts about systemic lupus erythematosus
publisher JMIR Publications
series JMIR Public Health and Surveillance
issn 2369-2960
publishDate 2021-06-01
description BackgroundGaining insights that cannot be obtained from health care databases from patients has become an important topic in pharmacovigilance. ObjectiveOur objective was to demonstrate a use case, in which patient-generated data were incorporated in pharmacovigilance, to understand the epidemiology and burden of illness in Japanese patients with systemic lupus erythematosus. MethodsWe used data on systemic lupus erythematosus, an autoimmune disease that substantially impairs quality of life, from 2 independent data sets. To understand the disease’s epidemiology, we analyzed a Japanese health insurance claims database. To understand the disease’s burden, we analyzed text data collected from Japanese disease blogs (tōbyōki) written by patients with systemic lupus erythematosus. Natural language processing was applied to these texts to identify frequent patient-level complaints, and term frequency–inverse document frequency was used to explore patient burden during treatment. We explored health-related quality of life based on patient descriptions. ResultsWe analyzed data from 4694 and 635 patients with systemic lupus erythematosus in the health insurance claims database and tōbyōki blogs, respectively. Based on health insurance claims data, the prevalence of systemic lupus erythematosus is 107.70 per 100,000 persons. Tōbyōki text data analysis showed that pain-related words (eg, pain, severe pain, arthralgia) became more important after starting treatment. We also found an increase in patients’ references to mobility and self-care over time, which indicated increased attention to physical disability due to disease progression. ConclusionsA classical medical database represents only a part of a patient's entire treatment experience, and analysis using solely such a database cannot represent patient-level symptoms or patient concerns about treatments. This study showed that analysis of tōbyōki blogs can provide added information on patient-level details, advancing patient-centric pharmacovigilance.
url https://publichealth.jmir.org/2021/6/e29238
work_keys_str_mv AT shinichimatsuda incorporatingunstructuredpatientnarrativesandhealthinsuranceclaimsdatainpharmacovigilancenaturallanguageprocessinganalysisofpatientgeneratedtextsaboutsystemiclupuserythematosus
AT takumiohtomo incorporatingunstructuredpatientnarrativesandhealthinsuranceclaimsdatainpharmacovigilancenaturallanguageprocessinganalysisofpatientgeneratedtextsaboutsystemiclupuserythematosus
AT shihotomizawa incorporatingunstructuredpatientnarrativesandhealthinsuranceclaimsdatainpharmacovigilancenaturallanguageprocessinganalysisofpatientgeneratedtextsaboutsystemiclupuserythematosus
AT yukimiyano incorporatingunstructuredpatientnarrativesandhealthinsuranceclaimsdatainpharmacovigilancenaturallanguageprocessinganalysisofpatientgeneratedtextsaboutsystemiclupuserythematosus
AT miwakomogi incorporatingunstructuredpatientnarrativesandhealthinsuranceclaimsdatainpharmacovigilancenaturallanguageprocessinganalysisofpatientgeneratedtextsaboutsystemiclupuserythematosus
AT hiroshikuriki incorporatingunstructuredpatientnarrativesandhealthinsuranceclaimsdatainpharmacovigilancenaturallanguageprocessinganalysisofpatientgeneratedtextsaboutsystemiclupuserythematosus
AT teruminakayama incorporatingunstructuredpatientnarrativesandhealthinsuranceclaimsdatainpharmacovigilancenaturallanguageprocessinganalysisofpatientgeneratedtextsaboutsystemiclupuserythematosus
AT shinichiwatanabe incorporatingunstructuredpatientnarrativesandhealthinsuranceclaimsdatainpharmacovigilancenaturallanguageprocessinganalysisofpatientgeneratedtextsaboutsystemiclupuserythematosus
_version_ 1721354247417626624