Automatic allergy classification based on Russian unstructured medical texts

Most of the medical data in hospital information systems databases are stored in an unstructured form. Techniques for processing unstructured records are widely presented in scientific papers focused on English data. This paper proposes a method for intellectual analysis of unstructured allergy anam...

Full description

Bibliographic Details
Main Authors: Iuliia D. Lenivtceva, Georgy D. Kopanitsa
Format: Article
Language:English
Published: Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University) 2021-06-01
Series:Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki
Subjects:
Online Access:https://ntv.ifmo.ru/file/article/20517.pdf
id doaj-d8fc923971f947ea9cfc7b5401fab1ac
record_format Article
spelling doaj-d8fc923971f947ea9cfc7b5401fab1ac2021-06-21T12:09:26ZengSaint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki2226-14942500-03732021-06-0121343343610.17586/2226-1494-2021-21-3-433-436Automatic allergy classification based on Russian unstructured medical textsIuliia D. Lenivtceva0https://orcid.org/0000-0002-5572-5151Georgy D. Kopanitsa1https://orcid.org/0000-0002-6231-8036Engineer, ITMO University, Saint Petersburg, 197101, Russian FederationPhD, Leading Researcher, ITMO University, Saint Petersburg, 197101, Russian FederationMost of the medical data in hospital information systems databases are stored in an unstructured form. Techniques for processing unstructured records are widely presented in scientific papers focused on English data. This paper proposes a method for intellectual analysis of unstructured allergy anamnesis in Russian in order to identify the presence and type of allergy and intolerance of a patient. The method is based on machine learning algorithms and uses international standards for the exchange of medical data and terminology standards, such as FHIR and SNOMED CT. As a result of the experiment, about 12 thousand medical records were processed. F-measure for the developed classification models ranged from 0.93 to 0.96. The models showed high values of metrics for evaluating the effectiveness of the models. In the future, structured data can be used in models for predicting medical risks. Further development of methods for structuring medical texts will ensure the interoperability of medical data.https://ntv.ifmo.ru/file/article/20517.pdfmedical data structuringallergyintolerancemachine learningunstructured text analysisinteroperability
collection DOAJ
language English
format Article
sources DOAJ
author Iuliia D. Lenivtceva
Georgy D. Kopanitsa
spellingShingle Iuliia D. Lenivtceva
Georgy D. Kopanitsa
Automatic allergy classification based on Russian unstructured medical texts
Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki
medical data structuring
allergy
intolerance
machine learning
unstructured text analysis
interoperability
author_facet Iuliia D. Lenivtceva
Georgy D. Kopanitsa
author_sort Iuliia D. Lenivtceva
title Automatic allergy classification based on Russian unstructured medical texts
title_short Automatic allergy classification based on Russian unstructured medical texts
title_full Automatic allergy classification based on Russian unstructured medical texts
title_fullStr Automatic allergy classification based on Russian unstructured medical texts
title_full_unstemmed Automatic allergy classification based on Russian unstructured medical texts
title_sort automatic allergy classification based on russian unstructured medical texts
publisher Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)
series Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki
issn 2226-1494
2500-0373
publishDate 2021-06-01
description Most of the medical data in hospital information systems databases are stored in an unstructured form. Techniques for processing unstructured records are widely presented in scientific papers focused on English data. This paper proposes a method for intellectual analysis of unstructured allergy anamnesis in Russian in order to identify the presence and type of allergy and intolerance of a patient. The method is based on machine learning algorithms and uses international standards for the exchange of medical data and terminology standards, such as FHIR and SNOMED CT. As a result of the experiment, about 12 thousand medical records were processed. F-measure for the developed classification models ranged from 0.93 to 0.96. The models showed high values of metrics for evaluating the effectiveness of the models. In the future, structured data can be used in models for predicting medical risks. Further development of methods for structuring medical texts will ensure the interoperability of medical data.
topic medical data structuring
allergy
intolerance
machine learning
unstructured text analysis
interoperability
url https://ntv.ifmo.ru/file/article/20517.pdf
work_keys_str_mv AT iuliiadlenivtceva automaticallergyclassificationbasedonrussianunstructuredmedicaltexts
AT georgydkopanitsa automaticallergyclassificationbasedonrussianunstructuredmedicaltexts
_version_ 1721368574218469376