Design and Implementation of a Structured Electronic Form for Celiac Disease ‎Pathology ‎Reports: A Text Mining Approach

Introduction: Pathology reports generally use an unstructured text format and contain a complex web of ‎relations between medical concepts. In order to enable computers to understand and analyze ‎the reports’ free text, we aimed to convert these concepts and their relations into a structured ‎format...

Full description

Bibliographic Details
Main Authors: Azadeh Kamel-Ghalibaf, Farzaneh Khadem-Sameni, Majid Jangi, Mohammad Reza Mazaheri-Habibi, Kobra Etminani
Format: Article
Language:fas
Published: Vesnu Publications 2016-04-01
Series:مدیریت اطلاعات سلامت
Subjects:
Online Access:http://him.mui.ac.ir/index.php/him/article/view/2393
Description
Summary:Introduction: Pathology reports generally use an unstructured text format and contain a complex web of ‎relations between medical concepts. In order to enable computers to understand and analyze ‎the reports’ free text, we aimed to convert these concepts and their relations into a structured ‎format.‎ Methods: The training, validation, and evaluation of this implementation study was based on a corpus ‎of 258 pathology reports with a positive diagnosis of celiac disease randomly selected from ‎among the records of 2 pathology laboratories. Our proposed system consisted of 3 phases of ‎standardization of celiac disease pathology reports using Delphi technique with 3 experts, ‎information extraction from free text reports with text mining techniques using Stanford ‎Parser, and automatic classification of celiac disease stages in marsh system using decision ‎tree classifier J48 algorithm.‎ Results: We were successful in extracting information from free text pathology reports and assigning ‎each piece of information to the associated pre-defined fields in standardized template form ‎with an accuracy of 76%. After determining marsh stage for each report in the third phase, ‎our system showed an average overall accuracy of 62%. Evaluation of the third phase as an ‎independent system with manually corrected, gold-standard input achieved an accuracy of ‎greater than 84%.‎ Conclusion: The benefits of standardized synoptic pathology reporting include enhanced completeness ‎and improved consistency, avoidance of confusion and error, and facilitation of the faster and ‎safer transmission of critical pathological data in comparison with narrative reports.‎
ISSN:1735-7853
1735-9813