An Automatic Staging Classification Model for Esophageal Cancer Pathological Reports

碩士 === 長庚大學 === 管理學院碩士學位學程在職專班資訊管理組 === 99 === More than 40,000 Taiwan residents died of cancers in 2009 according to the statistics of Department of Health, Taiwan. Due to the advance in medical experience and knowledge over the last decade, the prognosis of cancer patients has been significantly i...

Full description

Bibliographic Details
Main Authors: Yung-Han Sun, 孫詠涵
Other Authors: C. H. Chen
Format: Others
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/40529700074976516273
Description
Summary:碩士 === 長庚大學 === 管理學院碩士學位學程在職專班資訊管理組 === 99 === More than 40,000 Taiwan residents died of cancers in 2009 according to the statistics of Department of Health, Taiwan. Due to the advance in medical experience and knowledge over the last decade, the prognosis of cancer patients has been significantly improved and there are more drugs as well as alternative treatments to help patients at relatively late stage of cancers. Cancer staging is an important indicator for assessing the effects of cancer treatment and prognosis. Its effectiveness may be affected by the interpretation proficiency of cancer registration staff who read the pathological reports of cancer patients. However, the manual interpretation process is somewhat inefficient and time consuming. The aim of this study was to explore the effectiveness of computationally converting pathological reports of esophageal cancer into cancer staging reports by using efficient document classification techniques. Materials and Methods: Pathological reports of 234 patients undergone esophagectomy from year 2000 to 2008 in Division of Thoracic Surgery, Taipei-Veterans General Hospital, Taiwan were collected in this study. The reports were computationally converted into weighted frequency vectors of keywords by using text mining techniques to analyze cancer staging related keywords in the reports. Then, J48 decision tree induction algorithm, a supervised learning algorithm, was used to evaluate the performance of our document classification model for automatic cancer staging based on the 234 vectors. Results: The average prediction accuracy rate for cell type reaches 95.74%, and those for T, N and M status reach 86.14%, 90.73% and 94.89% respectively. Conclusions: In esophageal cancer, using the J48 decision tree induction algorithm, the average prediction accuracy rate is high, the model may efficiently and effectively assist the physicians or cancer registration staffs to improve the accuracy rate of cancer pathological stage and reduce the time-consuming stage in the large number of data processing in studies.