Ellipsis Handling in A Medical Diagnosis Dialog System

碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 100 === Computerized virtual patient (CVP) is a domain specific dialog system. In this system, we handle ellipsis in medical diagnosis. Virtual patient is an important teaching method for medical college’s student. It can help student to learn how to judge patient’s co...

Full description

Bibliographic Details
Main Author: 鮑建威
Other Authors: Chuan-Jie Lin
Format: Others
Language:zh-TW
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/50928352094819149040
Description
Summary:碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 100 === Computerized virtual patient (CVP) is a domain specific dialog system. In this system, we handle ellipsis in medical diagnosis. Virtual patient is an important teaching method for medical college’s student. It can help student to learn how to judge patient’s condition from medical diagnosis. CVP need to resolve oral phenomenon something like our goal ellipsis. If we don’t handle ellipsis which is not easy to find corresponding problems and answers in the standard problem set of teaching text. Ellipsis handling includes ellipsis detection, type classification and recovery. There are many domain specific dialog system, but no one similar ours. Ellipses in our thesis are classified according to omitted element. Medical diagnosis template saves necessary information from dialogs. Our system is a hybrid system, rule-based module and machine learning. Rule-based module uses information from template to detect, classify and recover ellipsis. If some ellipsis can’t be detected by rule-based module, machine learning will implement. We learn a classifier for detecting ellipsis. Features include lexical surface, word information, POS, verb tense, punctuation and special terms from observation. These features also can be combined. After detection, rule-based module classifies and recovers ellipsis. The training and testing data are from virtual patient’s teaching record and medical diagnosis record in the hospital. Our machine learning method is Condition Random Field(CRF). Training is performed in 10-fold-cross-validation. In training, when using best features and feature combination, ellipsis detection classifier with a f-value of 86.73%, then recover by rule-based module with a f-value of 78.95%. Using information in diagnosis template to detect and recover ellipsis with a f-value of 82.58%. Total ellipsis system with a f-value of 85.54%. In testing, ellipsis system with a recall of 77.4%, a precision of 79.36% and a f-value of 78.35%