Developing integrated evolutionary approaches to improve process mining fitness for noise-free and noisy logs

博士 === 國立臺灣科技大學 === 工業管理系 === 103 === With the changes of information technology, the complexity of the development of business activities has increased markedly. To understand the connotative information of complex business processes is difficult. A process model is a way that can describe the inte...

Full description

Bibliographic Details
Main Authors: Hsin-Jung Cheng, 鄭歆蓉
Other Authors: Chao Ou-Yang
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/19652768192393448280
Description
Summary:博士 === 國立臺灣科技大學 === 工業管理系 === 103 === With the changes of information technology, the complexity of the development of business activities has increased markedly. To understand the connotative information of complex business processes is difficult. A process model is a way that can describe the internal activities or behavior in businesses. Process mining (PM) is a technique to extract a process model from an event log to represent the process behavior recorded in that event log. A mined process model with high fitness means that it can reflect most of the process behavior recorded in the event log. Previous studies have shown that the mined model with high fitness can be used in process improvement, such as fraud detection, continuous process improvement, and benchmarking. Additionally, event logs may contain incomplete or incorrect process data. Such logs are called noisy logs. Therefore, this dissertation aims to study the discovery of process models with high fitness for noise-free and noisy event logs. Firstly, this dissertation considers discovering parallel (AND) structures in PM. There are several problematic structures in PM, including parallel (AND) structures, exclusive-choice (XOR) structures, non-free-choice structures, loops, and noise. Some PM approaches have been conducted to address only one or a few problematic structures. Genetic process mining (GPM) is a well-known PM method which can simultaneously handle most of the problematic structures. However, GPM still cannot effectively discover parallel structures from noise-free logs. This dissertation proposes a PM approach based on integration of GPM, particle swarm optimization (PSO), and differential evolution (DE) to find process models with high fitness for noise-free logs involving multiple parallel structures. The results show that the proposed approach does indeed lead to improvement in gaining process models with high fitness for event logs involving multiple parallel structures. Secondly, this dissertation considers discovering process models from noisy logs. In some industries (such as hospitals), data must sometimes be manually recorded and data concerning some tasks that are performed during emergency situations may be missing. Manual logging may result in logging errors. Missing events result in incomplete logs. Therefore, event logs may contain incorrect process data. These logs are called noisy logs. This dissertation develops a PSOSA approach that combines particle swarm optimization (PSO) with simulated annealing (SA) to discover process models with high fitness from noisy logs. The results achieved using this PSOSA approach reveal that it improves fitness of process models found using noisy logs.