Summary: | 碩士 === 國立臺灣師範大學 === 資訊教育學系 === 94 === Graph is a kind of structural data, which is applied to model the various relations among data in real world. Mining frequent sub-graph patterns, being equal to solve the problem of checking graph isomorphism, is a NP hard problem. Therefore, mining frequent sub-graph patterns in data streams is an even more complicated problem. In this thesis, graph data at every time point is collected for mining frequent sub-graph patterns at the time point. We assume that the changing of frequent sub-graph patterns will take several time points. Therefore, it is not necessary to re-mine frequent sub-graph patterns at each time point. The frequent sub-graph patterns discovered at the first time point are named base patterns. An efficient method, named FGCD algorithm, is proposed to detect the change of base patterns at the following time points, the FGCD algorithm approximately counts the frequencies of base patterns in the set of newly coming graphs, and calculates the percentage of remaining frequent patterns to decide whether the trend of frequent sub-graph patterns is changing or not and trigger to perform the re-mining of frequent sub-graph patterns. The storage structures of graphs are designed and the downward closure property among frequent sub-graphs is applied in the proposed method to efficiently match the sub-graphs patterns. According to experimental results, FGCD can approximately estimate the percentage of base patterns that remain frequent. When the trend of frequent sub-graph patterns does not change, FGCD algorithm provides a more efficient way than re-mining to maintain the frequent sub-graph patterns approximately.
|