Seeing the forest for the trees: tree-based uncertain frequent pattern mining
Many frequent pattern mining algorithms operate on precise data, where each data point is an exact accounting of a phenomena (e.g., I have exactly two sisters). Alas, reasoning this way is a simplification for many real world observations. Measurements, predictions, environmental factors, human erro...
Main Author: | |
---|---|
Other Authors: | |
Published: |
Springer International Publishing
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/1993/31059 |
id |
ndltd-MANITOBA-oai-mspace.lib.umanitoba.ca-1993-31059 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-MANITOBA-oai-mspace.lib.umanitoba.ca-1993-310592016-01-15T03:52:16Z Seeing the forest for the trees: tree-based uncertain frequent pattern mining MacKinnon, Richard Kyle Leung, Carson K.-S. (Computer Science) Wang, Yang (Computer Science) Wang, Xikui (Statistics) Data mining Databases Many frequent pattern mining algorithms operate on precise data, where each data point is an exact accounting of a phenomena (e.g., I have exactly two sisters). Alas, reasoning this way is a simplification for many real world observations. Measurements, predictions, environmental factors, human error, &ct. all introduce a degree of uncertainty into the mix. Tree-based frequent pattern mining algorithms such as FP-growth are particularly efficient due to their compact in-memory representations of the input database, but their uncertain extensions can require many more tree nodes. I propose new algorithms with tightened upper bounds to expected support, Tube-S and Tube-P, which mine frequent patterns from uncertain data. Extensive experimentation and analysis on datasets with different probability distributions are undertaken that show the tightness of my bounds in different situations. February 2016 2016-01-13T22:43:27Z 2016-01-13T22:43:27Z 2014-05 2014-09 2014-09 2014-12 2014-12 MacKinnon, R.K., Leung, C.K.-S., Tanbeer, S.K. (2014) A scalable data analytics algorithm for mining frequent patterns from uncertain data. In Proc. PAKDDW 2014: 404-416. Springer International Publishing. Leung, C.K.-S., MacKinnon, R.K. (2014) BLIMP: a compact tree structure for uncertain frequent pattern mining. In Proc. DaWaK 2014: 115-123. Springer International Publishing. Leung, C.K.-S., MacKinnon, R.K., Tanbeer, S.K. (2014) Tightening upper bounds to the expected support for uncertain frequent pattern mining. In Proc. KES 2014: 328-337. Elsevier. MacKinnon, R.K., Strauss, T.D., Leung, C.K.-S. (2014) DISC: efficient uncertain frequent pattern mining with tightened upper bounds. In Proc. ICDMW 2014: 1038-1045. IEEE Computer Society Press. Leung, C.K.-S., MacKinnon, R.K., Tanbeer, S.K. (2014) Fast algorithms for frequent itemset mining from uncertain data. In Proc. ICDM 2014: 893-898. IEEE Computer Society Press. http://hdl.handle.net/1993/31059 Springer International Publishing Springer International Publishing Elsevier IEEE Computer Society Press IEEE Computer Society Press |
collection |
NDLTD |
sources |
NDLTD |
topic |
Data mining Databases |
spellingShingle |
Data mining Databases MacKinnon, Richard Kyle Seeing the forest for the trees: tree-based uncertain frequent pattern mining |
description |
Many frequent pattern mining algorithms operate on precise data, where each data point is an exact accounting of a phenomena (e.g., I have exactly two sisters). Alas, reasoning this way is a simplification for many real world observations. Measurements, predictions, environmental factors, human error, &ct. all introduce a degree of uncertainty into the mix. Tree-based frequent pattern mining algorithms such as FP-growth are particularly efficient due to their compact in-memory representations of the input database, but their uncertain extensions can require many more tree nodes. I propose new algorithms with tightened upper bounds to expected support, Tube-S and Tube-P, which mine frequent patterns from uncertain data. Extensive experimentation and analysis on datasets with different probability distributions are undertaken that show the tightness of my bounds in different situations. === February 2016 |
author2 |
Leung, Carson K.-S. (Computer Science) |
author_facet |
Leung, Carson K.-S. (Computer Science) MacKinnon, Richard Kyle |
author |
MacKinnon, Richard Kyle |
author_sort |
MacKinnon, Richard Kyle |
title |
Seeing the forest for the trees: tree-based uncertain frequent pattern mining |
title_short |
Seeing the forest for the trees: tree-based uncertain frequent pattern mining |
title_full |
Seeing the forest for the trees: tree-based uncertain frequent pattern mining |
title_fullStr |
Seeing the forest for the trees: tree-based uncertain frequent pattern mining |
title_full_unstemmed |
Seeing the forest for the trees: tree-based uncertain frequent pattern mining |
title_sort |
seeing the forest for the trees: tree-based uncertain frequent pattern mining |
publisher |
Springer International Publishing |
publishDate |
2016 |
url |
http://hdl.handle.net/1993/31059 |
work_keys_str_mv |
AT mackinnonrichardkyle seeingtheforestforthetreestreebaseduncertainfrequentpatternmining |
_version_ |
1718160968854274048 |