Abstract:
Frequent-regular itemset mining has achieved a great attention and applied
in several applications. In this framework, an itemset that frequently and regularly
occurs in a database is identified as interesting. However, without prior knowledge, the
setting appropriate support and regularity thresholds to measure interestingness of
itemsets is quite difficult. This may lead to none, only few or overwhelm of generated
results causing users cannot further take advantages from these itemsets. In addition,
mining interesting itemsets over data streams is a challenging task on various domains.
Therefore, to cope with these issues, we here propose an approach to mine top-k
frequent-regular itemsets over data streams.To mine such itemsets, the concept of
sliding window is applied in which recent occurrences are considered to be more
important than the former occurrences. An efficient single- pass algorithm, called
TFRIM-DS is also introduced to mine a set of k itemsets that regularly occur and
have highest support in the current considered window. In addition, a bit-vector with
a reuse technique is applied and designed to efficiently maintain occurrence
information of each itemset. Experiments were conducted and showed efficiency of
our proposed TFRIM-DS to mine top-k frequent-regular itemsets over sliding window
of data streams.