Techniques to improve the efficiency of Apriori algorithm
- Hash based technique
- Transaction Reduction
- Dynamic item counting
Apriori Algorithm – Frequent Pattern Algorithms
Apriori algorithm was the first algorithm that was proposed for frequent itemset mining. It was later improved by R Agarwal and R Srikant and came to be known as Apriori. This algorithm uses two steps “join” and “prune” to reduce the search space. It is an iterative approach to discover the most frequent itemsets.
The probability that item I is not frequent is if:
- P(I) < minimum support threshold, then I is not frequent.
- P (I+A) < minimum support threshold, then I+A is not frequent, where A also belongs to itemset.
- If an itemset set has value less than minimum support then all of its supersets will also fall below min support, and thus can be ignored. This property is called the Antimonotone property.
The steps followed in the Apriori Algorithm of data mining are:
- Join Step: This step generates (K+1) itemset from K-itemsets by joining each item with itself.
- Prune Step: This step scans the count of each item in the database. If the candidate item does not meet minimum support, then it is regarded as infrequent and thus it is removed. This step is performed to reduce the size of the candidate itemsets.