Support: The support of an association pattern is the percentage of task-relevant data transaction for which the pattern is true.

Support (A): Number of tuples containing A / Total number of tuples

Support (A = > B): Number of tuples containing A and B / Total number of tuples

  • If minsup is set too high, we could miss itemsets involving interesting rare items (e.g., expensive products)
  • If minsup is set too low, it is computationally expensive and the number of itemsets is very large

Confidence: Confidence is defined as the measure of certainty or trustworthiness associated with each discovered pattern.

Confidence (A = > B): Number of tuples containing A and B / Total count of A

Itemset

  • A collection of one or more Example: {Milk, Bread, Diaper}
  • An itemset that contains k items is called k-itemset.

Frequent Itemset

  • An itemset whose support is greater than or equal to a minimum support threshold.

Association Rule


  • An implication expression of the form X => Y, where X and Y are Example: {Milk, Diaper} => {Beer}

Maximal Frequent Itemset:

-     An itemset is maximal if none of its immediate supersets is frequent.

Closed Itemset:

  • An itemset is closed if none of its immediate supersets has same support as of the itmeset.

Lift

  • Lift is a measure of the performance of a targeting model (association rule) at predicting or classifying cases as having an enhanced response with respect to the population as a whole, measured against a random choice targeting model.
  • Lift can be found by dividing the confidence by the unconditional probability of the consequent, or by dividing the support by the probability of the antecedent times the probability of the consequent.
  • If some rule had a lift of 1, it would imply that the probability of occurrence of the antecedent and that of the consequent are independent of each other. When two events are independent of each other, no rule can be drawn involving those two events.
  • If the lift is > 1, that lets us know the degree to which those two occurrences are dependent on one another, and makes those rules potentially useful for predicting the consequent in future data sets.
  • Lift = P(Y | X ) /P(Y )