Mining Association
Rules
Md Tabrez Nafis
Department of Computer Science &
Engineering
JAMIA HAMDARD, New Delhi
Association Rules
Mining Association Rules between Sets of
Items in Large Databases
(R. Agrawal, T. Imielinski & A. Swami) 1993.
Fast Algorithms for Mining Association Rules
(R. Agrawal & R. Srikant) 1994.
What are Association Rules?
Study of “what goes with what”
“Customers who bought X also bought Y”
What symptoms go with what diagnosis
Transaction-based or event-based
Also called “market basket analysis” and
“affinity analysis”
Originated with study of customer
transactions databases to determine
associations among items purchased
Association rule mining
It is an important data mining model studied
extensively by the database and data mining
community.
Assume all data are categorical.
Initially used for Market Basket Analysis to find
how items purchased by customers are related.
Used in many recommender
systems
Basket Data
Retail organizations, e.g., supermarkets,
collect and store massive amounts of sales
data, called basket data.
A record consist of
transaction date
items bought
Or, basket data may consist of items bought
by a customer over a period.
Generating Rules
Terms
“IF” part = antecedent
“THEN” part = consequent
“Item set” = the items (e.g., products)
comprising the antecedent or consequent
Antecedent and consequent are disjoint
(i.e., have no items in common)
Tiny Example: Phone
Faceplates
Many Rules are Possible
For example: Transaction 1 supports several
rules, such as
“If red, then white” (“If a red faceplate is
purchased, then so is a white one”)
“If white, then red”
“If red and white, then green”
+ several more
Frequent Item Sets
Ideally, we want to create all possible
combinations of items
Problem: computation time grows
exponentially as # items increases
Solution: consider only “frequent item
sets”
Criterion for frequent: support
Support
Support = # (or percent) of transactions that
include both the antecedent and the
consequent
Example: support for the item set {red,
white} is 4 out of 10 transactions, or 40%
Example Association Rule
90% of transactions that purchase bread and
butter also purchase milk
Antecedent: bread and butter
Consequent: milk
Confidence factor: 90%
Association rules
Support
Every association rule has a support and a confidence.
“The support is the percentage of transactions that demonstrate the rule.”
Example: Database with transactions ( customer_# : item_a1, item_a2,
…)
1: 1, 3, 5.
2: 1, 8, 14, 17, 12.
3: 4, 6, 8, 12, 9, 104.
4: 2, 1, 8.
support {8,12} = 2 (,or 50% ~ 2 of 4 customers)
support {1, 5} = 1 (,or 25% ~ 1 of 4 customers )
support {1} = 3 (,or 75% ~ 3 of 4 customers)
2. Association rules
Support
An itemset is called frequent if its support is equal or
greater than an agreed upon minimal value – the support
threshold
add to previous example:
if threshold 50%
then itemsets {8,12} and {1} called frequent
2. Association rules
Confidence
Every association rule has a support and a confidence.
An association rule is of the form: X => Y
X => Y: if someone buys X, he also buys Y
The confidence is the conditional probability that, given X
present in a transaction , Y will also be present.
Confidence measure, by definition:
Confidence(X=>Y) equals support(X,Y) / support(X)
2. Association rules
Confidence
We should only consider rules derived from
itemsets with high support, and that also have
high confidence.
“A rule with low confidence is not meaningful.”
Rules don’t explain anything, they just point out
hard facts in data volumes.
3. Example
Example: Database with transactions ( customer_# : item_a1, item_a2, … )
1: 3, 5, 8.
2: 2, 6, 8.
3: 1, 4, 7, 10.
4: 3, 8, 10.
5: 2, 5, 8.
6: 1, 5, 6.
7: 4, 5, 6, 8.
8: 2, 3, 4.
9: 1, 5, 7, 8.
10: 3, 8, 9, 10.
Conf ( {5} => {8} ) ?
supp({5}) = 5 , supp({8}) = 7 , supp({5,8}) = 4,
then conf( {5} => {8} ) = 4/5 = 0.8 or 80%
3. Example
Example: Database with transactions ( customer_# : item_a1, item_a2, … )
1: 3, 5, 8.
2: 2, 6, 8.
3: 1, 4, 7, 10.
4: 3, 8, 10.
5: 2, 5, 8.
6: 1, 5, 6.
7: 4, 5, 6, 8.
8: 2, 3, 4.
9: 1, 5, 7, 8.
10: 3, 8, 9, 10.
Conf ( {5} => {8} ) ? 80% Done. Conf ( {8} => {5} ) ?
supp({5}) = 5 , supp({8}) = 7 , supp({5,8}) = 4,
then conf( {8} => {5} ) = 4/7 = 0.57 or 57%
3. Example
Example: Database with transactions ( customer_# : item_a1, item_a2, … )
Conf ( {5} => {8} ) ? 80% Done.
Conf ( {8} => {5} ) ? 57% Done.
Rule ( {5} => {8} ) more meaningful then
Rule ( {8} => {5} )
3. Example
Example: Database with transactions ( customer_# : item_a1, item_a2, … )
1: 3, 5, 8.
2: 2, 6, 8.
3: 1, 4, 7, 10.
4: 3, 8, 10.
5: 2, 5, 8.
6: 1, 5, 6.
7: 4, 5, 6, 8.
8: 2, 3, 4.
9: 1, 5, 7, 8.
10: 3, 8, 9, 10.
Conf ( {9} => {3} ) ?
supp({9}) = 1 , supp({3}) = 1 , supp({3,9}) = 1,
then conf( {9} => {3} ) = 1/1 = 1.0 or 100%. OK?
The model: data
I = {i1, i2, …, im}: a set of items.
Transaction t :
t a set of items, and t I.
Transaction Database T: a set of transactions
T = {t1, t2, …, tn}.
Transaction data:
supermarket data
Market basket transactions:
t1: {bread, cheese, milk}
t2: {apple, eggs, salt, yogurt}
… …
tn: {biscuit, eggs, milk}
Concepts:
An item: an item/article in a basket
I: the set of all items sold in the store
A transaction: items purchased in a basket; it may
have TID (transaction ID)
A transactional dataset: A set of transactions
The model: rules
A transaction t contains X, a set of items
(itemset) in I, if X t.
An association rule is an implication of the
form:
X Y, where X, Y I, and X Y =
An itemset is a set of items.
E.g., X = {milk, bread, cereal} is an itemset.
A k-itemset is an itemset with k items.
E.g., {milk, bread, cereal} is a 3-itemset
Support and Confidence
Support count: The support count of an
itemset X, denoted by X.count, in a data set
T is the number of transactions in T that
contain X. Assume T has n transactions.
Then,
( X Y ).count
support
n
( X Y ).count
confidence
X .count
Goal and key features
Goal: Find all rules that satisfy the user-
specified minimum support (minsup) and
minimum confidence (minconf).
Key Features
Completeness: find all rules.
No target item(s) on the right-hand-side
Mining with data on hard disk (not in memory)
Transaction data
representation
A simplistic view of shopping baskets,
Some important information not considered.
E.g,
the quantity of each item purchased and
the price paid.
Many mining algorithms
There are a large number of them!!
They use different strategies and data structures.
Their resulting sets of rules are all the same.
Given a transaction data set T, and a minimum support and
a minimum confident, the set of association rules existing in
T is uniquely determined.
Any algorithm should find the same set of rules
although their computational efficiencies and
memory requirements may be different.
We study only one: the Apriori Algorithm
The Apriori Algorithm
The name, Apriori, is based on the fact that the algorithm
uses prior knowledge of frequent itemset properties
Apriori employs an iterative approach known as a level-wise
search, where k-itemsets are used to explore (k+1)-
itemsets
The first pass determines the frequent 1-itemsets
denoted L1
A subsequence pass k consists of two phases
First, the frequent itemsets Lk-1 are used to generate the
candidate itemsets Ck
Next, the database is scanned and the support of candidates in
Ck is counted
The frequent itemsets Lk are determined
Generating Frequent Item
Sets
For k products…
1. User sets a minimum support criterion
2. Next, generate list of one-item sets that
meet the support criterion
3. Use the list of one-item sets to generate
list of two-item sets that meet the support
criterion
4. Use list of two-item sets to generate list of
three-item sets
5. Continue up through k-item sets
The Apriori algorithm
Probably the best known algorithm
Two steps:
Find all itemsets that have minimum support
(frequent itemsets, also called large itemsets).
Use frequent itemsets to generate rules.
E.g., a frequent itemset
{Chicken, Clothes, Milk} [sup = 3/7]
and one rule from the frequent itemset
Clothes Milk, Chicken [sup = 3/7, conf =
3/3]
The Apriori Algorithm — Example
Min support =50%
Database D itemset sup.
C
L1 itemset sup.
TID Items 1 {1} 2 {1} 2
100 134 {2} 3 {2} 3
Scan D
200 235 {3} 3 {3} 3
300 1235 {4} 1 {5} 3
400 25 {5} 3
C2 itemset sup C2 itemset
L2 itemset sup {1 2} 1 Scan D {1 2}
{1 3} 2 {1 3} 2 {1 3}
{2 3} 2 {1 5} 1 {1 5}
{2 3} 2 {2 3}
{2 5} 3
{2 5} 3 {2 5}
{3 5} 2
{3 5} 2 {3 5}
C3 itemset Scan D L3 itemset sup
{2 3 5} {2 3 5} 2
The Apriori Algorithm—Example
Let the minimum support be 2
The Apriori Algorithm—Example
Picture of A-Priori
Item Frequent items
counts
Counts of pairs
of
Frequent items
Pass 1 Pass 2
A-Priori Using Triangular Matrix
Why would we even
mention the infrequent items?
Old #’s New #’s
Item counts 1. 1
2. -
3. 2
Counts of
pairs of
frequent
items
Pass 1 Pass 2
Frequent Triples, Etc.
For each size of itemsets k, we construct two
sets of k-sets (sets of size k):
Ck = candidate k-sets = those that might be
frequent sets (support > s) based on information
from the pass for itemsets of size k – 1.
Lk = the set of truly frequent k-sets.
Count All pairs Count
All Go on
the items of items the pairs
items from L1
Filter
Filter
C1 L1 Construct C2 L2 Construct C3
First Second
pass pass
Frequent Frequent
items pairs
Step 1: Mining all frequent
itemsets
A frequent itemset is an itemset whose support
is ≥ minsup.
Key idea: The apriori property (downward
closure property): any subsets of a frequent
itemset are also frequent itemsets
ABC ABD ACD BCD
AB AC AD BC BD CD
A B C D
The Algorithm
Iterative algo. (also called level-wise search):
Find all 1-item frequent itemsets; then all 2-item
frequent itemsets, and so on.
In each iteration k, only consider itemsets that
contain some k-1 frequent itemset.
Find frequent itemsets of size 1: F1
From k = 2
Ck = candidates of size k: those itemsets of size k
that could be frequent, given Fk-1
Fk = those itemsets that are actually frequent, Fk
Ck (need to scan the database once).
Dataset T TID Items
Example – minsup=0.5 T100 1, 3, 4
Finding frequent itemsets T200 2, 3, 5
T300 1, 2, 3, 5
T400 2, 5
itemset:count
1. scan T C1: {1}:2, {2}:3, {3}:3, {4}:1, {5}:3
F1: {1}:2, {2}:3, {3}:3, {5}:3
C2 : {1,2}, {1,3}, {1,5}, {2,3}, {2,5}, {3,5}
2. scan T C2: {1,2}:1, {1,3}:2, {1,5}:1, {2,3}:2, {2,5}:3, {3,5}:2
F2: {1,3}:2, {2,3}:2, {2,5}:3, {3,5}:2
C3: {2, 3,5}
3. scan T C3: {2, 3, 5}:2 F3: {2, 3, 5}
Details: the algorithm
Algorithm Apriori(T)
C1 init-pass(T);
F1 {f | f C1, f.count/n minsup}; // n: no. of transactions in T
for (k = 2; Fk-1 ; k++) do
Ck candidate-gen(Fk-1);
for each transaction t T do
for each candidate c Ck do
if c is contained in t then
c.count++;
end
end
Fk {c Ck | c.count/n minsup}
end
return F k Fk;
Apriori candidate
generation
The candidate-gen function takes Fk-1 and
returns a superset (called the candidates)
of the set of all frequent k-itemsets. It has
two steps
join step: Generate all possible candidate
itemsets Ck of length k
prune step: Remove those candidates in Ck
that cannot be frequent.
Candidate-gen function
Function candidate-gen(Fk-1)
Ck ;
forall f1, f2 Fk-1
with f1 = {i1, … , ik-2, ik-1}
and f2 = {i1, … , ik-2, i’k-1}
and ik-1 < i’k-1 do
c {i1, …, ik-1, i’k-1}; // join f1 and f2
Ck Ck {c};
for each (k-1)-subset s of c do
if (s Fk-1) then
delete c from Ck; // prune
end
end
return Ck;
An example
F3 = {{1, 2, 3}, {1, 2, 4}, {1, 3, 4},
{1, 3, 5}, {2, 3, 4}}
After join
C4 = {{1, 2, 3, 4}, {1, 3, 4, 5}}
After pruning:
C4 = {{1, 2, 3, 4}}
because {1, 4, 5} is not in F3 ({1, 3, 4, 5} is removed)
Step 2: Generating rules from
frequent itemsets
Frequent itemsets association rules
One more step is needed to generate
association rules
For each frequent itemset X,
For each proper nonempty subset A of X,
Let B = X - A
A B is an association rule if
Confidence(A B) ≥ minconf,
support(A B) = support(AB) = support(X)
confidence(A B) = support(A B) / support(A)
Generating rules: an example
Suppose {2,3,4} is frequent, with sup=50%
Proper nonempty subsets: {2,3}, {2,4}, {3,4}, {2}, {3}, {4}, with
sup=50%, 50%, 75%, 75%, 75%, 75% respectively
These generate these association rules:
2,3 4, confidence=100%
2,4 3, confidence=100%
3,4 2, confidence=67%
2 3,4, confidence=67%
3 2,4, confidence=67%
4 2,3, confidence=67%
All rules have support = 50%
Generating rules: summary
To recap, in order to obtain A B, we need
to have support(A B) and support(A)
All the required information for confidence
computation has already been recorded in
itemset generation. No need to see the data
T any more.
This step is not as time-consuming as
frequent itemsets generation.
On Apriori Algorithm
Seems to be very expensive
Level-wise search
K = the size of the largest itemset
It makes at most K passes over data
In practice, K is bounded (10).
The algorithm is very fast. Under some conditions,
all rules can be found in linear time.
Scale up to large data sets
More on association rule
mining
Clearly the space of all association rules is
exponential.
The mining exploits sparseness of data, and
high minimum support and high minimum
confidence values.
Still, it always produces a huge number of
rules, thousands, tens of thousands, millions,
...
Advantage of Apriori
Algorithm
(Summary)
The Apriori Algorithm calculates more sets of
frequent items.
Limitations of Apriori
Algorithm
(Summary)
Needs several iterations of the data
The candidate generation could be
extremely slow (pairs, triplets, etc.
Uses a uniform minimum support
threshold
Difficulties to find rarely occuring events
Alternative methods (other than appriori)
can address this by using a non-uniform
minimum support thresold
Contd…
The counting method iterates through all of
the transactions each time.
Constant items make the algorithm a lot
heavier.
Huge memory consumption
End