MBA 739 – Advanced Analytics
Week 3 – Association Rules Assignment
The data shown in Table 1 below and the output in Table 2 are based on a subset of a dataset on
cosmetic purchases ([Link]) at a large chain drugstore. The store wants to analyze associations
among purchases of these items for purposes of point-of-sale display, guidance to sales personnel in
promoting cross-sales, and guidance for piloting an eventual time-of-purchase electronic
recommender system to boost cross-sales. Consider first only the data shown in Table 1, given in
binary matrix form.
Table 1: Excerpt from data on cosmetics purchases in binary matrix form
Trans. Bag Blush Nail Brushes Concealer Eyebrow Bronzer
# Polish Pencils
1 0 1 1 1 1 0 1
2 0 0 1 0 1 0 1
3 0 1 0 0 1 1 1
4 0 0 1 1 1 0 1
5 0 1 0 0 1 0 1
6 0 0 0 0 1 0 0
7 0 1 1 1 1 0 1
8 0 0 1 1 0 0 1
9 0 0 0 0 1 0 0
10 1 1 1 1 0 0 0
11 0 0 1 0 0 0 1
12 0 0 1 1 1 0 1
1
MBA 739 – Advanced Analytics
Table 2: Association rules for cosmetics purchases data
Questions
a. Consider the results of the association rules analysis shown in Table 2
i. For the first row, explain the “confidence” output and how it is calculated.
ii. For the first row, explain the “support” output and how it is calculated.
iii. For the first row, explain the “lift” and how it is calculated.
iv. For the first row, explain the rule that is represented there in words.
b. Now, use the complete dataset on the cosmetics purchases (in the file [Link]). Using
the file in RStudio cloud, apply association rules to these data (use the default parameters).
i. Generate an Item Frequency Plot. What three cosmetics appear most frequently.
ii. Generate the corpus of rules for the cosmetics dataset where support is greater than
0.12. Sort the list by lift. How many rules are created? What are the antecedents,
consequents, support, confidence, and lift of the first five rows?
iii. If a rule you generated in this dataset had a lift of 6.87, is that a good rule?