100% found this document useful (1 vote)
78 views

1 - Assignment - Association Rule Mining

This document provides instructions for a group assignment to simulate the FP-Growth algorithm on Rapidminer. It describes generating transaction data, transforming the data into customer-oriented and item-oriented formats, and applying FP-Growth and association rule generation with specified parameters. Students are asked to perform the tasks, document their process and solutions, and answer 5 questions about the frequent patterns, itemsets, rules generated.

Uploaded by

hadi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
78 views

1 - Assignment - Association Rule Mining

This document provides instructions for a group assignment to simulate the FP-Growth algorithm on Rapidminer. It describes generating transaction data, transforming the data into customer-oriented and item-oriented formats, and applying FP-Growth and association rule generation with specified parameters. Students are asked to perform the tasks, document their process and solutions, and answer 5 questions about the frequent patterns, itemsets, rules generated.

Uploaded by

hadi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Simulating FP-Growth on the Rapidminer tool.

Deadline: Tuesday February 28, 2023.

Instructions: This is a group assignment where each group would have at maximum 2 students. Upload
your solution on the LMS in the form of a report (Word/PDF) that details the operations performed for
each task, explanation of the solution you have come-up with and corresponding screenshots.

We want to generate a dataset that contains 100 number of transactions for 8 customers against 10 items.
For such an operation we can use Generate Transaction Data operator with aforementioned parameters
including cluster value 10 and local random seed 1992.

Task 1. The generated data may not be suitable for the FP-Growth operator, so we need to perform some
data transformation operations such as Aggregate, Rename, Set Role, and One Column.

We will get the dataset transformed as follows:

Figure 1. Customer-oriented data transformation.

Task 2. We can also perform another transformation that may include operations such as Pivot Items,
Rename by Replacing, Replace Missing Values, Numeric to Binominal and Item Type operations.
Afterwards, we will get the original dataset transformed as follows:
Figure 2. Item-oriented data transformation.

Now we can use the FP-Growth component with following parameters for Task 1:

 Input format: item list in a column


 Item separators: |
 Escape character: \
 Trim item names: checked
 Min support: 0.8
 Min items per itemset: 1
 Max items per itemset: 0
 Max number of itemset:
 Max number of itemset: 1000000
 Find min number of itemsets: unchecked

For Task 2, FP-Growth will take on the following parameters:

 Input format: items in dummy columns


 Positive value: empty
 Min requirement: support
 Min support: 0.7
 Min items per itemset: 1
 Max items per itemset: 0
 Max number of itemsets: 1000000
 Find min number of itemsets: unchecked

Task 3. Add Create Association Rules operator with following parameters:


 Criterion: confidence
 Min confidence: 0.8
 Gain theta: 0
 Laplace k: 1.0
We may get an overall process pipeline as follows:

Figure 3. Overall process flow

Note: The “Items in one Column” and “Item Type Columns” contain subprocess operations for data
transformation.

Answer the following questions:

1. How many frequent patterns are generated?


2. What is the maximum size of the frequent itemset and what items it contained?
3. How many frequent rules are generated?
4. How many rules are generated that have consequent/conclusion Item 4?
5. How many rules are generated having confidence = 1?

You might also like