ECLAT (Equivalence Class Clustering and bottom-up Lattice Traversal) algorithm is a popular and efficient technique used for association rule mining. It is an improved alternative to the Apriori algorithm, offering better scalability and computational efficiency. Unlike Apriori, which follows a horizontal database layout and employs a breadth-first search (BFS) approach, ECLAT adopts a vertical database representation and uses depth-first search (DFS).
This vertical approach significantly reduces the number of database scans, making ECLAT faster and more memory-efficient, especially for large datasets.
Key Differences between ECLAT and Apriori
- Apriori Algorithm: Uses a horizontal database layout and follows BFS, requiring multiple database scans.
- ECLAT Algorithm: Uses a vertical database layout and follows DFS, reducing the number of database scans.
For example, in Apriori, frequent single-item sets are identified first, followed by expansion to larger itemsets, requiring multiple database scans. ECLAT solves this by storing transactions in a vertical format (TID sets), which allows efficient intersection operations.
How ECLAT Algorithm Works
Let’s walk through an example to better understand how the ECLAT algorithm works. Consider the following transaction dataset represented in a Boolean matrix:

The core idea of the ECLAT algorithm is based on the intersection of datasets to calculate the support of itemsets, avoiding the generation of subsets that are not likely to exist in the dataset. Here’s a breakdown of the steps:
Step 1: Create the Tidset
The first step is to generate the tidset for each individual item. A tidset is simply a list of transaction IDs where the item appears. For example:
k = 1, minimum support = 2
Item |
Tidset |
Bread |
{T1, T4, T5, T7, T8, T9} |
Butter |
{T1, T2, T3, T4, T6, T8, T9} |
Milk |
{T3, T5, T6, T7, T8, T9} |
Coke |
{T2, T4} |
Jam |
{T1, T8} |
Step 2: Calculate the Support of Itemsets by Intersecting Tidsets
ECLAT then proceeds by recursively combining the tidsets. The support of an itemset is determined by the intersection of tidsets. For example:
k = 2
Item |
Tidset |
{Bread, Butter} |
{T1, T4, T8, T9} |
{Bread, Milk} |
{T5, T7, T8, T9} |
{Bread, Coke} |
{T4} |
{Bread, Jam} |
{T1, T8} |
{Butter, Milk} |
{T3, T6, T8, T9} |
{Butter, Coke} |
{T2, T4} |
{Butter, Jam} |
{T1, T8} |
{Milk, Jam} |
{T8} |
Step 3: Recursive Call and Generation of Larger Itemsets
The algorithm continues recursively by combining pairs of itemsets (k-itemsets) checking the support by intersecting the tidsets. The recursion continues until no further frequent itemsets can be generated.
k = 3
Item |
Tidset |
{Bread, Butter, Milk} |
{T8, T9} |
{Bread, Butter, Jam} |
{T1, T8} |
Step 4: Stop When No More Frequent Itemsets Can Be Found
The algorithm stops once no more itemset combinations meet the minimum support threshold.
k = 4
Item |
Tidset |
{Bread, Butter, Milk, Jam} |
{T8} |
We stop at k = 4 because there are no more item-tidset pairs to combine. Since minimum support = 2, we conclude the following rules from the given dataset:-
Items Bought |
Recommended Products |
Bread |
Butter |
Bread |
Milk |
Bread |
Jam |
Butter |
Milk |
Butter |
Coke |
Butter |
Jam |
Bread and Butter |
Milk |
Bread and Butter |
Jam |
Advantages of the ECLAT Algorithm
- Efficient in Dense Datasets: Performs better than Apriori in datasets with frequent co-occurrences.
- Memory Efficient: Uses vertical representation, reducing redundant scans.
- Fast Itemset Intersection: Computing itemset support via TID-set intersections is faster than scanning transactions repeatedly.
- Better Scalability: Can handle larger datasets due to its depth-first search mechanism.
Disadvantages of the ECLAT Algorithm
- High Memory Requirement: Large TID sets can consume significant memory.
- Not Suitable for Sparse Data: Works better in dense datasets, but performance drops for sparse datasets where intersections result in small itemsets.
- Sensitive to Large Transactions: If a transaction has too many items, its corresponding TID-set intersections can be expensive.
Applications of ECLAT Algorithm
- Market Basket Analysis: Identifying frequently purchased items together.
- Recommendation Systems: Suggesting products based on past purchase patterns.
- Medical Diagnosis: Finding co-occurring symptoms in medical records.
- Web Usage Mining: Analyzing web logs to understand user behavior.
- Fraud Detection: Discovering frequent patterns in fraudulent activities.
Similar Reads
FOCL Algorithm
The First Order Combined Learner (FOCL) Algorithm is an extension of the purely inductive, FOIL Algorithm. It uses domain theory to further improve the search for the best rule and greatly improves accuracy. It incorporates the methods of Explanation-Based learning (EBL) into the existing methods of
6 min read
ML | Find S Algorithm
Introduction : The find-S algorithm is a basic concept learning algorithm in machine learning. The find-S algorithm finds the most specific hypothesis that fits all the positive examples. We have to note here that the algorithm considers only those positive training example. The find-S algorithm sta
4 min read
Simple Genetic Algorithm (SGA)
Prerequisite - Genetic Algorithm Introduction : Simple Genetic Algorithm (SGA) is one of the three types of strategies followed in Genetic algorithm. SGA starts with the creation of an initial population of size N.Then, we evaluate the goodness/fitness of each of the solutions/individuals. After tha
1 min read
Learn-One-Rule Algorithm
Prerequisite: Rule-Based Classifier Learn-One-Rule: This method is used in the sequential learning algorithm for learning the rules. It returns a single rule that covers at least some examples (as shown in Fig 1). However, what makes it really powerful is its ability to create relations among the at
3 min read
ML - Candidate Elimination Algorithm
The candidate elimination algorithm incrementally builds the version space given a hypothesis space H and a set E of examples. The examples are added one by one; each example possibly shrinks the version space by removing the hypotheses that are inconsistent with the example. The candidate eliminati
4 min read
ML | Expectation-Maximization Algorithm
The Expectation-Maximization (EM) algorithm is an iterative method used in unsupervised machine learning to estimate unknown parameters in statistical models. It helps find the best values for unknown parameters, especially when some data is missing or hidden. It works in two steps: E-step (Expectat
6 min read
Machine Learning Algorithms
Machine learning algorithms are essentially sets of instructions that allow computers to learn from data, make predictions, and improve their performance over time without being explicitly programmed. Machine learning algorithms are broadly categorized into three types: Supervised Learning: Algorith
8 min read
Inductive Learning Algorithm
In this article, we will learn about Inductive Learning Algorithm which generally comes under the domain of Machine Learning. What is Inductive Learning Algorithm? Inductive Learning Algorithm (ILA) is an iterative and inductive machine learning algorithm that is used for generating a set of classif
5 min read
ML - Convergence of Genetic Algorithms
Introduction: Genetic algorithms are probabilistic search optimization techniques, which operate on a population of chromosomes, representing potential solutions to the given problem.In a standard genetic algorithm, binary strings of 1s and 0s represent the chromosomes. Each chromosome is assigned a
2 min read
LightGBM Boosting Algorithms
A machine learning approach called "boosting" turns several poor learners into strong learners. A model that is a poor learner can only marginally outperform random guessing, but a model that is a strong learner can attain great accuracy and generalization. Boosting employs weak learners through ite
15+ min read