Module 3 Collaborative Filtering
Collaborative Filtering is a technique or a method to predict a user’s taste and
find the items that a user might prefer on the basis of information collected from
various other users having similar tastes or preferences. It takes into
consideration the basic fact that if person X and person Y have a certain reaction
for some items then they might have the same opinion for other items too.
Collaborative Filtering based Recommendation Systems
Recommendation Systems is an important topic in machine learning. There are two
different techniques used in recommendation systems to filter options: collaborative
filtering and content-based filtering.
Recommendation systems predict the user preferences or ratings that users would give to
items. The recommendation system is very highly used in movies, news, advisement,
music, etc.
The best examples of recommendation systems are YouTube, IMDb, Amazon, Flipkart,
etc.
Most recommendation systems use collaborative filtering to find similar patterns or
information about the users. This technique can filter out items that users like on the
basis of the ratings or reactions by similar users.
An example of collaborative filtering can be to predict the rating of a particular user
based on user ratings for other movies and others’ ratings for all movies. This concept
is widely used in recommending movies, news, applications, and so many other items.
Types of Collaborative Filtering
There are two types of Collaborative Filtering:
User-User-based Similarity/Collaborative Filtering
Item-Item-based Similarity/Collaborative Filtering
The most popular Collaborative Filtering is item-item-based Collaborative Filtering.
User-User-Based Collaborative Filtering
User-user collaborative filtering is one kind of recommendation method that looks for
similar users based on the items users have already liked or positively.
Here, we look for the users who have rated various items in the same way and
then find the rating of the missing item with the help of these users.
Compute a user-user similarity between two users using the cosine similarity formula.
Cosine similarity means the similarity between two vectors of inner product space. It is
measured by the cosine of the angle between two vectors.
Source: Wikipedia
Simij= similarity(user i , userj)
Item-Item Based Collaborative Filtering
This is also very simple and quite similar in idea to user-user similarity. Item-item
similarity solves a problem that occurs in a user user-based similarity (user’s interest
changes over time).
Here, we explore the relationship between the pair of items (the user who bought
Y, also bought Z). We find the missing rating with the help of the ratings given to
the other items by the user.
Simij= similarity(item i , itemj)
It was first invented and used by Amazon in 1998. Rather than matching the user
to similar customers, item-to-item collaborative filtering matches each of the
user’s purchased and rated items to similar items, then combines those similar
items into a recommendation list. Now, let us discuss how it works.
Item to Item Similarity: The very first step is to build the model by finding
similarity between all the item pairs. The similarity between item pairs can be
found in different ways. One of the most common methods is to use cosine
similarity.
Prediction Computation: The second stage involves executing a
recommendation system. It uses the items (already rated by the user) that are
most similar to the missing item to generate rating. We hence try to generate
predictions based on the ratings of similar products. We compute this using a
formula which computes rating for a particular item using weighted sum of the
ratings of the other similar products.
Example:
Let us consider one example. Given below is a set table that contains some items
and the user who have rated those items. The rating is explicit and is on a scale of
1 to 5. Each entry in the table denotes the rating given by a i th User to a jth Item. In
most cases majority of cells are empty as a user rates only for few items. Here,
there are 4 users and 3 items. We need to find the missing ratings for the
respective user.
User/
Item_1 Item_2 Item_3
Item
User_1 2 – 3
User_2 5 2 –
User_3 3 3 1
User_4 – 2 2
Step 1: Finding similarities of all the item pairs.
Form the item pairs. For example in this example the item pairs are (Item_1,
Item_2), (Item_1, Item_3), and (Item_2, Item_3). Select each item to pair one by
one. After this, we find all the users who have rated for both the items in the item
pair. Form a vector for each item and calculate the similarity between the two
items using the cosine formula stated above.
Sim(Item1, Item2)
In the table, we can see only User_2 and User_3 have rated for both items 1 and 2.
Thus, let I1 be vector for Item_1 and I2 be for Item_2. Then,
I1 = 5U2 + 3U3 and,
I2 = 2U2 + 3U3
I1 I2 I3
I1 1 ((5*2+(3*3) ) / ((2*3+(3*1) ) /
(sqrt (52+32) * sqrt (sqrt (22+32) * sqrt
(22+32) ) (32+12) )
I2 1 ((3*1+(2*2) ) /
(sqrt (32+22) * sqrt
(12+22) )
I3 1
Sim(Item2, Item3)
In the table we can see only User_3 and User_4 have rated for both the items 1 and 2.
Thus, let I2 be vector for Item_2 and I3 be for Item_3. Then,
I2 = 3U3 + 2U4 and,
I3 = 1U3 + 2U4
Sim(Item1, Item3)
In the table we can see only User_1 and User_3 have rated for both the items 1 and 2.
Thus, let I1 be vector for Item_1 and I3 be for Item_3. Then,
I1 = 2U1 + 3U3 and,
I3 = 3U1 + 1U3
Step 2: Generating the missing ratings in the table
Now, in this step we calculate the ratings that are missing in the table.
References:
https://www.geeksforgeeks.org/item-to-item-based-collaborative-filtering/