M04 Matrix Factorization
M04 Matrix Factorization
Matrix Factorization
ZK Abdurahman Baizal
Collaborative Filtering Techniques
Introduction
• While user-based or item-based collaborative Filtering
methods are simple and intuitive, matrix factorization
techniques are usually more effective because they allow us to
discover the latent features underlying the interactions
between users and items
Number Factorization
• Assumes you have a number, let says 𝑧.
• Factorization scenario run on :
– select two other numbers (e.g. 𝑥 and 𝑦), and make sure when you
multiply both of them you will obtain equal value of 𝑧
• So the factorization can be written in the following formula
𝑧 =𝑥 ∗𝑦
Matrix Factorization (MF)
• Assumes you have a matrix, let says 𝑍.
• Almost similar to factorization, our matrix factorization (MF)
task can be defined as
– select two other matrices (e.g. 𝑋 and 𝑌), and
– make sure when you multiply both of them you will obtain
̅
approximately similar to 𝑍 (we will notate as 𝑍).
• So the matrix factorization can be written in the following
formula
𝑍̅ ≈ 𝑋 ∗ 𝑌
MF-Basic Idea
Given a group of users and a set of items
D1 D2 D3 D4
U1 5 3 - 1
U2 4 - - 1
U3 1 1 - 5
U4 1 - - 4
U5 0 1 5 4
MF-Basic Idea
• The intuition behind using matrix factorization to solve this
problem is that there should be some latent features that
determine how a user rates an item.
• For example, two users would give high ratings to a certain movie if they both
like the actors/actresses of the movie, or if the movie is an action movie,
which is a genre preferred by both users.
• If we can discover these latent features, we should be able to
predict a rating with respect to a certain user and a certain
item, because the features associated with the user should
match with the features associated with the item
MF-Basic Idea
• We have a set 𝑈 of users, and a set 𝐷 of items. Let 𝑅 of size
𝑈 × 𝐷 be the matrix that contains all the ratings that the
users have assigned to the items.
• Also, we assume that we would like to discover 𝑘 latent
features. Our task, then, is to find two matrices 𝑃 𝑘× 𝑈
and 𝑄 𝑘× 𝐷 such that their product approximates 𝑅(
𝑅( ≈ 𝑃! ×𝑄
each row of 𝑃 would represent the strength of the associations
between a user and the features