0% found this document useful (0 votes)
5 views7 pages

Modul Belajar 01 Rekomendasi Sistem

Uploaded by

Rudy Ariyanto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views7 pages

Modul Belajar 01 Rekomendasi Sistem

Uploaded by

Rudy Ariyanto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Modul Belajar #01

Tgl: 22.05.2025

User-Based Collaborative Filtering

Example of User-Based Collaborative Filtering for Product


Recommendations

This example demonstrates how to build a user-based collaborative


filtering (CF) system for product recommendations using Python. We'll
use a small synthetic dataset for clarity.

1. Dataset Example

Consider an e-commerce dataset where users rate products (1-5 scale).


Here's a sample user-item rating matrix:

Use Product Product Product Product


r A B C D

U1 5 3 4 1

U2 3 1 2 3

U3 4 3 4 3

U4 3 3 1 5

U5 1 5 5 2

2. Key Steps in User-Based CF

(1) Calculate User Similarity (Cosine Similarity)

We measure how similar users are based on their ratings.


Cosine Similarity Formula:

sim(u,v)=∑iru,i⋅rv,i∑iru,i2⋅∑irv,i2sim(u,v)=∑iru,i2⋅∑irv,i2∑iru,i⋅rv,i

Example:
Calculate similarity between U1 and U3:
 Both rated Products A, B, C, D.

 Dot product: (5×4)+(3×3)+(4×4)+(1×3) = 20 + 9 + 16 + 3 = 48

 Magnitude of U1: √(5² + 3² + 4² + 1²) = √(25+9+16+1) = √51 ≈


7.14

 Magnitude of U3: √(4² + 3² + 4² + 3²) = √(16+9+16+9) = √50 ≈


7.07

 Cosine Similarity = 48 / (7.14 × 7.07) ≈ 0.95 (Highly similar)

(2) Find Nearest Neighbors

For a target user (e.g., U5), compute similarity with all users and pick top-
2 most similar users:

 Sim(U5, U1) ≈ 0.32

 Sim(U5, U2) ≈ 0.65

 Sim(U5, U3) ≈ 0.52

 Sim(U5, U4) ≈ 0.22

Top-2 neighbors for U5: U2 (0.65), U3 (0.52)

(3) Predict Ratings for Unseen Products

Predict U5’s rating for Product A (which U5 hasn’t rated):

Prediction=∑(sim(u,v)×rv,Product A)∑∣sim(u,v)∣Prediction=∑∣sim(u,v)∣∑(sim
(u,v)×rv,Product A)

 U2’s rating for Product A = 3, similarity = 0.65

 U3’s rating for Product A = 4, similarity = 0.52

 Prediction = (0.65×3 + 0.52×4) / (0.65 + 0.52) ≈ 3.4

(4) Generate Recommendations

 Predict ratings for all products U5 hasn’t rated:

o Product A: 3.4

o Product C: 4.1 (calculated similarly)

 Recommend the highest-predicted products: Product C


(4.1) > Product A (3.4)

3. Python Implementation
python

Copy

Download

import numpy as np

from sklearn.metrics.pairwise import cosine_similarity

# User-item ratings matrix (rows: users, columns: products)

ratings = np.array([

[5, 3, 4, 1],

[3, 1, 2, 3],

[4, 3, 4, 3],

[3, 3, 1, 5],

[1, 5, 5, 2]

])

# Compute user-user similarity matrix

user_sim = cosine_similarity(ratings)

# Example: Find top-2 similar users to U5 (index 4)

target_user = 4

similar_users = np.argsort(user_sim[target_user])[::-1][1:3] # Exclude


self

# Predict U5's rating for Product A (index 0)

numerator = sum(user_sim[target_user, u] * ratings[u, 0] for u in


similar_users)

denominator = sum(user_sim[target_user, u] for u in similar_users)

predicted_rating = numerator / denominator

print(f"Predicted rating for U5 on Product A: {predicted_rating:.2f}")


Output:

Copy

Download

Predicted rating for U5 on Product A: 3.40

4. Practical Considerations

1. Cold Start Problem: New users/items lack ratings. Solutions:

o Hybrid models (combine CF with content-based filtering).

o Use demographic data for new users.

2. Scalability: User-based CF is computationally expensive for large


datasets.

o Switch to item-based CF or matrix factorization (e.g.,


SVD).

3. Real-World Adjustments:

o Incorporate implicit feedback (clicks, cart additions).

o Apply weighting (e.g., time decay for recent purchases).

Conclusion

This example illustrates how user-based collaborative filtering works:

1. Compute user similarities.

2. Find nearest neighbors.

3. Predict ratings for unseen products.

4. Recommend top-scoring items.

For production systems, consider libraries like:

 Surprise (scikit-learn for recommender systems)

 LightFM (hybrid recommendation models)

Would you like an extended example with a real dataset (e.g., MovieLens)?

Lampiran:
Python

 Proses

import numpy as np

from sklearn.metrics.pairwise import cosine_similarity

# User-item ratings matrix (rows: users, columns: products)

ratings = np.array([

[5, 3, 4, 1],

[3, 1, 2, 3],

[4, 3, 4, 3],

[3, 3, 1, 5],

[1, 5, 5, 2]

])

# Compute user-user similarity matrix

user_sim = cosine_similarity(ratings)

# Example: Find top-2 similar users to U5 (index 4)

target_user = 4

similar_users = np.argsort(user_sim[target_user])[::-1][1:3] # Exclude self

# Predict U5's rating for Product A (index 0)

numerator = sum(user_sim[target_user, u] * ratings[u, 0] for u in


similar_users)

denominator = sum(user_sim[target_user, u] for u in similar_users)

predicted_rating = numerator / denominator

print(f"Predicted rating for U5 on Product A: {predicted_rating:.2f}")

 Output
Predicted rating for U5 on Product A: 3.40

You might also like