0% found this document useful (0 votes)

299 views24 pages

CS548S15 Showcase Web Mining

This document summarizes Amazon's item-to-item collaborative filtering recommender system. It begins by explaining the need for recommender systems and how Amazon uses recommendations to personalize online shopping experiences. It then describes Amazon's recommendation algorithm which finds similar items instead of similar users, precomputes item similarities offline for efficient online recommendations, and provides high quality recommendations even for new users. Comparisons show item-to-item collaborative filtering provides better scalability and recommendation quality than traditional user-based collaborative filtering and cluster models.

Uploaded by

P.Anand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

299 views24 pages

CS548S15 Showcase Web Mining

Uploaded by

P.Anand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

CS 548 Spring 2015 Web Mining Showcase

By Salah Ahmed, Hai Liu, Shaocheng Wang, Sijing Yang

Showcasing Work by
[Link] on Recommender
System
References:
1. Linden, G.; Smith, B.; York, J.; , "[Link] recommendations: item-to-item
collaborative filtering,". Internet Computing, IEEE , vol.7, no.1, pp. 76- 80, Jan/Feb
2003
2. Takács, G.; Pilászy, I.; Németh, B.; Tikk, D. (March 2009). "Scalable Collaborative
Filtering Approaches for Large Recommender Systems". Journal of Machine Learning
Research 10: 623-656
3. Candillier, L., Meyer, F., & Boull'e Marc. (2007). Comparing state-of-the-art
collaborative filtering systems. Proceedings of the 5th International Conference on
Machine Learning and Data Mining in Pattern Recognition, Leipzig, Germany. 548-
562. doi: 10.1007/978-3-540-73499-4_41
4. Ala Alluhaidan, Ala, "Recommender System Using Collaborative Filtering Algorithm"
(2013). Technical Library. Paper 155. [Link]
5. Francesco Ricci, Slides on "Item-to-Item Collaborative Filtering and Matrix
Factorization".
[Link]
[Link]
6. Xavier Amatriain, Bamshad Mobasher; KDD 2014 Tutorial - the recommender
problem revisited. [Link]

@[Link] 1
Why Recommender?

“We are leaving the age of information and entering the age of recommendation.”

—— Chris Anderson in “The Long Tail”

@[Link] Picture from: [Link]
2
The Age of Recommendation

Search:

User Items

Recommend:

Items User

@[Link] Picture from: [Link] 3

Amazon: A personalized online store

@[Link] Picture from: [Link] 4

Amazon: A personalized online store

@[Link] Picture from: [Link] 5

Recommender Problem
A good recommender
• Show programming titles to a software engineer and baby toys to a new
mother.
• Don’t recommend items user already knows or would find anyway.
• Expand user’s taste without offending or annoying him/her…

Challenges
• Huge amounts of data, tens of millions of customers and millions of distinct
catalog items.
• Results are required to be returned in real time.
• New customers have limited information.
• Old customers can have a glut of information.
• Customer data is volatile.

@[Link] 6
Amazon’s solution
1. Amazon Recommendation Engine
• Amazon’s model that implements recommendation algorithm.
• Recommendation algorithm is designed to personalize the online store for each customer.

2. Algorithm feature
• Most recommendation algorithms start by finding a set of similar customers whose purchased and
rated items overlap the user’s purchased and rated items.
• The Amazon’s item-to-item collaborative filtering is focusing on finding similar items instead of
similar customers.

3. Recommendation Engine Workflow

@[Link] Picture from: [Link] 7

Traditional Recommendation Algorithms

Two mostly used traditional algorithms:

1. User Based Collaborative Filtering

2. Cluster Models

@[Link] 8
User Based Collaborative Filtering
Approach
• Represents a customer as an N-dimensional vector of items
• Vector is positive for purchased or positively rated items and negative for negatively rated items
• Based on cosine similarity: finds similar customers/users

• Generates recommendations based on a few customers who are most similar to the user
• Rank each item according to how many similar customers purchased it
Problems
• computationally expensive, O(MN) in the worst case, where
─ M is the number of customers and
─ N is the number of items

• dimensionality reduction can increase the performance, BUT, also reduce the quality of the
recommendation
• For very large data sets, such as 10 million customers and 1 million items, the algorithm
encounters severe performance and scaling issues

@[Link] 9
Cluster Models
Approach
• Divide the customer base into many segments and treat the task as a classification
problem
• Assign the user to the segment containing the most similar customers
• Uses the purchases and ratings of the customers in the segment to generate
recommendations
• Cluster models have better online scalability and performance than collaborative
filtering because they compare the user to a controlled number of segments rather
than the entire customer base.
Problems
• Quality of the recommendation is low
• The recommendations are less relevant because the similar customers that the cluster
models find are not the most similar customers
• To improve quality, it needs online segmentation, which is almost as expensive as
finding similar customers using collaborative filtering

@[Link] 10
Amazon’s Item-to-Item CF

• Difference with User-to-User CF

@[Link] Picture from: [Link] 11

Amazon’s Item-to-Item CF

@[Link] Picture from: Francesco Ricci, Slides [5] 12

Amazon’s Item-to-Item CF
How It Works
• Matches each of the user’s purchased and rated items to similar items
• Combines those similar items into a recommendation list
An iterative algorithm:
• Builds a similar-items table by finding items that customers tend to purchase
together
• Provides a better approach by calculating the similarity between a single product
and all related products:

• The similarity between two items uses the cosine measure

• Each vector corresponds to an item rather than a customer and
• Vector’s M dimensions correspond to customers who have purchased that item
@[Link] 13
Offline computation : Online Recommendation

Offline Computation:
• builds a similar-items table which is extremely time intensive, O(N2M)
• In practice, it’s closer to O(NM), as most customers have very few purchases
• Sampling customers can also reduce runtime even further with little
reduction in quality.

Online Recommendation:
• Given a similar-items table, the algorithm
─ finds items similar to each of the user’s purchases and ratings,
─ aggregates those items, and then
─ recommends the most popular or correlated items.

@[Link] 14
Scalability and Quality: Comparison

User Based collaborative filtering: Item-to-Item collaborative filtering:

─ little or no offline computation ─ scalability and performance are achieved by
─ impractical on large data sets, unless it uses creating the expensive similar-items table
dimensionality reduction, sampling, or offline
partitioning ─ online component "looking up similar items“
─ dimensionality reduction, sampling, or scales independently of the catalog size or the
partitioning reduces recommendation quality number of customers
─ fast for extremely large data sets
Cluster models: ─ recommendation quality is excellent since it
─ can perform much of the computation offline, recommends highly correlated similar items
─ but recommendation quality is relatively poor ─ unlike traditional collaborative filtering,
 the algorithm performs well with limited
user data,
 producing high-quality recommendations
based on as few as two or three items

@[Link] 15
Results:
• The MovieLens dataset contains 1 million ratings from 6,040 users on 3,900
movies.
• The best overall results are reached by the item-by-item based approach. It
needs 170 seconds to construct the model and 3 seconds to predict 100,021
ratings.

Table from: Candillier, L., Meyer, F., & Boull'e Marc. (2007).
@[Link] Comparing state-of-the-art collaborative filtering systems [3]
16
Some Related Applications

• Pandora

• Netflix

• Google YouTube

@[Link] 17
Pandora music recommendation service
How It Works:
• Base its recommendation on data from
Music Genome Project
• Assigns 400 attributes for each song, done
by musicians, takes half an hour per song
• Use this method to find songs which is
similar to user’s favorite songs

Benefits:
• Accurate method, don’t need lots of users information, needs little to get started

Drawback:
• Doesn't scale very well and often feels that Pandora's library is somewhat limited

@[Link] Picture from: [Link] 18

Netflix movie recommendation system
What’s it
• Make recommendations by comparing the
watching and the searching habits of similar
users as well as by offering movies that share
characteristics with films that a user has rated
highly
• Collaborative, content-based, knowledge-
based, and demographic techniques serves
as the basis of its recommendation system.
An ensemble method of 107 different
algorithmic approaches, blended into a single
prediction

Benefit:
• Each of these techniques has known shortcomings, using multiple techniques
together achieves some synergy between them.

@[Link] Picture from: [Link] 19

Google YouTube recommendation system
Why:
• Focus on videos, bring videos to users
which they believe users will be interest in
• Increase the numbers of videos, increase
the length of time, and maximize the
enjoyment
• Ultimately google can increase revenue by
showing more ads

Interesting things:
•Give up its old recommendation system based on random walk, changed to a new
one based on Amazon’s item-to-item collaborative filtering in 2010
•Amazon’s item-to-item collaborative filtering appears to be the best for video
recommendation

@[Link] Picture from: [Link] 20

References:
1. Linden, G.; Smith, B.; York, J.; , "[Link] recommendations: item-to-item
collaborative filtering,". Internet Computing, IEEE , vol.7, no.1, pp. 76- 80, Jan/Feb
2003
2. Takács, G.; Pilászy, I.; Németh, B.; Tikk, D. (March 2009). "Scalable Collaborative
Filtering Approaches for Large Recommender Systems". Journal of Machine Learning
Research 10: 623-656
3. Candillier, L., Meyer, F., & Boull'e Marc. (2007). Comparing state-of-the-art
collaborative filtering systems. Proceedings of the 5th International Conference on
Machine Learning and Data Mining in Pattern Recognition, Leipzig, Germany. 548-
562. doi: 10.1007/978-3-540-73499-4_41
4. Ala Alluhaidan, Ala, "Recommender System Using Collaborative Filtering Algorithm"
(2013). Technical Library. Paper 155. [Link]
5. Francesco Ricci, Slides on "Item-to-Item Collaborative Filtering and Matrix
Factorization".
[Link]
[Link]
6. Xavier Amatriain, Bamshad Mobasher; KDD 2014 Tutorial - the recommender
problem revisited. [Link]

@[Link] 21
Web Mining, CS 548

Questions and Comments?

22
Web Mining, CS 548

Thank You

Recomender System
No ratings yet
Recomender System
20 pages
TECHNICAL+NOTE Recommender+Systems+v.27
No ratings yet
TECHNICAL+NOTE Recommender+Systems+v.27
16 pages
Recommender - Introduction
No ratings yet
Recommender - Introduction
25 pages
Unit III Collaborative Filtering Final
No ratings yet
Unit III Collaborative Filtering Final
65 pages
Unit Iii-Collaborative Filtering
No ratings yet
Unit Iii-Collaborative Filtering
34 pages
10 Recommender Systems
No ratings yet
10 Recommender Systems
35 pages
DM Lect 6 - Recommender Systems
No ratings yet
DM Lect 6 - Recommender Systems
46 pages
Book Recommendation System Project
No ratings yet
Book Recommendation System Project
14 pages
Recommendation System
No ratings yet
Recommendation System
8 pages
Recommender Systems Overview and Methods
No ratings yet
Recommender Systems Overview and Methods
36 pages
Chapter 2
No ratings yet
Chapter 2
40 pages
Rec Sys CF
No ratings yet
Rec Sys CF
48 pages
Advances in Artificial Intelligence - 2009 - Su - A Survey of Collaborative Filtering Techniques
No ratings yet
Advances in Artificial Intelligence - 2009 - Su - A Survey of Collaborative Filtering Techniques
19 pages
Collab Survey
No ratings yet
Collab Survey
19 pages
An Introduction To Recommender Systems
No ratings yet
An Introduction To Recommender Systems
6 pages
RS Notes1
No ratings yet
RS Notes1
19 pages
Advances in Artificial Intelligence - 2009 - Su - A Survey of Collaborative Filtering Techniques
No ratings yet
Advances in Artificial Intelligence - 2009 - Su - A Survey of Collaborative Filtering Techniques
19 pages
Slides Lecture 2 RecSys
No ratings yet
Slides Lecture 2 RecSys
86 pages
Lect 13 DM
No ratings yet
Lect 13 DM
20 pages
Module5 Recommender Systems PartA
No ratings yet
Module5 Recommender Systems PartA
54 pages
Rec - Unit 1
No ratings yet
Rec - Unit 1
66 pages
Recommender Systems
No ratings yet
Recommender Systems
12 pages
Unit 1 PDF
No ratings yet
Unit 1 PDF
58 pages
Recommender Systems Overview
No ratings yet
Recommender Systems Overview
72 pages
Unit 3
No ratings yet
Unit 3
21 pages
Recommender Systems Overview
No ratings yet
Recommender Systems Overview
26 pages
Survey of Recommendation System Techniques
No ratings yet
Survey of Recommendation System Techniques
7 pages
Unit Iii Collaborative Filtering
No ratings yet
Unit Iii Collaborative Filtering
51 pages
Recommendation System
No ratings yet
Recommendation System
27 pages
Recommender Systems
No ratings yet
Recommender Systems
23 pages
Unit 1 Recommender Systems
No ratings yet
Unit 1 Recommender Systems
33 pages
RecommenderSystems Shortened
No ratings yet
RecommenderSystems Shortened
95 pages
Chatbot-Driven Recommendation Insights
No ratings yet
Chatbot-Driven Recommendation Insights
49 pages
Recommender Systems Asanov
No ratings yet
Recommender Systems Asanov
7 pages
Module4 RecommenderSystem
No ratings yet
Module4 RecommenderSystem
11 pages
Aai - Unit 3
No ratings yet
Aai - Unit 3
25 pages
Machine Learning Recommender Systems
No ratings yet
Machine Learning Recommender Systems
33 pages
Unit-5 ML
No ratings yet
Unit-5 ML
7 pages
Module 5
No ratings yet
Module 5
50 pages
Emerging Synergies Between Large Language Models A
No ratings yet
Emerging Synergies Between Large Language Models A
7 pages
Collaborative Filtering - N
No ratings yet
Collaborative Filtering - N
16 pages
6CS4 ML Unit-5
No ratings yet
6CS4 ML Unit-5
33 pages
SSRN 3702439
No ratings yet
SSRN 3702439
5 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
20 pages
Collaborative Filtering Case Study
No ratings yet
Collaborative Filtering Case Study
5 pages
RS Part 1
No ratings yet
RS Part 1
40 pages
MS - BDA Lec - Recommendation Systems I
No ratings yet
MS - BDA Lec - Recommendation Systems I
31 pages
DM Lec 6
No ratings yet
DM Lec 6
4 pages
ML Unit 6
No ratings yet
ML Unit 6
83 pages
Recommendation Systems: A Review
No ratings yet
Recommendation Systems: A Review
6 pages
Unit III - 3.1 - Recommender Systems at CSJMU - 6 Slides Handouts
No ratings yet
Unit III - 3.1 - Recommender Systems at CSJMU - 6 Slides Handouts
3 pages
IDEA - Collaborative Filtering Techniques in Recommendation Systems
No ratings yet
IDEA - Collaborative Filtering Techniques in Recommendation Systems
11 pages
Module 5
No ratings yet
Module 5
8 pages
Survey of Collaborative Filtering Techniques
No ratings yet
Survey of Collaborative Filtering Techniques
7 pages
AI Recommendation System
No ratings yet
AI Recommendation System
20 pages
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
No ratings yet
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
5 pages
Implementing A Recommender System With Graph Database: Seminar
No ratings yet
Implementing A Recommender System With Graph Database: Seminar
26 pages
Weeks 1 - 2 Activities: Possible Outcomes Value of The Random Variable A (Number of Heads)
No ratings yet
Weeks 1 - 2 Activities: Possible Outcomes Value of The Random Variable A (Number of Heads)
3 pages
Lesson Plan: Lesson: D-Block Elements (IV)
No ratings yet
Lesson Plan: Lesson: D-Block Elements (IV)
3 pages
NetMeeting Hosting Guide
No ratings yet
NetMeeting Hosting Guide
6 pages
Pre Oral Defense Powerpoint
No ratings yet
Pre Oral Defense Powerpoint
26 pages
Customer Complaints How To Handle Customer Complaints
100% (1)
Customer Complaints How To Handle Customer Complaints
2 pages
Consent
No ratings yet
Consent
14 pages
Learning To learn-Report-Hautamäki
100% (1)
Learning To learn-Report-Hautamäki
103 pages
Spelling Bee Words for Teens
No ratings yet
Spelling Bee Words for Teens
2 pages
TVPaint Tutorial 1
No ratings yet
TVPaint Tutorial 1
22 pages
English Grammar Exercises for B1-B2
No ratings yet
English Grammar Exercises for B1-B2
10 pages
Gen Math Mod
No ratings yet
Gen Math Mod
7 pages
Job Application: Tech Exec Role
No ratings yet
Job Application: Tech Exec Role
4 pages
1.1 Basic Properties of Inequalities
No ratings yet
1.1 Basic Properties of Inequalities
6 pages
Antonio Bratto The Teaching Brain
No ratings yet
Antonio Bratto The Teaching Brain
6 pages
DB Qwik Site 5
No ratings yet
DB Qwik Site 5
382 pages
Brand Synergy for Business Growth
No ratings yet
Brand Synergy for Business Growth
7 pages
Complex Numbers Exercise
No ratings yet
Complex Numbers Exercise
4 pages
Adaptive Re-Use of Regal Cinema
No ratings yet
Adaptive Re-Use of Regal Cinema
26 pages
Aquarius Woman - Personality Traits, Love & More
No ratings yet
Aquarius Woman - Personality Traits, Love & More
7 pages
Hakekat Filsafat Hukum Dalam Pembangunan Hukum Nas
No ratings yet
Hakekat Filsafat Hukum Dalam Pembangunan Hukum Nas
7 pages
Pseudo Code
No ratings yet
Pseudo Code
14 pages
Microsoft Dynamics Sure Step Training PDF
No ratings yet
Microsoft Dynamics Sure Step Training PDF
66 pages
Membrane Design Performance Report
No ratings yet
Membrane Design Performance Report
5 pages
Miller Automation Panasonic MWA Brochure - 250707
No ratings yet
Miller Automation Panasonic MWA Brochure - 250707
16 pages
Test Bank For Research Methods Design and Analysis 11th Edition by Christensen
100% (1)
Test Bank For Research Methods Design and Analysis 11th Edition by Christensen
13 pages
College Board Top 100 Common SAT ACT Vocabulary Words
No ratings yet
College Board Top 100 Common SAT ACT Vocabulary Words
4 pages
Single Row Functions
89% (19)
Single Row Functions
7 pages
Environmental Reporting: Present Status and Some Suggestive Guidelines For Listed Manufacturing Companies in Bangladesh
No ratings yet
Environmental Reporting: Present Status and Some Suggestive Guidelines For Listed Manufacturing Companies in Bangladesh
11 pages
Touch & Yawn
No ratings yet
Touch & Yawn
13 pages
Extend QP Custom Applications
No ratings yet
Extend QP Custom Applications
21 pages

CS548S15 Showcase Web Mining

Uploaded by

CS548S15 Showcase Web Mining

Uploaded by

CS 548 Spring 2015 Web Mining Showcase

By Salah Ahmed, Hai Liu, Shaocheng Wang, Sijing Yang

—— Chris Anderson in “The Long Tail”

@[Link] Picture from: [Link] 3

@[Link] Picture from: [Link] 4

@[Link] Picture from: [Link] 5

3. Recommendation Engine Workflow

@[Link] Picture from: [Link] 7

Two mostly used traditional algorithms:

1. User Based Collaborative Filtering

• Difference with User-to-User CF

@[Link] Picture from: [Link] 11

@[Link] Picture from: Francesco Ricci, Slides [5] 12

• The similarity between two items uses the cosine measure

User Based collaborative filtering: Item-to-Item collaborative filtering:

@[Link] Picture from: [Link] 18

@[Link] Picture from: [Link] 19

@[Link] Picture from: [Link] 20

Questions and Comments?

You might also like