Music Personalization:
Realtime Platforms
♫ + ML + You = ❤
CrunchConf, Budapest, October 30, 2015
Esh Kumar
Machine Learning & Data Products @ Spotify NYC
@eshvk
Who am I?
• UT Austin Machine Learning
• Building Large Scale Recommendation Systems @
Mozilla, StumbleUpon & Spotify
75 M+ Active Users
58 Markets
1 TB of Logs/Day
1200+ Node Hadoop Cluster
Products
•Discover … to find new albums
•Discover Weekly … A weekly Playlist
•Editorial Playlist Recommendations
•Radio
Music Personalization
•Understanding People
➡ User Experience, Cultural Variations
•Understanding Content
➡ Genres, Cultural knowledge
•Models
➡ Collaborative Filtering, Content Based
ML
Content
User
Music Personalization
•Understanding People
➡ User Experience, Cultural Variations
•Understanding Content
➡ Genres, Cultural knowledge
•Models
➡ Collaborative Filtering, Content Based
• News, Blogs, NLP
Music Personalization
•Understanding People
➡ User Experience, Cultural Variations
•Understanding Content
➡ Genres, Cultural knowledge
•Models
➡ Collaborative Filtering, Content Based
• News, Blogs, NLP
• Manually tag attributes
• Curation
Music Personalization
•Understanding People
➡ User Experience, Cultural Variations
•Understanding Content
➡ Genres, Cultural knowledge
•Models
➡ Collaborative Filtering, Content Based
• News, Blogs, NLP
• Manually tag attributes
• Curation
• CF
30 Million Songs…
WhatTo Play?
75 Million Users … 1 Person Every 3 Secs…
Recommendation Systems
• Predict user response to options.
• Rich field: Matrix completion, ranking, text models,
latent factor models.
• Several conferences annually. RecSys, NIPS, ICML etc
• Industry researchers include NFLX, GOOG, MS and
more…
Collaborative Filtering
Hey,
I like tracks P, Q, R, S!
Well,
I like tracks Q, R, S, T!
Then you should check out
track P!
Nice! Btw try track T!
Model you based on songs you played…
Predict your future based on similar users…
Millions of users and billions of streams…
…. so there is someone like you out there
Collaborative Filtering
The Netflix Prize.
A million dollars for beating NFLX’s
best algorithms by ~ 10%.
Similarity
Our problem is to figure out how similar two
items are.
Mathematically, this means modeling a function
Similarity(x,y) for all users and items, if possible.
How do we do this?
Matrix Completion. A matrix expresses a system. We model the
data in the form of a matrix. For example, play counts for all songs
and all users could be:
Users
8
>>>>>><
>>>>>>:
0
B
B
B
B
B
B
@
Song Plays
z }| {
s1,1 s1,2 14 · · · s1,n
s2,1 s2,2 2 · · · s2,n
·
·
·
sm,1 sm,2 1 · · · sm,n
1
C
C
C
C
C
C
A
Users
8
>>>>>><
>>>>>>:
0
B
B
B
B
B
B
@
Song Plays
z }| {
s1,1 s1,2 14 · · · s1,n
s2,1 s2,2 2 · · · s2,n
·
·
·
sm,1 sm,2 1 · · · sm,n
1
C
C
C
C
C
C
A
Call Me Maybe
Esh
Esh listened to call me maybe once…
⇡
0
B
B
B
B
B
B
B
B
B
@
u1
u2
...
...
...
um
1
C
C
C
C
C
C
C
C
C
A
t1 t2 · · · · · · · · · tn⇡
0
B
B
B
B
B
B
B
B
B
@
u1
u2
...
...
...
um
1
C
C
C
C
C
C
C
C
C
A
t1 t2 · · · · · · · · · tn
Matrix Completion is well studied …
Start with random vectors around the origin. Run alternating least
squares or gradient descent or stochastic gradient descent… All this
is Hadoopable™.
Users
8
>>>>>><
>>>>>>:
0
B
B
B
B
B
B
@
Song Plays
z }| {
s1,1 s1,2 14 · · · s1,n
s2,1 s2,2 2 · · · s2,n
·
·
·
sm,1 sm,2 1 · · · sm,n
1
C
C
C
C
C
C
A
Users
8
>>>>>><
>>>>>>:
0
B
B
B
B
B
B
@
Song Plays
z }| {
s1,1 s1,2 14 · · · s1,n
s2,1 s2,2 2 · · · s2,n
·
·
·
sm,1 sm,2 1 · · · sm,n
1
C
C
C
C
C
C
A
Call Me Maybe
Esh
Esh listened to call me maybe once…
⇡
0
B
B
B
B
B
B
B
B
B
@
u1
u2
...
...
...
um
1
C
C
C
C
C
C
C
C
C
A
t1 t2 · · · · · · · · · tn⇡
0
B
B
B
B
B
B
B
B
B
@
u1
u2
...
...
...
um
1
C
C
C
C
C
C
C
C
C
A
t1 t2 · · · · · · · · · tn
30 Million Songs…
WhatTo Play?
75 Million People … 1 Person Every 3 Secs…
1.5 Billion Playlists
Language Models
• Language models work well too. For example, a
playlist could be considered as a document and
you could learn the latent vectors for tracks
(words).
• Then represent a User as a linear combination
of their Tracks.
word2vec
Words with similar contexts have similar
meaning
word2vec
word2vec
Target Word
Context Word
word2vec
Target Words and Corresponding Contexts
shining bright trees dark green
stars 61 50 10 30 1
sun 71 60 5 2 0
cucumber 2 1 15 3 40
word2vec
Playlists CPU Vectors
Read GetVectors & Update
Vectors are awesome!
•Unique fingerprint for every users, tracks,
albums, artists & even playlists in the same
space.
•Similarity is easily computable. Euclidean
Distance or Cosine Similarity.
Approximate Nearest Neighbors
•Fast approximate nearest neighbor search.
• Locality Sensitive Hashing
• https://github.com/spotify/annoy
Vectors are great for Infrastructure too…
•Machine Learning can be decomposed &
abstracted away.
•A Lambda Architecture involving Machine
Learning becomes eas(ier).
•Platforms for Personalization become
possible….
The Record Store…
The List Maker …
How do you scale this?
Tools of the trade
• Build models in Python. (NumPy, SciPy )
• Jobs in Scalding + Luigi ( https://github.com/spotify/luigi )
• Storm for real time.
• In house RPC for serving requests.
Storm 101
• Realtime Stream Processing.
• Like Hadoop but easier.
• Fault tolerant.
• Java, Clojure (yay!) and more!
Storm @ Spotify
• Major users are Ads & Personalization!
• Everyteam manages its own cluster. For personalization, we have
a 12 node cluster.
• Relatively a new tech, compared to Hadoop™.
So why Storm?
• Hadoop is slowwww. Daily UserVector jobs takes ~ 16 hours to
run. Small Data FTW!
• New Users are important; they need a friend!
• What moment are you in? Gym, Running etc?.
Getting Data Across The Globe
HDFS
Kafka
Pipeline …
User

Listens
Playlists
Realtime Listens
Spout
HDFS
Kafka
Pipeline …
User

Listens
Playlists
Realtime Listens
Spout
User Vector
Generation Job
Latent
Vector
Models
Track, Artist, Album
Vectors
HDFS
Kafka
Pipeline …
User

Listens
Playlists
Realtime Listens
Spout
User Vector
Generation Job
Latent
Vector
Models
Track, Artist, Album
Vectors
Compressed
Listening History
Bolts
Cassandra
Cassandra
HDFS
Kafka
Pipeline + Platform
User

Listens
Playlists
Realtime Listens
Spout
User Vector
Generation Job
Latent
Vector
Models
Track, Artist, Album
Vectors
Compressed
Listening History
Bolts
Cassandra
Cassandra
Backend
Systems
•Top Albums
•Top Tracks
•Top Playlists
Discover New User
•Going from two weeks of no
recommendations to recommendations as
soon as a user plays a track.
•Successful A/B test
•First team to build a production ready
personalization feature using Storm.
Lessons Learnt …
• Boring technology works well. Complicated Storm
Topology = Bad. (Dan Mckinley)
• Storm is nice. Would have preferred reusing batch
Scalding Code. Maybe Spark Streaming?
• Grow your API from one use case to another. Don’t
solve for everything at one time.
Join the band!
• Machine Learning, Data & Backend Gigs.
• Now touring in New York, Boston & Stockholm!
• https://www.spotify.com/jobs/
Thanks !
Esh Kumar
@eshvk

Music Personalization : Real time Platforms.

  • 1.
    Music Personalization: Realtime Platforms ♫+ ML + You = ❤ CrunchConf, Budapest, October 30, 2015
  • 2.
    Esh Kumar Machine Learning& Data Products @ Spotify NYC @eshvk
  • 3.
    Who am I? •UT Austin Machine Learning • Building Large Scale Recommendation Systems @ Mozilla, StumbleUpon & Spotify
  • 4.
  • 5.
  • 6.
    1 TB ofLogs/Day
  • 7.
  • 8.
    Products •Discover … tofind new albums •Discover Weekly … A weekly Playlist •Editorial Playlist Recommendations •Radio
  • 9.
    Music Personalization •Understanding People ➡User Experience, Cultural Variations •Understanding Content ➡ Genres, Cultural knowledge •Models ➡ Collaborative Filtering, Content Based ML Content User
  • 10.
    Music Personalization •Understanding People ➡User Experience, Cultural Variations •Understanding Content ➡ Genres, Cultural knowledge •Models ➡ Collaborative Filtering, Content Based • News, Blogs, NLP
  • 11.
    Music Personalization •Understanding People ➡User Experience, Cultural Variations •Understanding Content ➡ Genres, Cultural knowledge •Models ➡ Collaborative Filtering, Content Based • News, Blogs, NLP • Manually tag attributes • Curation
  • 12.
    Music Personalization •Understanding People ➡User Experience, Cultural Variations •Understanding Content ➡ Genres, Cultural knowledge •Models ➡ Collaborative Filtering, Content Based • News, Blogs, NLP • Manually tag attributes • Curation • CF
  • 13.
    30 Million Songs… WhatToPlay? 75 Million Users … 1 Person Every 3 Secs…
  • 14.
    Recommendation Systems • Predictuser response to options. • Rich field: Matrix completion, ranking, text models, latent factor models. • Several conferences annually. RecSys, NIPS, ICML etc • Industry researchers include NFLX, GOOG, MS and more…
  • 15.
    Collaborative Filtering Hey, I liketracks P, Q, R, S! Well, I like tracks Q, R, S, T! Then you should check out track P! Nice! Btw try track T! Model you based on songs you played… Predict your future based on similar users… Millions of users and billions of streams… …. so there is someone like you out there
  • 16.
    Collaborative Filtering The NetflixPrize. A million dollars for beating NFLX’s best algorithms by ~ 10%.
  • 17.
    Similarity Our problem isto figure out how similar two items are. Mathematically, this means modeling a function Similarity(x,y) for all users and items, if possible.
  • 18.
    How do wedo this? Matrix Completion. A matrix expresses a system. We model the data in the form of a matrix. For example, play counts for all songs and all users could be: Users 8 >>>>>>< >>>>>>: 0 B B B B B B @ Song Plays z }| { s1,1 s1,2 14 · · · s1,n s2,1 s2,2 2 · · · s2,n · · · sm,1 sm,2 1 · · · sm,n 1 C C C C C C A Users 8 >>>>>>< >>>>>>: 0 B B B B B B @ Song Plays z }| { s1,1 s1,2 14 · · · s1,n s2,1 s2,2 2 · · · s2,n · · · sm,1 sm,2 1 · · · sm,n 1 C C C C C C A Call Me Maybe Esh Esh listened to call me maybe once… ⇡ 0 B B B B B B B B B @ u1 u2 ... ... ... um 1 C C C C C C C C C A t1 t2 · · · · · · · · · tn⇡ 0 B B B B B B B B B @ u1 u2 ... ... ... um 1 C C C C C C C C C A t1 t2 · · · · · · · · · tn
  • 19.
    Matrix Completion iswell studied … Start with random vectors around the origin. Run alternating least squares or gradient descent or stochastic gradient descent… All this is Hadoopable™. Users 8 >>>>>>< >>>>>>: 0 B B B B B B @ Song Plays z }| { s1,1 s1,2 14 · · · s1,n s2,1 s2,2 2 · · · s2,n · · · sm,1 sm,2 1 · · · sm,n 1 C C C C C C A Users 8 >>>>>>< >>>>>>: 0 B B B B B B @ Song Plays z }| { s1,1 s1,2 14 · · · s1,n s2,1 s2,2 2 · · · s2,n · · · sm,1 sm,2 1 · · · sm,n 1 C C C C C C A Call Me Maybe Esh Esh listened to call me maybe once… ⇡ 0 B B B B B B B B B @ u1 u2 ... ... ... um 1 C C C C C C C C C A t1 t2 · · · · · · · · · tn⇡ 0 B B B B B B B B B @ u1 u2 ... ... ... um 1 C C C C C C C C C A t1 t2 · · · · · · · · · tn
  • 20.
    30 Million Songs… WhatToPlay? 75 Million People … 1 Person Every 3 Secs…
  • 21.
  • 22.
    Language Models • Languagemodels work well too. For example, a playlist could be considered as a document and you could learn the latent vectors for tracks (words). • Then represent a User as a linear combination of their Tracks.
  • 23.
    word2vec Words with similarcontexts have similar meaning
  • 24.
  • 25.
  • 26.
    word2vec Target Words andCorresponding Contexts shining bright trees dark green stars 61 50 10 30 1 sun 71 60 5 2 0 cucumber 2 1 15 3 40
  • 27.
  • 28.
    Vectors are awesome! •Uniquefingerprint for every users, tracks, albums, artists & even playlists in the same space. •Similarity is easily computable. Euclidean Distance or Cosine Similarity.
  • 29.
    Approximate Nearest Neighbors •Fastapproximate nearest neighbor search. • Locality Sensitive Hashing • https://github.com/spotify/annoy
  • 30.
    Vectors are greatfor Infrastructure too… •Machine Learning can be decomposed & abstracted away. •A Lambda Architecture involving Machine Learning becomes eas(ier). •Platforms for Personalization become possible….
  • 31.
    The Record Store… TheList Maker … How do you scale this?
  • 32.
    Tools of thetrade • Build models in Python. (NumPy, SciPy ) • Jobs in Scalding + Luigi ( https://github.com/spotify/luigi ) • Storm for real time. • In house RPC for serving requests.
  • 33.
    Storm 101 • RealtimeStream Processing. • Like Hadoop but easier. • Fault tolerant. • Java, Clojure (yay!) and more!
  • 34.
    Storm @ Spotify •Major users are Ads & Personalization! • Everyteam manages its own cluster. For personalization, we have a 12 node cluster. • Relatively a new tech, compared to Hadoop™.
  • 35.
    So why Storm? •Hadoop is slowwww. Daily UserVector jobs takes ~ 16 hours to run. Small Data FTW! • New Users are important; they need a friend! • What moment are you in? Gym, Running etc?.
  • 36.
  • 37.
  • 38.
    HDFS Kafka Pipeline … User
 Listens Playlists Realtime Listens Spout UserVector Generation Job Latent Vector Models Track, Artist, Album Vectors
  • 39.
    HDFS Kafka Pipeline … User
 Listens Playlists Realtime Listens Spout UserVector Generation Job Latent Vector Models Track, Artist, Album Vectors Compressed Listening History Bolts Cassandra Cassandra
  • 40.
    HDFS Kafka Pipeline + Platform User
 Listens Playlists RealtimeListens Spout User Vector Generation Job Latent Vector Models Track, Artist, Album Vectors Compressed Listening History Bolts Cassandra Cassandra Backend Systems •Top Albums •Top Tracks •Top Playlists
  • 41.
    Discover New User •Goingfrom two weeks of no recommendations to recommendations as soon as a user plays a track. •Successful A/B test •First team to build a production ready personalization feature using Storm.
  • 42.
    Lessons Learnt … •Boring technology works well. Complicated Storm Topology = Bad. (Dan Mckinley) • Storm is nice. Would have preferred reusing batch Scalding Code. Maybe Spark Streaming? • Grow your API from one use case to another. Don’t solve for everything at one time.
  • 43.
    Join the band! •Machine Learning, Data & Backend Gigs. • Now touring in New York, Boston & Stockholm! • https://www.spotify.com/jobs/
  • 44.