Graph Neural Network For Context-Aware Recommendation: Asma Sattar Davide Bacciu
Graph Neural Network For Context-Aware Recommendation: Asma Sattar Davide Bacciu
https://doi.org/10.1007/s11063-022-10917-3
Abstract
Recommendation problems are naturally tackled as a link prediction task in a bipartite graph
between user and item nodes, labelled with rating information on edges. To provide personal
recommendations and improve the performance of the recommender system, it is necessary
to integrate side information along with user-item interactions. The integration of context is a
key success factor in recommendation systems because it allows catering for user preferences
and opinions, especially when this pertains to the circumstances surrounding the interaction
between users and items. In this paper, we propose a context-aware Graph Convolutional
Matrix Completion which captures structural information and integrates the user’s opinion on
items along with the surrounding context on edges and static features of user and item nodes.
Our graph encoder produces user and item representations with respect to context, features
and opinion. The decoder takes the aggregated embeddings to predict the user-item score
considering the surrounding context. We have evaluated the performance of our model on 14
five publicly available datasets and compared it with state-of-the-art algorithms. Throughout
this we show how it can effectively integrate user opinion along with surrounding context
to produce a final node representation which is aware of the favourite circumstances of the
particular node.
Symbols:
Auvc : 3D matrix between users, items and context
Ar : 2D user’s opinion matrix
Uc : 2D user’s contextual importance matrix
Vc : 2D item’s contextual importance matrix
UF : 2D user’s static importance matrix
VF : 2D item’s static importance matrix
NF : Total number of user’s features
B Asma Sattar
[email protected]
Davide Bacciu
[email protected]
1 Dipartimento di Informatica, Università di Pisa, L.Go B. Pontecorvo 3, Pisa 56121, Pisa, Italy
123
5358 A. Sattar, D. Bacciu
Abbreviations:
GCMC: Graph convolution matrix completion
GCMC + feat: Graph convolution matrix completion with user and item features
GCMC: context aware Graph convolution matrix completion
cGCMC + feat: context aware Graph convolution matrix completion with user and
item features
1 Introduction
With the rapid development of e-commerce and social media platforms in the last few years,
recommender systems have gathered notable attention [1, 2]. They provide a methodology
to identify user’s requirements and predict the interest by mining the user’s history and
their interactions with items (e.g., purchase, watch, click, and read). Recommender systems
can take various forms depending upon the application, e.g., playlist generator for video
and music services (Netflix, YouTube), friend suggestions on Instagram and Facebook, and
product suggestion on eBay and Amazon. One of the most common and general approaches
for recommendation is Collaborative Filtering (CF) [3, 4], which assume similar users have
similar preferences and hence they like similar items. This approach models explicit feedback
(e.g., ratings) or implicit feedback (e.g., clicks, read) to reconstruct the user’s interactions.
Recently, approaches based on Graph Neural Networks (GNNs) have been demonstrated to
be highly effective on various tasks defined over relational data, such as protein structure
and knowledge graphs [5]. The main idea of GNN is to produce the representation of a node
by aggregating features from its neighbouring nodes iteratively, as shown in Fig. 1. Each
GNN layer gathers all k-hop nearby node embeddings (messages) and summarizes them via
an aggregation function (e.g., sum). After aggregation, the node’s current state is updated.
Many of these approaches treat recommendation tasks as link prediction in bipartite graphs
via matrix completion [6, 7]. The bipartite graph can be represented as an adjacency matrix
between user and item nodes, where the task is to predict entries inside the matrix (also
known as link prediction). Recently, many researchers contributed towards the development
of GNN-based collaborative filtering for modelling user-item interactions in the form of a
message passing neural network between user and item nodes [8, 9].
A wide range of techniques including CF based approaches for recommender systems
solely focus on rating information provided by users. Despite the popularity of these
approaches, they have limited performance in real-world applications as they neglect side
information such as static features of nodes (user’s and item’s profile), and surrounding con-
text information (e.g., mood, time, weather) that can improve performance by enhancing the
personalization in recommender systems. The surrounding context reflects the fact that user
choices change with time and are highly dependent on the context under which they interact
with the item. For example, time and weather information highly impact the choice of users
in restaurant recommendation, while the user’s mood influences which song they are most
123
Graph Neural Network for Context-Aware Recommendation 5359
Fig. 1 Graph Neural Network with message passing up to k hop neighbours. Each neighbouring node or edge
share information and impact each other’s updated embedding
Fig. 2 User’s interaction with the item (e.g., movie) is surrounded by certain context (e.g., weather, mood,
weekend) that influence user’s opinion on item. This data takes the form of a 3D matrix between user, item
and context
123
5360 A. Sattar, D. Bacciu
2 Related Work
The vast majority of the work in the field of context-aware recommendation frameworks has
been devoted to the improvement of matrix factorization (MF) approaches. These approaches
work by decomposing the user-item interaction matrix into lower dimension matrices [20, 21].
Despite of good performance, these approaches are unable to capture the user/item-context
correlation as they consider context as features of the user and item [22]. Neural Factorization
machine (NFM) is a deep learning method to model high-order nonlinear feature interactions
for sparse data [15]. In [23], a neural network model has been proposed that captures the
impact of context on users and items. It learns the importance of context, but the simplicity of
this model limits the ability to capture the real influence of the relationship between features.
Recently, GNN based approaches have been introduced to tackle recommendation tasks
on graph-structured representations of the problem [24]. These methods are suitable for
modelling the interaction of nodes on graph structural features in a flexible and explicit
way. Fi-GNN [25] utilizes a graph structure to naturally represent the characteristics of
multiple feature fields, in which every node corresponds to a feature field, and these different
fields can interact through edges to model the node interaction in graph. STAR-GCN [7]
stacks multiple identical GCN encoder-decoders combined with intermediate supervision to
improve the final prediction performance. GCMC [6] leverages the bipartite graph between
user and item nodes to learn the node representations. Both GCMC and STAR-GCN treat
equally all neighbours of a node. IGMC [26] is an inductive approach for user-item matrix
completion recommendation tasks, which do not consider any side information.
Previous GNN based collaborative filtering approaches [27, 28] are unable to capture
the collaborative filtering effect, as they discard the collaborative signals that are hidden
in user-item interaction. In [8], NGCF model successfully encodes user-item high-order
connectivity by exploiting user-item bipartite graph. GCF-YA [29] is a deep graph neural
network implementation of collaborative filtering, based on information propagation and
123
Graph Neural Network for Context-Aware Recommendation 5361
attention mechanism to predict missing links between users and items. GraphRec [30] tackles
social recommendation by aggregating the historical behaviour of individuals from user-user
and user-item bipartite graph for recommendation.
Context information on the user has been successfully used to improve recommendation
performance [16, 31]. Recently, we have seen work on dynamic graphs that integrate inter-
action times as context information [32–34]. DGCF [35] integrates the time interval between
the previous and current interaction of user-item pairs inside their embedding to get the lat-
est node representations for recommendation. An inductive deep learning approach DyRep,
which is used to learn from the temporally evolving interaction between user item nodes.
These approaches solely consider time information and are hence limited to integrate any
other context information.
The above GNN based approaches consider the rating information as the user’s opinion on
the edges between the user and item nodes in a bipartite graph. Some approaches only consider
user and item static features, or integrate time as a context to capture a dynamically evolving
environments. All these approaches ignore the surrounding context information that can
improve performance. In the following, we show how it is possible to extend such approaches
to consider dynamic and time-varying contextual features influencing recommendations.
3 Problem Definition
We have categorized data for context-aware recommendation into four categories: items,
users, context, and interactions. Context can be defined as the surrounding knowledge that
is associated with the user-item interaction, e.g., time, company, mood, location, etc. In this
work, we have defined a 3D rating/opinion interaction matrix between user, item and context
Auvc ∈ R Nu ×Nv ×Nc , where Nu is total number of users, Nv represents the total number of
items and Nc is total number of different contexts (as shown in Fig. 2). The rating scale
ranges from one to five stars such that Auvc ∈ {1, . . . 5} Nu ×Nv ×Nc except for InCarMusic
dataset, where maximum rating is six. User and items are associated to multiple static features
describing the characteristics of individuals. For example, static user features are gender, age,
and static product features can be colour, brand, category etc. Let N Fu and N Fv represents the
total number of features of users and items, respectively. The importance of the contextual
features varies from person to person and from item to item.
Given such data, the recommendation problem is then cast as a task aiming to predict the
existence of a labelled link between a user and an item considering the knowledge about the
surrounding context. This work aims to introduce context information to matrix completion
tasks with mechanisms for finding which context attributes are important for a target user
and items. Details of the learning model are discussed in Sect. 4.
In this section, we present our link prediction model for bipartite graph between users and
items with context information on edges. We extend the graph convolutional autoencoder in
[6] (GC MC + f eat, in the following). GC MC + f eat leverages rating information using
a 2D user-item opinion/rating matrix along with static node features, ignoring the context
information on edges. The major contribution of our approach, dubbed as context-aware graph
convolutional matrix completion (cGC MC F ), is to utilize context features on the edges. The
123
5362 A. Sattar, D. Bacciu
Fig. 3 High-level architecture of the proposed context-aware graph convolutional autoencoder. User’s opinion
on item is modeled using local weight sharing GCN. User and item features as well as user-context and item-
context are modeled using dense neural network. While user-context-item interaction is modeled with GCN
with global weight sharing
proposed architecture has three main blocks, shown in Fig. 3. From top to bottom: the first
block represents the input data, i.e., user’s opinion/rating on items, the profile of users and
items, user-item-context interaction graph with edges labeled with context and rating, and
the favourite context of users and items. The second block represents the graph encoder.
Inside the graph encoder, GC MC + f eat operates on 2D user-item rating matrix, while
cGC MC F is our proposed extension that leverages context information on edges and maps
user-item-context interaction to a 3D matrix. The graph encoder is composed of two graph
convolutional neural network layers and two dense neural network layers. Each layer operates
on different data to produce user and item representations with respect to rating opinion, static
node features, and context information. This multiple perspective representation for each
user and item is accumulated without attention weights, in our algorithms cGC MC old and
cGC MC Fold [19]. While in cGC MC and cGC MC F , we provide the accumulation along
with the attention mechanism. Further details regarding the encoder part are explained in
Sect. 4.1. The decoder (discussed in Sect. 4.2) utilizes the encoded representations to predict
the link in a bipartite graph.
123
Graph Neural Network for Context-Aware Recommendation 5363
The user opinions represented in the adjacency matrix A (Eq. 1) map the user’s likeliness
for items in the bipartite graph. We have a local weight sharing graph convolutional layer
for modelling user’s opinion. The local weight sharing mechanism allows having different
convolutional weights based on the edge types. The number of weight matrices is equal
to the possible available rating levels R. The customized message propagation for graph
convolutions uses an edge type-specific parameter matrix Wr . After the message propagation
step, we aggregate the incoming messages at each node by two alternative types of aggregation
functions: sum and stack.
• stack aggregation: concatenating all edge specific matrices along their first dimension.
• sum aggregation: performing an addition of all edge-specific matrices.
Overall, this edge specific message propagation is more effective compared to the general
global message propagation. Our model selection experiments considered summation and
concatenation as alternatives, and we have selected the former for its best overall performance
(in validation). Details of this spectral convolutional layer are defined in the following:
z uo = Agg (GC N (X v , Ai )) = σ Agg Ãi X v Wiv (3)
i:0→R i:0→R
123
5364 A. Sattar, D. Bacciu
z vo = Agg (GC N (X u , AiT )) =σ Agg ÃiT X u Wiu (4)
i:0→R i:0→R
where X u and X v are the one-hot unique vectors for the user and item node. The term R is
the maximal rating a user can give to an item, Wiu and Wiv represents R trainable weight
matrices and σ is non linear activation function such as ReLU. The matrix Ãi and ÃiT are
the normalized adjacency matrix Ai and its transpose, respectively.
Ãi = D −1/2 Ai D −1/2 ∀ i = 0 to R (5)
where the term D represents a diagonal degree matrix, containing the square root of degree
on diagonal. Similarly, AiT is normalized to get ÃiT (using Eq. 5).
The user-item-context interaction matrix Auvc is normalized by dividing each context attribute
with the total count of context attributes recorded at the time of user-item interaction. The
normalized context attributes are further accumulated to get Ac ∈ R Nu ×Nv .
Ncuv
ciuv
Ac [u] [v] = (6)
Ncuv
i=0
where u and v are user and item indexes in the matrix, Ncuv represents the count of occurrences
of context c when user u has rated item v, ciuv denotes the individual context value under
which user u has rated item v.
We propose to leverage graph convolutions to model user-context-item interactions in
the matrix Ac , with the same message propagation rule as used for modelling user’s opin-
ion (Eq. 3) and (Eq. 4)) but with a single global weight matrix. We represent the user and
item representation with respect to context attributes as z uc1 and z vc1 , respectively. The user’s
behaviour varies with the change in the surrounding context, which makes them react differ-
ently to the same item under different contexts. Similarly, an item gets a different rating when
the surrounding context changes. This makes the context information naturally dynamic. For
modelling this dynamic user-context and item-context relation, we performed a statistical
analysis of training data and identify α importance factor for each user and item, respec-
tively. The α factor gives more importance to the favourite context of users and items. We
have stored the extracted user preferences in UC :
uvi
Nu ,Nc
UC [u][c] = Auvc [u][vi ][c j ] ∗ α[r ] : r ∈ {1, · · · , R} (7)
i, j
where Nu denotes the neighbours of user u, Ncuv represents the number of context attributes
in which the user provides opinion r . We have obtained the context importance for each
item in a similar way (Eq. 7) and stored in VC . Both matrices are normalized to have values
between 0 to 1. We have the simplest dense neural network layer to process this information.
The weight matrices chosen for this purpose are randomly and uniformly distributed and
node dropout is applied to the hidden layers to prevent overfitting. The operations on this
layer are defined as :
z uc2 = σ (UC W3c + bc ) (8)
z vc2 = σ (VC W4c + bc ) (9)
123
Graph Neural Network for Context-Aware Recommendation 5365
To get the final user’s and item’s context representation, we have integrated z uc1 with z uc2 , and
z vc1 with z vc2 .
z uc = σ z uc1 ⊕ z uc2 W5c + bc (10)
c c
z v = σ z v ⊕ z v W 6 + bc
c 1 c2
(11)
where as W represents trainable weight matrices and b is a bias.
The static features of user and item nodes are represented as U F and VF , respectively. We
have not given these features directly as input in the graph convolution layer as they degrade
the performance in case of sparse user-item content features. Therefore, we have a separate
dense neural network layer to get the static feature representation for user and item nodes.
f f
z u = σ U F W3 + b f (12)
f
z vf = σ VF W4 + b f (13)
f f
where W3 and W4 represent trainable weight matrices and b f is a bias.
We have accumulated the user’s representation from rating/opinion (Eq. 3), features (Eq. 12)
and context (Eq. 10) perspective. Here, we introduce the learnable attention weights for
the three representations in cGC MC F . In cGC MC old [19], we have accumulated these
embeddings without considering any learnable attention weights. The last layer of the graph
encoder is a dense neural network layer and is responsible for producing the final embedding
with or without attention weights. For cGC MC F user’s final representation is defined as:
f
z u = σ wuo ∗ z uo ⊕ wuc ∗ z uc ⊕ wuo ∗ z u W6 + b . (14)
Similarly, the item’s representations from rating/opinion, context and feature perspective are
concatenated after having attention weights to get the final item embedding.
z v = σ wvo ∗ z vo ⊕ wvc ∗ z vc ⊕ wvf ∗ z vf W7 + b . (15)
4.2 Decoder
We use a bilinear decoder that takes context-aware embedding of user-item interaction and
reconstructs rating matrix ( Â) between users and items. Here, we address this problem as
a classification task and each rating is treated as a separate class. The decoder produces a
probability distribution over all classes through a bilinear operation:
eu i Q r v j
T
123
5366 A. Sattar, D. Bacciu
We evaluate the performance of the proposed algorithm using MAE (Eq. 18) and RMSE
(Eq. 19) metrics with respect to the rating assigned by the user to their interaction with the
item. The choice of these metrics over classification based ones is driven by the nature of the
ratings, which is ordinal rather than multinomial. Hence it is important to capture how closely
the prediction approximates the expected rating (which is not the case for classification-based
metrics). Our model is trained in end-to-end fashion by minimizing the root mean square
error between the actual (Ai j ) and reconstructed rating ( Âi j ).
( Âi, j − Ai, j )
M AE = (18)
n
i, j
( Âi, j − Ai, j )2
RMSE = (19)
n
i, j
5 Experiments
5.1 Datasets
123
Table 1 The statistical information defining number of users, items and context attributes along with the edge density and rating levels for each of the datasets used in our
experiments
Dataset LDOS-CoMoDa DePaul Travel-STS InCarMusic Tijuana-Restaurant
No of Context variables 12 3 14 8 6
Rating Scale 1−5 1−5 1−5 1−6 1−5
No of Ratings 2278 2270 2534 4012 1422
Density 0.0154 0.6581 0.0313 0.6872 0.711
123
5367
5368 A. Sattar, D. Bacciu
neutral, scared, disgusted), interaction (1st interaction with a movie, N th interaction with a
movie), physical (ill, healthy), companion (alone, friends, partner, family, colleagues, parents,
public). Besides this information, LDOS-CoMoDa also has profile features for users (gender,
age, city, country) and movies (director, language, actor, genre).
DePaulMovie2 is a movie dataset collected by researchers of the DePaul University, with
ratings acquired by survey. Students have been asked to rate movies subject to 3 context
variables: location (home, Cinema), time (weekend, weekday), and companion (partner,
family, alone) information. This dataset does not have user’s and item’s profile features.
Travel-STS3 dataset contains information about places visited by tourists. The context
information includes distance (nearby, far away), time available (half a day, one day, more
than one day), temperature (warm, hot, burning, cool, cold, freezing), season (summer, win-
ter, spring, autumn), crowdedness (empty, crowded, not crowded), mood (happy, active, sad,
lazy), budget (high spender, budget traveler, price for quality), weather (sunny, cloudy, rainy,
clear sky, thunderstorm, snowing), companion (with children, with friends/colleagues, alone,
with family, with girlfriend/boyfriend), weekend (weekday, weekend), travel goal (visiting
friends, religion, business, health care, education, social event, scenic/landscape, hedonis-
tic/fun, activity/sport), means of transport (bicycle, car, public transport, no transportation
means) and knowledge of surrounding (returning visitor, completely new area, citizen of the
area). This dataset also contains user profile features (age, gender).
InCarMusic3 dataset consists of music tracks recommended to passengers based on the
surrounding contextual information. The context information includes driving style (sport
driving, relaxed driving), road type (highway, city, serpentine), landscape (mountains, coast
line, urban, country side), sleepiness (sleepy, awake), traffic conditions (busy road, free road,
traffic jam), mood (happy, active, sad, lazy), weather (sunny, cloudy, rainy, snowing), and
natural phenomena (day time, morning, night, afternoon).
Tijuana Restaurant3 is a restaurant dataset gathered via a survey consisting of 8 inquiries
from persons about various neighbouring cafes. Every restaurant picked was assessed
multiple times, one for every possible context setting. The context information includes
combinations of time and location (c1 : weekday and school, c2 : weekday and home, c3 :
weekday and work, c4 : weekend and school, c5 : weekend and home, and c6 : weekend and
work).
The density value in Table 1 represent a fraction of positive links between the nodes.
Tijuana-Restaurant dataset has a few number of nodes connected with a high number of
edges, while LDOS-CoMoDA dataset has a greater number of nodes connected with few
edges (compared to other datasets). Overall, the effect of high or low density values on the
performance of our models is shown to be negligible in Sect.6.
Our Pytorch implementation4 of the cGC MC and cGC MC F models is publicly available.
We have used 60% data as a training set, 20% as a validation set and 20% as a test set for
each dataset. The data splitting is performed five times. Each time the data is shuffled with a
different random seed before dividing into splits. The average performance of all algorithms
after five runs with different random splits is presented in Sect. 6.
2 https://cran.r-project.org/web/packages/contextual/vignettes/
3 https://github.com/irecsys/CARSKit/blob/master/context-aware_data_sets/
4 https://github.com/asmaAdil/cGCMC
123
Graph Neural Network for Context-Aware Recommendation 5369
Table 2 The average time (sec.) taken by cGCMC and cGCMC F for each dataset
Algorithm Dataset Avg. time per epoch Avg. time for prediction
Table 3 cGCMC F encoder and decoder layers and their respective best output dimension hyperparameter
values
cGCMC F Layers Output dimension
We report the computational costs (in seconds) of cGCMc and cGCMC F , obtained by com-
puting the average time required by a single training epoch and the average time required by
the prediction step (i.e., on the whole testset). Results are presented in the Table 2.
5.2.2 Hyper-parameters
We have evaluated our approach under different configurations. The best value for each
hyper-parameter is shown in bold. We have searched the embedding size for the user’s
opinion representation do in [300, 400, 500, 600]), static features representation d f in
[5, 10, 15, 20, 25] and contextual representation dc1 in [50, 100, 150, 200, 250] (for GCN)
and dc2 in [5, 10, 15, 20, 25] (for the dense layer) as shown in Table 3. We have chosen batch
size from [40, 80, 120, 150, 200]. The last layer of the encoder is set to produce embeddings
of size 75. The node dropout (Pdr op ) rate is tuned in [0.3, 0.4, 0.5, 0.6, 0.7]. Pdr op is the
probability to randomly drop all outgoing messages from specific nodes to train under the
denoising setup. The α importance factor defined as [0.2, 0.3, 0.5, 0.7, 0.8] ∀ r ∈ R, initially
chosen randomly considering the fact: α[r1 ] < α[r2 ] ⇐⇒ r1 < r2 . We can choose from
any set of initial values provided that it satisfies the fact: the context in which the user gives a
high rating should have more weight. The attention weights for opinion, feature, and context
representations are first set to random values and then learned to give appropriate weights for
each of these representations before combining them. All neurons use ReLU nonlinearity and
Adam is employed as the optimization algorithm. The model is trained for 200 epochs.For
baseline algorithms, all parameters are initialized as mentioned in the corresponding papers.
123
5370 A. Sattar, D. Bacciu
5.3 Benchmarks
In the evaluation phase, we have evaluated the test set using predictive performance in terms of
mean absolute error (M AE) and root mean square error (R M S E). We compare our approach
with several link prediction algorithms from the literature as follows :
• SocialMF [14] is a matrix factorization approach that exploits user-user trust information
along with user opinion on the item to predict items for users.
• SVD + + [36] improves the conventional SVD approach by allowing the joint use of
explicit (e.g., user’s rating opinion), and implicit (e.g., purchases, visited items) infor-
mation.
• PMF [37] is a matrix factorization approach for sparse datasets. This exploits the user-
item interactions only to learn user and item embeddings, while forgoing the context
features.
• BiasedMF [38] is an improvement to traditional matrix factorization and it incorporates
bias for user, item, and global bias factors
• GCMC [6] models user’s opinion leveraging the rating matrix between users and items
for matrix completion task.
• GCMC+feat [6] extended GC MC by integrating static features inside the user and item
nodes for link prediction in a bipartite graph.
• GraphRecuu uv [30] algorithm exploits the social relation between users along with user-
item interactions for link prediction in user-item bipartite graph.
6 Performance Comparison
Table 4 presents a comparison between the previous version of our algorithm (subscript with
’old’) with the extended version, and Table 5 presents the performance comparison of our
approach with other state-of-art algorithms. Our two datasets (LDOS-CoMoDa and Travel-
STS) contain user and item (description) features along with the user’s opinion on items and
context information. For the other three datasets (DePaul, InCarMusic, Tijuana-Restaurant),
we have only user’s opinion on the item and contextual information. The algorithms that are
integrating user’s and item’s feature information are not applicable to the later category of
datasets (indicated inside tables with the NA mark, as in “Not Applicable”).
• A clear performance difference can be seen between the old and extended versions of our
model on all datasets (provided in Table 4). This is purely due to the newly introduced
attention factor in the last layer of the encoder.
• Basic matrix factorization approaches, P M F and Biased M F, that solely model
user-item interaction as isolated instances, ignore side information thus limiting their rep-
resentation ability. These approaches perform worse compared to all baseline algorithms
on all datasets because of their limitation to integrate knowledge about surroundings.
• The SV D++, Social M F, and Graph Recuv uu perform better than basic matrix factoriza-
tion approaches as they capture and integrate knowledge about an individual user in the
form of social trust or by using implicit feedback. Despite of integrating side information,
these approaches perform worse than our method because of the advantageous effect of
surrounding contextual learning.
• When comparing our proposed algorithm with GNN based approaches (GC MC and
GC MC + f eat), we can identify a significant improvement in performance motivated
by the capability of providing context-aware recommendations.
123
Table 4 Test performance comparison with state-of-art algorithms. Best results are marked in bold letters
ALGORITHM LDOS-CoMoDa DePaul Travel-STS InCarMusic Tijuana-Restaurant
MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE
cGCMC 0.77 ± 0.01 1.03 ± 0.01 1.03 ± 0.01 1.21 ± 0.01 0.85 ± 0.02 1.03 ± 0.02 1.15 ± 0.01 1.25 ± 0.01 0.93 ± 0.01 1.23 ± 0.01
Graph Neural Network for Context-Aware Recommendation
cGCMCold [19] 0.938 ± 0.01 1.15 ± 0.01 1.04 ± 0.01 1.21 ± 0.01 0.96 ± 0.02 1.17 ± 0.02 1.19 ± 0.01 1.30 ± 0.01 1.07 ± 0.01 1.28 ± 0.01
cGCMC F 0.853 ± 0.01 1.10 ± 0.01 NA NA 0.91 ± 0.02 1.12 ± 0.02 NA NA NA NA
cGCMCold
F [19] 0.918 ± 0.01 1.127 ± 0.01 NA NA 0.932 ± 0.02 1.14 ± 0.02 NA NA NA NA
123
5371
5372
123
Table 5 Test-set performance comparison with state-of-art algorithms. Best results are marked in bold
ALGORITHM LDOS-CoMoDa DePaul Travel-STS InCarMusic Tijuana-Restaurant
MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE
cGCMC 0.77 ± 0.01 1.03 ± 0.01 1.03 ± 0.01 1.21 ± 0.01 0.85 ± 0.02 1.03 ± 0.02 1.15 ± 0.01 1.25 ± 0.01 0.93 ± 0.01 1.23 ± 0.01
cGCMC F 0.853 ± 0.01 1.10 ± 0.01 NA NA 0.91 ± 0.02 1.12 ± 0.02 NA NA NA NA
GCMC [6] 1.12 ± 0.01 1.33 ± 0.01 1.18 ± 0.00 1.42 ± 0.00 1.09 ± 0.02 1.32 ± 0.02 1.21 ± 0.01 1.42 ± 0.01 1.25 ± 0.01 1.47 ± 0.01
GCMC+feat [6] 1.001 ± 0.01 1.24 ± 0.01 NA NA 0.95 ± 0.01 1.23 ± 0.01 NA NA NA NA
SocialMF [14] 0.96 ± 0.01 1.28 ± 0.02 1.06 ± 0.01 1.29 ± 0.01 1.12 ± 0.01 1.46 ± 0.01 1.34 ± 0.01 1.56 ± 0.02 1.09 ± 0.01 1.28 ± 0.01
SVD++ [36] 1.10 ± 0.01 1.45 ± 0.01 1.17 ± 0.02 1.40 ± 0.01 1.20 ± 0.01 1.36 ± 0.02 1.20 ± 0.01 1.41 ± 0.01 1.13 ± 0.01 1.32 ± 0.01
PMF [37] 1.38 ± 0.00 1.75 ± 0.00 1.19 ± 0.01 1.44 ± 0.01 1.14 ± 0.00 1.49 ± 0.00 1.37 ± 0.01 1.58 ± 0.01 1.30 ± 0.01 1.54 ± 0.01
BiasedMF [38] 1.46 ± 0.02 1.78 ± 0.02 1.20 ± 0.02 1.46 ± 0.02 1.13 ± 0.01 1.45 ± 0.01 1.45 ± 0.02 1.65 ± 0.02 1.41 ± 0.02 1.68 ± 0.02
GraphRecuu
uv [30] 1.16 ± 0.02 1.32 ± 0.02 1.25 ± 0.03 1.45 ± 0.03 1.20 ± 0.02 1.36 ± 0.02 1.25 ± 0.02 1.40 ± 0.02 1.18 ± 0.01 1.34 ± 0.01
A. Sattar, D. Bacciu
Graph Neural Network for Context-Aware Recommendation 5373
1.4
α α-ablated
1.2
0.8
MAE
0.6
0.4
0.2
0
LDOS-CoMoDa Tijuana-Restaurant Travel-STS InCarMusic DePaul
Overall, our model outperforms all baseline approaches on all datasets, providing sufficient
grounding to state the importance of being able to take into consideration the surrounding
knowledge of the context to provide accurate recommendations.
The major contribution of our approach is to organize context features on edges with user-item
interaction in an effective way. We have used the α importance factor to learn favourite sur-
rounding context features for target user and item for context-aware link prediction. We hence
execute ablation study, to validate the rationality and usefulness of α. We already explained
how context importance varies from person to person and different context attributes effect
differently on the items. The Fig. 4 demonstrates the positive effect of capturing this impor-
tance factor in our model. This is clearly due to prioritizing the contexts which are important
for users and items by giving them more weight.
We have three kinds of representations for the individual user and item (opinions, feature,
and context as mentioned in Sect. 4.1). For the accumulation of these three representations,
we determine that the concatenation of the representations is better in performance com-
pared to summation. That is why we mentioned the results with concatenation only. We have
introduced learnable attention weights for each representation before accumulating them.
These learnable weights provide a different significance for each representation (i.e., opin-
ion, contextual, and feature representation for users and items) in the final embedding. User’s
(or Item’s) opinion representation contains information about the neighbouring nodes with
respect to opinion information. Similarly, the contextual representation contains neighbour-
ing nodes with respect to contextual information. The final representation for the user is
an accumulation of these along with a dense feature representation. We believe that these
representations have their own impact on the final node representation with some factors,
which we call learnable weight. It might be possible that for some users opinion-based neigh-
123
5374 A. Sattar, D. Bacciu
0.8
MAE
0.6
0.4
0.2
0
LDOS-CoMoDa Tijuana-Restaurant Travel-STS InCarMusic DePaul
7 Conclusion
We have focused our work on emphasizing the impact of knowledge about the surrounding
context on user-item interaction. To this end, we have organized context, opinion, and item
features into a bipartite graph and an associated multidimensional matrix. We approached
the resulting matrix completion task using a graph convolutional autoencoder. Our graph
encoder captures the context information along with opinion in user-item interactions. We
also showed how the model leverages context information to capture the user’s behaviour
in relation to the surrounding context, giving attention to the most important contextual
aspects of the user and item. Furthermore, the bilinear decoder predicts the labelled edges
between the user and item. To demonstrate the effectiveness of our approach, we tested it
on five public datasets, showing significant improvements over state-of-the-art baselines.
We have conducted various experiments to verify how context representation gives benefit.
The application of our model is not only limited to product recommender systems in smart
devices, i.e., music/movie/travel/fashion recommendations. This model can also be used
for several intelligent predictions by developing further for specific domains like personal
medical reminders for elders and smart device setting controller based on the surrounding
context.
In this work, the accumulative approach unifies all context information, neglecting the
dynamic nature of some contextual attributes. This may result in losing the diversity of
individual context attributes. In the future, we would like to explore multi-dimensional edge
feature-based GNNs and multi-way interactions between users and items to capture more
realistically the dynamic behaviours. Furthermore, we intend to investigate the use of separate
embeddings for user and item contexts and to evaluate the performance on a large scale dataset.
123
Graph Neural Network for Context-Aware Recommendation 5375
On a different side, we want to extend our model to deal with heterogeneous graphs which
consist of nodes of different types and different context information on different edges.
Funding Open access funding provided by Universitá di Pisa within the CRUI-CARE Agreement.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,
and indicate if changes were made. The images or other third party material in this article are included in the
article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is
not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
References
1. Katarya R, Verma OP (2016) Recent developments in affective recommender systems. Physica A 461:182–
190
2. Karimova F (2016) A survey of e-commerce recommender systems. Eur Sci J 12(34):75–89
3. Li X, Li D (2019) An improved collaborative filtering recommendation algorithm and recommendation
strategy. Mobile Information Systems, p 3560968
4. Zarzour H, Maazouzi F, Soltani M, Chemam C (2018) An improved collaborative filtering recommen-
dation algorithm for big data. In: IFIP International Conference on Computational Intelligence and Its
Applications, pp 660–668. Springer
5. Bacciu D, Errica F, Micheli A, Podda M (2020) A gentle introduction to deep learning for graphs. Neural
Netw 129:203–221. https://doi.org/10.1016/j.neunet.2020.06.006
6. Berg Rvd, Kipf TN, Welling M (2017) Graph convolutional matrix completion. arXiv preprint
arXiv:1706.02263
7. Zhang J, Shi X, Zhao S, King I (2019) Star-gcn: Stacked and reconstructed graph convolutional networks
for recommender systems. In: The 28th International Joint Conference on Artificial Intelligence, pp
4264–4270
8. Wang X, He X, Wang M, Feng F, Chua T-S (2019) Neural graph collaborative filtering. In: Proceedings of
the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval,
pp 165–174
9. Wang X, Jin H, Zhang A, He X, Xu T, Chua T-S (2020) Disentangled graph collaborative filtering.
In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in
Information Retrieval, pp 1001–1010
10. Shi Y, Larson M, Hanjalic A (2014) Collaborative filtering beyond the user-item matrix: A survey of the
state of the art and future challenges. ACM Computing Surveys (CSUR) 47(1):1–45
11. Baltrunas L, Amatriain X (2009) Towards time-dependant recommendation based on implicit feedback.
In: Workshop on Context-aware Recommender Systems (CARS’09), pp 25–30. Citeseer
12. Panniello U, Tuzhilin A, Gorgoglione M, Palmisano C, Pedone A (2009) Experimental comparison of
pre-vs. post-filtering approaches in context-aware recommender systems. In: Proceedings of the Third
ACM Conference on Recommender Systems, pp 265–268
13. Karatzoglou A, Amatriain X, Baltrunas L, Oliver N (2010) Multiverse recommendation: n-dimensional
tensor factorization for context-aware collaborative filtering. In: Proceedings of the Fourth ACM Confer-
ence on Recommender Systems, pp 79–86
14. Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in
social networks. In: Proceedings of the Fourth ACM Conference on Recommender Systems, pp 135–142
15. He X, Chua T-S (2017) Neural factorization machines for sparse predictive analytics. In: Proceedings of
the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval,
pp 355–364
16. Lian J, Zhou X, Zhang F, Chen Z, Xie X, Sun G (2018) xdeepfm: Combining explicit and implicit
feature interactions for recommender systems. In: Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining, pp 1754–1763
17. Xin X, Chen B, He X, Wang D, Ding Y, Jose J (2019) Cfm: Convolutional factorization machines for
context-aware recommendation. IJCAI 19:3926–3932
123
5376 A. Sattar, D. Bacciu
18. Liu H, Zhang H, Hui K, He H (2015) Overview of context-aware recommender system research. In: 3rd
International Conference on Mechatronics, Robotics and Automation. Atlantis Press
19. Sattar A, Bacciu D (2021) Context-aware graph convolutional autoencoder. In: International Work-
Conference on Artificial Neural Networks, pp 279–290. Springer
20. Peng H, Jin Y, Lv X, Wang X (2019) A context aware poi recommendation algorithm based on matrix
decomposition. J Comput Sci 42:1797–1811
21. Baltrunas L, Ludwig B, Ricci F (2011) Matrix factorization techniques for context aware recommendation.
In: Proceedings of the Fifth ACM Conference on Recommender Systems, pp 301–304
22. Gao Q, Ma P (2020) Graph neural network and context-aware based user behavior prediction and recom-
mendation system research. Computational Intelligence and Neuroscience, p 8812370
23. Chen J, Zhang H, He X, Nie L, Liu W, Chua T-S (2017) Attentive collaborative filtering: Multimedia
recommendation with item-and component-level attention. In: Proceedings of the 40th International ACM
SIGIR Conference on Research and Development in Information Retrieval, pp 335–344
24. Wu S, Zhang W, Sun F, Cui B (2020) Graph neural networks in recommender systems: A survey. arXiv
preprint arXiv:2011.02260
25. Li Z, Cui Z, Wu S, Zhang X, Wang L (2019) Fi-gnn: Modeling feature interactions via graph neural
networks for ctr prediction. In: Proceedings of the 28th ACM International Conference on Information
and Knowledge Management, pp 539–548
26. Zhang M, Chen Y (2020) Inductive matrix completion based on graph neural networks. In: International
Conference on Learning Representations
27. Zheng L, Lu C-T, Jiang F, Zhang J, Yu, PS (2018) Spectral collaborative filtering. In: Proceedings of the
12th ACM Conference on Recommender Systems, pp 311–319
28. Wu Y, Liu H, Yang Y (2018) Graph convolutional matrix completion for bipartite edge prediction. In:
KDIR, pp 49–58
29. Yin R, Li K, Zhang G, Lu J (2019) A deeper graph neural network for recommender systems. Knowl-Based
Syst 185:105020
30. Fan W, Ma Y, Li Q, He Y, Zhao E, Tang J, Yin D (2019) Graph neural networks for social recommendation.
In: The World Wide Web Conference, pp 417–426
31. Rendle S (2010) Factorization machines. In: 2010 IEEE International Conference on Data Mining, pp
995–1000. IEEE
32. Trivedi R, Farajtabar M, Biswal P, Zha H (2018) Representation learning over dynamic graphs. arXiv
preprint arXiv:1803.04051
33. Rossi E, Chamberlain B, Frasca F, Eynard D, Monti F, Bronstein M (2020) Temporal graph networks for
deep learning on dynamic graphs. arXiv preprint arXiv:2006.10637
34. Li X, Zhang M, Wu S, Liu Z, Wang L, Philip SY (2020) Dynamic graph collaborative filtering. In: 2020
IEEE International Conference on Data Mining (ICDM), pp 322–331. IEEE
35. Sankar A, Wu Y, Gou L, Zhang W, Yang H (2018) Dynamic graph representation learning via self-attention
networks. arXiv preprint arXiv:1812.09430
36. Xian Z, Li Q, Li G, Li L (2017) New collaborative filtering algorithms based on svd++ and differential
privacy. Mathematical Problems in Engineering
37. Mnih A, Salakhutdinov RR (2008) Probabilistic matrix factorization. In: Advances in Neural Information
Processing Systems, pp 1257–1264
38. Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer
42(8):30–37
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
123