COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
COL775/COL7375: Assignment 1
The IndiaMART RecSys Challenge: Predicting Next-Product
Engagement
Contents
1 Overview and Introduction 2
1.1 The Business Imperative of Recommendation . . . . . . . . . . . . . . . . . . . . 2
1.2 A Brief History of Recommendation Paradigms . . . . . . . . . . . . . . . . . . . 2
1.3 The Deep Learning Revolution: Capturing Sequential Dynamics . . . . . . . . . . 3
1.4 Assignment Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 The Competition & The IndiaMART Dataset 4
2.1 Data Split and Leaderboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Training Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Test Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Evaluation, Timeline, and Grading 7
3.1 Evaluation Metric: Click-Through Rate (CTR) @ 6 . . . . . . . . . . . . . . . . . 7
3.2 Assignment Deadlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Submission Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 Grading Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Checkpoint Details 8
5 Code of Conduct & Tips 10
5.1 Rules and Regulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.2 Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6 Resources for Your Journey 11
6.1 Tutorials, Blogs, and Codebases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.2 Inspiration from Past Kaggle Competitions . . . . . . . . . . . . . . . . . . . . . 11
6.3 Foundational & Classic Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.4 Sequential Recommendation Papers . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6.5 Graph-Based & Advanced Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
1 Overview and Introduction
1.1 The Business Imperative of Recommendation
In large-scale digital marketplaces, recommendation systems are a critical component of the
platform’s economic engine and user experience. For a B2B platform like IndiaMART, which
connects millions of buyers and sellers, the ability to intelligently guide a user’s journey from one
product to the next is paramount. An effective recommendation can reveal a highly relevant
product that a user might not have discovered on their own, thereby increasing session duration,
conversion rates, and overall user satisfaction. Conversely, poor recommendations can lead to user
frustration and abandonment. The impact of these systems on both revenue and user retention
is, therefore, immense.
1.2 A Brief History of Recommendation Paradigms
Figure 1: Overview of recommender systems. Source: Ahmadian et al., “Recommender Systems
based on Non-negative Matrix Factorization: A Survey,” IEEE Transactions on Artificial
Intelligence, 2025.
The field of recommendation systems has evolved through several dominant paradigms. Un-
derstanding this evolution provides context for the modern, sequence-aware models you will be
building.
• Content-Based Filtering: The earliest systems operated on the principle of item simi-
larity: "If you liked X, you might also like Y, because Y’s properties are similar to X’s."
These systems analyze item attributes (e.g., a movie’s genre, a product’s category) to make
recommendations. While effective, they often suffer from a lack of novelty and can trap
users in a "filter bubble."
2
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
• Collaborative Filtering (CF): This paradigm shifted the focus from item content to
collective user behavior, operating on the motto: "Users who agreed in the past are likely
to agree in the future."
– Memory-Based (Neighborhood) Methods: These algorithms operate directly on
the user-item interaction matrix. They identify "neighbors", either similar users or
similar items, and aggregate their behavior to make predictions.
∗ User-Based CF: Predicts a user’s preference for an item by finding a neighbor-
hood of users with similar interaction histories and calculating a weighted average
of their ratings on that item.
∗ Item-Based CF: Famously pioneered by Amazon, this method builds an item-
item similarity matrix based on co-interaction patterns (e.g., users who bought
item A also tended to buy item B). It is often more scalable and performant than
user-based CF in practice.
– Model-Based Methods: Instead of using the raw interaction matrix at prediction
time, these approaches learn a parameterized model to uncover latent factors explaining
the observed interactions. The bulk of modern research resides here.
∗ Clustering-Based: A simple model-based approach where users or items are
clustered into groups based on their interaction patterns. Predictions are then
made by averaging the behavior within a given cluster.
∗ Matrix Factorization: A breakthrough class of models that decomposes the
sparse user-item interaction matrix into two smaller, dense matrices of latent
factors - one for users and one for items. These latent vectors (embeddings)
capture the underlying user tastes and item properties.
1.3 The Deep Learning Revolution: Capturing Sequential Dynamics
While powerful, traditional CF methods often treat a user’s history as an unordered set of
interactions. This approach fails to capture the crucial element of sequence and intent. A
user’s interest is not static; it evolves within a single browsing session. The product you look at
now is a powerful predictor of what you’ll look at next—far more so than a product you viewed
two months ago.
This is the core of Session-Based Recommendation. The objective is to predict the user’s
immediate next action by leveraging the short-term context of the current session. Deep learning
architectures designed for sequential data, such as those used in natural language processing,
have proven exceptionally effective. By treating a sequence of product IDs as a sequence of words
in a sentence, we can build models that "understand" a user’s immediate intent and predict the
continuation of their journey.
• Recurrent Neural Networks (RNNs): Architectures like GRUs and LSTMs are
inherently designed to process sequences. By feeding a user’s browsing history as a sequence
of item embeddings, RNNs can model the temporal dynamics of their intent and predict
the next item. This approach, popularized by GRU4Rec, is a foundational technique for
this assignment.
• Graph Neural Networks (GNNs): User-item interaction data can be naturally repre-
sented as a bipartite graph, or more relevant to this task, sessions can be modeled as a
graph of item-to-item transitions. GNNs learn powerful item embeddings by aggregating
information from their neighbors in the graph, effectively capturing complex connectivity
3
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
patterns that other methods might miss. The SR-GNN paper is a key example of this
approach.
1.4 Assignment Objective
This assignment places you at the forefront of this modern recommendation challenge. You will
work with a large-scale, real-world dataset from IndiaMART. Your mission is to develop a
sophisticated recommendation model that, given a user’s current product view (ProdID1), can
accurately predict and rank the top 6 unique products they are most likely to engage with
next. Your success will be measured by a direct business metric: the Click-Through Rate
(CTR) of your recommendations on a hidden, future dataset. This is a direct simulation of
deploying and evaluating a recommendation model in a live production environment.
2 The Competition & The IndiaMART Dataset
This assignment is a competitive challenge hosted on Kaggle, providing you with a real-time
leaderboard and a platform to test your models.
• Competition Name: COL775/COL7375 x IndiaMART RecSys Challenge
• https://www.kaggle.com/competitions/col-775-col-7375-x-india-mart-rec-sys-challenge
• Invite Link: https://www.kaggle.com/t/a157ac26d6c0aa073a5d4c21577621e5
• The Dataset and a sample notebook for evaluation has been provided in the competition.
2.1 Data Split and Leaderboard
The dataset is split temporally to simulate a real-world scenario where you must predict future
behavior based on past data.
• Training Set: The data you will use to train your models.
• Public Leaderboard Set: A small block of data (about 15 days) used to calculate the
public leaderboard score. This gives you a snapshot of your model’s performance but is
susceptible to overfitting.
• Private Leaderboard Set: Your final, grade-determining rank is based on a final hidden
dataset that comes after the public leaderboard period. This is the true test of your model’s
ability to generalize to the future.
2.2 Training Data
You are provided with two rich datasets: one detailing user interactions and the other containing
product metadata.
1. Interactions.csv/train.csv: This is the core dataset, logging the sequence of user activities.
It captures PDP to PDP transitions (PDP: Product Description Page), i.e. the movement
of a user from one product page to another.
4
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
Col ID Col Name Description
A User_ID The unique, hashed ID of the user.
B modid The device ID used for browsing (e.g., desktop, mobile, etc.). This
can capture different user behaviors across platforms.
C Timestamp The timestamp of the ProdID1 interaction.
D ProdID1 The product ID of the first product (source).
E ProdID2 The product ID of the second product (destination).
F mcatid1 Category ID for ProdID1 at event time.
G mcatid2 Category ID for ProdID2.
H subcatid1 Higher level category ID for ProdID1.
I subcatid2 Higher level category ID for ProdID2.
J Time_Lag Time elapsed between ProdID1 and ProdID2 views (sec-
onds.milliseconds). This can be a powerful feature indicating user
intent.
K Transition_Type The specific user behavior that led from ProdID1 to ProdID2. This
is a critical feature that provides context.
0 – PDP-PDP transition through widget but we don’t know which
of the widgets or positions were used (m0site, m-app)
1-1000: (To identify the widgets and the position within the widgets
clicked on the IM PDP page. Currently only from 1-20 corresponding
to the six positions on the first widget)
1-100: First widget
101-200: Second widget
201-300: Third widget and so on..
1001: Transition through IM search
1002: Transition through Google search
1003: Transition through MCAT page
1004: Other type of transition
1005: Transition through CITY page
1006: PDP page reload
1007: Seller page to PDP page
1008: Other Search Engines
1010: Contact Supplier/Get Latest Quote
1011: Click to Call or view mobile number
1012: Image clicked
L grp_id1 Industry/group identifier for ProdID1 (e.g., variant/collection/brand
family).
M grp_id2 Industry/group identifier for ProdID2.
N user_city User’s city at event time (inferred/account-level).
O cityid1 Listing city ID for ProdID1.
P cityid2 Listing city ID for ProdID2.
5
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
Col ID Col Name Description
Table 1: Description of columns in Interactions.csv
Note 1: Special Product IDs: The following special product IDs must be created to indicate
start of session, end of session, etc.
• Start of session: -1
• End of session: -2
Note 2: Data Integrity: For each ProductID, the number of times it appears in column D
must be exactly the same as the number of times it appears in column E.
2. Product_Data.csv: Contains rich metadata for the products.
Col ID Column Name Description
A pc_item_display_id Primary product identifier. This is
the join key to ProdID1/ProdID2 in
interactions.csv. Unique, anonymized,
stable over time.
B pc_item_name Display title of the product. May contain
spelling errors, capitalization variance, or
promotional text. Useful for text-based em-
beddings (TF–IDF, word2vec, transformer
encoders).
C pc_item_glusr_usr_id Seller account ID. Groups products by sup-
plier. Useful for modeling supplier-level
diversity, fairness, or cold-start behavior.
D city_id Listing city identifier (numeric key). Cor-
responds to geographic location of the sup-
plier/product. Matches cityid1/cityid2
in interactions.
E city_name Human-readable name of the listing city.
Redundant with city_id, but useful for
sanity checks and interpretable reporting.
F glcat_mcat_id Master category identifier (coarse taxon-
omy level). Maps to mcatid1/mcatid2 in
interactions.
G glcat_mcat_name Text name for the master category (e.g.,
“Plywoods”). Descriptive label for report-
ing and error analysis.
H glcat_cat_id Sub-category identifier (finer taxon-
omy than glcat_mcat_id). Maps to
subcatid1/subcatid2 in interactions.
6
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
Col ID Column Name Description
I glcat_cat_name Text name for the sub-category (e.g., “Ma-
rine Plywood”). Useful for category-aware
sampling, embeddings, and explanations.
J specs Semi-structured key:val product specifica-
tions. May include attributes such as
dimensions, material, brand. Free text
with inconsistent delimiters; requires pars-
ing/normalization to exploit fully.
2.3 Test Data
The Test Dataset test.csv is similar to train.csv, but all the information regarding ProdID2
has been removed, such as the columns grp_id2, cityid2, mcatid2, etc., and a special index
column has been added which acts as the data aligning column.
3 Evaluation, Timeline, and Grading
3.1 Evaluation Metric: Click-Through Rate (CTR) @ 6
For every ProdID1 in the test set, your model produces 6 predictions. If the true ProdID2 is
among your 6 predictions, it is a "Hit".
Formula: Let N be the total number of interactions in the test set. Let Ri be the set of 6
products you recommend for the ith interaction, and yi be the true next product.
PN
i=1 [1 if yi ∈ Ri else 0]
CTR @ 6 =
N
3.2 Assignment Deadlines
There will be three deadlines to ensure steady progress. Note that all the deadlines are at 7:00
p.m. IST!
Submission Deadline Deliverables
Checkpoint 1 7:00 p.m., September 21, 2025 A baseline submission using
frequency-based methods. (30
marks)
Checkpoint 2 7:00 p.m., October 05, 2025 An improved submission using
ML/DL techniques only on Prod-
uct Details. (30 marks)
Final Submission 7:00 p.m., October 26, 2025 Final Kaggle submission based on
both user sessions + product de-
tails. (40 marks)
7
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
3.3 Submission Format
All submissions to Kaggle must follow a strict CSV schema. Each row corresponds to one test
interaction (identified by index) and contains exactly six predictions. Thus, you need to provide
predictions for all the rows in the test dataset, with index being the unique identifier.
• The header must be:
index,predictions
• Each row begins with the test index, followed by six distinct product IDs separated by ‘,‘
predicted as the most likely next products (ProdID2).
• No sentinel IDs (-1, -2) may appear in predictions.
• All predictions in a row must be unique.
Example.
index,predictions
0,"eGSyx5PM,2DVEEARU,bJQEE1GD,XSD0e4lg,cVmIQnPp,kVk9FFHM"
1,"ja1yduXM,Yneo6XiB,6zrCSZJL,XSD0e4lg,cVmIQnPp,kVk9FFHM"
3.4 Grading Policy
The grading is designed to reward both competitive performance and the depth of your research.
• Competitive Component (30%): Your final grade for this part is determined by your
team’s rank on the private Kaggle leaderboard. Ranks will be grouped into tiers to
assign scores.
• Non-Competitive Component (70%): Graded independently of your rank, this is
based on the quality, depth, and clarity of your Final Report and submitted code. A
rigorous report detailing extensive experimentation and insightful analysis can earn full
marks, even if the final rank is not at the very top. This component rewards the process of
scientific inquiry.
• Furthermore, a demo will be held after the deadlines, where we will discuss your implemen-
tatoin and judge your understanding on the principles behind the working of the models.
The final score for the assignment will be the product of the demo score (between 0 to 1)
and the assignment score.
• All the three checkpoints will be graded independently.
4 Checkpoint Details
Checkpoint 1: Frequency-Based Baseline (Non Competitive)
All students must submit the same baseline method to establish a common starting point. The
requirement is to construct a simple frequency-based predictor directly from train.csv:
8
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
• Compute the frequency of products appearing as ProdID2 across all users and sessions for a
given ProdID1 in the training data.
• Sort all products by descending frequency and output the top 6 products as the recommen-
dations for ProdID1.
• Ensure predictions are valid: 6 unique product IDs, no sentinels (-1, -2). In case there are
less than 6 products appearing as ProdID2 for a given ProdID1, you can set the rest of the
recommendations to "unknown". E.g. if there are only 2 ProdID2, the recommendations will
be "6zrCSZJL,Yneo6XiB,unknown,unknown,unknown,unknown".
• For each row in test set, predict the top 6 products based on only the ProdID1.
Deliverable. A Kaggle submission file produced by this method and the code used to generate
the submission. The code will be submitted on Moodle. (30 marks)
Checkpoint 2: Product Centered Approach (Competitive + Non-Competitive)
Once the frequency baseline is established, you are expected to innovate and improve. For this
checkpoint, you have to use feature engineering and DL models based only on the product
details, disregarding any session or user bias. Examples include:
• Matrix factorization on item–attribute matrix (products × categories/features), TF/IDF
on product specs, Feature Engineering on mcatid, subcatid, group IDs, supplier city, etc.
• Autoencoders on product features: reconstruct product feature vectors and use latent space
similarity as recommendation scores.
• Simple Machine Learning Models: Logistic regression or tree-based models (e.g., Light-
GBM) using features such as category matches, or city overlaps.
Deliverable. A working submission using at least one ML/DL method or a significantly improved
heuristic over the global baseline, accompanied by report including your observations, validation
analysis, and ablations. The code has to be submitted on Moodle, and the report on Gradescope.
The report must contain the link of your model uploaded on Kaggle (make sure we can access
it!). (30 marks)
Checkpoint 3: Product + User Centered Approach (Competitive + Non-
Competitive)
Develop full-fledged recommendation pipelines that integrate modern deep learning methods and
hybrid strategies, based on both the product specific features and past user interactions. You
may experiment with:
• Sequential Models: GRU4Rec, SASRec, BERT4Rec — models that consume user sessions
as sequences of product embeddings.
• Graph Models: SR-GNN or LightGCN over item–item transitions to capture structural
connectivity.
• Two-Stage Architectures: Candidate generation (e.g., co-visitation, popularity, simple CF)
followed by re-ranking with a feature-rich model (Transformer, MLP, LightGBM).
9
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
Deliverable. A final Kaggle submission, full reproducible code on Moodle, and a detailed
report documenting experiments, ablations, error analysis, limitations, and the model link on
Gradescope. (40 marks)
5 Code of Conduct & Tips
5.1 Rules and Regulations
• Team Size: Up to 4 members.
• Open Source: You may use any open-source code, but you must cite it in your report
and be able to explain its functionality.
• Pre-trained Models: Use of pre-trained weights and embeddings is permitted and
encouraged.
5.2 Best Practices
Reproducibility
• Use a virtual environment (venv or conda) and record exact package versions in a requirements.txt
or environment.yml.
• Fix random seeds (Python, NumPy, PyTorch/TensorFlow) to make experiments repeatable.
• Provide a clear script or README.md that reproduces your submission from raw data.
Data Handling
• Do not use test data or future-period data when computing statistics or building features
(avoid leakage).
• Use temporal validation splits only: training must precede validation chronologically.
• Split into train/validation before fitting encoders, scalers, or embeddings.
Modeling
• Begin with simple baselines (frequency, co-visitation) before complex models.
• For sequential models, mask future items correctly during training.
• Validate on your own held-out set; avoid tuning only on the public leaderboard.
Robustness
• Prototype on small data subsets; scale up after verifying correctness.
• Ensure every test row produces 6 unique predictions; use a global-popularity fallback if
needed.
• Double-check submission format: correct headers, no sentinel IDs (-1, -2).
10
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
Reporting
• Document splits, baselines, models tried, results, and failures.
• Report compute details (hardware, runtime, parameters).
• State limitations and assumptions clearly.
• Make sure the links to models and/or Kaggle notebooks work!
We wish you the best of luck. We are excited to see the powerful and creative models you will
build!
6 Resources for Your Journey
This section provides a curated list of research papers and resources to guide your project. We
strongly encourage you to read the relevant papers for the models you choose to implement.
Understanding the original motivations and architectures is a key part of this assignment.
6.1 Tutorials, Blogs, and Codebases
• NVIDIA Merlin: An open-source framework for building large-scale recommender systems.
Exploring their tutorials can provide great insights. https://github.com/NVIDIA-Merlin/
Merlin
• "Recommender Systems" by Google Developers: A high-level course on the funda-
mentals. https://developers.google.com/machine-learning/recommendation
• Papers With Code: An excellent resource for finding implementations under the Sequen-
tial Recommendation task. https://paperswithcode.com/task/sequential-recommendation
6.2 Inspiration from Past Kaggle Competitions
One of the most valuable skills in machine learning is the ability to learn from the community.
The winning solutions from past Kaggle competitions are a treasure trove of practical techniques
and robust strategies. We highly encourage you to study the top-ranking solutions. Your goal
is not to copy-paste code, but to understand the strategies behind winning solutions.
Here are a few highly relevant competitions:
1. H&M Personalized Fashion Recommendations
• Link: https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations
• Relevance: This competition is a masterclass in handling large-scale implicit feedback.
The dominant strategy was a two-stage candidate generation + re-ranking architecture,
a powerful pattern for this type of problem.
11
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
2. OTTO - Multi-Objective Recommender System
• Link: https://www.kaggle.com/competitions/otto-recommender-system
• Relevance: The data format (session-based clickstreams) is almost identical to yours.
The winning solutions demonstrated the incredible power of carefully constructed co-
occurrence matrices for candidate generation. This competition is a masterclass in
feature engineering for session data.
6.3 Foundational & Classic Papers
These papers introduce the core concepts that underpin modern recommender systems.
• Amazon.com Recommendations: Item-to-Item Collaborative Filtering (2003)
https://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf
1. Item-Based Collaborative Filtering Recommendation Algorithms
• Citation: Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). WWW ’01.
• Remarks: The original paper that formalized item-based collaborative filtering.
• Link to Paper: https://dl.acm.org/doi/10.1145/371920.372071
2. Matrix Factorization Techniques for Recommender Systems
• Citation: Koren, Y., Bell, R., & Volinsky, C. (2009). IEEE Computer.
• Remarks: The quintessential paper summarizing the power of matrix factorization,
made famous during the Netflix Prize. A must-read.
• Link to Paper: https://datajobs.com/data-science-repo/Recommender-Systems-[Netflix]
.pdf
3. Collaborative Filtering for Implicit Feedback Datasets
• Citation: Hu, Y., Koren, Y., & Volinsky, C. (2008). ICDM ’08.
• Remarks: Extremely relevant for this assignment. It introduces the popular
Alternating Least Squares (ALS) method for implicit feedback (clicks, views).
• Link to Paper: http://yifanhu.net/PUB/cf.pdf
4. BPR: Bayesian Personalized Ranking from Implicit Feedback
• Citation: Rendle, S., et al. (2009). UAI ’09.
• Remarks: Introduces BPR, a powerful loss function that directly optimizes for ranking,
which is crucial for recommendation.
• Link to Paper: https://arxiv.org/abs/1205.2618
5. Factorization Machines
• Citation: Rendle, S. (2010). ICDM ’10.
• Remarks: A powerful model that generalizes matrix factorization and is highly effective
for handling sparse categorical features.
• Link to Paper: https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf
12
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge
6. Wide & Deep Learning for Recommender Systems
• Citation: Cheng, H. T., et al. (2016). RecSys ’16.
• Remarks: A seminal paper from Google showing how to combine deep networks (for
learning feature interactions) with a linear model (for memorization).
• Link to Paper: https://arxiv.org/abs/1606.07792
6.4 Sequential Recommendation Papers
This is the core research area for your assignment.
7. Session-based Recommendations with Recurrent Neural Networks (GRU4Rec)
• Citation: Hidasi, B., et al. (2016). ICLR ’16.
• Remarks: The foundational paper that successfully applied RNNs (specifically GRUs)
to session-based recommendation, setting a new standard for performance.
• Link to Paper: https://arxiv.org/abs/1511.06939
8. Self-Attentive Sequential Recommendation (SASRec)
• Citation: Kang, W. C., & McAuley, J. (2018). ICDM ’18.
• Remarks: A key paper that replaces the RNN with a Transformer (self-attention)
model. It is often more effective at capturing long-range dependencies and is a very
strong baseline to aim for.
• Link to Paper: https://arxiv.org/abs/1808.09781
9. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Represen-
tations from Transformer
• Citation: Sun, F., et al. (2019). CIKM ’19.
• Remarks: Adapts the cloze task objective from language modeling (BERT) to recom-
mendation, allowing for learning bidirectional sequence representations.
• Link to Paper: https://arxiv.org/abs/1904.06690
6.5 Graph-Based & Advanced Papers
10. Session-based Recommendation with Graph Neural Networks (SR-GNN)
• Citation: Wu, S., et al. (2019). AAAI ’19.
• Remarks: A seminal paper on applying GNNs to session data, constructing a graph
for each session to learn item transitions.
• Link to Paper: https://arxiv.org/abs/1811.00855
11. LightGCN: Simplifying and Powering Graph Convolution Network for Recom-
mendation
• Citation: He, X., et al. (2020). SIGIR ’20.
• Remarks: A powerful and simplified GCN model that has become a very strong
baseline in collaborative filtering.
• Link to Paper: https://arxiv.org/abs/2002.02126
13