0% found this document useful (0 votes)

14 views13 pages

Col7757375 A1

The document outlines the COL775/COL7375 Assignment 1 for the IndiaMART RecSys Challenge, focusing on predicting next-product engagement using a large-scale dataset. It details the importance of recommendation systems in enhancing user experience and business performance, the evolution of recommendation paradigms, and the assignment's objective of developing a model to predict user interactions. Participants will utilize provided datasets to train their models and will be evaluated based on Click-Through Rate (CTR) metrics in a competitive environment.

Uploaded by

sangmhari474

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views13 pages

Col7757375 A1

Uploaded by

sangmhari474

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

COL775/COL7375: Assignment 1
The IndiaMART RecSys Challenge: Predicting Next-Product
Engagement

Contents

1 Overview and Introduction 2

1.1 The Business Imperative of Recommendation . . . . . . . . . . . . . . . . . . . . 2
1.2 A Brief History of Recommendation Paradigms . . . . . . . . . . . . . . . . . . . 2
1.3 The Deep Learning Revolution: Capturing Sequential Dynamics . . . . . . . . . . 3
1.4 Assignment Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 The Competition & The IndiaMART Dataset 4

2.1 Data Split and Leaderboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Training Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Test Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Evaluation, Timeline, and Grading 7

3.1 Evaluation Metric: Click-Through Rate (CTR) @ 6 . . . . . . . . . . . . . . . . . 7
3.2 Assignment Deadlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Submission Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 Grading Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Checkpoint Details 8

5 Code of Conduct & Tips 10

5.1 Rules and Regulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.2 Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

6 Resources for Your Journey 11

6.1 Tutorials, Blogs, and Codebases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.2 Inspiration from Past Kaggle Competitions . . . . . . . . . . . . . . . . . . . . . 11
6.3 Foundational & Classic Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.4 Sequential Recommendation Papers . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6.5 Graph-Based & Advanced Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

1 Overview and Introduction

1.1 The Business Imperative of Recommendation

In large-scale digital marketplaces, recommendation systems are a critical component of the

platform’s economic engine and user experience. For a B2B platform like IndiaMART, which
connects millions of buyers and sellers, the ability to intelligently guide a user’s journey from one
product to the next is paramount. An effective recommendation can reveal a highly relevant
product that a user might not have discovered on their own, thereby increasing session duration,
conversion rates, and overall user satisfaction. Conversely, poor recommendations can lead to user
frustration and abandonment. The impact of these systems on both revenue and user retention
is, therefore, immense.

1.2 A Brief History of Recommendation Paradigms

Figure 1: Overview of recommender systems. Source: Ahmadian et al., “Recommender Systems

based on Non-negative Matrix Factorization: A Survey,” IEEE Transactions on Artificial
Intelligence, 2025.

The field of recommendation systems has evolved through several dominant paradigms. Un-
derstanding this evolution provides context for the modern, sequence-aware models you will be
building.

• Content-Based Filtering: The earliest systems operated on the principle of item simi-
larity: "If you liked X, you might also like Y, because Y’s properties are similar to X’s."
These systems analyze item attributes (e.g., a movie’s genre, a product’s category) to make
recommendations. While effective, they often suffer from a lack of novelty and can trap
users in a "filter bubble."

2
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

• Collaborative Filtering (CF): This paradigm shifted the focus from item content to
collective user behavior, operating on the motto: "Users who agreed in the past are likely
to agree in the future."

– Memory-Based (Neighborhood) Methods: These algorithms operate directly on

the user-item interaction matrix. They identify "neighbors", either similar users or
similar items, and aggregate their behavior to make predictions.
∗ User-Based CF: Predicts a user’s preference for an item by finding a neighbor-
hood of users with similar interaction histories and calculating a weighted average
of their ratings on that item.
∗ Item-Based CF: Famously pioneered by Amazon, this method builds an item-
item similarity matrix based on co-interaction patterns (e.g., users who bought
item A also tended to buy item B). It is often more scalable and performant than
user-based CF in practice.
– Model-Based Methods: Instead of using the raw interaction matrix at prediction
time, these approaches learn a parameterized model to uncover latent factors explaining
the observed interactions. The bulk of modern research resides here.
∗ Clustering-Based: A simple model-based approach where users or items are
clustered into groups based on their interaction patterns. Predictions are then
made by averaging the behavior within a given cluster.
∗ Matrix Factorization: A breakthrough class of models that decomposes the
sparse user-item interaction matrix into two smaller, dense matrices of latent
factors - one for users and one for items. These latent vectors (embeddings)
capture the underlying user tastes and item properties.

1.3 The Deep Learning Revolution: Capturing Sequential Dynamics

While powerful, traditional CF methods often treat a user’s history as an unordered set of
interactions. This approach fails to capture the crucial element of sequence and intent. A
user’s interest is not static; it evolves within a single browsing session. The product you look at
now is a powerful predictor of what you’ll look at next—far more so than a product you viewed
two months ago.
This is the core of Session-Based Recommendation. The objective is to predict the user’s
immediate next action by leveraging the short-term context of the current session. Deep learning
architectures designed for sequential data, such as those used in natural language processing,
have proven exceptionally effective. By treating a sequence of product IDs as a sequence of words
in a sentence, we can build models that "understand" a user’s immediate intent and predict the
continuation of their journey.

• Recurrent Neural Networks (RNNs): Architectures like GRUs and LSTMs are
inherently designed to process sequences. By feeding a user’s browsing history as a sequence
of item embeddings, RNNs can model the temporal dynamics of their intent and predict
the next item. This approach, popularized by GRU4Rec, is a foundational technique for
this assignment.

• Graph Neural Networks (GNNs): User-item interaction data can be naturally repre-
sented as a bipartite graph, or more relevant to this task, sessions can be modeled as a
graph of item-to-item transitions. GNNs learn powerful item embeddings by aggregating
information from their neighbors in the graph, effectively capturing complex connectivity

3
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

patterns that other methods might miss. The SR-GNN paper is a key example of this
approach.

1.4 Assignment Objective

This assignment places you at the forefront of this modern recommendation challenge. You will
work with a large-scale, real-world dataset from IndiaMART. Your mission is to develop a
sophisticated recommendation model that, given a user’s current product view (ProdID1), can
accurately predict and rank the top 6 unique products they are most likely to engage with
next. Your success will be measured by a direct business metric: the Click-Through Rate
(CTR) of your recommendations on a hidden, future dataset. This is a direct simulation of
deploying and evaluating a recommendation model in a live production environment.

2 The Competition & The IndiaMART Dataset

This assignment is a competitive challenge hosted on Kaggle, providing you with a real-time
leaderboard and a platform to test your models.

• Competition Name: COL775/COL7375 x IndiaMART RecSys Challenge

• https://www.kaggle.com/competitions/col-775-col-7375-x-india-mart-rec-sys-challenge

• Invite Link: https://www.kaggle.com/t/a157ac26d6c0aa073a5d4c21577621e5

• The Dataset and a sample notebook for evaluation has been provided in the competition.

2.1 Data Split and Leaderboard

The dataset is split temporally to simulate a real-world scenario where you must predict future
behavior based on past data.

• Training Set: The data you will use to train your models.

• Public Leaderboard Set: A small block of data (about 15 days) used to calculate the
public leaderboard score. This gives you a snapshot of your model’s performance but is
susceptible to overfitting.

• Private Leaderboard Set: Your final, grade-determining rank is based on a final hidden
dataset that comes after the public leaderboard period. This is the true test of your model’s
ability to generalize to the future.

2.2 Training Data

You are provided with two rich datasets: one detailing user interactions and the other containing
product metadata.
1. Interactions.csv/train.csv: This is the core dataset, logging the sequence of user activities.
It captures PDP to PDP transitions (PDP: Product Description Page), i.e. the movement
of a user from one product page to another.

4
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

Col ID Col Name Description

A User_ID The unique, hashed ID of the user.

B modid The device ID used for browsing (e.g., desktop, mobile, etc.). This
can capture different user behaviors across platforms.
C Timestamp The timestamp of the ProdID1 interaction.
D ProdID1 The product ID of the first product (source).
E ProdID2 The product ID of the second product (destination).
F mcatid1 Category ID for ProdID1 at event time.
G mcatid2 Category ID for ProdID2.
H subcatid1 Higher level category ID for ProdID1.
I subcatid2 Higher level category ID for ProdID2.
J Time_Lag Time elapsed between ProdID1 and ProdID2 views (sec-
onds.milliseconds). This can be a powerful feature indicating user
intent.
K Transition_Type The specific user behavior that led from ProdID1 to ProdID2. This
is a critical feature that provides context.
0 – PDP-PDP transition through widget but we don’t know which
of the widgets or positions were used (m0site, m-app)
1-1000: (To identify the widgets and the position within the widgets
clicked on the IM PDP page. Currently only from 1-20 corresponding
to the six positions on the first widget)
1-100: First widget
101-200: Second widget
201-300: Third widget and so on..
1001: Transition through IM search
1002: Transition through Google search
1003: Transition through MCAT page
1004: Other type of transition
1005: Transition through CITY page
1006: PDP page reload
1007: Seller page to PDP page
1008: Other Search Engines
1010: Contact Supplier/Get Latest Quote
1011: Click to Call or view mobile number
1012: Image clicked
L grp_id1 Industry/group identifier for ProdID1 (e.g., variant/collection/brand
family).
M grp_id2 Industry/group identifier for ProdID2.
N user_city User’s city at event time (inferred/account-level).
O cityid1 Listing city ID for ProdID1.
P cityid2 Listing city ID for ProdID2.

5
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

Col ID Col Name Description

Table 1: Description of columns in Interactions.csv

Note 1: Special Product IDs: The following special product IDs must be created to indicate
start of session, end of session, etc.

• Start of session: -1
• End of session: -2

Note 2: Data Integrity: For each ProductID, the number of times it appears in column D
must be exactly the same as the number of times it appears in column E.

2. Product_Data.csv: Contains rich metadata for the products.

Col ID Column Name Description

A pc_item_display_id Primary product identifier. This is

the join key to ProdID1/ProdID2 in
interactions.csv. Unique, anonymized,
stable over time.
B pc_item_name Display title of the product. May contain
spelling errors, capitalization variance, or
promotional text. Useful for text-based em-
beddings (TF–IDF, word2vec, transformer
encoders).
C pc_item_glusr_usr_id Seller account ID. Groups products by sup-
plier. Useful for modeling supplier-level
diversity, fairness, or cold-start behavior.
D city_id Listing city identifier (numeric key). Cor-
responds to geographic location of the sup-
plier/product. Matches cityid1/cityid2
in interactions.
E city_name Human-readable name of the listing city.
Redundant with city_id, but useful for
sanity checks and interpretable reporting.
F glcat_mcat_id Master category identifier (coarse taxon-
omy level). Maps to mcatid1/mcatid2 in
interactions.
G glcat_mcat_name Text name for the master category (e.g.,
“Plywoods”). Descriptive label for report-
ing and error analysis.
H glcat_cat_id Sub-category identifier (finer taxon-
omy than glcat_mcat_id). Maps to
subcatid1/subcatid2 in interactions.

6
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

Col ID Column Name Description

I glcat_cat_name Text name for the sub-category (e.g., “Ma-

rine Plywood”). Useful for category-aware
sampling, embeddings, and explanations.
J specs Semi-structured key:val product specifica-
tions. May include attributes such as
dimensions, material, brand. Free text
with inconsistent delimiters; requires pars-
ing/normalization to exploit fully.

2.3 Test Data

The Test Dataset test.csv is similar to train.csv, but all the information regarding ProdID2
has been removed, such as the columns grp_id2, cityid2, mcatid2, etc., and a special index
column has been added which acts as the data aligning column.

3 Evaluation, Timeline, and Grading

3.1 Evaluation Metric: Click-Through Rate (CTR) @ 6

For every ProdID1 in the test set, your model produces 6 predictions. If the true ProdID2 is
among your 6 predictions, it is a "Hit".
Formula: Let N be the total number of interactions in the test set. Let Ri be the set of 6
products you recommend for the ith interaction, and yi be the true next product.
PN
i=1 [1 if yi ∈ Ri else 0]
CTR @ 6 =
N

3.2 Assignment Deadlines

There will be three deadlines to ensure steady progress. Note that all the deadlines are at 7:00
p.m. IST!

Submission Deadline Deliverables

Checkpoint 1 7:00 p.m., September 21, 2025 A baseline submission using

frequency-based methods. (30
marks)
Checkpoint 2 7:00 p.m., October 05, 2025 An improved submission using
ML/DL techniques only on Prod-
uct Details. (30 marks)
Final Submission 7:00 p.m., October 26, 2025 Final Kaggle submission based on
both user sessions + product de-
tails. (40 marks)

7
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

3.3 Submission Format

All submissions to Kaggle must follow a strict CSV schema. Each row corresponds to one test
interaction (identified by index) and contains exactly six predictions. Thus, you need to provide
predictions for all the rows in the test dataset, with index being the unique identifier.

• The header must be:

index,predictions

• Each row begins with the test index, followed by six distinct product IDs separated by ‘,‘
predicted as the most likely next products (ProdID2).

• No sentinel IDs (-1, -2) may appear in predictions.

• All predictions in a row must be unique.

Example.

index,predictions
0,"eGSyx5PM,2DVEEARU,bJQEE1GD,XSD0e4lg,cVmIQnPp,kVk9FFHM"
1,"ja1yduXM,Yneo6XiB,6zrCSZJL,XSD0e4lg,cVmIQnPp,kVk9FFHM"

3.4 Grading Policy

The grading is designed to reward both competitive performance and the depth of your research.

• Competitive Component (30%): Your final grade for this part is determined by your
team’s rank on the private Kaggle leaderboard. Ranks will be grouped into tiers to
assign scores.

• Non-Competitive Component (70%): Graded independently of your rank, this is

based on the quality, depth, and clarity of your Final Report and submitted code. A
rigorous report detailing extensive experimentation and insightful analysis can earn full
marks, even if the final rank is not at the very top. This component rewards the process of
scientific inquiry.

• Furthermore, a demo will be held after the deadlines, where we will discuss your implemen-
tatoin and judge your understanding on the principles behind the working of the models.
The final score for the assignment will be the product of the demo score (between 0 to 1)
and the assignment score.

• All the three checkpoints will be graded independently.

4 Checkpoint Details

Checkpoint 1: Frequency-Based Baseline (Non Competitive)

All students must submit the same baseline method to establish a common starting point. The
requirement is to construct a simple frequency-based predictor directly from train.csv:

8
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

• Compute the frequency of products appearing as ProdID2 across all users and sessions for a
given ProdID1 in the training data.
• Sort all products by descending frequency and output the top 6 products as the recommen-
dations for ProdID1.
• Ensure predictions are valid: 6 unique product IDs, no sentinels (-1, -2). In case there are
less than 6 products appearing as ProdID2 for a given ProdID1, you can set the rest of the
recommendations to "unknown". E.g. if there are only 2 ProdID2, the recommendations will
be "6zrCSZJL,Yneo6XiB,unknown,unknown,unknown,unknown".
• For each row in test set, predict the top 6 products based on only the ProdID1.

Deliverable. A Kaggle submission file produced by this method and the code used to generate
the submission. The code will be submitted on Moodle. (30 marks)

Checkpoint 2: Product Centered Approach (Competitive + Non-Competitive)

Once the frequency baseline is established, you are expected to innovate and improve. For this
checkpoint, you have to use feature engineering and DL models based only on the product
details, disregarding any session or user bias. Examples include:

• Matrix factorization on item–attribute matrix (products × categories/features), TF/IDF

on product specs, Feature Engineering on mcatid, subcatid, group IDs, supplier city, etc.
• Autoencoders on product features: reconstruct product feature vectors and use latent space
similarity as recommendation scores.
• Simple Machine Learning Models: Logistic regression or tree-based models (e.g., Light-
GBM) using features such as category matches, or city overlaps.

Deliverable. A working submission using at least one ML/DL method or a significantly improved
heuristic over the global baseline, accompanied by report including your observations, validation
analysis, and ablations. The code has to be submitted on Moodle, and the report on Gradescope.
The report must contain the link of your model uploaded on Kaggle (make sure we can access
it!). (30 marks)

Checkpoint 3: Product + User Centered Approach (Competitive + Non-

Competitive)

Develop full-fledged recommendation pipelines that integrate modern deep learning methods and
hybrid strategies, based on both the product specific features and past user interactions. You
may experiment with:

• Sequential Models: GRU4Rec, SASRec, BERT4Rec — models that consume user sessions
as sequences of product embeddings.
• Graph Models: SR-GNN or LightGCN over item–item transitions to capture structural
connectivity.
• Two-Stage Architectures: Candidate generation (e.g., co-visitation, popularity, simple CF)
followed by re-ranking with a feature-rich model (Transformer, MLP, LightGBM).

9
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

Deliverable. A final Kaggle submission, full reproducible code on Moodle, and a detailed
report documenting experiments, ablations, error analysis, limitations, and the model link on
Gradescope. (40 marks)

5 Code of Conduct & Tips

5.1 Rules and Regulations

• Team Size: Up to 4 members.

• Open Source: You may use any open-source code, but you must cite it in your report
and be able to explain its functionality.
• Pre-trained Models: Use of pre-trained weights and embeddings is permitted and
encouraged.

5.2 Best Practices

Reproducibility

• Use a virtual environment (venv or conda) and record exact package versions in a requirements.txt
or environment.yml.
• Fix random seeds (Python, NumPy, PyTorch/TensorFlow) to make experiments repeatable.
• Provide a clear script or README.md that reproduces your submission from raw data.

Data Handling

• Do not use test data or future-period data when computing statistics or building features
(avoid leakage).
• Use temporal validation splits only: training must precede validation chronologically.
• Split into train/validation before fitting encoders, scalers, or embeddings.

Modeling

• Begin with simple baselines (frequency, co-visitation) before complex models.

• For sequential models, mask future items correctly during training.
• Validate on your own held-out set; avoid tuning only on the public leaderboard.

Robustness

• Prototype on small data subsets; scale up after verifying correctness.

• Ensure every test row produces 6 unique predictions; use a global-popularity fallback if
needed.
• Double-check submission format: correct headers, no sentinel IDs (-1, -2).

10
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

Reporting

• Document splits, baselines, models tried, results, and failures.

• Report compute details (hardware, runtime, parameters).

• State limitations and assumptions clearly.

• Make sure the links to models and/or Kaggle notebooks work!

We wish you the best of luck. We are excited to see the powerful and creative models you will
build!

6 Resources for Your Journey

This section provides a curated list of research papers and resources to guide your project. We
strongly encourage you to read the relevant papers for the models you choose to implement.
Understanding the original motivations and architectures is a key part of this assignment.

6.1 Tutorials, Blogs, and Codebases

• NVIDIA Merlin: An open-source framework for building large-scale recommender systems.

Exploring their tutorials can provide great insights. https://github.com/NVIDIA-Merlin/
Merlin

• "Recommender Systems" by Google Developers: A high-level course on the funda-

mentals. https://developers.google.com/machine-learning/recommendation

• Papers With Code: An excellent resource for finding implementations under the Sequen-
tial Recommendation task. https://paperswithcode.com/task/sequential-recommendation

6.2 Inspiration from Past Kaggle Competitions

One of the most valuable skills in machine learning is the ability to learn from the community.
The winning solutions from past Kaggle competitions are a treasure trove of practical techniques
and robust strategies. We highly encourage you to study the top-ranking solutions. Your goal
is not to copy-paste code, but to understand the strategies behind winning solutions.
Here are a few highly relevant competitions:

1. H&M Personalized Fashion Recommendations

• Link: https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations

• Relevance: This competition is a masterclass in handling large-scale implicit feedback.

The dominant strategy was a two-stage candidate generation + re-ranking architecture,
a powerful pattern for this type of problem.

11
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

2. OTTO - Multi-Objective Recommender System

• Link: https://www.kaggle.com/competitions/otto-recommender-system

• Relevance: The data format (session-based clickstreams) is almost identical to yours.

The winning solutions demonstrated the incredible power of carefully constructed co-
occurrence matrices for candidate generation. This competition is a masterclass in
feature engineering for session data.

6.3 Foundational & Classic Papers

These papers introduce the core concepts that underpin modern recommender systems.

• Amazon.com Recommendations: Item-to-Item Collaborative Filtering (2003)

https://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf

1. Item-Based Collaborative Filtering Recommendation Algorithms

• Citation: Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). WWW ’01.
• Remarks: The original paper that formalized item-based collaborative filtering.
• Link to Paper: https://dl.acm.org/doi/10.1145/371920.372071

2. Matrix Factorization Techniques for Recommender Systems

• Citation: Koren, Y., Bell, R., & Volinsky, C. (2009). IEEE Computer.
• Remarks: The quintessential paper summarizing the power of matrix factorization,
made famous during the Netflix Prize. A must-read.
• Link to Paper: https://datajobs.com/data-science-repo/Recommender-Systems-[Netflix]
.pdf

3. Collaborative Filtering for Implicit Feedback Datasets

• Citation: Hu, Y., Koren, Y., & Volinsky, C. (2008). ICDM ’08.
• Remarks: Extremely relevant for this assignment. It introduces the popular
Alternating Least Squares (ALS) method for implicit feedback (clicks, views).
• Link to Paper: http://yifanhu.net/PUB/cf.pdf

4. BPR: Bayesian Personalized Ranking from Implicit Feedback

• Citation: Rendle, S., et al. (2009). UAI ’09.

• Remarks: Introduces BPR, a powerful loss function that directly optimizes for ranking,
which is crucial for recommendation.
• Link to Paper: https://arxiv.org/abs/1205.2618

5. Factorization Machines

• Citation: Rendle, S. (2010). ICDM ’10.

• Remarks: A powerful model that generalizes matrix factorization and is highly effective
for handling sparse categorical features.
• Link to Paper: https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf

12
COL775/COL7375 — Assignment 1 IndiaMART RecSys Challenge

6. Wide & Deep Learning for Recommender Systems

• Citation: Cheng, H. T., et al. (2016). RecSys ’16.

• Remarks: A seminal paper from Google showing how to combine deep networks (for
learning feature interactions) with a linear model (for memorization).
• Link to Paper: https://arxiv.org/abs/1606.07792

6.4 Sequential Recommendation Papers

This is the core research area for your assignment.

7. Session-based Recommendations with Recurrent Neural Networks (GRU4Rec)

• Citation: Hidasi, B., et al. (2016). ICLR ’16.

• Remarks: The foundational paper that successfully applied RNNs (specifically GRUs)
to session-based recommendation, setting a new standard for performance.
• Link to Paper: https://arxiv.org/abs/1511.06939

8. Self-Attentive Sequential Recommendation (SASRec)

• Citation: Kang, W. C., & McAuley, J. (2018). ICDM ’18.

• Remarks: A key paper that replaces the RNN with a Transformer (self-attention)
model. It is often more effective at capturing long-range dependencies and is a very
strong baseline to aim for.
• Link to Paper: https://arxiv.org/abs/1808.09781

9. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Represen-

tations from Transformer

• Citation: Sun, F., et al. (2019). CIKM ’19.

• Remarks: Adapts the cloze task objective from language modeling (BERT) to recom-
mendation, allowing for learning bidirectional sequence representations.
• Link to Paper: https://arxiv.org/abs/1904.06690

6.5 Graph-Based & Advanced Papers

10. Session-based Recommendation with Graph Neural Networks (SR-GNN)

• Citation: Wu, S., et al. (2019). AAAI ’19.

• Remarks: A seminal paper on applying GNNs to session data, constructing a graph
for each session to learn item transitions.
• Link to Paper: https://arxiv.org/abs/1811.00855

11. LightGCN: Simplifying and Powering Graph Convolution Network for Recom-
mendation

• Citation: He, X., et al. (2020). SIGIR ’20.

• Remarks: A powerful and simplified GCN model that has become a very strong
baseline in collaborative filtering.
• Link to Paper: https://arxiv.org/abs/2002.02126

Project Report "E-Commerce Recommendation"
No ratings yet
Project Report "E-Commerce Recommendation"
20 pages
Project Progression Report
No ratings yet
Project Progression Report
7 pages
CSE545 sp23 (9) Recommendation Systems 4-10
No ratings yet
CSE545 sp23 (9) Recommendation Systems 4-10
72 pages
AI Recommendation System
No ratings yet
AI Recommendation System
20 pages
An Introduction To Recommender Systems
No ratings yet
An Introduction To Recommender Systems
6 pages
1 Introduction - Recommender Systems
No ratings yet
1 Introduction - Recommender Systems
24 pages
Unit 4 - MLMM
No ratings yet
Unit 4 - MLMM
36 pages
Emerging Synergies Between Large Language Models A
No ratings yet
Emerging Synergies Between Large Language Models A
7 pages
Data Analytics
No ratings yet
Data Analytics
21 pages
Recommendation System Project Report
No ratings yet
Recommendation System Project Report
18 pages
1aggarwal C C Recommender Systems The Textbook
100% (1)
1aggarwal C C Recommender Systems The Textbook
518 pages
Recommender - Introduction
No ratings yet
Recommender - Introduction
25 pages
Rec - Unit 1
No ratings yet
Rec - Unit 1
66 pages
Bda Mini Project Part2
No ratings yet
Bda Mini Project Part2
24 pages
Machine Learning Recommender Systems
No ratings yet
Machine Learning Recommender Systems
33 pages
Recommender MidTerm - 2
No ratings yet
Recommender MidTerm - 2
12 pages
Recommendation Systems Overview
No ratings yet
Recommendation Systems Overview
37 pages
Recommender Systems Overview and Techniques
No ratings yet
Recommender Systems Overview and Techniques
92 pages
Module5 Recommender Systems PartA
No ratings yet
Module5 Recommender Systems PartA
54 pages
TECHNICAL+NOTE Recommender+Systems+v.27
No ratings yet
TECHNICAL+NOTE Recommender+Systems+v.27
16 pages
Rs Syll
No ratings yet
Rs Syll
3 pages
MLT Unit 5 Notes
No ratings yet
MLT Unit 5 Notes
14 pages
Jannach Et Al. - 2016 - Recommender Systems - Beyond Matrix Completion
No ratings yet
Jannach Et Al. - 2016 - Recommender Systems - Beyond Matrix Completion
9 pages
Recommender Systems Overview
No ratings yet
Recommender Systems Overview
72 pages
UNIT I - Introduction-Recommender Systems
No ratings yet
UNIT I - Introduction-Recommender Systems
24 pages
DP UPq Cos BF DSTPPZ
No ratings yet
DP UPq Cos BF DSTPPZ
20 pages
Recommender System
No ratings yet
Recommender System
8 pages
Module4 RecommenderSystem
No ratings yet
Module4 RecommenderSystem
11 pages
Priya Merged
No ratings yet
Priya Merged
31 pages
Build a Python Recommendation Engine
No ratings yet
Build a Python Recommendation Engine
17 pages
E - Commerce Website
No ratings yet
E - Commerce Website
12 pages
MS - BDA Lec - Recommendation Systems I
No ratings yet
MS - BDA Lec - Recommendation Systems I
31 pages
Recommendation System Final
No ratings yet
Recommendation System Final
16 pages
DM Lect 6 - Recommender Systems
No ratings yet
DM Lect 6 - Recommender Systems
46 pages
Unit I Introduction
No ratings yet
Unit I Introduction
24 pages
Recommender Systems Overview and Methods
No ratings yet
Recommender Systems Overview and Methods
36 pages
Ai Based Electronic Gadget Recommendation System
No ratings yet
Ai Based Electronic Gadget Recommendation System
12 pages
Unit 1 PDF
No ratings yet
Unit 1 PDF
58 pages
KanagalMukhul Fall2020
No ratings yet
KanagalMukhul Fall2020
32 pages
Book Recs with Machine Learning
100% (1)
Book Recs with Machine Learning
3 pages
6731 Documentation Seminar
No ratings yet
6731 Documentation Seminar
27 pages
Cloud Computing Report
No ratings yet
Cloud Computing Report
38 pages
Slides Lecture 2 RecSys
No ratings yet
Slides Lecture 2 RecSys
86 pages
A Survey and Critique of Deep Learning On Recommender Systems
No ratings yet
A Survey and Critique of Deep Learning On Recommender Systems
31 pages
Course Recommendations for CS Students
No ratings yet
Course Recommendations for CS Students
46 pages
Module No - 5
No ratings yet
Module No - 5
304 pages
Web Crawling Based Context Aware Recommender Syste
No ratings yet
Web Crawling Based Context Aware Recommender Syste
25 pages
Experiment
No ratings yet
Experiment
36 pages
BOOK Recommendation That Help To Analsis The
No ratings yet
BOOK Recommendation That Help To Analsis The
22 pages
Recommendation System
No ratings yet
Recommendation System
19 pages
Chatbot-Driven Recommendation Insights
No ratings yet
Chatbot-Driven Recommendation Insights
49 pages
Minor Long 1
No ratings yet
Minor Long 1
8 pages
Machine Learning Based Efficient Recommendation System For Book Selection Using User Based Collaborative Filtering Algorithm
No ratings yet
Machine Learning Based Efficient Recommendation System For Book Selection Using User Based Collaborative Filtering Algorithm
6 pages
UNIMAS eRecruitment: Recommender Systems
No ratings yet
UNIMAS eRecruitment: Recommender Systems
46 pages
50 Fundamental Recommendation Systems Interview Questions in 2025
No ratings yet
50 Fundamental Recommendation Systems Interview Questions in 2025
17 pages
Recommender System Syllabus
100% (1)
Recommender System Syllabus
3 pages
Bookrecommendations 230615063942 3b1016c9
No ratings yet
Bookrecommendations 230615063942 3b1016c9
22 pages
PERSONALIZATION AND RECOMMENDATION SYSTEMS: LEVERAGING MACHINE LEARNING ALGORITHMS TO OFFER PERSONALIZED PRODUCT RECOMMENDATIONS AND CONTENT TO CUSTOMERS BASED ON THEIR BEHAVIOR, PREFERENCES AND PURCHASING HISTORY
No ratings yet
PERSONALIZATION AND RECOMMENDATION SYSTEMS: LEVERAGING MACHINE LEARNING ALGORITHMS TO OFFER PERSONALIZED PRODUCT RECOMMENDATIONS AND CONTENT TO CUSTOMERS BASED ON THEIR BEHAVIOR, PREFERENCES AND PURCHASING HISTORY
4 pages
Naan Mudhalvan Phase 5project
No ratings yet
Naan Mudhalvan Phase 5project
19 pages
Lec15-S Sarkar
No ratings yet
Lec15-S Sarkar
12 pages
Changing Fate A Werewolf Romance CJ Alexis Download
100% (1)
Changing Fate A Werewolf Romance CJ Alexis Download
91 pages
Film Review of The Rizal The Movie
No ratings yet
Film Review of The Rizal The Movie
2 pages
05
No ratings yet
05
1 page
The Formalization of Selection Procedures
No ratings yet
The Formalization of Selection Procedures
4 pages
Teak Wood
No ratings yet
Teak Wood
10 pages
Accomplishment Report in Reading Grade 5
No ratings yet
Accomplishment Report in Reading Grade 5
11 pages
Bokeh Cheat Sheet Python For Data Science: 3 Renderers & Visual Customizations
No ratings yet
Bokeh Cheat Sheet Python For Data Science: 3 Renderers & Visual Customizations
26 pages
Experimental Cold Storage Energy Use
No ratings yet
Experimental Cold Storage Energy Use
6 pages
Response Pape1
No ratings yet
Response Pape1
8 pages
DCRPC101
No ratings yet
DCRPC101
149 pages
British Education System Before British Education System Before Independent India
No ratings yet
British Education System Before British Education System Before Independent India
4 pages
Diploma Project: Warehouse Construction
100% (3)
Diploma Project: Warehouse Construction
41 pages
Pob Sba
No ratings yet
Pob Sba
15 pages
Subject Outline
No ratings yet
Subject Outline
15 pages
STS Software Diagnostic Output and Error Messages
No ratings yet
STS Software Diagnostic Output and Error Messages
480 pages
Bproject KSDL
0% (2)
Bproject KSDL
84 pages
Module 4 C Programming
No ratings yet
Module 4 C Programming
6 pages
Badgujar
No ratings yet
Badgujar
14 pages
SMEA Proposal
100% (3)
SMEA Proposal
3 pages
An Introduction To Mucusless Diet Menu Planning
100% (8)
An Introduction To Mucusless Diet Menu Planning
25 pages
The Digital Transformation of The New York Times
No ratings yet
The Digital Transformation of The New York Times
4 pages
Codcea Itemwise Rate - January 2025
100% (5)
Codcea Itemwise Rate - January 2025
40 pages
Scheme of Work For An English Lesson
100% (1)
Scheme of Work For An English Lesson
3 pages
Miss Universe Script
No ratings yet
Miss Universe Script
1 page
Map Scales
No ratings yet
Map Scales
8 pages
Sep-21 524004
No ratings yet
Sep-21 524004
2 pages
LSU Law Center Application Overview
No ratings yet
LSU Law Center Application Overview
13 pages
Mkt111 Elements of Marketing. MKT Study Group
No ratings yet
Mkt111 Elements of Marketing. MKT Study Group
15 pages
Differential Scanning Calorimetry (DSC)
100% (2)
Differential Scanning Calorimetry (DSC)
9 pages
Ada199356 PDF
No ratings yet
Ada199356 PDF
243 pages