Major Report20202
Major Report20202
ON
Session 2022-2025
This is to certify that the project report entitled “Video game sales prediction” is done by us
is an authentic work carried out for the partial fulfilment of the requirements for the award
of the degree of Bachelor of Computer Applications (2022-25) under the guidance of DR.
Ruchi Aggarwal. The matter embodied in this project work has not been submitted earlierfor
the award of any degree or diploma to the best of our knowledge and belief.
[Harshit Sharma]
[00225502022]
ACKNOWLEDGEMENT
I would like to express my heartfelt gratitude to several individuals who played a pivotal role in
the successful completion of this project report.
I am deeply thankful to Dr. Ruchi Agarwal, our Project Mentor, for his continuous guidance, expert
knowledge, and constant motivation. His mentorship has been invaluable in steering this project
towards its goals.
Special thanks are due to our project guide, DR. Ruchi Agarwal, for her expert guidance, patience,
and dedication. Her insights and feedback were instrumental in the successful execution of this
project. I am also grateful to all my peers and colleagues who contributed to this project in various
ways.
Lastly, I would like to acknowledge the support and encouragement of my family and friends
throughout this project.
[Harshit Sharma]
[00225502022]
SYNOPIS
2. Problem Statement:
The video game industry is an ever-evolving, highly competitive market, where developers and
publishers strive to create captivating experiences that stand out among the vast number of
annual releases. As technology advances and player demographics broaden, understanding
market trends, evolving player preferences, and key factors influencing sales has become
essential for game developers, publishers, and stakeholders alike. This project seeks to analyze
global video game sales data to uncover patterns and actionable insights that drive the industry's
success. By identifying trends and utilizing predictive modeling, we aim to equip industry
stakeholders with data-driven strategies to anticipate market demands, optimize game
development and marketing, and ultimately, improve the potential for future success in this
dynamic market.
4. Objective:
The primaryobjective of this project is to analyze historical video game sales data to:
Develop robust predictive models capable of forecasting sales for upcoming games
across various platforms and regions.
Identify key factors and trends influencing game performance, including genre,
platform, release timing, and marketing efforts.
Provide actionable insights for optimizing game development, marketing strategies, and
distribution channels to maximize potential sales and improve return on investment.
5. Scope:
This project will focus on a comprehensive analysis of global video game sales data, covering
the past four decades, to understand long-term trends and shifts in the gaming industry. It will
explore sales performance across a wide range of platforms, from legacy consoles to modern-
day gaming systems, encompassing different hardware generations. The analysis will include
various game genres, highlighting popular and emerging genres, as well as regional preferences
across key markets such as North America, Europe, Asia, and others. The scope extends to the
development of predictive models for sales forecasting, utilizing historical data to provide
insights into future trends. Additionally, the project will assess factors influencing game sales,
including seasonality, marketing efforts, and technological advancements, to provide a holistic
view of the video game market dynamics.
Hardware:
Processor: A Pentium® 4 CPU 2.26 GHz or higher for smooth computational
performance.
RAM: Minimum 128 MB RAM, though 4 GB or more is recommended for
efficient handling of machine learning tasks.
Display: A screen resolution of at least 800 x 600, though higher resolutions such
as 1024 x 768 or more are preferable for better visualization.
Optical Drive: CD ROM Driver for external media support.
Software:
Programming Environment: Python, with tools like Jupyter Notebooks for
development and libraries like Pandas, NumPy, Scikit-Learn, Matplotlib, and
Seaborn for data analysis, visualization, and machine learning.
Operating System: Any Windows (95/98/2000/XP/NT and above) or Linux based
system.
Documentation: Microsoft Word or WordPad for creating reports and
documentation.
As we know the video game industry is an ever-evolving and a dynamic industry, so sales
figures play a very important role in accessing the popularity of video games. Here we have
used applied linear regression algorithm to predict the global sales.
In this project, we collected and analyzed a vast dataset comprising information on Genre, Platform
and publisher of Game with Region Sales such as North America Salse, Japan Sales, Europe Sales
and global Sales. We used this dataset to train and fine-tune a predictive model capable of
estimating Global Sales with high precision.
Keywords: video game sales prediction, machine learning, performance measures, linear
regression.
TABLE OF CONTENT
CERTIFICATE --
ACKNOWLEDGEMENT --
SYNOPSIS --
ABSTRACT --
LIST OF FIGURES --
LIST OF TABLES --
CHAPTER -1 INTRODUCTION:
1.1 Description of the topic: (Introduction the project's background and context.)
1.3 Objectives
3.1.5 Modelling
3.1.8 Conclusion
4.3 Results
5.1 Conclusion
REFERENCES
List of Figures:
INTRODUCTION
1.1 Description of the topic:
Video game sales prediction involves forecasting the future performance of a game by analyzing
historical sales data. This task is inherently challenging due to the volatility and rapid evolution
of the gaming industry, with trends shifting based on technological advancements, consumer
preferences, and external factors like global events. However, accurate predictions can provide
significant advantages for stakeholders, such as publishers, developers, and investors, enabling
informed decisions related to game development, resource allocation, marketing strategies, and
pricing models.
Various methods are available for sales prediction, ranging from traditional statistical models to
more advanced machine learning algorithms and deep learning techniques. Statistical methods
may rely on historical trends, while machine learning and deep learning models can uncover
complex patterns in data, adapt to changes over time, and improve accuracy. Machine learning
algorithms, in particular, offer flexibility in handling large, diverse datasets with features such as
game genre, platform, release time, and regional preferences. These methods enable better
forecasts, even in the face of fluctuating market conditions, making them highly effective for the
dynamic nature of the video game industry. In our project, we have specifically chosen machine
learning algorithms due to their ability to handle this complexity and provide reliable predictions.
The goal of this project is to develop a predictive model capable of estimating the global
sales of video games, based on various influencing factors. This model will aim to predict
sales performance across different geographical regions, including but not limited to
North America, Europe, and Japan, based on historical data, genre, platform, release year,
and other game-specific attributes.
Key aspects of this project include building a model that is both accurate and reliable in
its predictions, capable of providing insightful sales forecasts for new games and helping
stakeholders—such as game developers, publishers, and marketers—make data-driven
decisions. The model should be able to generalize effectively, meaning it should not only
fit well on the training data but also perform accurately on new, unseen games.
The project will involve several stages, starting with data preprocessing and exploratory
data analysis to identify patterns and key predictors of sales. Various machine learning
algorithms will be tested and evaluated, potentially including linear regression, decision
trees, and ensemble methods, to find the best-performing model for this task. Finally, the
chosen model will be optimized and validated using appropriate techniques to ensure its
robustness, making it a valuable tool for forecasting sales trends in the gaming industry.
This approach will support better inventory planning, marketing strategies, and ultimately
drive business success in the highly competitive video game market.
1.3 Objectives:
The project will focus on a robust dataset of global video game sales spanning the
last four decades, ensuring a comprehensive analysis of market dynamics. It will
examine sales performance across various platforms, genres, and regions, enabling
a detailed understanding of how different factors contribute to success.
The scope of the project includes developing and validating predictive models that
not only forecast future sales but also identify key factors such as platform
popularity, genre trends, and regional preferences that influence video game
success. By doing so, the project will provide stakeholders with the tools they need
to optimize their strategies in a highly competitive market.
Team consists of three members Harsh Rai, Harshit Sharma, Ayush Joshi and the work is distributed
in the following manner.
The data collection was done with the agreement of both the team members and found by the
both from Kaggle website.
The preprocessing part was done with the help of both team members.
Key Highlights:
Focused on developing the predictive model for global sales estimation.
Utilized Python, scikit-learn for model development and training.
Gungun:
Created presentation on the project.
Used Canva graphic designing tool creation of presentation.
1.5.2 PERT Chart
Fig. 1.1 Pert Chart: Describing the work flow of the whole project.
1.6 Organization of the report:
The organization of this report is designed to facilitate clarity and ease of navigation, enabling
readers to access specific sections quickly and understand the logical flow of the analysis. This
structure will help the reader follow the video game sales prediction process from the initial stages
of exploration and data preparation through to the final stages of modeling and evaluation. Each
section serves a distinct purpose, contributing to a comprehensive understanding of the methods
and insights obtained. The report is divided into the following sections:
1. Introduction: This section introduces the purpose of the report, outlining the problem
statement and objectives. It explains why video game sales prediction is relevant and its
potential applications within the gaming industry.
2. Literature Review: A summary of existing work on sales prediction methodologies, especially
within the context of video game markets. This section highlights past research, commonly
used models, and existing gaps that the report aims to address.
3. Data Collection and Exploration: This section describes the data sources, attributes, and
preprocessing steps involved. It includes exploratory data analysis (EDA) to uncover trends,
correlations, and any missing or anomalous values, ensuring the dataset is suitable for
modeling.
4. Feature Engineering: Here, the report details any transformations or new features created to
improve the predictive power of the model. This might include temporal features, genre- based
factors, or platform popularity trends.
5. Model Selection and Development: This section discusses the models chosen for the task
(e.g., linear regression, decision trees, random forests, etc.), the rationale for each, and the
training process. It also includes parameter tuning and cross-validation methods to enhance
model accuracy and generalizability.
6. Evaluation Metrics: Details the performance metrics used to assess model efficacy, such as
Mean Absolute Error (MAE), Mean Squared Error (MSE), and R-squared (R²), as well as an
explanation of why these metrics are suitable for sales prediction.
7. Results and Analysis: This section presents the results of the models, compares their
performance, and interprets the outcomes. It may also include visualizations that demonstrate
model predictions against actual sales figures to highlight accuracy.
8. Conclusion and Future Work: A summary of the key findings, the report concludes by
discussing the implications of the results and offering recommendations for future work, such
as expanding the dataset or testing additional algorithms to refine the prediction model further.
This structured organization ensures that readers can systematically follow the progression of
the project, gaining insights into each phase of video game sales prediction and facilitating the
application of findings in similar contexts.
CHAPTER 2
LITERATURE REVIEW
2.1 Summary of papers studied
1. Joe Cox, 2013, “What makes a blockbuster video game? An empirical analysis of US
video game sales”
Joe Cox, 2013: This study investigates the factors that affect the likelihood of a video game
becoming a blockbuster. A unique dataset is collected consisting of approximately 1,800
observations relating to individual video games titles released across a variety of platforms.
Due to the long-tailed nature of the dependent variable, OLS regressions of the logarithm
ofunit sales are estimated alongside binary logistic regressions based on three different
thresholds of sales success.
TM Geethanjali, Ranjan D, Swaraj HY, Thejaskumar MV, Chandana HP, 2020: This
paperaims to predict the top-selling video game sales in North America between 1983 and
2016 The dataset is collected from an internet platform known as Kaggle.com. Exploitation
the dataset, the RStudio IDE tool and R-programming language are used for data cleaning,
analysis, and representation. The machine learning algorithm used in this project is linear
regression. They point the results of this Analysis in Associate in nursing intuitive
methodology by visualizing outcome victimization ggplot2 in R.
3. Bodduru Keerthana, Dr. K.Venkata Rao, 2019, “Sales Prediction On Video Games
Using Machine Learning.”
Bodduru Keerthana, Dr. K.Venkata Rao, 2019: In this paper, they briefly analysed the
concept of gaming sales data and sales predictions. The various machine learning
techniques and measures used for this sales prediction. On the basis of a performance
evaluation, best suited predictive models like linear regression, support vector regression,
random forest anddecision trees etc. are used for the sales trend predictions. The results
summarized in performance measures are root mean square error, r-square, and mean
absolute error. The studies found that the best fit model is Random forest algorithm, which
shows maximum accuracy in future sales prediction.
4. Alice Yufa, Jonathan L. Yu, Henry Chan, Paul D. Berger, 2019, “Predicting Global
Video-Game Sales”
Alice Yufa, Jonathan L. Yu, Henry Chan, Paul D. Berger, 2019: In this paper they examine
video games, and consider selected independent variables and explore their relationship
to global sales. Key variables that are identified include the number of critics that rate the
video game, and the average score that the critics give to the game. they also find a non-
intuitive result concerning user ratings of the game.
5. Amar Aziz, Shuhaida Ismail, Muhammad Fakri Othman, Aida Mustapha, 2017,
“Empirical Analysis on Sales of Video Games: A Data Mining Approach”
Amar Aziz, Shuhaida Ismail, Muhammad Fakri Othman, Aida Mustapha, 2017: This
paper studies factors that make the sales of video games becomes a blockbuster. The
dataset used is collected from an online database maintained by VGChartz.com. Using
the dataset, the Rapid Miner tool is used to select the features or factors and produce
efficient estimation of the data. The techniques used in this project included the k-
Nearest Neighbour (k-NN), Random Forest and Decision Tree.
6. John Sacranie, 2010, “Consumer Perceptions & Video Game Sales: A Meeting of the
Minds”
John Sacranie, 2010: This paper examines the determinants of video game software sales.
This paper incorporates several of variables that previous literature have, but also add new
one that is quality. To account for quality, the model incorporates the average review score
agame receives from professional critics. The results indicate that indeed, quality does play
a major role in consumers’ purchase decisions.
7. Jianbin Li, Yufan Zheng, Haoran Hu, Junhui Lu, Choujun Zhan, 2021, "Predicting
Video Game Sales Based on Machine Learning and Hybrid Feature Selection Method"
Jianbin Li, Yufan Zheng, Haoran Hu, Junhui Lu, Choujun Zhan, 2021: In this paper, we
proposed a new hybrid feature selection method Pearson correlation coefficient - Random
Forest Feature Selection(PCC-RFFS), and utilized 9 machine learning methods combined
with PCC-RFFS to predict the sales of video games. For the filter-based stage, we used the
absolute value of Pearson correlation coefficient and feature ranking technique, while the
wrapper-based stage is based on Random Forest to measure the importance between
featuresand target.
8. Luis Alberto Mendieta Caballero, 2015, “The Impact of Online Ratings on Video Game
Sales”
T Luis Alberto Mendieta Caballero, 2015: his study intends to quantify the impact online
ratings have over video game sales by conducting a linear regression analysis on 300 titles
forthe previous console generation (PlayStation® 3 and Xbox® 360) using a data from the
video game industry to understand the existing influence on this particular market.
Afterwards, we compare results to previous ones and discuss the managerial implications for
upcoming gaming generations.
9. Wesley J. Marshall and Andrew L. Weaver, 2012, "The Effect of Video Game Platform on
Game Sales: A Comparative Analysis"
Wesley J. Marshall and Andrew L. Weaver, 2012: This study explores the impact of the platform
on which a video game is released on its overall sales performance. Using data from multiple
platforms, the authors analyze how platform exclusivity, cross-platform availability, and console
type affect the game’s popularity and sales. Through a combination of regression analysis and data
visualization, they find that platform choice plays a significant role in sales, with console-specific
games often seeing different market trajectories compared to those released across multiple
platforms.
10. Marie Müller, 2018, "Analyzing the Impact of Social Media on Video Game Sales"
Marie Müller, 2018: This paper examines the correlation between social media presence and video
game sales performance. By analyzing Twitter mentions, YouTube reviews, and Reddit discussions
for various titles, the study assesses how social media hype and user-generated content impact sales.
The authors use sentiment analysis and time-series forecasting to track trends, finding that games
with high engagement levels on social media platforms tend to experience higher sales. They
conclude that social media has become a significant predictor of a game's commercial success,
especially for independent titles and smaller studios leveraging community support.
2.2 Findings of studied researches
1 Random Forest:
RMSE: 1.4648
Accuracy: 0.9605
2 Support vector regression
RMSE: 1.9773
Accuracy: 0.8154
3 Decision Tree
RMSE: 2.8762
Accuracy: 0.8036
4 Linear regression
RMSE: 2.4830
Accuracy: 0.5734
Choujun Zhan:
School of Electrical and
Computer Engineering
Nanfang College,
Guangzhou Guangzhou,
China
Purpose and Importance of System Design for Video Game Sales Analysis
The global video game industry generates billions of dollars in revenue each year, with hundreds
of new titles competing for market share. As games are distributed across various platforms,
regions, and formats, having a reliable system for tracking and analyzing sales data is essential
for gaining a competitive edge. The insights drawn from such a system enable companies to
tailor their marketing strategies, identify profitable regions, and forecast future demand based
on current and past sales trends. This data-driven approach not only helps game developers
prioritize resources but also supports publishers in making strategic distribution decisions.
Components:
o Data Source: The primary data source is a CSV file containing comprehensive sales
data, including fields such as game title, genre, platform, release year, and sales figures
across various regions. This data serves as the foundation for all subsequent analysis.
Data Flow:
o Input: The CSV file containing the raw sales data is imported using the Pandas
library.
o Preprocessing: Data cleaning, handling missing values, and preparing the dataset for
modeling.
o Modeling and Analysis: The clean data is split into training and test sets, and then
passed through machine learning algorithms to create predictive models.
o Output: Predictions, along with visual insights, are produced, showcasing key metrics
like genre popularity, sales trends by region, and projected future sales.
Technology Stack:
o Python Libraries: The project uses a suite of Python libraries. Pandas handles data
import and manipulation, Seaborn and Matplotlib create visualizations, and Scikit-
Learn performs machine learning tasks, including data splitting, model training, and
evaluation. Together, these tools create a seamless pipeline for analyzing and predicting
video game sales trends.
3.1.3 Data Sources:
The data source consists of a dataset CSV file downloaded from Kaggle website. Kaggle is a
data science competition platform and online community of data scientists and machine
learning practitioners under Google LLC. Kaggle enables users to find and publish datasets,
explore and build models in a web-based data science environment, work with other data
scientists and machine learning engineers, and enter competitions to solve data science
challenges.
Link: https://www.kaggle.com/datasets/gregorut/videogamesales
The machine learning models we employed for estimating global Sales was Linear regression.
The main theme of this algorithm is to find a mathematical equation for continuous variables
Y when we have one or more X variables. This algorithm establishes a relation between two
variables one variable is predicted variable and another one is resulting variable whose value
is derived from the predictive variable.
Y=aX+b
X is predicted variable
Since we were using python for machine learning algorithm we used sci-kit learn library for
model selection.
The training dataset taken was 85% of the data and 15% was used for testing. We also used
cross validation technique and performance metrics used was R2_score and for error detection
we used MSE, RMSE, MAE.
Linear regression is a type of supervised machine learning algorithm that computes the linear
relationship between a dependent variable and one or more independent features. Here’s how
it works:
The first step in building any predictive model, including linear regression, is to gather a
dataset that contains both the independent variables, also known as features, and the dependent
variable, also referred to as the target. Independent variables are the predictors that provide
information relevant to the prediction, while the dependent variable is the outcome we aim to
predict. The quality and relevance of this data are crucial, as they directly impact the model's
performance. In our case, we carefully collected and prepared a dataset that captures all
necessary aspects to ensure our linear regression model could be accurately trained and
evaluated.
Linear regression operates under the assumption of a linear relationship between the
independent variables and the dependent variable. This relationship suggests that any change
in the independent variable(s) will result in a proportional change in the dependent variable.
The general form of a simple linear regression model, which includes only one independent
variable, can be represented by the following equation:
y=β0+β1x+ϵ
.
3.2.5.3 Model Training:
The goal is to find the best values for a and b that minimize the difference between the
predicted values and the actual target values. This is typically done using a method called
"least squares optimization." The model iteratively adjustsa and b to minimize the sum of the
squared differences between predicted and actual values.
3.2.5.4 Predictions:
Once the model is trained, you can use it to make predictions. Givennew values of the
independent variable(s), you can calculate the corresponding predicted values of the
dependent variable using the linear equation.
Hardware Software
Pentium® 4 CPU 2.26GHz, 128 Any window-based operating
MB RAM system (Windows
95/98/2000/XP/NT or above).
Screen resolution of at least 800
x 600 required for proper and Word pad or Microsoft Word
complete viewing of screens.
Higher resolution would not be a
problem.
CD ROM Driver.
Table 4.1 Showing the Hardware and Software requirement of the system developed
Data Collection is the systematic process of gathering information from multiple sources, including
surveys, experiments, public databases, web scraping, sensors, and social media. The quality and
relevance of the collected data are crucial for the success of the model, as they significantly impact
its performance and reliability. Here are the steps we followed to collect and prepare our dataset:
After defining our data needs, we gathered the dataset from reliable sources. In this project, we
chose to use a public dataset of video game sales available on the Kaggle platform. This dataset
provided comprehensive information on various video games, including sales by region, genre,
and platform, all of which were pertinent to our analysis. Using a reputable source like Kaggle
ensured that the data was credible, widely used, and well-maintained.
Assess data quality:
Following data collection, we assessed the dataset for quality and completeness. This step
involved checking for any errors, inconsistencies, and missing values, as well as ensuring that
each entry met the accuracy requirements for our analysis. Data quality assessments are
essential to avoid any potential issues that could impact model performance.
Following these steps enabled us to gather, validate, and prepare the dataset, ensuring that we
had a reliable foundation for our linear regression model
Now, we will follow the steps given below:
Following this, we have imported the data into a dataframe named dataset,
Detailed information of data,
The process of cleaning, transforming, and enriching data to make it suitable for machine
learning. This may involve cleaning the data, handling missing values, and feature
engineering.
Cleaning the data: This involves removing errors and inconsistencies from the data.
For example, you may need to remove duplicate entries, correct typos, or convert data
types.
Handling missing values: This involves deciding how to deal with missing values in
the data. There are a variety of methods for handling missing values, such as imputing
them with the mean or median value, or deleting them from the dataset.
Feature engineering: This involves creating new features from the existing data. This
can be done to improve the performance of the machine learning model. For example,
you may create a new feature that represents the difference between two existing
features.
In this step we first started with checking the no. of null value in each columns, as follows
Before proceeding to data cleaning process, first we have to create a copy of dataframe to
avoid the loss of original data, as shown in
Now, we found that year column has the most null value so we filled it with its Median, as
shown in
Now, we will remove the N/A value present in any other column (i.e. Publisher’s column),
Removing outliers
Our final dataset,
EDA, or Exploratory Data Analysis, is a crucial initial step in the data science process. It
involves the analysis and visualization of data to understand its main characteristics, detect
patterns, spot anomalies, and generate hypotheses. It consists of data visualization, handling
missing data, outlier detection etc.
Heatmap to see the correlation between Genre and Various Sales:
Fig 4.1 Heatmap showing correlation between Genre and Various Sales.
Bar Chart to show what genre games have been made most.
Fig 4.2 Bar Chart to show what genre games have been made most.
pairplot to show relations of columns.
Model selection is the process of choosing a machine learning model that is appropriate for
the problem you are trying to solve. There are a variety of different machine learning
algorithms available, each with its own strengths and weaknesses like, Linear regression,
Logistic regression, Decision trees, Random forests, Support vector machines, Neural
networks
The type of problem we are trying to solve (e.g., classification, regression, clustering,
etc.)
The characteristics of our data (e.g., the number of features, the presence of missing
values, etc.)
The desired level of accuracy and explainability.
so considering the above factors we chose Linear Regression model
Linear regression is a machine learning algorithm that is used to model and predict continuous
(numerical) target variables. It is one of the simplest and most widely used machine learning
algorithms. Linear regression works by fitting a linear line to the data, where the slope and
intercept of the line represent the relationship between the input features and the target
variable.
Linear regression is a good choice for your project because the target value is numerical.
Linear regression is specifically designed to predict numerical target variables.
Here,
Independent variable=Global_sales
Intercept=b
Model training is the process of fitting a machine learning model to a dataset. This involves
feeding the model the data and allowing it to learn the relationships between the features and
the target variable.
Now, splitting our dataset into training and testing sets.
Dependent values,
Target value,
Checking the error values in the model:
This is important to do to ensure that the model does not overfit the training data.
There are a variety of different model evaluation metrics that you can use, depending on the
type of problem you are trying to solve. For example, if you are trying to solve a classification
problem, you may use metrics such as accuracy, precision, recall, and F1 score.
Result Scores:
Comparison tables of the models that we used in our project:
Table. 4.2 Comparison of model at different test train size used in the project.
Visual representation of all the findings:
Fig 4.5 Bar graph to show most globally sales games based on genre.
Fig 4.6 Representation of global sales of games based on genre from 1990-2018
https://www.kaggle.com/datasets/gregorut/videogamesales
https://pure.port.ac.uk/ws/portalfiles/portal/1090821/COXj_2013_cright_MDE_What
_makes_a_blockbuster_video_game.pdf
https://ijcrt.org/papers/IJCRT2005182.pdf
https://www.jetir.org/papers/JETIR1907H50.pdf
https://www.questjournals.org/jrbm/papers/vol7-issue3/I07036064.pdf
https://www.researchgate.net/publication/326510277_Empirical_Analysis_on_Sales_
of_Video_Games_A_Data_Mining_Approach
https://www.researchgate.net/profile/Choujun-
Zhan/publication/360045887_Predicting_Video_Game_Sales_Based_on_Machine_Le
arning_and_Hybrid_Feature_Selection_Method/links/62b83dc11010dc02cc5eb5b6/Pr
edicting-Video-Game-Sales-Based-on-Machine-Learning-and-Hybrid-Feature-
Selection-Method.pdf
https://digitalcommons.iwu.edu/cgi/viewcontent.cgi?article=1107&context=econ_hon
proj
https://run.unl.pt/bitstream/10362/15686/1/Mendieta_2015.pdf
ResearchGate. (2022). Predicting Video Game Sales Based on Machine Learning and
Hybrid Feature Selection Method. Available at:
https://www.researchgate.net/publication/360045887_Predicting_Video_Game_Sales_Base
d_on_Machine_Learning_and_Hybrid_Feature_Selection_Method
Journal of Applied Data Science and Machine Learning. (2023). Enhancing Video Game Sales
Prediction with Advanced Regression Models. Available at: https://jadsml.org/papers/vol10-
issue2/Enhancing-Video-Game-Sales.pdf