Predicting
IPL
Auction Prices
Introduction to Business Analytics
By: Arun 22037
    Ayush Kumar 22044
    Abhishek Modi 22010
    Vibhor Kumar 22270
INTRODUCTION
               Overview:
               The primary aim of this project revolves around leveraging machine learning techniques to
               predict auction prices of IPL (Indian Premier League) players based on their attributes. By
               employing regression analysis on player data, we aim to estimate the expected price at which
               a player might be auctioned.
               Importance:
               •   Strategic Financial Decision-making: Predicting auction prices holds immense value for
                   IPL teams in formulating their financial strategies. It assists franchises in budget allocation
                   and optimal player selection during auctions.
               •   Player Valuation: Understanding player values through statistical modeling aids teams in
                   assessing player worth, negotiating contracts, and forming a balanced team composition
                   for the tournament.
               Objectives:
               •   Auction Price Forecasting: Utilize attributes like nationality, player type, batting average,
                   and bowling average to predict the probable auction price of IPL players.
               •   Regression Techniques Demonstration: Showcase the practical application of regression
                   algorithms in forecasting player values, demonstrating the predictive capabilities of
                   machine learning in a real-world sports context.
DATASET
               Our Dataset contains the 7 columns and 205 rows. For determining the Price of the
               Players in IPL Auction we need to consider some of the important factors like Nationality,
               Type, Status(Capped/Uncapped), Batting Average, Bowling Average. These factors are
               given larger importance while determining the price of the Players in IPL Auction.
              Factors                                           Details
            Nationality      The player's country of origin or nationality (e.g., Indian, Overseas).
           Players Type      The player's role or category in cricket (e.g., Batsman, Bowler, All-
                             Rounder).
              Status         Indicates whether the player is categorized as Capped (having
                             International Level experience) or Uncapped .
          Batting Average    The average number of runs scored by the player per dismissal while
                             batting in T20s.
          Bowling Average    The average number of runs conceded by the player per wicket taken
                             while bowling in T20s.
METHODOLOGY
               • Importing Data: Utilized Pandas library to load IPL auction dataset ('IPL.xlsx') into a Pandas DataFrame for further analysis.
       Data    • Exploratory Data Analysis (EDA):
                 • Inspected data structure using df.info() to comprehend column data types, missing values, and dataset size.
      Loadi      • Employed visualizations like scatter plots and pair plots using Seaborn and Matplotlib libraries to identify relationships and
                   distributions among attributes.
       ng &      • Created a heatmap to visually represent correlations among attributes using sns.heatmap(df.iloc[:,[5,6,7]].corr(), annot=True).
      Prepr
      ocessi
        ng
               • Data Preparation:
                  • Segregated dataset into features (independent variables) and the target variable (Sold_Price).
                  • Split the dataset into training and testing sets using train_test_split from sklearn.model_selection.
               • Feature Encoding:
      Model        •Utilized OneHotEncoder and ColumnTransformer to encode categorical features (Nationality, Type, Status) into numerical format
                   suitable for model training.
      Devel    • Model Building:
      opme          •Employed a Linear Regression model via a pipeline using make_pipeline and LinearRegression.
                    •Fitted the model with training data using pipe.fit(X_train, Y_train).
       nt
               •Prediction for New Player Data:
                •Created a new dataset for a hypothetical player and used the model to predict the auction price for this
                 player.
      Predic   •Model Performance Evaluation:
      tion &    •Assessed the model's performance using evaluation metrics such as Mean Squared Error (MSE) and R-
                 squared (mean_squared_error, r2_score).
      Evalua
        tion
FINDINGS
           Pair Plot of Dataset
FINDINGS
           Correlation Matrix Heatmap   Actual vs Predicted Prices on Test Set
LIMITATIONS AND SCOPE FOR IMPROVEMNENT
                                                       Enhancement Opportunities
                                                         • Feature Expansion: Consider augmenting the
Feature Set Limitation: The model's predictive             feature set by incorporating additional relevant
power might be constrained by the limited set of           attributes, such as player performance in specific
features considered (Nationality, Type, Batting            match formats, recent form, injury history, or
Average, Bowling Average). The exclusion of other          team dynamics, to enhance predictive accuracy.
potential influential factors could impact its           • Improved Data Collection: Focus on acquiring
accuracy in estimating auction prices.                     more comprehensive and diverse data sources to
                                                           mitigate missing data issues and provide a more
                                                           holistic view of player attributes and performance
                                                           metrics.
                                                         • Advanced Modeling Techniques: Explore the
                                                           application of more sophisticated modeling
Data Quality: Potential limitations in data quality,       approaches beyond linear regression, such as
such as missing or incomplete information, might           ensemble methods (Random Forests, Gradient
affect the model's performance and predictive              Boosting), which could potentially capture
                                                           nonlinear relationships more effectively.
capability. These data inadequacies might lead to
biased or less accurate predictions.
                                                       Validation and Iteration
                                                         • Model Validation: Implement rigorous validation
                                                           techniques, such as cross-validation, to assess
                                                           model stability and generalizability, ensuring that
Model Complexity: The model's simplicity (linear           the model performs consistently across various
regression) might limit its ability to capture             datasets and scenarios.
intricate nonlinear relationships between                • Iterative Approach: Adopt an iterative approach
attributes and auction prices. More complex                to model development by continuously refining
                                                           the model based on new data insights and
models might better represent the underlying
                                                           feedback from predictions to enhance its accuracy
complexities.                                              and reliability.
THANK
YOU!!
"In the ever-evolving world of cricket auctions, predictive models pave
the way for strategic decisions. Anticipating player values is not just a
statistic, but a game-changer in shaping winning teams."