By Dr Norshakirah Ab Aziz
BDA Lab 3: Performing Exploratory Data Analysis (EDA)
*Using the dataset scrapped from:
https://www.basketball-reference.com/leagues/NBA_2023_per_game.html
Your Name: Wan Muhammad Iman Bin Wan Md Nazan
Matric No: 20000692
Task Lab: Answer the following questions. Provide your answer, code, and output screen for each
question.
1. Which team does the player who score the most points per game (PTS) comes from? (1M)
Your answer: PHI
imported_df[(imported_df.PTS == imported_df.PTS.max())]
result = imported_df.loc[imported_df['PTS'] ==
imported_df['PTS'].max(), ['Player', 'PTS', 'Tm']]
print(result)
2. Which position is the player playing as? (1M)
Your answer: C
imported_df[(imported_df.PTS == imported_df.PTS.max())]
result_2 = imported_df.loc[imported_df['PTS'] ==
imported_df['PTS'].max(), ['Player', 'Pos', 'PTS', 'Tm']]
print(result_2)
3. How many games did the player played in the season? (2M)
Your answer: 66
imported_df[(imported_df.PTS == imported_df.PTS.max())]
result_3 = imported_df.loc[imported_df['PTS'] ==
imported_df['PTS'].max(), ['Player', 'Pos', 'PTS', 'Tm', 'G']]
print(result_3)
4. Which player had the highest Assists Per Game? (2M)
Your answer: James Harden
By Dr Norshakirah Ab Aziz
imported_df[(imported_df.AST == imported_df.AST.max())]
result_4 = imported_df.loc[imported_df['AST'] ==
imported_df['AST'].max(), ['Player', 'AST']]
print(result_4)
5. How many players do score more than 25 points per game? (2M)
Your answer: 24 Players
total_players_over_25_pts = len(imported_df[imported_df['PTS'] > 25])
print("Total players with more than 25 points:",
total_players_over_25_pts)
6. Create a correlation matrix/heatmap. Then, explain which 3 variables have the highest
correlation to each other? (2M)
Your answer:
sns.heatmap(corr)
By Dr Norshakirah Ab Aziz
The 3 variables with the highest correlation to each other are:
• FG% (Field Goal Percentage) and 2P% (2-Point Field Goal Percentage) with a correlation
of 0.8. This means that these two variables are very strongly positively correlated, indicating
that as a player's 2-point field goal percentage increases, their overall field goal percentage
also tends to increase.
• FG% and eFG% (Effective Field Goal Percentage) with a correlation of 0.6. This is also a
positive correlation, but not quite as strong as the correlation between FG% and 2P%. It
means that there is a positive relationship between a player's overall field goal percentage
and their effective field goal percentage, which takes into account the fact that 3-point shots
are worth more points than 2-point shots.
• 2P% and eFG% with a correlation of 0.54. This is another positive correlation, but slightly
weaker than the correlation between FG% and eFG%. It indicates that there is a positive
relationship between a player's 2-point field goal percentage and their effective field goal
percentage.
***You are required to submit:
By Dr Norshakirah Ab Aziz
Lab report in PDF format with the screenshot of code and output of your lab task.
Submission: INDIVIDUAL SUBMISSION
File type: PDF file (.pdf)
File name: matricNumber-Lab3.pdf example: 20002199-Lab3.pdf
Deadline: Before Lab 4 Session, submit according to GA instruction (see the PPT slides).
*Copying is prohibited. Late submission without acceptable reasons will not be considered.