0% found this document useful (0 votes)
4 views

5

The document outlines a Python script for analyzing a Netflix dataset using pandas and seaborn. It includes creating a heatmap to visualize the correlation between IMDb Score, Hidden Gem Score, and IMDb Votes, as well as plotting line graphs to compare IMDb Votes and Scores across different movie types. Additionally, it notes a FutureWarning regarding the deprecated 'ci' parameter in seaborn's lineplot function.

Uploaded by

anuj rawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

5

The document outlines a Python script for analyzing a Netflix dataset using pandas and seaborn. It includes creating a heatmap to visualize the correlation between IMDb Score, Hidden Gem Score, and IMDb Votes, as well as plotting line graphs to compare IMDb Votes and Scores across different movie types. Additionally, it notes a FutureWarning regarding the deprecated 'ci' parameter in seaborn's lineplot function.

Uploaded by

anuj rawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

nloypqbmz

December 26, 2024

[1]: import pandas as pd

# Load the dataset


url = 'https://itv-contentbucket.s3.ap-south-1.amazonaws.com/Exams/AWP/pandas/
↪Netflix.csv'

df = pd.read_csv(url)

1) A heatmap to understand the correlation between IMDB Score, Hidden Gem Score, and
IMDB Votes
[2]: import seaborn as sns
import matplotlib.pyplot as plt

# Select the relevant columns


correlation_data = df[['IMDb Score', 'Hidden Gem Score', 'IMDb Votes']].dropna()

# Calculate the correlation matrix


correlation_matrix = correlation_data.corr()

# Create the heatmap


plt.figure(figsize=(8, 6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('Correlation between IMDb Score, Hidden Gem Score, and IMDb Votes')
plt.show()

1
2) Plot lines for categories of every movie type and analyze how they have received IMDB Votes.
Create a subplot to compare the same categories with IMDB Score.
[3]: import seaborn as sns
import matplotlib.pyplot as plt

# Ensure 'Series or Movie' and 'IMDb Votes' columns are treated as strings and␣
↪numbers respectively

df['Series or Movie'] = df['Series or Movie'].astype(str)


df['IMDb Votes'] = pd.to_numeric(df['IMDb Votes'], errors='coerce')
df['IMDb Score'] = pd.to_numeric(df['IMDb Score'], errors='coerce')

# Plot lines for categories of every movie type with IMDb Votes
plt.figure(figsize=(14, 6))

# Subplot 1: IMDb Votes


plt.subplot(1, 2, 1)
sns.lineplot(data=df, x='Series or Movie', y='IMDb Votes', marker='o', ci=None)
plt.title('IMDb Votes for Movie Types')

2
plt.xlabel('Movie Type')
plt.ylabel('IMDb Votes')

# Subplot 2: IMDb Score


plt.subplot(1, 2, 2)
sns.lineplot(data=df, x='Series or Movie', y='IMDb Score', marker='o', ci=None)
plt.title('IMDb Score for Movie Types')
plt.xlabel('Movie Type')
plt.ylabel('IMDb Score')

plt.tight_layout()
plt.show()

C:\Users\lenovo\AppData\Local\Temp\ipykernel_10248\35240728.py:14:
FutureWarning:

The `ci` parameter is deprecated. Use `errorbar=None` for the same effect.

sns.lineplot(data=df, x='Series or Movie', y='IMDb Votes', marker='o',


ci=None)
C:\Users\lenovo\AppData\Local\Temp\ipykernel_10248\35240728.py:21:
FutureWarning:

The `ci` parameter is deprecated. Use `errorbar=None` for the same effect.

sns.lineplot(data=df, x='Series or Movie', y='IMDb Score', marker='o',


ci=None)

You might also like