0% found this document useful (0 votes)

22 views

Assignment 2

Uploaded by

aryan540787

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

Assignment 2

Uploaded by

aryan540787

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

ASSIGNMENT 2

Q1. CONVERT A CONTINOUS VARIABLE INTO QUANTILE BINS AND

PLOT THEM USING COUNTPLOTS. USE DATASET DIAMONDS AND
DO THIS FOR COLUMN X,Y,Z
ANS1. To achieve this, you can use the seaborn library in Python, along with
pandas for data manipulation. Here's a step-by-step guide to convert continuous
variables into quantile bins and plot them using count plots for columns X, Y, and
Z in the diamonds dataset:

```python
import seaborn as sns
import pandas as pd

# Load the diamonds dataset

diamonds = sns.load_dataset('diamonds')

# Define the columns X, Y, and Z

columns = ['x', 'y', 'z']

# Create quantile bins for each column

for col in columns:
diamonds[f'{col}_quantile'] = pd.qcut(diamonds[col], q=5, labels=False)

# Melt the dataset to long format for countplot

diamonds_melted = pd.melt(diamonds, value_vars=[f'{col}_quantile' for col in
columns])

# Plot using countplot

sns.countplot(data=diamonds_melted, x='value', hue='variable')
```

This code will create countplots for the quantile bins of columns X, Y, and Z in the
diamonds dataset. Each bin represents a range of values for each column, and the
countplot shows the distribution of data points within each bin. Certainly! Here's a
more concise version:

```python
import seaborn as sns
import pandas as pd
diamonds = sns.load_dataset('diamonds')

columns = ['x', 'y', 'z']

for col in columns:

diamonds[f'{col}_quantile'] = pd.qcut(diamonds[col], q=5, labels=False)

sns.countplot(data=pd.melt(diamonds, value_vars=[f'{col}_quantile' for col in

columns]), x='value', hue='variable')
```

This code accomplishes the same task in a more condensed form.

Q2. PLOT A BIVARIATE REALTIONSHIP BETWEEN ATTAL THE
CATEGORICAL AND NUMERIC COLUMNS OF HOUSING DATA TO
UNDERSTAND THE REALTIONSHIP BETWEEN THE VARIABLES.
WHAT APPORACHES YOU TOOK?
ANS 2. To plot a bivariate relationship between all the categorical and numeric
columns of the housing data, you can use seaborn's `pairplot` function. This
function creates a grid of scatterplots and histograms for each pair of variables,
allowing you to visualize the relationships between them. Here's a step-by-step
approach:

1. Load the housing dataset.

2. Separate the categorical and numeric columns.
3. Use `pairplot` to plot the bivariate relationships between all pairs of variables.

Here's how you can do it:

```python
import seaborn as sns
import pandas as pd

# Load the housing dataset (replace 'housing_data.csv' with the actual file
name)
housing_data = pd.read_csv('housing_data.csv')

# Separate categorical and numeric columns

numeric_columns = housing_data.select_dtypes(include=['int64', 'float64'])
categorical_columns = housing_data.select_dtypes(include=['object'])
# Plot bivariate relationships using pairplot
sns.pairplot(data=numeric_columns.join(categorical_columns.apply(lambda
x: x.astype('category').cat.codes)),
diag_kind='hist')
```

In this approach:

- We load the housing dataset.

- We separate the numeric and categorical columns.
- We convert the categorical columns to numeric using `.apply(lambda x:
x.astype('category').cat.codes)` so that we can include them in the `pairplot`.
- Finally, we use `pairplot` to visualize the relationships between all pairs of
numeric and converted categorical variables. The `diag_kind='hist'` parameter
specifies that histograms should be plotted along the diagonal.
Q3. GENERATE A HEATMAP BETWEEN ALL THE CONTONIOUS
VARIABLES OF HOUSING DATASET , MAKESURE YOU
UNDERSTAND THAT WHAT CONTONIOUS DATASET IS I THE GIVEN
THE INPUT DATASET OF HOUSING.
ANS 3. Continuous variables in a dataset are those that can take any value within a
certain range. In the context of the housing dataset, continuous variables might
include features like square footage, number of bedrooms, price, etc. To generate a
heatmap between all the continuous variables in the housing dataset, you can use
the `corr()` function to calculate the correlation matrix and then visualize it using
seaborn's heatmap. Here's how you can do it:

```python
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

# Load the housing dataset (replace 'housing_data.csv' with the actual file
name)
housing_data = pd.read_csv('housing_data.csv')

# Filter out continuous variables

continuous_variables = housing_data.select_dtypes(include=['int64',
'float64'])
# Calculate the correlation matrix
correlation_matrix = continuous_variables.corr()

# Plot heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Correlation Heatmap of Continuous Variables')
plt.show()
```

In this code:

- We load the housing dataset.

- We filter out the continuous variables using `select_dtypes`.
- We calculate the correlation matrix using the `corr()` function.
- Finally, we plot the heatmap using seaborn's `heatmap` function, specifying
parameters such as `annot=True` to display the correlation values on the heatmap
and `cmap='coolwarm'` to choose the color palette. Certainly! Here's a more
detailed breakdown:

```python
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

# Load the housing dataset (replace 'housing_data.csv' with the actual file
name)
housing_data = pd.read_csv('housing_data.csv')

# Filter out continuous variables

continuous_variables = housing_data.select_dtypes(include=['int64',
'float64'])

# Calculate the correlation matrix

correlation_matrix = continuous_variables.corr()

# Plot heatmap
plt.figure(figsize=(10, 8)) # Set the size of the heatmap
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
# annot=True shows correlation values, cmap='coolwarm' selects color
palette, fmt=".2f" formats values to 2 decimal places
plt.title('Correlation Heatmap of Continuous Variables') # Set the title of the
heatmap
plt.show() # Display the heatmap
```

This code provides a more detailed explanation of each step:

1. **Load the dataset**: Reads the housing dataset into a pandas DataFrame.
Replace `'housing_data.csv'` with the actual filename.

2. Filter continuous variables: Selects only the continuous variables from

the dataset using `select_dtypes`. This ensures that only numerical variables are
included in the correlation analysis.

3. Calculate the correlation matrix: Computes the correlation coefficients

between all pairs of continuous variables using the `corr()` method. This generates
a symmetric matrix where each element represents the correlation between two
variables.

4. Plot the heatmap: Creates a heatmap using seaborn's `heatmap` function.

The heatmap visualizes the correlation matrix, with each cell color representing the
strength and direction of the correlation between two variables. Parameters such as
`annot=True`, `cmap='coolwarm'`, and `fmt=".2f"` are used to enhance the
visualization by displaying correlation values, selecting a color palette, and
formatting values to two decimal places, respectively.

5. Display the plot: Finally, the heatmap is displayed using `plt.show()`.

This heatmap allows you to quickly identify patterns of association between pairs
of continuous variables in the housing dataset. Strong correlations (either positive
or negative) are indicated by darker shades, while weaker correlations are
represented by lighter shades or near-zero values.
Q4. FIND THE ROOTS OF THE EQUATION OF “x^2 – 6X + 5 AND
POINT OUT THE ROOTS ON MATPLOTIB GRAPH.
ANS 4. To find the roots of the equation \(x^2 - 6x + 5\), you can use the quadratic
formula:

\[x = \frac{{-b \pm \sqrt{{b^2 - 4ac}}}}{{2a}}\]

Where \(a\), \(b\), and \(c\) are coefficients of the quadratic equation \(ax^2 +
bx + c\).

For the given equation \(x^2 - 6x + 5\), \(a = 1\), \(b = -6\), and \(c = 5\).

Now, let's calculate the roots and then plot them on a Matplotlib graph:

```python
import numpy as np
import matplotlib.pyplot as plt

# Coefficients of the quadratic equation

a=1
b = -6
c=5

# Calculate the discriminant

discriminant = b**2 - 4*a*c

# Check if the discriminant is positive, negative, or zero

if discriminant > 0:
# Two real and distinct roots
root1 = (-b + np.sqrt(discriminant)) / (2*a)
root2 = (-b - np.sqrt(discriminant)) / (2*a)
print("Roots are real and distinct.")
print("Root 1:", root1)
print("Root 2:", root2)

# Plot the quadratic equation

x = np.linspace(min(root1, root2) - 1, max(root1, root2) + 1, 100)
y = a*x**2 + b*x + c
plt.plot(x, y, label='y = x^2 - 6x + 5')

# Plot the roots

plt.scatter([root1, root2], [0, 0], color='red', label='Roots')

elif discriminant == 0:
# One real root (repeated)
root = -b / (2*a)
print("Root is real and repeated.")
print("Root:", root)

# Plot the quadratic equation

x = np.linspace(root - 1, root + 1, 100)
y = a*x**2 + b*x + c
plt.plot(x, y, label='y = x^2 - 6x + 5')

# Plot the root

plt.scatter(root, 0, color='red', label='Root')

else:
# Complex roots
real_part = -b / (2*a)
imaginary_part = np.sqrt(abs(discriminant)) / (2*a)
root1 = complex(real_part, imaginary_part)
root2 = complex(real_part, -imaginary_part)
print("Roots are complex.")
print("Root 1:", root1)
print("Root 2:", root2)

# Complex roots cannot be plotted on a real number line

# But we can still plot the quadratic equation
x = np.linspace(real_part - 1, real_part + 1, 100)
y = a*x**2 + b*x + c
plt.plot(x, y, label='y = x^2 - 6x + 5')

# Add labels and legend

plt.xlabel('x')
plt.ylabel('y')
plt.title('Quadratic Equation and Roots')
plt.axhline(0, color='black',linewidth=0.5)
plt.axvline(0, color='black',linewidth=0.5)
plt.grid(color = 'gray', linestyle = '--', linewidth = 0.5)
plt.legend()
plt.show()
```
This code calculates the roots of the quadratic equation and plots the equation
along with the roots on a Matplotlib graph. Depending on the nature of the roots
(real and distinct, real and repeated, or complex), it plots accordingly.
Q5. GRAPH SIN(X) AND COS(X) IN MATPOTLIB AND REPRESENT
THEM ON MATPLOTLIB, MAKE SURE THESE PLOT SHOLUD BE
DONE IN ONE PLOT ONLY.
ANS 5. To graph both \( \sin(x) \) and \( \cos(x) \) on the same plot using
Matplotlib, you can simply compute the values of both functions over a range of \
( x \) values and then plot them together. Here's how you can do it:

```python
import numpy as np
import matplotlib.pyplot as plt

# Generate x values
x = np.linspace(0, 2*np.pi, 100)

# Compute sin(x) and cos(x) values

sin_values = np.sin(x)
cos_values = np.cos(x)

# Plot sin(x) and cos(x) on the same plot

plt.plot(x, sin_values, label='sin(x)')
plt.plot(x, cos_values, label='cos(x)')

# Add labels and legend

plt.xlabel('x')
plt.ylabel('y')
plt.title('Plot of sin(x) and cos(x)')
plt.legend()

# Show plot
plt.show()
```

This code generates \( x \) values ranging from 0 to \( 2\pi \) (a full period of the
sine and cosine functions), computes the corresponding \( \sin(x) \) and \( \cos(x) \)
values, and then plots them on the same graph using `plt.plot()`. Finally, it adds
labels, a title, and a legend to the plot using `plt.xlabel()`, `plt.ylabel()`, `plt.title()`,
and `plt.legend()` functions respectively. Certainly! Here's a more concise version:
```python
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0, 2*np.pi, 100)

plt.plot(x, np.sin(x), label='sin(x)')
plt.plot(x, np.cos(x), label='cos(x)')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Plot of sin(x) and cos(x)')
plt.legend()
plt.show()
```

This code accomplishes the same task in a more condensed form. It generates \(x\)
values from 0 to \(2\pi\), computes \( \sin(x) \) and \( \cos(x) \) values, plots them
on the same graph, adds labels, a title, and a legend, and finally displays the plot.
Q6. GENERATE A 3D PLOT TO REPRESENT USING NUMPY
MESHGRID TO REPRESENT A PEAK AND A VALLEY IN A GRAPH.
ANS 6. Certainly! You can use NumPy's meshgrid to create a grid of points and
then plot a function that represents a peak and a valley in 3D using Matplotlib.
Here's an example:

```python
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Generate grid points

x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)

# Define the peak and valley function

Z_peak = np.sin(np.sqrt(X**2 + Y**2)) # Peak function
Z_valley = -np.sin(np.sqrt(X**2 + Y**2)) # Valley function

# Create 3D plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

# Plot the peak

ax.plot_surface(X, Y, Z_peak, cmap='viridis', edgecolor='none')

# Plot the valley

ax.plot_surface(X, Y, Z_valley, cmap='magma', edgecolor='none')

# Set labels and title

ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
ax.set_title('Peak and Valley')

# Show plot
plt.show()
```

In this code:

- We create a grid of points in the \(xy\)-plane using `np.meshgrid`.

- We define two functions, one representing a peak and the other a valley.
- We create a 3D plot using Matplotlib and plot the surfaces of the peak and valley
functions.
- Finally, we set labels for the axes and a title for the plot and display it.

You can adjust the parameters of the functions and the range of the grid points to
visualize different peaks and valleys. Sure! Here's a more concise version:

```python
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)

Z_peak = np.sin(np.sqrt(X2 + Y2))

Z_valley = -np.sin(np.sqrt(X**2 + Y**2))
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

ax.plot_surface(X, Y, Z_peak, cmap='viridis', edgecolor='none')

ax.plot_surface(X, Y, Z_valley, cmap='magma', edgecolor='none')

ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
ax.set_title('Peak and Valley')

plt.show()
```

This code generates a grid of points, calculates the values for a peak and a valley
function, and plots them in 3D using matplotlib.
Q7. WHY DO YOU THINK IT IS IMPORTANT TO MELT THE
DATAFRAME BEFORE WE DO SOME PLOTTING, IN WHAT
SCENERIOS THIS WILL BE IMPORTANT.
ANS 7. Melting a Data Frame is often important in scenarios where the data is in a
wide format (multiple columns) and needs to be converted into a long format
(single column) for visualization or analysis purposes. Here are a few reasons why
melting data might be important and the scenarios where it's particularly useful:

1. Plotting Multiple Variables: When you have multiple columns

representing different variables, melting the Data Frame allows you to plot these
variables against each other more easily. For example, when you want to create a
line plot or scatter plot with multiple lines or markers representing different
variables, melting the data makes it simpler to achieve this.

2. **Facet Plots**: In scenarios where you want to create facet plots (plots with
multiple subplots arranged in a grid, each showing a different subset of the data),
melting the data is often a prerequisite. Libraries like Seaborn provide functions
like `sns.catplot ()` or `sns.relplot()` which expect data in a long format for creating
facet plots.

3. Statistical Analysis: Some statistical analysis techniques, such as ANOVA

or regression analysis, require data to be in a long format. Melting the data allows
you to reshape it for these analyses.
4. **Database Structure**: Sometimes, you might receive data in a wide format
because it's easier to store or manipulate in a database. However, for analytical
purposes or visualization, it's often more convenient to work with the data in a long
format. Melting the data helps in converting it to a format that's more suitable for
analysis and visualization.

5. Machine Learning: In machine learning tasks like feature engineering,

you might need to reshape the data to create new features or to fit certain models.
Melting can be a useful preprocessing step in such cases.

6. **Time Series Analysis**: In time series analysis, you might have data in a
wide format where each column represents a time point. Melting the DataFrame
can transform it into a long format where each row represents a unique time point,
making it easier to perform time-based analysis or plotting.

7. Hierarchical Data: If you have hierarchical or nested data where multiple

variables are grouped within different levels of a hierarchy, melting the DataFrame
can help flatten the hierarchy and make the data more amenable to analysis and
visualization.

8. Data Integration: When integrating data from multiple sources with

different structures, melting can help standardize the data format and make it easier
to combine or merge datasets.

9. Interactive Visualizations: For interactive visualizations using tools like

Plotly or Bokeh, melting the data can be beneficial as it allows for more flexibility
in creating interactive plots with dynamic features like tooltips or hover
information.

10. Longitudinal Studies: In longitudinal studies where data is collected

from the same subjects over multiple time points, melting the DataFrame can
organize the data by subject and time point, facilitating longitudinal analysis or
plotting.

In summary, melting a DataFrame is an important data transformation technique

that can improve the flexibility, usability, and interpretability of the data,
especially in scenarios involving complex or multi-dimensional datasets.

Assignment2 DataViz
No ratings yet
Assignment2 DataViz
11 pages
Dodge Wiring Diagram Information
100% (1)
Dodge Wiring Diagram Information
17 pages
Fanuc I/O Unit-Model A: Connection and Maintenance Manual
100% (2)
Fanuc I/O Unit-Model A: Connection and Maintenance Manual
244 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
Cosa - The Lamps
No ratings yet
Cosa - The Lamps
282 pages
Machine Learning Lab Manual (1)
No ratings yet
Machine Learning Lab Manual (1)
33 pages
graphs using matplotlib
No ratings yet
graphs using matplotlib
23 pages
Machine Learning Laboratory
No ratings yet
Machine Learning Laboratory
23 pages
ML Lab program 1& 2
No ratings yet
ML Lab program 1& 2
6 pages
exp_2_sdk_ok
No ratings yet
exp_2_sdk_ok
18 pages
Data Visualization With Python
No ratings yet
Data Visualization With Python
34 pages
Boston House Prediction - Colab1
No ratings yet
Boston House Prediction - Colab1
10 pages
Be A 65 Ads Exp 2
No ratings yet
Be A 65 Ads Exp 2
10 pages
Advanced_Plot_Types_with_Seaborn
No ratings yet
Advanced_Plot_Types_with_Seaborn
4 pages
Content From Jose Portilla's Udemy Course Learning Python For Data Analysis and Visualization Notes by Michael Brothers, Available On
No ratings yet
Content From Jose Portilla's Udemy Course Learning Python For Data Analysis and Visualization Notes by Michael Brothers, Available On
13 pages
DM Assignment
No ratings yet
DM Assignment
17 pages
DV LAb Staff
No ratings yet
DV LAb Staff
73 pages
Exercise3 Solution
No ratings yet
Exercise3 Solution
19 pages
Code - Cap 3
No ratings yet
Code - Cap 3
5 pages
Python - Adv - 3 - Jupyter Notebook (Student)
No ratings yet
Python - Adv - 3 - Jupyter Notebook (Student)
18 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
Data Visualization using Matplotlib in Python
No ratings yet
Data Visualization using Matplotlib in Python
15 pages
DALab Part-B BCU&BU
No ratings yet
DALab Part-B BCU&BU
12 pages
Seaborn
No ratings yet
Seaborn
7 pages
Seaborn 2
No ratings yet
Seaborn 2
49 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
External
No ratings yet
External
11 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
MLRecord
No ratings yet
MLRecord
24 pages
Machine Learning(BCSL606) Lab Manual (2) (1)
No ratings yet
Machine Learning(BCSL606) Lab Manual (2) (1)
117 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
10 Must-know Seaborn Visualization Plots for Multivariate Data Analysis in Python _ by Susan Maina _ Towards Data Science
No ratings yet
10 Must-know Seaborn Visualization Plots for Multivariate Data Analysis in Python _ by Susan Maina _ Towards Data Science
39 pages
P04 The Regression Pipeline - Preprocessing Ans
No ratings yet
P04 The Regression Pipeline - Preprocessing Ans
19 pages
ML Expt 1 Description
No ratings yet
ML Expt 1 Description
15 pages
CSA105-LinearRegression-HousePrice-Prediction - Ipynb - Colaboratory
No ratings yet
CSA105-LinearRegression-HousePrice-Prediction - Ipynb - Colaboratory
17 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
Python Course Cheat Sheet
No ratings yet
Python Course Cheat Sheet
30 pages
DAVL PR1.2 Mit
No ratings yet
DAVL PR1.2 Mit
10 pages
Data Visualization Using Matplotlib and Seaborn
No ratings yet
Data Visualization Using Matplotlib and Seaborn
28 pages
Injecttive Blockchain
No ratings yet
Injecttive Blockchain
14 pages
Matplotlib in Python
No ratings yet
Matplotlib in Python
43 pages
FALLSEM2021-22 MDI4001 ETH VL2021220104135 Reference Material I 09-Aug-2021 Data2 1
No ratings yet
FALLSEM2021-22 MDI4001 ETH VL2021220104135 Reference Material I 09-Aug-2021 Data2 1
9 pages
Python Exploratory Data Analysis
No ratings yet
Python Exploratory Data Analysis
24 pages
02 End To End Machine Learning Project
No ratings yet
02 End To End Machine Learning Project
26 pages
EDA and Hypothesis Testing On KC Housing Data: Daniele Sammarco - Exploratory Data Analysis For Machine Learning by IBM
No ratings yet
EDA and Hypothesis Testing On KC Housing Data: Daniele Sammarco - Exploratory Data Analysis For Machine Learning by IBM
9 pages
DAVP Lab Manual
No ratings yet
DAVP Lab Manual
12 pages
House Price Prediction
No ratings yet
House Price Prediction
14 pages
Faseeh Chap 2 Report
No ratings yet
Faseeh Chap 2 Report
30 pages
California Housing Project
No ratings yet
California Housing Project
5 pages
1722414346054
No ratings yet
1722414346054
18 pages
BDA File
No ratings yet
BDA File
26 pages
Data Visualization With Python
No ratings yet
Data Visualization With Python
36 pages
PML Ex3
No ratings yet
PML Ex3
20 pages
Practical Graph
No ratings yet
Practical Graph
8 pages
EDA_CODE_SNIPPETS
No ratings yet
EDA_CODE_SNIPPETS
17 pages
SESION 12 (Pandas)
No ratings yet
SESION 12 (Pandas)
41 pages
lecture4
No ratings yet
lecture4
60 pages
Unit V notes
No ratings yet
Unit V notes
11 pages
Data Science Algorithmen Master - 02 Data Handling
No ratings yet
Data Science Algorithmen Master - 02 Data Handling
76 pages
1684918425867
No ratings yet
1684918425867
14 pages
Experiment - 2.3 Krikita
No ratings yet
Experiment - 2.3 Krikita
12 pages
Fds Slips
No ratings yet
Fds Slips
6 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Abrasive Jet Machining (Ajm) : Dept. of ME, ACE
No ratings yet
Abrasive Jet Machining (Ajm) : Dept. of ME, ACE
8 pages
9 Different Data Types To Better Understand Your Business
No ratings yet
9 Different Data Types To Better Understand Your Business
7 pages
fuba reviewer
No ratings yet
fuba reviewer
6 pages
Parallel and Concurrent Programming in Haskell - PDF
No ratings yet
Parallel and Concurrent Programming in Haskell - PDF
71 pages
Lockout Permit Form - 1
No ratings yet
Lockout Permit Form - 1
2 pages
Milling Rate Formate 2019 SDB
No ratings yet
Milling Rate Formate 2019 SDB
3 pages
Asterisk The Definitive Guide Open Source Telephony for the Enterprise 5th Edition Jim Van Meggelen - Quickly download the ebook in PDF format for unlimited reading
100% (2)
Asterisk The Definitive Guide Open Source Telephony for the Enterprise 5th Edition Jim Van Meggelen - Quickly download the ebook in PDF format for unlimited reading
46 pages
AICA Mindoro-Oriental
No ratings yet
AICA Mindoro-Oriental
10 pages
Igl - Tariff Card Excel Revised 20-08-2023 F
No ratings yet
Igl - Tariff Card Excel Revised 20-08-2023 F
4 pages
Understanding The Attitude of Generation Z Consumers Towards Advertising Avoidance On The Internet
No ratings yet
Understanding The Attitude of Generation Z Consumers Towards Advertising Avoidance On The Internet
11 pages
CSS Explo Q1 M4
No ratings yet
CSS Explo Q1 M4
13 pages
Spare Parts List: Foma Norge As
No ratings yet
Spare Parts List: Foma Norge As
20 pages
FINAL_YEAR_PROJECT_GLAUCOMA_DETECTION_TEAM_COPY _done
No ratings yet
FINAL_YEAR_PROJECT_GLAUCOMA_DETECTION_TEAM_COPY _done
70 pages
Quest: Metal Detectors
No ratings yet
Quest: Metal Detectors
16 pages
Chapter 2 Network Topology
No ratings yet
Chapter 2 Network Topology
22 pages
College Faculty List of Seminars/Training Attended First Semester S.Y. 2021-2022
No ratings yet
College Faculty List of Seminars/Training Attended First Semester S.Y. 2021-2022
2 pages
Imdp12013-61e 020
No ratings yet
Imdp12013-61e 020
149 pages
Textbook UNIT 8 + UNIT 9
No ratings yet
Textbook UNIT 8 + UNIT 9
25 pages
Water Supply and Sewage Disposal PDF
No ratings yet
Water Supply and Sewage Disposal PDF
192 pages
Session 5 (October 11 2021)
No ratings yet
Session 5 (October 11 2021)
67 pages
Information Processing Models Benefits and Limitat
No ratings yet
Information Processing Models Benefits and Limitat
8 pages
Roland Bk5 Manual PDF
No ratings yet
Roland Bk5 Manual PDF
160 pages
Computer Fundamentals Objective Type Questions 1-25
No ratings yet
Computer Fundamentals Objective Type Questions 1-25
20 pages
ESB - Catalogo Passive Das 10 3
No ratings yet
ESB - Catalogo Passive Das 10 3
105 pages
Assign
No ratings yet
Assign
5 pages
What Is A Research Gap?
No ratings yet
What Is A Research Gap?
6 pages
Invoice
No ratings yet
Invoice
1 page

Assignment 2

Uploaded by

Assignment 2

Uploaded by

ASSIGNMENT 2

Q1. CONVERT A CONTINOUS VARIABLE INTO QUANTILE BINS AND

# Load the diamonds dataset

# Define the columns X, Y, and Z

# Create quantile bins for each column

# Melt the dataset to long format for countplot

# Plot using countplot

columns = ['x', 'y', 'z']

for col in columns:

sns.countplot(data=pd.melt(diamonds, value_vars=[f'{col}_quantile' for col in

This code accomplishes the same task in a more condensed form.

1. Load the housing dataset.

Here's how you can do it:

# Separate categorical and numeric columns

- We load the housing dataset.

# Filter out continuous variables

- We load the housing dataset.

# Filter out continuous variables

# Calculate the correlation matrix

This code provides a more detailed explanation of each step:

2. **Filter continuous variables**: Selects only the continuous variables from

3. **Calculate the correlation matrix**: Computes the correlation coefficients

4. **Plot the heatmap**: Creates a heatmap using seaborn's `heatmap` function.

5. **Display the plot**: Finally, the heatmap is displayed using `plt.show()`.

\[x = \frac{{-b \pm \sqrt{{b^2 - 4ac}}}}{{2a}}\]

# Coefficients of the quadratic equation

# Calculate the discriminant

# Check if the discriminant is positive, negative, or zero

# Plot the quadratic equation

# Plot the roots

# Plot the quadratic equation

# Plot the root

# Complex roots cannot be plotted on a real number line

# Add labels and legend

# Compute sin(x) and cos(x) values

# Plot sin(x) and cos(x) on the same plot

# Add labels and legend

x = np.linspace(0, 2*np.pi, 100)

# Generate grid points

# Define the peak and valley function

# Plot the peak

# Plot the valley

# Set labels and title

- We create a grid of points in the \(xy\)-plane using `np.meshgrid`.

Z_peak = np.sin(np.sqrt(X**2 + Y**2))

ax.plot_surface(X, Y, Z_peak, cmap='viridis', edgecolor='none')

1. **Plotting Multiple Variables**: When you have multiple columns

3. **Statistical Analysis**: Some statistical analysis techniques, such as ANOVA

5. **Machine Learning**: In machine learning tasks like feature engineering,

7. **Hierarchical Data**: If you have hierarchical or nested data where multiple

8. **Data Integration**: When integrating data from multiple sources with

9. **Interactive Visualizations**: For interactive visualizations using tools like

10. **Longitudinal Studies**: In longitudinal studies where data is collected

In summary, melting a DataFrame is an important data transformation technique

You might also like

2. Filter continuous variables: Selects only the continuous variables from

3. Calculate the correlation matrix: Computes the correlation coefficients

4. Plot the heatmap: Creates a heatmap using seaborn's `heatmap` function.

5. Display the plot: Finally, the heatmap is displayed using `plt.show()`.

Z_peak = np.sin(np.sqrt(X2 + Y2))

1. Plotting Multiple Variables: When you have multiple columns

3. Statistical Analysis: Some statistical analysis techniques, such as ANOVA

5. Machine Learning: In machine learning tasks like feature engineering,

7. Hierarchical Data: If you have hierarchical or nested data where multiple

8. Data Integration: When integrating data from multiple sources with

9. Interactive Visualizations: For interactive visualizations using tools like

10. Longitudinal Studies: In longitudinal studies where data is collected