0% found this document useful (0 votes)
241 views8 pages

Data Visualization Techniques in Python

The document contains code and outputs from 8 problems visualizing different datasets in Python. The problems include: 1) Plotting GDP over time in the US using matplotlib. 2) Plotting GDP from 2001-2010 in red on the same axes. 3) Creating a 2x2 subplot to show the percentage of degrees awarded to women over time in different subjects. 4) Using Seaborn to show the relationship between life expectancy and GDP per capita from the Gapminder dataset. 5) Improving the previous plot to include population as the hue.

Uploaded by

Analyn Enrico
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
241 views8 pages

Data Visualization Techniques in Python

The document contains code and outputs from 8 problems visualizing different datasets in Python. The problems include: 1) Plotting GDP over time in the US using matplotlib. 2) Plotting GDP from 2001-2010 in red on the same axes. 3) Creating a 2x2 subplot to show the percentage of degrees awarded to women over time in different subjects. 4) Using Seaborn to show the relationship between life expectancy and GDP per capita from the Gapminder dataset. 5) Improving the previous plot to include population as the hue.

Uploaded by

Analyn Enrico
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Worksheet 2.

6
DS100-1
DATA VISUALIZATION IN PYTHON
APPLIED DATA SCIENCE
Name:

Enrico, Dionne Marc L. Page 1 of 1

Write codes in Jupyter notebook as required by the problems. Copy both code and output as screen grab or screen shot and paste
them here. Be sure to apply the necessary customizations.

1 Import gdp_usa.csv. Use matplotlib to show the increase in GDP each year.
Code and Output

import [Link] as plt


import pandas as pd

df = pd.read_csv('gdp_usa.csv', header=0)
df = [Link](columns={"VALUE":"GDP"})
[Link](x="DATE", y=["GDP"], kind="line", figsize=(10, 5))
[Link]()

2 Use the previous import to prepare another plot (in red) showing only the years 2001 to 2010.
Code and Output

Page 1 of 8
import [Link] as plt
import pandas as pd
df = pd.read_csv('gdp_usa.csv', header=0)
df = df[(df['DATE'] > 2000) & (df['DATE'] <=2010)]
[Link](x="DATE", y=["VALUE"], kind="line", figsize=(10, 5),color='red')
[Link]()

3 Import ‘[Link]’. Prepare a 2x2 subplot to visualize the % degrees awarded to women yearly in
Agriculture (green), Business (yellow), Education (blue) and Psychology (red).
Code and Output

import pandas as pd
import [Link] as plt
Degrees = pd.read_csv('[Link]')

[Link](2,2,1)
[Link](Degrees [ 'Year'], Degrees [ 'Agriculture'], 'red')
[Link]('Year')
[Link](' Degrees')
[Link]('Agriculture')

[Link](2,2,2)
[Link](Degrees [ 'Year'], Degrees [ 'Business'], 'blue')
[Link]('Year')
[Link](' Degrees')
[Link]('Business')
Page 2 of 8
[Link](2,2,3)
[Link] (Degrees [ 'Year'], Degrees [ 'Education'], 'yellow')
[Link]('Year')
[Link](' Degrees')
[Link]('Education')

[Link](2,2,4)
[Link](Degrees [ 'Year'], Degrees[ 'Psychology'], 'green')
[Link]('Year')
[Link](' Degrees')
[Link]('Psychology')
plt.tight_layout()
[Link]()

4 Import [Link]. Use matplotlib to show how life expectancy varies with gdp_cap.
Code and Output

import pandas as pd
import [Link] as plt
import seaborn as sns

data = pd.read_csv('[Link]')
[Link](x = 'life_exp', y = 'gdp_cap', data=data)
[Link]('Life Expectancy')
[Link]('GDP Capital')
Page 3 of 8
[Link]('Relativity of Life Expectancy to GDP Capital')
[Link]()

5 Improve the previous visualization by including population.


Code and Output

import pandas as pd
import [Link] as plt
import seaborn as sns

data = pd.read_csv('[Link]')
[Link](x = 'life_exp', y = 'gdp_cap', data=data, hue= 'population')
[Link]('Life Expectancy')
[Link]('GDP Capital')
[Link]('Life Expectancy vs GDP per capita')
[Link]()

Page 4 of 8
6 Refer to the gapminder import. Use seaborn to show the relationship among all numeric data. Group the data according to
continent.
Code and Output

import pandas as pd
import [Link] as plt
import seaborn as sns

data= pd.read_csv('[Link]')
[Link](x = 'life_exp', y='gdp_cap', data=data, hue= 'cont')
[Link]('Life Expectancy')
[Link]('GDP Capital')
[Link]('Relativity of Life Expectancy to GDP Capital')
[Link]()

Page 5 of 8
7 Import [Link]. Use seaborn to show how horsepower varies with acceleration. We also want to see the distribution of
both variables in the same plot.
Code and Output

import pandas as pd
import [Link] as plt
import seaborn as sns

data = pd.read_csv('[Link]')
[Link](x = 'hp', y = 'accel', data=data, kind = 'box')
[Link]('Horsepower')
[Link]('Acceleration')
[Link]('Horsepower vs Acceleration')
[Link]()

Page 6 of 8
8 Using the auto dataset, prepare side-by-side plots showing the distribution of the number of cylinders. Use a swarm plot on
the left side and a strip plot on the right side.
Code and Output

import pandas as pd
import [Link] as plt
import seaborn as sns

data pd.read_csv('[Link]')
[Link] (1,2,1)
[Link](x = 'cyl', y= 'marker', data=data, size = 0.6)

[Link]('Number of Cylinders')
[Link]('Cylinder Markers')
[Link]('Swarm Plot)
[Link](1,2,2)
[Link](x = 'cyl', y= 'marker', data=data)
[Link]('Number of Cylinders')
[Link]('Cylinder Markers')
[Link]('Strip Plot’)
plt.tight_layout()
[Link]()

Page 7 of 8
Page 8 of 8

You might also like