0% found this document useful (0 votes)
18 views

Intermediate Python

f
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Intermediate Python

f
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

INTERMEDIATE PYTHON

Line plot (1) 2040

 print() the last item from both the year and the pop list to see what the
predicted population for the year 2100 is. Use two print() functions. 2060
 Before you can start, you should
import matplotlib.pyplot as plt. pyplot is a sub-package of matplotlib,
hence the dot.
2085
 Use plt.plot() to build a line plot. year should be mapped on the
horizontal axis, pop on the vertical axis. Don’t forget to finish off
with the plt.show() function to actually display the plot.
2095
# edited/addedimport numpy as np
Line plot (3)
year=list(range(1950,2100+1))
pop=list(np.loadtxt('pop1.txt', dtype=float))  Print the last item from both the list gdp_cap, and the list life_exp; it
is information about Zimbabwe.
# Print the last item from year and pop  Build a line chart, with gdp_cap on the x-axis, and life_exp on the y-
print(year[-1]) axis. Does it make sense to plot this data on a line plot?
 Don’t forget to finish off with a plt.show() command, to actually
print(pop[-1])
display the plot.
# Import matplotlib.pyplot as pltimport matplotlib.pyplot as plt
# Make a line plot: year on the x-axis, pop on the y-axis # edited/added
plt.plot(year, pop) gdp_cap=list(np.loadtxt('gdp_cap.txt', dtype=float))
# Display the plot with plt.show() life_exp=list(np.loadtxt('life_exp.txt', dtype=float))
plt.show() # Print the last item of gdp_cap and life_exp
print(gdp_cap[-1])
Line Plot (2): Interpretation
print(life_exp[-1])
Have another look at the plot you created in the previous exercise; it’s shown
on the right. Based on the plot, in approximately what year will there be # Make a line plot, gdp_cap on the x-axis, life_exp on the y-axis
more than ten billion human beings on this planet? plt.plot(gdp_cap, life_exp)
# Display the plot
plt.show() plt.show()

Scatter Plot (1) Build a histogram (1)

 Change the line plot that’s coded in the script to a scatter plot.  Use plt.hist() to create a histogram of the values in life_exp. Do not
 A correlation will become clear when you display the GDP per capita specify the number of bins; Python will set the number of bins to 10
on a logarithmic scale. Add the line plt.xscale('log'). by default for you.
 Finish off your script with plt.show() to display the plot.  Add plt.show() to actually display the histogram. Can you tell which
bin contains the most observations?
# Change the line plot below to a scatter plot
# Create histogram of life_exp data
plt.scatter(gdp_cap, life_exp)
plt.hist(life_exp)
# Put the x-axis on a logarithmic scale
# Display histogram
plt.xscale('log')
plt.show()
# Show plot
plt.show() Build a histogram (2): bins

Scatter plot (2)  Build a histogram of life_exp, with 5 bins. Can you tell which bin
contains the most observations?
 Start from scratch: import matplotlib.pyplot as plt.  Build another histogram of life_exp, this time with 20 bins. Is this
 Build a scatter plot, where pop is mapped on the horizontal axis, better?
and life_exp is mapped on the vertical axis.
 Finish the script with plt.show() to actually display the plot. Do you
# Build histogram with 5 bins
see a correlation?
plt.hist(life_exp, bins = 5)
# edited/added # Show and clear plot
pop=list(np.loadtxt('pop2.txt', dtype=float)) plt.show()
# Import packageimport matplotlib.pyplot as plt plt.clf()
# Build Scatter plot # Build histogram with 20 bins
plt.scatter(pop, life_exp) plt.hist(life_exp, bins = 20)
# Show plot # Show and clear plot again
plt.show()
plt.clf() Scatter plot

Build a histogram (3): compare


Histogram
 Build a histogram of life_exp with 15 bins.
 Build a histogram of life_exp1950, also with 15 bins. Is there a big Choose the right plot (2)
difference with the histogram for the 2007 data? You’re a professor in Data Analytics with Python, and you want to visually
assess if longer answers on exam questions lead to higher grades. Which plot
# edited/added do you use?
life_exp1950=list(np.loadtxt('life_exp1950.txt', dtype=float))
# Histogram of life_exp, 15 bins Line plot
plt.hist(life_exp, bins = 15)
# Show and clear plot
Scatter plot
plt.show()
plt.clf()
# Histogram of life_exp1950, 15 bins Histogram

plt.hist(life_exp1950, bins = 15) Labels


# Show and clear plot again
 The strings xlab and ylab are already set for you. Use these variables
plt.show() to set the label of the x- and y-axis.
plt.clf()  The string title is also coded for you. Use it to add a title to the plot.
 After these customizations, finish the script with plt.show() to actually
Choose the right plot (1) display the plot.
You’re a professor teaching Data Science with Python, and you want to
visually assess if the grades on your exam follow a particular distribution. # Basic scatter plot, log scale
Which plot do you use? plt.scatter(gdp_cap, life_exp)
plt.xscale('log')
Line plot # Strings
xlab = 'GDP per Capita [in USD]' tick_lab = ['1k', '10k', '100k']
ylab = 'Life Expectancy [in years]' # Adapt the ticks on the x-axis
title = 'World Development in 2007' plt.xticks(tick_val, tick_lab)
# Add axis labels # After customizing, display the plot
plt.xlabel(xlab) plt.show()
plt.ylabel(ylab)
Sizes
# Add title
plt.title(title)  Run the script to see how the plot changes.
 Looks good, but increasing the size of the bubbles will make things
# After customizing, display the plot
stand out more.
plt.show()
o Import the numpy package as np.
Ticks o Use np.array() to create a numpy array from the list pop. Call
this NumPy array np_pop.
 Use tick_val and tick_lab as inputs to the xticks() function to make o Double the values in np_pop setting the value of np_pop equal
the the plot more readable. to np_pop * 2. Because np_pop is a NumPy array, each array
 As usual, display the plot with plt.show() after you’ve added the element will be doubled.
customizations. o Change the s argument inside plt.scatter() to
be np_pop instead of pop.
# Scatter plot
# Import numpy as npimport numpy as np
plt.scatter(gdp_cap, life_exp)
# Store pop as a numpy array: np_pop
# Previous customizations
np_pop = np.array(pop)
plt.xscale('log')
# Double np_pop
plt.xlabel('GDP per Capita [in USD]')
np_pop = np_pop * 2
plt.ylabel('Life Expectancy [in years]')
# Update: set s argument to np_pop
plt.title('World Development in 2007')
plt.scatter(gdp_cap, life_exp, s = np_pop)
# Definition of tick_val and tick_lab
# Previous customizations
tick_val = [1000, 10000, 100000]
plt.xscale('log')
plt.xlabel('GDP per Capita [in USD]')  Add plt.grid(True) after the plt.text() calls so that gridlines are drawn
on the plot.
plt.ylabel('Life Expectancy [in years]')
plt.title('World Development in 2007') # Scatter plot
plt.xticks([1000, 10000, 100000],['1k', '10k', '100k']) plt.scatter(x = gdp_cap, y = life_exp, s = np.array(pop) * 2, c = col, alpha =
# Display the plot 0.8)
plt.show() # Previous customizations
plt.xscale('log')
Colors plt.xlabel('GDP per Capita [in USD]')
 Add c = col to the arguments of the plt.scatter() function. plt.ylabel('Life Expectancy [in years]')
 Change the opacity of the bubbles by setting the alpha argument plt.title('World Development in 2007')
to 0.8 inside plt.scatter(). Alpha can be set from zero to one, where
zero is totally transparent, and one is not at all transparent. plt.xticks([1000,10000,100000], ['1k','10k','100k'])
# Additional customizations
# edited/added plt.text(1550, 71, 'India')
col=list(np.loadtxt('col.txt', dtype=str)) plt.text(5700, 80, 'China')
# Specify c and alpha inside plt.scatter() # Add grid() call
plt.scatter(x = gdp_cap, y = life_exp, s = np.array(pop) * 2, c = col, alpha = plt.grid(True)
0.8)
# Show the plot
# Previous customizations
plt.show()
plt.xscale('log')
plt.xlabel('GDP per Capita [in USD]') Interpretation
plt.ylabel('Life Expectancy [in years]') If you have a look at your colorful plot, it’s clear that people live longer in
countries with a higher GDP per capita. No high income countries have really
plt.title('World Development in 2007')
short life expectancy, and no low income countries have very long life
plt.xticks([1000,10000,100000], ['1k','10k','100k']) expectancy. Still, there is a huge difference in life expectancy between
# Show the plot countries on the same income level. Most people live in middle income
countries where difference in lifespan is huge between countries; depending
plt.show() on how income is distributed and how it is used.

Additional Customizations
What can you say about the plot?  With the strings in countries and capitals, create a
dictionary called europe with 4 key:value pairs.
Beware of capitalization! Make sure you use
The countries in blue, corresponding to Africa, have both low lowercase characters everywhere.
life expectancy and a low GDP per capita.  Print out europe to see if the result is what you
expected.

There is a negative correlation between GDP per capita and # Definition of countries and capital
life expectancy.
countries = ['spain', 'france', 'germany', 'norway']
capitals = ['madrid', 'paris', 'berlin', 'oslo']
China has both a lower GDP per capita and lower life expectancy # From string in countries and capitals, create dictionary europe
compared to India. europe = {'spain':'madrid', 'france':'paris', 'germany':'berlin', 'norway':'oslo'}
Motivation for dictionaries # Print europe
print(europe)
 Use the index() method on countries to find the index of 'germany'.
Store this index as ind_ger. Access dictionary
 Use ind_ger to access the capital of Germany from the capitals list.
Print it out.  Check out which keys are in europe by calling the keys() method
on europe. Print out the result.
# Definition of countries and capital  Print out the value that belongs to the key 'norway'.
countries = ['spain', 'france', 'germany', 'norway']
# Definition of dictionary
capitals = ['madrid', 'paris', 'berlin', 'oslo']
europe = {'spain':'madrid', 'france':'paris', 'germany':'berlin', 'norway':'oslo' }
# Get index of 'germany': ind_ger
# Print out the keys in europe
ind_ger = countries.index('germany')
print(europe.keys())
# Use ind_ger to print out capital of Germany
# Print out value that belongs to key 'norway'
print(capitals[ind_ger])
print(europe['norway'])
Create dictionary
Dictionary Manipulation (1)

 Add the key 'italy' with the value 'rome' to europe.


 To assert that 'italy' is now a key in europe, print out 'italy' in europe. # Remove australiadel(europe['australia'])
 Add another key:value pair to europe: 'poland' is the key, 'warsaw' is
the corresponding value. # Print europe
 Print out europe. print(europe)

# Definition of dictionary Dictionariception


europe = {'spain':'madrid', 'france':'paris', 'germany':'berlin', 'norway':'oslo' }
 Use chained square brackets to select and print out the capital of
# Add italy to europe France.
europe['italy'] = 'rome'  Create a dictionary, named data, with the
keys 'capital' and 'population'. Set them to 'rome' and 59.83,
# Print out italy in europe respectively.
print('italy' in europe)  Add a new key-value pair to europe; the key is 'italy' and the value
is data, the dictionary you just built.
# Add poland to europe
europe['poland'] = 'warsaw'
# Dictionary of dictionaries
# Print europe
europe = { 'spain': { 'capital':'madrid', 'population':46.77 },
print(europe)
'france': { 'capital':'paris', 'population':66.03 },

Dictionary Manipulation (2) 'germany': { 'capital':'berlin', 'population':80.62 },


'norway': { 'capital':'oslo', 'population':5.084 } }
 The capital of Germany is not 'bonn'; it’s 'berlin'. Update its value.
 Australia is not in Europe, Austria is! Remove the
key 'australia' from europe. # Print out the capital of France
 Print out europe to see if your cleaning work paid off. print(europe['france']['capital'])
# Create sub-dictionary data
# Definition of dictionary
data = { 'capital':'rome', 'population':59.83 }
europe = {'spain':'madrid', 'france':'paris', 'germany':'bonn',
# Add data to europe under key 'italy'
'norway':'oslo', 'italy':'rome', 'poland':'warsaw',
europe['italy'] = data
'australia':'vienna' }
# Print europe
# Update capital of germany
print(europe)
europe['germany'] = 'berlin'
Dictionary to DataFrame (1)
 Import pandas as pd. # Build cars DataFrame
 Use the pre-defined lists to create a dictionary called my_dict. There
should be three key value pairs: names = ['United States', 'Australia', 'Japan', 'India', 'Russia', 'Morocco',
'Egypt']
o key 'country' and value names. dr = [True, False, False, False, True, True, True]
o key 'drives_right' and value dr.
o key 'cars_per_cap' and value cpc. cpc = [809, 731, 588, 18, 200, 70, 45]
cars_dict = { 'country':names, 'drives_right':dr, 'cars_per_cap':cpc }
 Use pd.DataFrame() to turn your dict into a DataFrame called cars.
cars = pd.DataFrame(cars_dict)
 Print out cars and see how beautiful it is.
print(cars)
# Pre-defined lists # Definition of row_labels
names = ['United States', 'Australia', 'Japan', 'India', 'Russia', 'Morocco', row_labels = ['US', 'AUS', 'JPN', 'IN', 'RU', 'MOR', 'EG']
'Egypt'] # Specify row labels of cars
dr = [True, False, False, False, True, True, True] cars.index = row_labels
cpc = [809, 731, 588, 18, 200, 70, 45] # Print cars again
# Import pandas as pdimport pandas as pd print(cars)
# Create dictionary my_dict with three key:value pairs: my_dict
my_dict = { 'country':names, 'drives_right':dr, 'cars_per_cap':cpc } CSV to DataFrame (1)
# Build a DataFrame cars from my_dict: cars  To import CSV files you still need the pandas package: import it
cars = pd.DataFrame(my_dict) as pd.
 Use pd.read_csv() to import cars.csv data as a DataFrame. Store this
# Print cars
DataFrame as cars.
print(cars)  Print out cars. Does everything look OK?

Dictionary to DataFrame (2) # Import pandas as pdimport pandas as pd

 Hit Run Code to see that, indeed, the row labels are not correctly set. # Import the cars.csv data: cars
 Specify the row labels by setting cars.index equal to row_labels. cars = pd.read_csv('cars.csv')
 Print out cars again and check if the row labels are correct this time.
# Print out cars
print(cars)
import pandas as pd
CSV to DataFrame (2) Square Brackets (2)

 Run the code with Run Code and assert that the first column should  Select the first 3 observations from cars and print them out.
actually be used as row labels.  Select the fourth, fifth and sixth observation, corresponding to row
 Specify the index_col argument inside pd.read_csv(): set it to 0, so indexes 3, 4 and 5, and print them out.
that the first column is used as row labels.
 Has the printout of cars improved now? # Import cars dataimport pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)
# Import pandas as pdimport pandas as pd
# Print out first 3 observations
# Fix import by including index_col
print(cars[0:3])
cars = pd.read_csv('cars.csv', index_col = 0)
# Print out fourth, fifth and sixth observation
# Print out cars
print(cars[3:6])
print(cars)
loc and iloc (1)
Square Brackets (1)
 Use loc or iloc to select the observation corresponding to Japan as a
 Use single square brackets to print out the country column of cars as a Series. The label of this row is JPN, the index is 2. Make sure to print
Pandas Series. the resulting Series.
 Use double square brackets to print out the country column of cars as  Use loc or iloc to select the observations for Australia and Egypt as a
a Pandas DataFrame. DataFrame. You can find out about the labels/indexes of these rows
 Use double square brackets to print out a DataFrame with both by inspecting cars in the IPython Shell. Make sure to print the
the country and drives_right columns of cars, in this order. resulting DataFrame.

# Import cars dataimport pandas as pd # Import cars dataimport pandas as pd


cars = pd.read_csv('cars.csv', index_col = 0) cars = pd.read_csv('cars.csv', index_col = 0)
# Print out country column as Pandas Series # Print out observation for Japan
print(cars['country']) print(cars.iloc[2])
# Print out country column as Pandas DataFrame # Print out observations for Australia and Egypt
print(cars[['country']]) print(cars.loc[['AUS', 'EG']])
# Print out DataFrame with country and drives_right columns
print(cars[['country', 'drives_right']]) loc and iloc (2)
 Print out the drives_right value of the row corresponding to Morocco  Write Python code to check if -5 * 15 is not equal to 75.
(its row label is MOR)  Ask Python whether the strings "pyscript" and "PyScript" are equal.
 Print out a sub-DataFrame, containing the observations for Russia and  What happens if you compare booleans and integers? Write code to
Morocco and the columns country and drives_right. see if True and 1 are equal.

# Import cars dataimport pandas as pd # Comparison of booleans


cars = pd.read_csv('cars.csv', index_col = 0) print(True == False)
# Print out drives_right value of Morocco # Comparison of integers
print(cars.iloc[5, 2]) print(-5 * 15 != 75)
# Print sub-DataFrame # Comparison of strings
print(cars.loc[['RU', 'MOR'], ['country', 'drives_right']]) print("pyscript" == "PyScript")
# Compare a boolean with a numeric
loc and iloc (3)
print(True == 1)
 Print out the drives_right column as a Series using loc or iloc.
 Print out the drives_right column as a DataFrame using loc or iloc. Greater and less than
 Print out both the cars_per_cap and drives_right column as a
DataFrame using loc or iloc.  Write Python expressions, wrapped in a print() function, to check
whether:
# Import cars dataimport pandas as pd
o x is greater than or equal to -10. x has already been defined for
cars = pd.read_csv('cars.csv', index_col = 0) you.
# Print out drives_right column as Series o "test" is less than or equal to y. y has already been defined for
you.
print(cars.iloc[:, 2]) o True is greater than False.
# Print out drives_right column as DataFrame
print(cars.iloc[:, [2]]) # Comparison of integers
# Print out cars_per_cap and drives_right as DataFrame x = -3 * 6
print(cars.loc[:, ['cars_per_cap', 'drives_right']]) print(x >= -10)
# Comparison of strings
Equality y = "test"
 In the editor on the right, write code to see if True equals False.
print("test" <= y) # Define variables
# Comparison of booleans my_kitchen = 18.0
print(True > False) your_kitchen = 14.0
# my_kitchen bigger than 10 and smaller than 18?
Compare arrays
print(my_kitchen > 10 and my_kitchen < 18)
 Which areas in my_house are greater than or equal to 18? # my_kitchen smaller than 14 or bigger than 17?
 You can also compare two NumPy arrays element-wise. Which areas
print(my_kitchen < 14 or my_kitchen > 17)
in my_house are smaller than the ones in your_house?
 Make sure to wrap both commands in a print() statement so that you # Double my_kitchen smaller than triple your_kitchen?
can inspect the output! print(my_kitchen * 2 < your_kitchen * 3)

# Create arraysimport numpy as np and, or, not (2)


my_house = np.array([18.0, 20.0, 10.75, 9.50])
x=8
your_house = np.array([14.0, 24.0, 14.25, 9.0])
y=9
# my_house greater than or equal to 18
not(not(x < 3) and not(y > 14 or y > 10))
print(my_house >= 18)
What will the result be if you execute these three commands in the IPython
# my_house less than your_house
Shell?
print(my_house < your_house)
NB: Notice that not has a higher priority than and and or, it is executed first.
Boolean Operators
True
and, or, not (1)

 Write Python expressions, wrapped in a print() function, to check


whether: False

o my_kitchen is bigger than 10 and smaller than 18.


o my_kitchen is smaller than 14 or bigger than 17. Running these commands will result in an error.
o double the area of my_kitchen is smaller than triple the area
of your_kitchen. Boolean operators with NumPy
 Generate boolean arrays that answer the following questions:
 Which areas in my_house are greater than 18.5 or smaller than 10?
medium
 Which areas are smaller than 11 in both my_house and your_house?
Make sure to wrap both commands in print() statement, so that you
can inspect the output.
large
# Create arraysimport numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
The syntax is incorrect; this code will produce an error.
your_house = np.array([14.0, 24.0, 14.25, 9.0])
# my_house greater than 18.5 or smaller than 10 if
print(np.logical_or(my_house > 18.5, my_house < 10))  Examine the if statement that prints out "looking around in the
# Both my_house and your_house smaller than 11 kitchen." if room equals "kit".
 Write another if statement that prints out “big place!” if area is greater
print(np.logical_and(my_house < 11, your_house < 11))
than 15.

if, elif, else


# Define variables
Warmup room = "kit"
To experiment with if and else a bit, have a look at this code sample: area = 14.0
area = 10.0 # if statement for roomif room == "kit" :
if(area < 9) : print("looking around in the kitchen.")
print("small") # if statement for areaif area > 15 :
elif(area < 12) : print("big place!")
print("medium")
Add else
else :
# Define variables
print("large")
room = "kit"
What will the output be if you run this piece of code in the IPython Shell?
area = 14.0
# if-else construct for roomif room == "kit" :
small
print("looking around in the kitchen.")else : # Extract drives_right column as Series: dr
print("looking around elsewhere.") dr = cars['drives_right']
# if-else construct for area :if area > 15 : # Use dr to subset cars: sel
print("big place!")else : sel = cars[dr]
print("pretty small.") # Print sel
print(sel)
Customize further: elif

# Define variables Driving right (2)

room = "bed" # Import cars dataimport pandas as pd


area = 14.0 cars = pd.read_csv('cars.csv', index_col = 0)
# if-elif-else construct for roomif room == "kit" : # Convert code to a one-liner
print("looking around in the kitchen.")elif room == "bed": sel = cars[cars['drives_right']]
print("looking around in the bedroom.")else : # Print sel
print("looking around elsewhere.") print(sel)
# if-elif-else construct for areaif area > 15 :
Cars per capita (1)
print("big place!")elif area > 10 :
print("medium size, nice!")else :  Select the cars_per_cap column from cars as a Pandas Series and store
it as cpc.
print("pretty small.")
 Use cpc in combination with a comparison operator and 500. You
want to end up with a boolean Series that’s True if the corresponding
Driving right (1) country has a cars_per_cap of more than 500 and False otherwise.
Store this boolean Series as many_cars.
 Extract the drives_right column as a Pandas Series and store it as dr.  Use many_cars to subset cars, similar to what you did before. Store
 Use dr, a boolean Series, to subset the cars DataFrame. Store the the result as car_maniac.
resulting selection in sel.  Print out car_maniac to see if you got it right.
 Print sel, and assert that drives_right is True for all observations.
# Import cars dataimport pandas as pd
# Import cars dataimport pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)
cars = pd.read_csv('cars.csv', index_col = 0)
# Create car_maniac: observations that have a cars_per_cap over 500 print(x)
cpc = cars['cars_per_cap'] x=x+1
many_cars = cpc > 500
car_maniac = cars[many_cars] 0
# Print car_maniac
print(car_maniac)
1
Cars per capita (2)

 Use the code sample provided to create a DataFrame medium, that 2


includes all the observations of cars that have
a cars_per_cap between 100 and 500.
 Print out medium.
3
# Import cars dataimport pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0) 4
# Import numpy, you'll need thisimport numpy as np
Basic while loop
# Create medium: observations with cars_per_cap between 100 and 500
cpc = cars['cars_per_cap']  Create the variable offset with an initial value of 8.
between = np.logical_and(cpc > 100, cpc < 500)  Code a while loop that keeps running as long as offset is not equal
to 0. Inside the while loop:
medium = cars[between]
# Print medium o Print out the sentence "correcting...".
o Next, decrease the value of offset by 1. You can do this
print(medium) with offset = offset - 1.
o Finally, still within your loop, print out offset so you can see
while: warming up how it changes.
Can you tell how many printouts the following while loop will do?
# Initialize offset
x=1
offset = 8
while x < 4 :
# Code the while loopwhile offset != 0 : areas = [11.25, 18.0, 20.0, 10.75, 9.50]
print("correcting...") # Code the for loopfor area in areas :
offset = offset - 1 print(area)
print(offset)
Indexes and values (1)
Add conditionals
 Adapt the for loop in the sample code to use enumerate() and use two
iterator variables.
 Inside the while loop, complete the if-else statement:
 Update the print() statement so that on each run, a line of the
form "room x: y" should be printed, where x is the index of the list
o If offset is greater than zero, you should decrease offset by 1.
element and y is the actual list element, i.e. the area. Make sure to
o Else, you should increase offset by 1.
print out this exact string, with the correct spacing.
 If you’ve coded things correctly, hitting Submit Answer should work
this time. # areas list
areas = [11.25, 18.0, 20.0, 10.75, 9.50]
# Initialize offset # Change for loop to use enumerate() and update print()for index, area in
offset = -6 enumerate(areas) :
# Code the while loopwhile offset != 0 : print("room " + str(index) + ": " + str(area))
print("correcting...")
Indexes and values (2)
if offset > 0 :
For non-programmer folks, room 0: 11.25 is strange. Wouldn’t it be better if
offset = offset - 1 the count started at 1?
else : Adapt the print() function in the for loop so that the first printout
offset = offset + 1 becomes "room 1: 11.25", the second one "room 2: 18.0" and so on.
print(offset) # areas list
areas = [11.25, 18.0, 20.0, 10.75, 9.50]
Loop over a list
# Adapt the printoutfor index, area in enumerate(areas) :
Write a for loop that iterates over all elements of the areas list and prints out
every element separately. print("room " + str(index + 1) + ": " + str(area))

# areas list Loop over list of lists


Write a for loop that goes through each sublist of house and prints out the x is np_height = np.array(mlb['Height'])
y sqm, where x is the name of the room and y is the area of the room.
np_weight = np.array(mlb['Weight'])
# house list of lists baseball = [[180, 78.4],
house = [["hallway", 11.25], [215, 102.7],
["kitchen", 18.0], [210, 98.5],
["living room", 20.0], [188, 75.2]]
["bedroom", 10.75], np_baseball = np.array(baseball)
["bathroom", 9.50]] # Import numpy as npimport numpy as np
# Build a for loop from scratchfor x in house : # For loop over np_heightfor x in np_height[:5]: # edited/added
print("the " + x[0] + " is " + str(x[1]) + " sqm") print(str(x) + " inches")

Loop over dictionary # For loop over np_baseballfor x in np.nditer(np_baseball) :


print(x)
# Definition of dictionary
europe = {'spain':'madrid', 'france':'paris', 'germany':'berlin', Loop over DataFrame (1)
'norway':'oslo', 'italy':'rome', 'poland':'warsaw', 'austria':'vienna' } Write a for loop that iterates over the rows of cars and on each iteration
# Iterate over europefor key, value in europe.items() : perform two print() calls: one to print out the row label and one to print out
all of the rows contents.
print("the capital of " + str(key) + " is " + str(value))
# Import cars dataimport pandas as pd
Loop over NumPy array cars = pd.read_csv('cars.csv', index_col = 0)

 Import the numpy package under the local alias np. # Iterate over rows of carsfor lab, row in cars.iterrows() :
 Write a for loop that iterates over all elements in np_height and prints print(lab)
out "x inches" for each element, where x is the value in the array.
print(row)
 Write a for loop that visits every element of the np_baseball array and
prints it out.
Loop over DataFrame (2)
# edited/addedimport pandas as pd  Using the iterators lab and row, adapt the code in the for loop such
mlb = pd.read_csv('baseball.csv') that the first iteration prints out "US: 809", the second iteration "AUS:
731", and so on.
 The output should be in the form "country: cars_per_cap". Make sure # Import cars dataimport pandas as pd
to print out this exact string (with the correct spacing).
cars = pd.read_csv('cars.csv', index_col = 0)
o You can use str() to convert your integer data to a string so # Use .apply(str.upper)
that you can print it in conjunction with the country label.
cars["COUNTRY"] = cars["country"].apply(str.upper)

# Import cars dataimport pandas as pd


Random float
cars = pd.read_csv('cars.csv', index_col = 0)
# Adapt for loopfor lab, row in cars.iterrows() :  seed(): sets the random seed, so that your results are reproducible
between simulations. As an argument, it takes an integer of your
print(lab + ": " + str(row['cars_per_cap'])) choosing. If you call the function, no output will be generated.
 rand(): if you don’t specify any arguments, it generates a random float
Add column (1) between zero and one.

 Use a for loop to add a new column, named COUNTRY, that contains  Import numpy as np.
a uppercase version of the country names in the "country" column.  Use seed() to set the seed; as an argument, pass 123.
You can use the string method upper() for this.  Generate your first random float with rand() and print it out.
 To see if your code worked, print out cars. Don’t indent this code, so
that it’s not part of the for loop. # Import numpy as npimport numpy as np
# Set the seed
# Import cars dataimport pandas as pd
np.random.seed(123)
cars = pd.read_csv('cars.csv', index_col = 0)
# Generate and print random float
# Code for loop that adds COUNTRY columnfor lab, row in cars.iterrows() :
print(np.random.rand())
cars.loc[lab, "COUNTRY"] = row["country"].upper()
# Print cars Roll the dice
print(cars)
 Use randint() with the appropriate arguments to randomly generate
the integer 1, 2, 3, 4, 5 or 6. This simulates a dice. Print it out.
Add column (2)
 Repeat the outcome to see if the second throw is different. Again,
print out the result.
 Replace the for loop with a one-liner that uses .apply(str.upper). The
call should give the same result: a column COUNTRY should be
added to cars, containing an uppercase version of the country names. # Import numpy and set seedimport numpy as np
 As usual, print out cars to see the fruits of your hard labor
np.random.seed(123) Random Walk

# Use randint() to simulate a dice The next step


print(np.random.randint(1,7))
 Make a list random_walk that contains the first step, which is the
# Use randint() again integer 0.
print(np.random.randint(1,7))  Finish the for loop:
 The loop should run 100 times.
 On each iteration, set step equal to the last element in
Determine your next move
the random_walk list. You can use the index -1 for this.
 Next, let the if-elif-else construct update step for you.
 Roll the dice. Use randint() to create the variable dice.
 The code that appends step to random_walk is already coded.
 Finish the if-elif-else construct by replacing ___:
 Print out random_walk.
 If dice is 1 or 2, you go one step down.
 if dice is 3, 4 or 5, you go one step up.
 Else, you throw the dice again. The number of eyes is the number of # NumPy is imported, seed is set
steps you go up. # Initialize random_walk
 Print out dice and step. Given the value of dice, was step updated
correctly? random_walk = [0]
# Complete the ___for x in range(100) :
# NumPy is imported, seed is set # Set step: last element in random_walk
# Starting step step = random_walk[-1]
step = 50
# Roll the dice # Roll the dice
dice = np.random.randint(1,7) dice = np.random.randint(1,7)
# Finish the control constructif dice <= 2 :
step = step - 1elif dice <= 5 : # Determine next step
step = step + 1else : if dice <= 2:
step = step + np.random.randint(1,7) step = step - 1
# Print out dice and step elif dice <= 5:
print(dice) step = step + 1
print(step)
else: step = step + np.random.randint(1,7)
step = step + np.random.randint(1,7)
random_walk.append(step)
# append next_step to random_walk
random_walk.append(step) print(random_walk)
# Print random_walk
Visualize the walk
print(random_walk)
 Import matplotlib.pyplot as plt.
How low can you go?  Use plt.plot() to plot random_walk.
 Finish off with plt.show() to actually display the plot.
 Use max() in a similar way to make sure that step doesn’t go below
zero if dice <= 2.
# NumPy is imported, seed is set
 Hit Submit Answer and check the contents of random_walk.
# Initialization
# NumPy is imported, seed is set random_walk = [0]
# Initialize random_walk for x in range(100) :
random_walk = [0] step = random_walk[-1]
for x in range(100) : dice = np.random.randint(1,7)
step = random_walk[-1]
dice = np.random.randint(1,7) if dice <= 2:
step = max(0, step - 1)
if dice <= 2: elif dice <= 5:
# Replace below: use max to make sure step can't go below 0 step = step + 1
step = max(0, step - 1) else:
elif dice <= 5: step = step + np.random.randint(1,7)
step = step + 1
else: random_walk.append(step)
# Import matplotlib.pyplot as pltimport matplotlib.pyplot as plt step = max(0, step - 1)
# Plot random_walk elif dice <= 5:
plt.plot(random_walk) step = step + 1
# Show the plot else:
plt.show() step = step + np.random.randint(1,7)
random_walk.append(step)
Distribution

Simulate multiple walks # Append random_walk to all_walks

 Fill in the specification of the for loop so that the random walk is all_walks.append(random_walk)
simulated 10 times. # Print all_walks
 After the random_walk array is entirely populated, append the array
print(all_walks)
to the all_walks list.
 Finally, after the top-level for loop, print out all_walks.
Visualize all walks
# NumPy is imported; seed is set  Use np.array() to convert all_walks to a NumPy array, np_aw.
# Initialize all_walks (don't change this line)  Try to use plt.plot() on np_aw. Also include plt.show(). Does it work
out of the box?
all_walks = []
 Transpose np_aw by calling np.transpose() on np_aw. Call the
# Simulate random walk 10 timesfor i in range(10) : result np_aw_t. Now every row in np_all_walks represents the
position after 1 throw for the 10 random walks.
 Use plt.plot() to plot np_aw_t; also include a plt.show(). Does it look
# Code from before better this time?
random_walk = [0]
for x in range(100) : # numpy and matplotlib imported, seed set.
step = random_walk[-1] # initialize and populate all_walks

dice = np.random.randint(1,7) all_walks = []for i in range(10) :


random_walk = [0]

if dice <= 2: for x in range(100) :


step = random_walk[-1]
dice = np.random.randint(1,7) # numpy and matplotlib imported, seed set
if dice <= 2: # Simulate random walk 250 times
step = max(0, step - 1) all_walks = []for i in range(250) :
elif dice <= 5: random_walk = [0]
step = step + 1 for x in range(100) :
else: step = random_walk[-1]
step = step + np.random.randint(1,7) dice = np.random.randint(1,7)
random_walk.append(step) if dice <= 2:
all_walks.append(random_walk) step = max(0, step - 1)
# Convert all_walks to NumPy array: np_aw elif dice <= 5:
np_aw = np.array(all_walks) step = step + 1
# Plot np_aw and show else:
plt.plot(np_aw) step = step + np.random.randint(1,7)
plt.show()
# Clear the figure # Implement clumsiness
plt.clf() if np.random.rand() <= 0.001 :
# Transpose np_aw: np_aw_t step = 0
np_aw_t = np.transpose(np_aw)
# Plot np_aw_t and show random_walk.append(step)
plt.plot(np_aw_t) all_walks.append(random_walk)
plt.show() # Create and plot np_aw_t
np_aw_t = np.transpose(np.array(all_walks))
Implement clumsiness
plt.plot(np_aw_t)
 Change the range() function so that the simulation is performed 250 plt.show()
times.
 Finish the if condition so that step is set to 0 if a random float is less Plot the distribution
or equal to 0.001. Use np.random.rand().
 To make sure we’ve got enough simulations, go crazy. Simulate the # Plot histogram of ends, display plot
random walk 500 times.
 From np_aw_t, select the last row. This contains the endpoint of all plt.hist(ends)
500 random walks you’ve simulated. Store this NumPy array as ends. plt.show()
 Use plt.hist() to build a histogram of ends. Don’t forget plt.show() to
display the plot.
Calculate the odds
The histogram of the previous exercise was created from a NumPy
# numpy and matplotlib imported, seed set
array ends, that contains 500 integers. Each integer represents the end point
# Simulate random walk 500 times of a random walk. To calculate the chance that this end point is greater than
all_walks = []for i in range(500) : or equal to 60, you can count the number of integers in ends that are greater
than or equal to 60 and divide that number by 500, the total number of
random_walk = [0] simulations.
for x in range(100) : Well then, what’s the estimated chance that you’ll reach at least 60 steps high
step = random_walk[-1] if you play this Empire State Building game? The ends array is everything
you need; it’s available in your Python session so you can make calculations
dice = np.random.randint(1,7) in the IPython Shell.
if dice <= 2:
step = max(0, step - 1) 48.8%
elif dice <= 5:
step = step + 1
else: 76.6%

step = step + np.random.randint(1,7)


if np.random.rand() <= 0.001 : 78.4%
step = 0
random_walk.append(step)
95.9%
all_walks.append(random_walk)
# Create and plot np_aw_t
np_aw_t = np.transpose(np.array(all_walks))
# Select last row from np_aw_t: ends
ends = np_aw_t[-1,:]

You might also like