0% found this document useful (0 votes)
153 views16 pages

Python Data Science Toolbox Overview

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
153 views16 pages

Python Data Science Toolbox Overview

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

PYTHON DATA SCIENCE TOOLBOX (PART 1) They are all NoneType types.

Strings in Python Write a simple function


Execute the following code in the shell: def square():
object1 = "data" + "analysis" + "visualization" new_value = 4 ** 2
object2 = 1 * 3 return new_value
object3 = "1" * 3
 Complete the function header by adding the appropriate function
What are the values in object1, object2, and object3, respectively? name, shout.
 In the function body, concatenate the string, 'congratulations' with
another string, '!!!'. Assign the result to shout_word.
object1 contains "data + analysis +  Print the value of shout_word.
visualization", object2 contains "1*3", object3 contains 13.  Call the shout function.

object1 contains "data+analysis+visualization", object2 contains 3, object # Define the function shout
3 contains "13".
def shout():
object1 contains "dataanalysisvisualization", object2 contains 3, object3 c """Print a string with three exclamation marks"""
ontains "111". # Concatenate the strings: shout_word

Recapping built-in functions shout_word = 'congratulations' + '!!!'

 Assign str(x) to a variable y1: y1 = str(x)


# Print shout_word
 Assign print(x) to a variable y2: y2 = print(x)
 Check the types of the variables x, y1, and y2. print(shout_word)
# Call shout
What are the types of x, y1, and y2?
shout()
They are all str types.
Single-parameter functions

x is a float, y1 is an float, and y2 is a str.  Complete the function header by adding the parameter name, word.
 Assign the result of concatenating word with '!!!' to shout_word.
x is a float, y1 is a str, and y2 is a NoneType.  Print the value of shout_word.
 Call the shout() function, passing to it the string, 'congratulations'. return shout_word
# Pass 'congratulations' to shout: yell
# Define shout with the parameter, word
yell = shout('congratulations')
def shout(word):
# Print yell
"""Print a string with three exclamation marks"""
print(yell)
# Concatenate the strings: shout_word
shout_word = word + '!!!' Functions with multiple parameters

 Modify the function header such that it accepts two


# Print shout_word parameters, word1 and word2, in that order.
print(shout_word)  Concatenate each of word1 and word2 with '!!!' and assign
to shout1 and shout2, respectively.
# Call shout with the string 'congratulations'
 Concatenate shout1 and shout2 together, in that order, and assign
shout('congratulations') to new_shout.
 Pass the strings 'congratulations' and 'you', in that order, to a call
Functions that return single values to shout(). Assign the return value to yell.

 In the function body, concatenate the string in word with '!!!' and # Define shout with parameters word1 and word2
assign to shout_word.
 Replace the print() statement with the appropriate return statement. def shout(word1, word2):
 Call the shout() function, passing to it the string, 'congratulations', and """Concatenate strings with three exclamation marks"""
assigning the call to the variable, yell.
# Concatenate word1 with '!!!': shout1
 To check if yell contains the value returned by shout(), print the value
of yell. shout1 = word1 + '!!!'

# Define shout with the parameter, word # Concatenate word2 with '!!!': shout2
def shout(word): shout2 = word2 + '!!!'
"""Return a string with three exclamation marks"""
# Concatenate the strings: shout_word # Concatenate shout1 with shout2: new_shout
shout_word = word + '!!!' new_shout = shout1 + shout2

# Replace print with return


# Return new_shout  Construct a tuple shout_words, composed of shout1 and shout2.
 Call shout_all() with the strings 'congratulations' and 'you' and assign
return new_shout the result to yell1 and yell2 (remember, shout_all() returns 2
# Pass 'congratulations' and 'you' to shout: yell variables!).
yell = shout('congratulations', 'you')
# Define shout_all with parameters word1 and word2
# Print yell
def shout_all(word1, word2):
print(yell)
"""Return a tuple of strings"""
A brief introduction to tuples # Concatenate word1 with '!!!': shout1
shout1 = word1 + '!!!'
 Print out the value of nums in the IPython shell. Note the elements in
the tuple.
 In the IPython shell, try to change the first element of nums to the # Concatenate word2 with '!!!': shout2
value 2 by doing an assignment: nums[0] = 2. What happens?
shout2 = word2 + '!!!'
 Unpack nums to the variables num1, num2, and num3.
 Construct a new tuple, even_nums composed of the same elements
# Construct a tuple with shout1 and shout2: shout_words
in nums, but with the 1st element replaced with the value, 2.
shout_words = (shout1, shout2)
# edited/added
nums = (3,4,6) # Return shout_words
# Unpack nums into num1, num2, and num3 return shout_words
num1, num2, num3 = nums # Pass 'congratulations' and 'you' to shout_all(): yell1, yell2
# Construct even_nums yell1, yell2 = shout_all('congratulations', 'you')
even_nums = (2, num2, num3) # Print yell1 and yell2
print(yell1)
Functions that return multiple values
print(yell2)
 Modify the function header such that the function name is
now shout_all, and it accepts two parameters, word1 and word2, in Bringing it all together (1)
that order.
 Concatenate the string '!!!' to each of word1 and word2 and assign  Import the pandas package with the alias pd.
to shout1 and shout2, respectively.
 Import the file '[Link]' using the pandas function read_csv(). Bringing it all together (2)
Assign the resulting DataFrame to df.
 Complete the for loop by iterating over col, the 'lang' column in the  Define the function count_entries(), which has two parameters. The
DataFrame df. first parameter is df for the DataFrame and the second is col_name for
 Complete the bodies of the if-else statements in the for loop: if the the column name.
key is in the dictionary langs_count, add 1 to the value corresponding  Complete the bodies of the if-else statements in the for loop: if the
to this key in the dictionary, else add the key to langs_count and set key is in the dictionary langs_count, add 1 to its current
the corresponding value to 1. Use the loop variable entry in your code. value, else add the key to langs_count and set its value to 1. Use the
loop variable entry in your code.
# Import pandas  Return the langs_count dictionary from inside
the count_entries() function.
import pandas as pd  Call the count_entries() function by passing to it tweets_df and the
# Import Twitter data as DataFrame: df name of the column, 'lang'. Assign the result of the call to the
variable result.
df = pd.read_csv('[Link]')
# Initialize an empty dictionary: langs_count # edited/added
langs_count = {} tweets_df = pd.read_csv('[Link]')
# Extract column from DataFrame: col
col = df['lang'] # Define count_entries()
# Iterate over lang column in DataFrame def count_entries(df, col_name):
for entry in col: """Return a dictionary with counts of
occurrences as value for each key."""
# If the language is in langs_count, add 1
if entry in langs_count.keys(): # Initialize an empty dictionary: langs_count
langs_count[entry] += 1 langs_count = {}
# Else add the language to langs_count, set the value to 1
else: # Extract column from DataFrame: col
langs_count[entry] = 1 col = df[col_name]
# Print the populated dictionary
print(langs_count) # Iterate over lang column in DataFrame
for entry in col: Try calling func1() and func2() in the shell, then answer the following
questions:
What are the values printed out when you call func1() and func2()?
# If the language is in langs_count, add 1
What is the value of num in the global scope after
if entry in langs_count.keys(): calling func1() and func2()?
langs_count[entry] += 1
func1() prints out 3, func2() prints out 6, and the value of num in
# Else add the language to langs_count, set the value to 1 the global scope is 3.
else:
func1() prints out 3, func2() prints out 3, and the value of num in
langs_count[entry] = 1 the global scope is 3.

func1() prints out 3, func2() prints out 10, and the value of num in
# Return the langs_count dictionary the global scope is 10.
return langs_count
func1() prints out 3, func2() prints out 10, and the value of num in
# Call count_entries(): result the global scope is 6.
result = count_entries(tweets_df, 'lang')
The keyword global
# Print the result
print(result)  Use the keyword global to alter the object team in the global scope.
 Change the value of team in the global scope to the string "justice
league". Assign the result to team.
Pop quiz on understanding scope
 Hit the Submit button to see how executing your newly defined
def func1(): function change_team() changes the value of the name team!
num = 3
# Create a string: team
print(num)
team = "teen titans"
# Define change_team()
def func2():
def change_team():
global num
"""Change the value of the global variable team."""
double_num = num * 2
num = 6
# Use team in global scope
print(double_num)
global team
 Complete the return value: each element of the tuple should be a call
to inner(), passing in the parameters from three_shouts() as arguments
# Change the value of team in global: team to each call.
team = "justice league"
# Print team # Define three_shouts
print(team) def three_shouts(word1, word2, word3):
# Call change_team() """Returns a tuple of strings
change_team() concatenated with '!!!'."""
# Print team
print(team) # Define inner
def inner(word):
Python’s built-in scope """Returns a string concatenated with '!!!'."""
Here you’re going to check out Python’s built-in scope, which is really just a return word + '!!!'
built-in module called builtins. However, to query builtins, you’ll need
to import builtins ‘because the name builtins is not itself built in…No, I’m
serious!’ (Learning Python, 5th edition, Mark Lutz). After executing import # Return a tuple of strings
builtins in the IPython Shell, execute dir(builtins) to print a list of all the
names in the module builtins. Have a look and you’ll see a bunch of names return (inner(word1), inner(word2), inner(word3))
that you’ll recognize! Which of the following names is NOT in the module # Call three_shouts() and print
builtins?
print(three_shouts('a', 'b', 'c'))

 ‘sum’ Nested Functions II


 ‘range’
 Complete the function header of the inner function with the function
 ‘array’ name inner_echo() and a single parameter word1.
 ‘tuple’  Complete the function echo() so that it returns inner_echo.
 We have called echo(), passing 2 as an argument, and assigned the
Nested Functions I resulting function to twice. Your job is to call echo(), passing 3 as an
argument. Assign the resulting function to thrice.
 Complete the function header of the nested function with the function  Hit Submit to call twice() and thrice() and print the results.
name inner() and a single parameter word.
# Define echo
def echo(n): # Concatenate word with itself: echo_word
"""Return the inner_echo function.""" echo_word = word*2

# Define inner_echo # Print echo_word


def inner_echo(word1): print(echo_word)
"""Concatenate n copies of word1."""
echo_word = word1 * n # Define inner function shout()
return echo_word def shout():
"""Alter a variable in the enclosing scope"""
# Return inner_echo
return inner_echo # Use echo_word in nonlocal scope
# Call echo: twice nonlocal echo_word
twice = echo(2)
# Call echo: thrice # Change echo_word to echo_word concatenated with '!!!'
thrice = echo(3) echo_word = echo_word + '!!!'
# Call twice() and thrice() then print
print(twice('hello'), thrice('hello')) # Call function shout()
shout()
The keyword nonlocal and nested functions

 Assign to echo_word the string word, concatenated with itself. # Print echo_word
 Use the keyword nonlocal to alter the value of echo_word in the
print(echo_word)
enclosing scope.
 Alter echo_word to echo_word concatenated with ‘!!!’. # Call function echo_shout() with argument 'hello'
 Call the function echo_shout(), passing it a single argument ‘hello’. echo_shout('hello')

# Define echo_shout()def echo_shout(word): Functions with one default argument


"""Change the value of a nonlocal variable"""
 Complete the function header with the function name shout_echo. It Functions with multiple default arguments
accepts an argument word1 and a default argument echo with default
value 1, in that order.  Complete the function header with the function name shout_echo. It
 Use the * operator to concatenate echo copies of word1. Assign the accepts an argument word1, a default argument echo with default
result to echo_word. value 1 and a default argument intense with default value False, in
 Call shout_echo() with just the string, "Hey". Assign the result that order.
to no_echo.  In the body of the if statement, make the string
 Call shout_echo() with the string "Hey" and the value 5 for the default object echo_word upper case by applying the method .upper() on it.
argument, echo. Assign the result to with_echo.  Call shout_echo() with the string, "Hey", the value 5 for echo and the
value True for intense. Assign the result to with_big_echo.
# Define shout_echodef shout_echo(word1, echo=1):  Call shout_echo() with the string "Hey" and the value True for intense.
Assign the result to big_no_echo.
"""Concatenate echo copies of word1 and three
exclamation marks at the end of the string.""" # Define shout_echo
def shout_echo(word1, echo=1, intense=False):
# Concatenate echo copies of word1 using *: echo_word """Concatenate echo copies of word1 and three
echo_word = word1 * echo exclamation marks at the end of the string."""

# Concatenate '!!!' to echo_word: shout_word # Concatenate echo copies of word1 using *: echo_word
shout_word = echo_word + '!!!' echo_word = word1 * echo

# Return shout_word # Make echo_word uppercase if intense is True


return shout_word if intense is True:
# Call shout_echo() with "Hey": no_echo # Make uppercase and concatenate '!!!': echo_word_new
no_echo = shout_echo("Hey") echo_word_new = echo_word.upper() + '!!!'
# Call shout_echo() with "Hey" and echo=5: with_echo else:
with_echo = shout_echo("Hey", echo=5) # Concatenate '!!!' to echo_word: echo_word_new
# Print no_echo and with_echo echo_word_new = echo_word + '!!!'
print(no_echo)
print(with_echo) # Return echo_word_new
return echo_word_new
# Call shout_echo() with "Hey", echo=5 and intense=True: with_big_echo # Return hodgepodge
with_big_echo = shout_echo("Hey", echo=5, intense=True) return hodgepodge
# Call shout_echo() with "Hey" and intense=True: big_no_echo # Call gibberish() with one string: one_word
big_no_echo = shout_echo("Hey", intense=True) one_word = gibberish("luke")
# Print with_big_echo and big_no_echo # Call gibberish() with five strings: many_words
print(with_big_echo) many_words = gibberish("luke", "leia", "han", "obi", "darth")
print(big_no_echo) # Print one_word and many_words
print(one_word)
Functions with variable-length arguments (*args)
print(many_words)
 Complete the function header with the function name gibberish. It
accepts a single flexible argument *args. Functions with variable-length keyword arguments (**kwargs)
 Initialize a variable hodgepodge to an empty string.
 Return the variable hodgepodge at the end of the function body.  Complete the function header with the function name report_status. It
 Call gibberish() with the single string, "luke". Assign the result accepts a single flexible argument **kwargs.
to one_word.  Iterate over the key-value pairs of kwargs to print out the keys and
 Hit the Submit button to call gibberish() with multiple arguments and values, separated by a colon ‘:’.
to print the value to the Shell.  In the first call to report_status(), pass the following keyword-value
pairs: name="luke", affiliation="jedi" and status="missing".
 In the second call to report_status(), pass the following keyword-
# Define gibberishdef gibberish(*args):
value pairs: name="anakin", affiliation="sith
"""Concatenate strings in *args together.""" lord" and status="deceased".

# Initialize an empty string: hodgepodge # Define report_statusdef report_status(**kwargs):

hodgepodge = '' """Print out the status of a movie character."""

# Concatenate the strings in args print("\nBEGIN: REPORT\n")

for word in args:


hodgepodge += word # Iterate over the key-value pairs of kwargs
for key, value in [Link]():
# Print out the keys and values, separated by a colon ':'
print(key + ": " + value) # Iterate over the column in DataFrame
for entry in col:
print("\nEND REPORT")
# First call to report_status() # If entry is in cols_count, add 1
report_status(name="luke", affiliation="jedi", status="missing") if entry in cols_count.keys():
# Second call to report_status() cols_count[entry] += 1
report_status(name="anakin", affiliation="sith lord", status="deceased")
# Else add the entry to cols_count, set the value to 1
Bringing it all together (1)
else:
 Complete the function header by supplying the parameter for a cols_count[entry] = 1
DataFrame df and the parameter col_name with a default value
of 'lang' for the DataFrame column name.
 Call count_entries() by passing the tweets_df DataFrame and the # Return the cols_count dictionary
column name 'lang'. Assign the result to result1. Note that return cols_count
since 'lang' is the default value of the col_name parameter, you don’t
have to specify it here. # Call count_entries(): result1
 Call count_entries() by passing the tweets_df DataFrame and the result1 = count_entries(tweets_df, col_name='lang')
column name 'source'. Assign the result to result2.
# Call count_entries(): result2

# Define count_entries()def count_entries(df, col_name='lang'): result2 = count_entries(tweets_df, col_name='source')

"""Return a dictionary with counts of # Print result1 and result2

occurrences as value for each key.""" print(result1)


print(result2)

# Initialize an empty dictionary: cols_count


Bringing it all together (2)
cols_count = {}
 Complete the function header by supplying the parameter for the
DataFrame df and the flexible argument *args.
# Extract column from DataFrame: col  Complete the for loop within the function definition so that the loop
col = df[col_name] occurs over the tuple args.
 Call count_entries() by passing the tweets_df DataFrame and the cols_count[entry] = 1
column name 'lang'. Assign the result to result1.
 Call count_entries() by passing the tweets_df DataFrame and the
column names 'lang' and 'source'. Assign the result to result2. # Return the cols_count dictionary
return cols_count
# Define count_entries()def count_entries(df, *args):
# Call count_entries(): result1
"""Return a dictionary with counts of
result1 = count_entries(tweets_df, 'lang')
occurrences as value for each key."""
# Call count_entries(): result2
result2 = count_entries(tweets_df, 'lang', 'source')
#Initialize an empty dictionary: cols_count
# Print result1 and result2
cols_count = {}
print(result1)
print(result2)
# Iterate over column names in args
for col_name in args: Pop quiz on lambda functions

 How would you write a lambda function add_bangs that adds three
# Extract column from DataFrame: col exclamation points '!!!' to the end of a string a?
col = df[col_name]  How would you call add_bangs with the argument 'hello'?

The lambda function definition is: add_bangs = (a + '!!!'), and the


# Iterate over the column in DataFrame function call is: add_bangs('hello').
for entry in col:
The lambda function definition is: add_bangs = (lambda a: a + '!!!'), and
# If entry is in cols_count, add 1 the function call is: add_bangs('hello').

if entry in cols_count.keys():
The lambda function definition is: (lambda a: a + '!!!') = add_bangs, and
cols_count[entry] += 1 the function call is: add_bangs('hello').

Writing a lambda function you already know


# Else add the entry to cols_count, set the value to 1
Take a look at this function definition:
else:
def echo_word(word1, echo):
"""Concatenate echo copies of word1.""" spells = ['protego', 'accio', 'expecto patronum', 'legilimens']
words = word1 * echo # Use map() to apply a lambda function over spells: shout_spells
return words shout_spells = map(lambda item: item + '!!!', spells)
# Convert shout_spells to a list: shout_spells_list
 Define the lambda function echo_word using the
shout_spells_list = list(shout_spells)
variables word1 and echo. Replicate what the original function
definition for echo_word() does above. # Print the result
 Call echo_word() with the string argument 'hey' and the value 5, in print(shout_spells_list)
that order. Assign the call to result.
Filter() and lambda functions
# Define echo_word as a lambda function: echo_word
echo_word = (lambda word1, echo: word1 * echo)  In the filter() call, pass a lambda function and the list of
strings, fellowship. The lambda function should check if the number
# Call echo_word: result of characters in a string member is greater than 6; use
result = echo_word('hey', 5) the len() function to do this. Assign the resulting filter object to result.
 Convert result to a list and print out the list.
# Print result
print(result) # Create a list of strings: fellowship

Map() and lambda functions fellowship = ['frodo', 'samwise', 'merry', 'pippin', 'aragorn', 'boromir', 'legolas',
'gimli', 'gandalf']
For example:
# Use filter() to apply a lambda function over fellowship: result
nums = [2, 4, 6, 8, 10] result = filter(lambda member: len(member) > 6, fellowship)
# Convert result to a list: result_list
result = map(lambda a: a ** 2, nums) result_list = list(result)
# Print result_list
 In the map() call, pass a lambda function that concatenates the
string '!!!' to a string item; also pass the list of strings, spells. Assign print(result_list)
the resulting map object to shout_spells.
 Convert shout_spells to a list and print out the list. Reduce() and lambda functions
Remember gibberish() from a few exercises back?
# Create a list of strings: spells
# Define gibberish
def gibberish(*args): len(525600)
"""Concatenate strings in *args together."""
hodgepodge = '' len(('jaime', 'cersei', 'tywin', 'tyrion', 'joffrey'))
for word in args: Which of the function calls raises an error and what type of error is raised?
hodgepodge += word
return hodgepodge The call len('There is a beast in every man and it stirs when you put a
sword in his hand.') raises a TypeError.
 Import the reduce function from the functools module.
 In the reduce() call, pass a lambda function that takes two string The call len(['robb', 'sansa', 'arya', 'eddard', 'jon']) raises an IndexError.
arguments item1 and item2 and concatenates them; also pass the list
of strings, stark. Assign the result to result. The first argument
The call len(525600) raises a TypeError.
to reduce() should be the lambda function and the second argument is
the list stark.
The call len(('jaime', 'cersei', 'tywin', 'tyrion', 'joffrey')) raises
a NameError.
# Import reduce from functoolsfrom functools import reduce
# Create a list of strings: stark Error handling with try-except
stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']
 Initialize the variables echo_word and shout_words to empty strings.
# Use reduce() to apply a lambda function over stark: result  Add the keywords try and except in the appropriate locations for the
result = reduce(lambda item1, item2: item1 + item2, stark) exception handling block.
 Use the * operator to concatenate echo copies of word1. Assign the
# Print the result result to echo_word.
print(result)  Concatenate the string '!!!' to echo_word. Assign the result
to shout_words.
Pop quiz about errors
Take a look at the following function calls to len(): # Define shout_echodef shout_echo(word1, echo=1):
"""Concatenate echo copies of word1 and three
len('There is a beast in every man and it stirs when you put a sword in his han
d.') exclamation marks at the end of the string."""

len(['robb', 'sansa', 'arya', 'eddard', 'jon']) # Initialize empty strings: echo_word, shout_words
echo_word = ''
shout_words = ''
# Raise an error with raise
# Add exception handling with try-except if echo < 0:
try: raise ValueError('echo must be greater than or equal to 0')
# Concatenate echo copies of word1 using *: echo_word
echo_word = word1 * echo # Concatenate echo copies of word1 using *: echo_word
echo_word = word1 * echo
# Concatenate '!!!' to echo_word: shout_words
shout_words = echo_word + '!!!' # Concatenate '!!!' to echo_word: shout_word
except: shout_word = echo_word + '!!!'
# Print error message
print("word1 must be a string and echo must be an integer.") # Return shout_word
return shout_word
# Return shout_words # Call shout_echo
return shout_words shout_echo("particle", echo=5)
# Call shout_echo
Bringing it all together
shout_echo("particle", echo="accelerator")
Bringing it all together (1)
Error handling by raising an error
 In the filter() call, pass a lambda function and the sequence of tweets
 Complete the if statement by checking if the value of echo is less as strings, tweets_df['text']. The lambda function should check if the
than 0. first 2 characters in a tweet x are ‘RT’. Assign the resulting filter
 In the body of the if statement, add a raise statement that raises object to result. To get the first 2 characters in a tweet x, use x[0:2].
a ValueError with message 'echo must be greater than or equal to To check equality, use a Boolean filter with ==.
0' when the value supplied by the user to echo is less than 0.  Convert result to a list and print out the list.

# Define shout_echodef shout_echo(word1, echo=1): # Select retweets from the Twitter DataFrame: result
"""Concatenate echo copies of word1 and three result = filter(lambda x: x[0:2] == 'RT', tweets_df['text'])
exclamation marks at the end of the string.""" # Create list from filter object result: res_list
res_list = list(result) if entry in cols_count.keys():
# Print all retweets in res_listfor tweet in res_list: cols_count[entry] += 1
print(tweet) # Else add the entry to cols_count, set the value to 1
else:
Bringing it all together (2)
cols_count[entry] = 1
 Add a try block so that when the function is called with the correct
arguments, it processes the DataFrame and returns a dictionary of
# Return the cols_count dictionary
results.
 Add an except block so that when the function is called incorrectly, it return cols_count
displays the following error message: 'The DataFrame does not have a
' + col_name + ' column.'.
# Add except block
# Define count_entries()def count_entries(df, col_name='lang'): except:
"""Return a dictionary with counts of print('The DataFrame does not have a ' + col_name + ' column.')
occurrences as value for each key.""" # Call count_entries(): result1
result1 = count_entries(tweets_df, 'lang')
# Initialize an empty dictionary: cols_count # Print result1
cols_count = {} print(result1)

Bringing it all together (3)


# Add try block
try:  If col_name is not a column in the DataFrame df, raise a ValueError
'The DataFrame does not have a ' + col_name + ' column.'.
# Extract column from DataFrame: col  Call your new function count_entries() to analyze the 'lang' column
col = df[col_name] of tweets_df. Store the result in result1.
 Print result1. This has been done for you, so hit ‘Submit Answer’ to
check out the result. In the next exercise, you’ll see that it raises the
# Iterate over the column in DataFrame necessary ValueErrors.
for entry in col:
# Define count_entries()def count_entries(df, col_name='lang'):

# If entry is in cols_count, add 1 """Return a dictionary with counts of


occurrences as value for each key.""" result1 = count_entries(tweets_df, 'lang')
# Print result1
# Raise a ValueError if col_name is NOT in DataFrame print(result1)
if col_name not in [Link]:
Bringing it all together: testing your error handling skills
raise ValueError('The DataFrame does not have a ' + col_name + ' colu
mn.') You have just written error handling into your count_entries() function so
that, when the user passes the function a column (as 2nd argument) NOT
contained in the DataFrame (1st argument), a ValueError is thrown. You’re
# Initialize an empty dictionary: cols_count now going to play with this function: it is loaded into pre-exercise code, as is
the DataFrame tweets_df. Try calling count_entries(tweets_df, 'lang') to
cols_count = {}
confirm that the function behaves as it should. Then
call count_entries(tweets_df, 'lang1'): what is the last line of the output?
# Extract column from DataFrame: col
col = df[col_name] ‘ValueError: The DataFrame does not have the requested column.’

‘ValueError: The DataFrame does not have a lang1 column.’


# Iterate over the column in DataFrame
for entry in col: ‘TypeError: The DataFrame does not have the requested column.’

# If entry is in cols_count, add 1


if entry in cols_count.keys():
cols_count[entry] += 1
# Else add the entry to cols_count, set the value to 1
else:
cols_count[entry] = 1

# Return the cols_count dictionary


return cols_count
# Call count_entries(): result1

You might also like