DS100-1 WS 2.5 Enrico, DM

This document contains code examples for cleaning and manipulating data in Python. It includes examples of importing CSV files, concatenating files, melting a dataframe, extracting columns from another column, and merging files. The examples demonstrate common data cleaning tasks like handling missing data, transforming column types, and joining datasets.

Uploaded by

Analyn Enrico

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

196 views5 pages

DS100-1 WS 2.5 Enrico, DM

Uploaded by

Analyn Enrico

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Worksheet 2.

5
DS100-1
CLEANING DATA IN PYTHON
APPLIED DATA SCIENCE
Name:

Enrico, Dionne Marc L. Page 1 of 1

Write codes in Jupyter notebook as required by the problems. Copy both code and output as screen grab or screen shot and paste
them here.

1 Import literacy_birth_rate.csv and assign into a dataframe named data_1. Write a code that explores this
dataframe. List at least 4 problems associated with this dataframe.
Code and Output

import pandas as pd
literacy_birth_rate_df = pd.read_csv("literacy_birth_rate.csv")
print(literacy_birth_rate_df.info())

2 Import the following files: uber_apr.csv, uber_may.csv, uber_jun.csv. Concatenate these files into a single file,
uber. Print the first 6 lines of the resulting DataFrame. Ensure that the indexes are in order.
Code and Output

import pandas as pd
# Read in the csv files using the read_csv function
apr_df = pd.read_csv("uber_apr.csv")
Page 1 of 5
may_df = pd.read_csv("uber_may.csv")
jun_df = pd.read_csv("uber_jun.csv")

uber_df = pd.concat([apr_df, may_df, jun_df])

print(uber_df.head(6))

3 Import tuberculosis.csv. Print the first five lines. Melt the DataFrame, keeping the country and year columns fixed.
Print the last five lines of the melted DataFrame.
Code and Output

import pandas as pd
tb = pd.read_csv("tuberculosis.csv")
print(tb.head())
tb_melt = pd.melt(tb, id_vars=['country', 'year'])
print(tb_melt.tail())

Page 2 of 5
4 Use the melted DataFrame in the previous problem. Create (and populate) a gender and an age column from the variable
column. Print the first five lines of the resulting DataFrame. Convert the age column to a numeric data type. Hint: use
pd.to_numeric, with the errors parameter equal to ‘coerce’. Show evidence that this column has indeed been
transformed into a numeric.
Code and Output

import pandas as pd
tb = pd.read_csv("tuberculosis.csv")
print(tb.head())
tb_melt = pd.melt(tb, id_vars=['country', 'year'])
print(tb_melt.tail())
tb_melt['gender'] = tb_melt.variable.str[0]
tb_melt['age'] = tb_melt.variable.str[1:]
print(tb_melt.head())
tb_melt['age']=pd.to_numeric(tb_melt['age'], errors='coerce')
print(tb_melt.info(['age']))
print("The age column is now in float64")

Page 3 of 5
5 Merge the files site.csv and visited.csv into a single dataframe. Use the column name of site and the column
site of visited. Make sure that the index labels are in order. Print the resulting dataframe.
Code and Output

import pandas as pd
# merging two csv files

df = pd.concat(
map(pd.read_csv, ['site.csv', 'visited.csv']), ignore_index=True)
print(df)

Page 4 of 5
Page 5 of 5

Cal. Tech 16
No ratings yet
Cal. Tech 16
6 pages
Assessments Lesson 4 Unit 3 Heat of Reaction
No ratings yet
Assessments Lesson 4 Unit 3 Heat of Reaction
3 pages
Python Data Visualization Exercises
No ratings yet
Python Data Visualization Exercises
8 pages
Mapua EE101-1L Exp 1
No ratings yet
Mapua EE101-1L Exp 1
3 pages
Basic Concepts of Vibrating System
No ratings yet
Basic Concepts of Vibrating System
14 pages
Polytropic Process & Free Expansion
No ratings yet
Polytropic Process & Free Expansion
6 pages
Calculate Series & Parallel Resistances
No ratings yet
Calculate Series & Parallel Resistances
11 pages
Ammeter and Voltmeter Usage Guide
No ratings yet
Ammeter and Voltmeter Usage Guide
3 pages
Isothermal Process Calculations for Air
No ratings yet
Isothermal Process Calculations for Air
5 pages
Assign 2-Resolution and Composition
No ratings yet
Assign 2-Resolution and Composition
9 pages
CH1 Q1 Cengage PDF
No ratings yet
CH1 Q1 Cengage PDF
3 pages
Engg Econ Questions
No ratings yet
Engg Econ Questions
7 pages
Barredo Michael MMW Introduction-Of-Data-Management
No ratings yet
Barredo Michael MMW Introduction-Of-Data-Management
84 pages
Thermo HW Solution3
No ratings yet
Thermo HW Solution3
6 pages
Python Screenshot Submission Guide
No ratings yet
Python Screenshot Submission Guide
4 pages
Sample 35313
No ratings yet
Sample 35313
16 pages
Physics 71
No ratings yet
Physics 71
9 pages
Automotive Crankcase Ventilation
No ratings yet
Automotive Crankcase Ventilation
11 pages
Exp 5
No ratings yet
Exp 5
22 pages
Lab Session 1: Uniform Motion: Name: Neil Yvann N. Aquino Block: BSME 1A Score: Date
No ratings yet
Lab Session 1: Uniform Motion: Name: Neil Yvann N. Aquino Block: BSME 1A Score: Date
14 pages
E102 - Agustin
No ratings yet
E102 - Agustin
20 pages
Rec
No ratings yet
Rec
5 pages
Quiz 1. Thermodynamics (Part 1) - 1
50% (2)
Quiz 1. Thermodynamics (Part 1) - 1
2 pages
Assigment 2
No ratings yet
Assigment 2
6 pages
Heat Transfer Problems
No ratings yet
Heat Transfer Problems
22 pages
Lesson 1.0 Fluid Properties
No ratings yet
Lesson 1.0 Fluid Properties
13 pages
MATH 499 Homework 1
No ratings yet
MATH 499 Homework 1
5 pages
Electrical and Fluid Systems Quiz
No ratings yet
Electrical and Fluid Systems Quiz
19 pages
Long Quiz 2 Thermodynamics
No ratings yet
Long Quiz 2 Thermodynamics
1 page
Thermo Pset
No ratings yet
Thermo Pset
9 pages
Thermodynamics Problem Set
No ratings yet
Thermodynamics Problem Set
4 pages
Heat Transfer and Thermal Equilibrium
No ratings yet
Heat Transfer and Thermal Equilibrium
14 pages
Practice Exercise 1: Suppose You Wish To Build A Model That Will Help Determine If
0% (1)
Practice Exercise 1: Suppose You Wish To Build A Model That Will Help Determine If
4 pages
Combustion Chamber Calculations
No ratings yet
Combustion Chamber Calculations
10 pages
Tutorial 5-Mass-Energy Conservation-2021-Ii
No ratings yet
Tutorial 5-Mass-Energy Conservation-2021-Ii
5 pages
QUIZ 1 RIPH 2024 Key To Correction
No ratings yet
QUIZ 1 RIPH 2024 Key To Correction
6 pages
Adrian Paul Mayhay (Engineering Economy) PDF
No ratings yet
Adrian Paul Mayhay (Engineering Economy) PDF
3 pages
Thermodynamics Lecture PDF
No ratings yet
Thermodynamics Lecture PDF
404 pages
Day 1 Sample Problems
No ratings yet
Day 1 Sample Problems
8 pages
Assessment No 3.1
No ratings yet
Assessment No 3.1
2 pages
Python Data Visualization Tasks
No ratings yet
Python Data Visualization Tasks
6 pages
Depreciation Methods Guide
No ratings yet
Depreciation Methods Guide
15 pages
Fluid Mechanics and Thermodynamics Exam
No ratings yet
Fluid Mechanics and Thermodynamics Exam
12 pages
ME130-2 Fluid Mechanics For M.E
No ratings yet
ME130-2 Fluid Mechanics For M.E
10 pages
Colegio de San Juan de Letran: Bachelor of Science in Mechanical Engineering
No ratings yet
Colegio de San Juan de Letran: Bachelor of Science in Mechanical Engineering
2 pages
Descriptive Statistics in Engineering Data
100% (1)
Descriptive Statistics in Engineering Data
21 pages
Thermodynamics 1: Precious Arlene Villaroza-Melendrez
100% (1)
Thermodynamics 1: Precious Arlene Villaroza-Melendrez
35 pages
Phys2 Ch3 Firstlawthermo New
No ratings yet
Phys2 Ch3 Firstlawthermo New
54 pages
Thermodynamics Quiz Questions and Answers
No ratings yet
Thermodynamics Quiz Questions and Answers
48 pages
Pete Circuits
No ratings yet
Pete Circuits
41 pages
DE Module 2
No ratings yet
DE Module 2
21 pages
Thermo Chapter1
No ratings yet
Thermo Chapter1
11 pages
This Study Resource Was: RZL110 - A2 - The Life and Works of Rizal A.Y. 2019-2020 Quiz 1 Prof. Janet Clemente
No ratings yet
This Study Resource Was: RZL110 - A2 - The Life and Works of Rizal A.Y. 2019-2020 Quiz 1 Prof. Janet Clemente
3 pages
Termooo
No ratings yet
Termooo
11 pages
Material Balance in Distillation Processes
No ratings yet
Material Balance in Distillation Processes
17 pages
Python Data Cleaning Worksheet
No ratings yet
Python Data Cleaning Worksheet
1 page
Import Import As Import As: #Default To CSV
No ratings yet
Import Import As Import As: #Default To CSV
6 pages
Cleaning Data with Python Techniques
No ratings yet
Cleaning Data with Python Techniques
24 pages
Chapter 2
No ratings yet
Chapter 2
36 pages
Plate 2
No ratings yet
Plate 2
1 page
Gate Question On Dbms
No ratings yet
Gate Question On Dbms
5 pages
Valve & Amplifier Design, EL34 (6CA7) Data, Mullard Valves
No ratings yet
Valve & Amplifier Design, EL34 (6CA7) Data, Mullard Valves
17 pages
Aatish Reddy Cloud Data Engineer 1+yrs AWS Snowflake Pyspark Resume
No ratings yet
Aatish Reddy Cloud Data Engineer 1+yrs AWS Snowflake Pyspark Resume
2 pages
Geography
No ratings yet
Geography
12 pages
Trend Tracer
No ratings yet
Trend Tracer
15 pages
7 3 Cost Behavior Analysis 2022
No ratings yet
7 3 Cost Behavior Analysis 2022
50 pages
Statement of Purpose
No ratings yet
Statement of Purpose
2 pages
Ewijst
No ratings yet
Ewijst
18 pages
Bridge Deck Width Analysis Methods
No ratings yet
Bridge Deck Width Analysis Methods
3 pages
Power System II Lab 5 Sem
No ratings yet
Power System II Lab 5 Sem
54 pages
Viscoelasticity Loki 2019
No ratings yet
Viscoelasticity Loki 2019
8 pages
Aconis2000e Introduction PDF
100% (1)
Aconis2000e Introduction PDF
19 pages
Statistics Curriculum Overview
No ratings yet
Statistics Curriculum Overview
5 pages
Vfe Out-Band DCN Mop
No ratings yet
Vfe Out-Band DCN Mop
34 pages
ImageNet Visual Recognition Challenge
No ratings yet
ImageNet Visual Recognition Challenge
43 pages
Textile Scouring: Absorbency & Weight Loss
No ratings yet
Textile Scouring: Absorbency & Weight Loss
4 pages
3rd - Multiplication - 11.10-14.14
No ratings yet
3rd - Multiplication - 11.10-14.14
2 pages
2021 08 06 Nasa STD 5020b - Final PDF
No ratings yet
2021 08 06 Nasa STD 5020b - Final PDF
114 pages
YEAR 8 Science Curriculum Overview 2020 2021 Nov 2020 End of Term Test
No ratings yet
YEAR 8 Science Curriculum Overview 2020 2021 Nov 2020 End of Term Test
5 pages
Two-Pan Equal-Arm Balances
No ratings yet
Two-Pan Equal-Arm Balances
31 pages
CHMA Unit - III
No ratings yet
CHMA Unit - III
6 pages
Spesifikasi Sandvik D245S
No ratings yet
Spesifikasi Sandvik D245S
43 pages
CP Workshop Day 2
No ratings yet
CP Workshop Day 2
37 pages
NIST Statistical Test Suite: An Introduction: October 2020
No ratings yet
NIST Statistical Test Suite: An Introduction: October 2020
10 pages
GLY 1010 Study Guide 1
No ratings yet
GLY 1010 Study Guide 1
7 pages
Eclox Manual
No ratings yet
Eclox Manual
126 pages
ADC and Sample Hold Circuit Overview
No ratings yet
ADC and Sample Hold Circuit Overview
25 pages
Advanced GD&T Training Course
0% (1)
Advanced GD&T Training Course
1 page
Eurotherm 2604 PDF
100% (1)
Eurotherm 2604 PDF
2 pages

DS100-1 WS 2.5 Enrico, DM

Uploaded by

DS100-1 WS 2.5 Enrico, DM

Uploaded by

Worksheet 2.

Enrico, Dionne Marc L. Page 1 of 1

uber_df = pd.concat([apr_df, may_df, jun_df])

You might also like