0% found this document useful (0 votes)

26 views

WEBINTEL GUIDED LAB ACTIVITY Introduction To Pandas

The document demonstrates how to use Pandas, a Python library, to load and manipulate tabular data. It shows how to load data from a CSV file into a DataFrame, view the data, select columns and rows, add new columns, rearrange columns, filter rows, aggregate statistics, and save the DataFrame to other file formats. Various DataFrame methods and attributes like head(), shape, info(), loc, iloc, describe(), sort_values(), and groupby() are used to explore and manipulate the data.

Uploaded by

Joseph Lacuerda

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

WEBINTEL GUIDED LAB ACTIVITY Introduction To Pandas

Uploaded by

Joseph Lacuerda

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

In [ ]: # a software library written for the Python programming language for data manipulation and analysis.

# In particular, it offers data structures and operations

# for manipulating numerical tables and time series.
import pandas

In [ ]: # Two-dimensional, size-mutable, potentially heterogeneous tabular data.

# Data structure also contains labeled axes (rows and columns).
df = pandas.read_csv('gapminder.tsv', sep='\t')

In [ ]: df.head() # print the first 5 records

In [ ]: df.columns # print the columns name

In [ ]: df.index # print the indexes

In [ ]: df.values # print the values

In [ ]: type(df) # print the type of your data frame

In [ ]: # The shape attribute of pandas. DataFrame stores the number of rows and columns
# as a tuple (number of rows, number of columns) .
df.shape

In [ ]: df.shape()

In [ ]: # The info() function is used to print a concise summary of a DataFrame.

df.info()

In [ ]: df['country'] # Read the specific column

In [ ]: # creating a new data frame which is to stored all the data about country column
country_df = df['country']

In [ ]: country_df.head() # print the first 5 records

In [ ]: # Series is a one-dimensional labeled array capable of

#holding data of any type (integer, string, float, python objects, etc.).
type(country_df) # print the type of your data frame

In [ ]: # creating a new data frame which is to stored all the data about following column
subset = df[['country', 'continent', 'year']]

In [ ]: subset.head() # print the first 5 records

In [ ]: pandas.version # print the version of your pandas

In [ ]: # loc is label-based, which means that you have to specify rows and columns based
# on their row and column labels
df.loc[2]

In [ ]: df.loc[[2, 0]]

In [ ]: # iloc is integer index based, so you have to specify rows and columns by their integer index
df.iloc[2]

In [ ]: df.head() # print the first 5 records

In [ ]: # ix indexer was an early addition to the library that allowed for flexibility selecting rows and columns
# by either integer location or by label. DEPRICATED
df.ix[2]

In [ ]: # create a new data frame which store all the observation / rows from colums year and pop
subset = df.loc[:, ['year', 'pop']]

In [ ]: subset.head() # print the first 5 records

In [ ]: # display the year and pop then locate the year which is equal to 1967
df.loc[df['year'] == 1967, ['year', 'pop']]

In [ ]: # display the year and pop then locate the year which is equal to 1967 and pop > 1,000,000
df.loc[(df['year'] == 1967) & (df['pop'] > 1_000_000),
['year', 'pop']]

In [ ]: # display the rows using for loop

for index, row in df.iterrows():
print(index, row)

In [ ]: # display the rows using for loop with specify column name
for index, row in df.iterrows():
print(index, row['country'])

In [ ]: # The describe() method is used for calculating some statistical data like percentile,
# mean and std of the numerical values of the Series or DataFrame.
df.describe()

In [ ]: # display the data frame in descending format using the column 'country'
df.sort_values('country', ascending = False)

In [ ]: # display the data frame using the two columns 'country' (A-Z) and 'pop' (High - Low)
df.sort_values(['country','pop'], ascending = [1,0])

In [ ]: # adding a new column to ur data frame

df['new_continent'] = df['continent']

In [ ]: df.head()

In [ ]: # dropping a specific columns

df = df.drop(columns = ['new_continent'])
df.head()

In [ ]: # adding a new column to ur data frame

df['new_continent'] = df.iloc[:,1]
df

In [ ]: # Rearrangin data frame columns

df = df[['country','continent','new_continent','year','lifeExp','pop','gdpPercap']]
df

In [ ]: # create and save a new csv file

df.to_csv('modified.csv')

In [ ]: # create and save a new csv file without index

df.to_csv('modified.csv', index= False)

In [ ]: # create and save a new excel file without index

df.to_excel('modified.xlsx', index= False)

In [ ]: # create and save a new text file without index seperated by tab
df.to_csv('modified.txt', index= False, sep = '\t')

In [ ]: # filter the data frame with country that contains 'Afg'

df.loc[df['country'].str.contains('Afg')]

In [ ]: # filter the data frame with country does not contains 'Afg'
df.loc[~df['country'].str.contains('Afg')]

In [ ]: # filter the data frame usign re

import re
df.loc[df['country'].str.contains('Afg|Alb', regex=True)]

In [ ]: # filter the data frame usign re

import re
df.loc[df['country'].str.contains('afg|alb', flags = re.I, regex=True)]

In [ ]: # Changing the value of column new_continent with content of Asia to Asya

df.loc[df['new_continent'] == 'Asia', 'new_continent'] = 'Asya'
df

In [ ]: df.loc[df['pop'] > 1000000, ['new_continent','continent']] = 'Hello Word'

In [ ]: df.loc[df['pop'] > 1000000, ['new_continent','continent']] = ['Hello', 'World']

In [ ]: df = pandas.read_csv('modified.csv')

In [ ]: df

In [ ]: #Aggregate Statistics Using Groupby function

df.groupby(['continent']).mean()

In [ ]: #Aggregate Statistics Using Groupby function

df.groupby(['continent']).mean().sort_values('pop', ascending = False)

In [ ]: #Aggregate Statistics Using Groupby function

df.groupby(['continent']).mean().sort_values('gdpPercap', ascending = False)

In [ ]: #Aggregate Statistics Using Groupby function

df.groupby(['continent']).count()

In [ ]: df['count'] = 1
df

In [ ]: df.groupby(['continent']).count()['count']

In [ ]:

Audit of Equity
60% (10)
Audit of Equity
44 pages
02.SAP BW 7.4 On HANA SP8 Couse Content Details
No ratings yet
02.SAP BW 7.4 On HANA SP8 Couse Content Details
5 pages
Python Cheatsy
No ratings yet
Python Cheatsy
1 page
Pandas Python For Data Science
100% (1)
Pandas Python For Data Science
1 page
Pandas Python For Data Science
No ratings yet
Pandas Python For Data Science
1 page
Pandaspythonfordatascience
No ratings yet
Pandaspythonfordatascience
1 page
Cheat Python
No ratings yet
Cheat Python
8 pages
Python Lab
No ratings yet
Python Lab
8 pages
PandasGUIA PYTHON-04
No ratings yet
PandasGUIA PYTHON-04
1 page
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
No ratings yet
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
1 page
Python For Data Science 1662157639
No ratings yet
Python For Data Science 1662157639
6 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
1 page
Pandas
No ratings yet
Pandas
13 pages
Pandas Complete Notes
No ratings yet
Pandas Complete Notes
105 pages
Pandas Data Structures: Sections
No ratings yet
Pandas Data Structures: Sections
13 pages
Python Cheat Sheet For Excel Users
100% (2)
Python Cheat Sheet For Excel Users
5 pages
PANDAS Python
No ratings yet
PANDAS Python
2 pages
Five Year Dataset
No ratings yet
Five Year Dataset
15 pages
Pandas
No ratings yet
Pandas
44 pages
Pandas
No ratings yet
Pandas
36 pages
Ip_project_new
No ratings yet
Ip_project_new
13 pages
pandas (1)
No ratings yet
pandas (1)
25 pages
Pandas - Cheat - Sheet
No ratings yet
Pandas - Cheat - Sheet
6 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Lecture 3 - Pandas
No ratings yet
Lecture 3 - Pandas
37 pages
pandas-cheet-sheet
No ratings yet
pandas-cheet-sheet
1 page
Using Python For Data Analysis - July 2018 - Slides
No ratings yet
Using Python For Data Analysis - July 2018 - Slides
43 pages
Data Visualization - New
No ratings yet
Data Visualization - New
5 pages
intro-to-pandas-world-happiness
No ratings yet
intro-to-pandas-world-happiness
20 pages
Pandas
No ratings yet
Pandas
5 pages
DV0101EN-2-2-1-Area-Plots-Histograms-and-Bar-Charts-py-v2.0: 1 Exploring Datasets With Pandas and Matplotlib
No ratings yet
DV0101EN-2-2-1-Area-Plots-Histograms-and-Bar-Charts-py-v2.0: 1 Exploring Datasets With Pandas and Matplotlib
29 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (4)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
11 pages
1-Python Pandas Case Study
No ratings yet
1-Python Pandas Case Study
25 pages
Acknowledgement
No ratings yet
Acknowledgement
25 pages
Pandas,Numpy,Matplotlib
No ratings yet
Pandas,Numpy,Matplotlib
11 pages
Pandas Commands
No ratings yet
Pandas Commands
3 pages
justenoughpython_pandas_220915_175329
No ratings yet
justenoughpython_pandas_220915_175329
64 pages
Pandas Plots
No ratings yet
Pandas Plots
14 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Day08-Pandas-Tutorial: Pandas - by Punith V T
No ratings yet
Day08-Pandas-Tutorial: Pandas - by Punith V T
8 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages
Pandas Dataframe1
No ratings yet
Pandas Dataframe1
43 pages
Data Science With Python
No ratings yet
Data Science With Python
12 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
Python Libraries Cheat Sheets
No ratings yet
Python Libraries Cheat Sheets
6 pages
CSL-410-L17
No ratings yet
CSL-410-L17
27 pages
Data Frame Demo
No ratings yet
Data Frame Demo
73 pages
Python Unit 4&5 Que
No ratings yet
Python Unit 4&5 Que
33 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas - Jupyter Notebook
No ratings yet
Pandas - Jupyter Notebook
4 pages
Pandas
No ratings yet
Pandas
8 pages
Python Cheat Sheets
97% (32)
Python Cheat Sheets
11 pages
Pandas
No ratings yet
Pandas
21 pages
Python-Pandas Notes
No ratings yet
Python-Pandas Notes
5 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Simplifying Data Science With Python
From Everand
Simplifying Data Science With Python
Billy David millican
No ratings yet
TL Loyola Sales Kit (October 4 2019)
No ratings yet
TL Loyola Sales Kit (October 4 2019)
39 pages
Torre Central Sales Kit
No ratings yet
Torre Central Sales Kit
16 pages
Research Management System
No ratings yet
Research Management System
1 page
Erd Dan Phisical DB
No ratings yet
Erd Dan Phisical DB
10 pages
Stack:: Note 3: Stack and Queue Concept in Data Structure For Application
No ratings yet
Stack:: Note 3: Stack and Queue Concept in Data Structure For Application
7 pages
Individual Assessment Record For Practical Performance: NICF - MTA: Database Fundamentals (SF) Assessment Records
No ratings yet
Individual Assessment Record For Practical Performance: NICF - MTA: Database Fundamentals (SF) Assessment Records
9 pages
4 Queue: in This
No ratings yet
4 Queue: in This
14 pages
20Z352 - OS Lab Report
No ratings yet
20Z352 - OS Lab Report
9 pages
Course 1 Module 02 Lesson 5
No ratings yet
Course 1 Module 02 Lesson 5
7 pages
log
No ratings yet
log
2 pages
Btech Cse 7 Sem Distributed Database 2012 PDF
No ratings yet
Btech Cse 7 Sem Distributed Database 2012 PDF
7 pages
Subdomain Enumeration Cheat Sheet: @yamakira
No ratings yet
Subdomain Enumeration Cheat Sheet: @yamakira
1 page
Chapter 6 File System
No ratings yet
Chapter 6 File System
9 pages
SQR
No ratings yet
SQR
14 pages
Diagnostics Apps Check 250418
No ratings yet
Diagnostics Apps Check 250418
521 pages
HALF YEARLY X (Computer) Answer Key
No ratings yet
HALF YEARLY X (Computer) Answer Key
6 pages
eBOOK SQLServerExecutionPlans 2ed G Fritchey PDF
No ratings yet
eBOOK SQLServerExecutionPlans 2ed G Fritchey PDF
332 pages
Chapter 6: The Relational Algebra and Relational Calculus: Answers To Selected Exercises
No ratings yet
Chapter 6: The Relational Algebra and Relational Calculus: Answers To Selected Exercises
11 pages
What'S New in Ibm Infosphere Information Server 8.7
No ratings yet
What'S New in Ibm Infosphere Information Server 8.7
28 pages
Hyper-V and Failover Clustering Mini Poster
No ratings yet
Hyper-V and Failover Clustering Mini Poster
1 page
ODK1500S-SQL-driver DOCU V10 en PDF
No ratings yet
ODK1500S-SQL-driver DOCU V10 en PDF
51 pages
Linux Filesystem Features: Evolution of A de Facto Standard File System For Linux: Ext2'
No ratings yet
Linux Filesystem Features: Evolution of A de Facto Standard File System For Linux: Ext2'
20 pages
What Are The Components of Physical Database Structure of Oracle Database
100% (3)
What Are The Components of Physical Database Structure of Oracle Database
27 pages
Bugreport Begonia - Eea RP1A.200720.011 2022 10 17 14 03 51 Dumpstate - Log 691
No ratings yet
Bugreport Begonia - Eea RP1A.200720.011 2022 10 17 14 03 51 Dumpstate - Log 691
30 pages
Data Mining Methods
No ratings yet
Data Mining Methods
18 pages
Adding Second IDE Hard Drive Into A Blade 100
No ratings yet
Adding Second IDE Hard Drive Into A Blade 100
3 pages
Laravel 8 CRUD Tutorial Example Step by Step From Scratch
No ratings yet
Laravel 8 CRUD Tutorial Example Step by Step From Scratch
14 pages
Job Description For Senior .Net Developer Digitain Romania
No ratings yet
Job Description For Senior .Net Developer Digitain Romania
1 page
My Questions 1
No ratings yet
My Questions 1
26 pages
Sample Midterm Solutions
No ratings yet
Sample Midterm Solutions
10 pages
Dump
No ratings yet
Dump
13 pages
Simtronics - Operator Training Simulator (OTS)
No ratings yet
Simtronics - Operator Training Simulator (OTS)
1 page

WEBINTEL GUIDED LAB ACTIVITY Introduction To Pandas

Uploaded by

WEBINTEL GUIDED LAB ACTIVITY Introduction To Pandas

Uploaded by

In [ ]: # a software library written for the Python programming language for data manipulation and analysis.

# In particular, it offers data structures and operations

In [ ]: # Two-dimensional, size-mutable, potentially heterogeneous tabular data.

In [ ]: df.head() # print the first 5 records

In [ ]: df.columns # print the columns name

In [ ]: df.index # print the indexes

In [ ]: df.values # print the values

In [ ]: type(df) # print the type of your data frame

In [ ]: # The info() function is used to print a concise summary of a DataFrame.

In [ ]: df['country'] # Read the specific column

In [ ]: country_df.head() # print the first 5 records

In [ ]: # Series is a one-dimensional labeled array capable of

In [ ]: subset.head() # print the first 5 records

In [ ]: pandas.__version__ # print the version of your pandas

In [ ]: df.head() # print the first 5 records

In [ ]: subset.head() # print the first 5 records

In [ ]: # display the rows using for loop

In [ ]: # adding a new column to ur data frame

In [ ]: # dropping a specific columns

In [ ]: # adding a new column to ur data frame

In [ ]: # Rearrangin data frame columns

In [ ]: # create and save a new csv file

In [ ]: # create and save a new csv file without index

In [ ]: # create and save a new excel file without index

In [ ]: # filter the data frame with country that contains 'Afg'

In [ ]: # filter the data frame usign re

In [ ]: # filter the data frame usign re

In [ ]: # Changing the value of column new_continent with content of Asia to Asya

In [ ]: df.loc[df['pop'] > 1000000, ['new_continent','continent']] = 'Hello Word'

In [ ]: df.loc[df['pop'] > 1000000, ['new_continent','continent']] = ['Hello', 'World']

In [ ]: #Aggregate Statistics Using Groupby function

In [ ]: #Aggregate Statistics Using Groupby function

In [ ]: #Aggregate Statistics Using Groupby function

In [ ]: #Aggregate Statistics Using Groupby function

You might also like

In [ ]: pandas.version # print the version of your pandas