0% found this document useful (0 votes)
15 views

Data Frame Notes1

A DataFrame is a 2D data structure that organizes data into rows and columns like an Excel spreadsheet. It allows storing and manipulating tabular data in Python. A DataFrame can be created from lists, dictionaries, NumPy arrays, and other DataFrames. It consists of columns that can be of different data types. Indexes are used to label rows and columns. New columns and rows can be added, and existing ones can be modified or deleted.

Uploaded by

ANIKET RATHOUR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Data Frame Notes1

A DataFrame is a 2D data structure that organizes data into rows and columns like an Excel spreadsheet. It allows storing and manipulating tabular data in Python. A DataFrame can be created from lists, dictionaries, NumPy arrays, and other DataFrames. It consists of columns that can be of different data types. Indexes are used to label rows and columns. New columns and rows can be added, and existing ones can be modified or deleted.

Uploaded by

ANIKET RATHOUR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

What is Data Frame?

A Data frame is a 2D (two-dimensional) data structure, i.e., data is arranged


in tabular form i.e. In rows and columns.

Or we can say that, Pandas DataFrame is similar to excel sheet

Let’s understand it through an example

known as Indexes

Name Age Department Charges Gender


0 Arprit 62 Surgery 300 M
1 Zarina 22 ENT 250 F
2 Kareem 32 Orthopaedic 200 M Known as
3 Arun 12 Surgery 300 M
Columns
4 Zubin 30 ENT 250 M
5 Kettaki 16 ENT 250 F
6 Ankita 29 Cardiology 800 F
7 Zareen 45 300 F
8 Kush 19 Cardiology 800 M Data Values
9 Shilpa 23 Nuclear 400 F
Id
Medicine
1. Create DataFrame
pandas DataFrame can be created using the following constructor −
pandas.DataFrame( data, index, columns, dtype, copy)

The parameters of the constructor are as follows −

Sr.No Parameter & Description

1 Data data takes various forms like ndarray, series, map, lists, dict, constants and
also another DataFrame.

2 Index For the row labels, the Index to be used for the resulting frame is Optional
Default np.arrange(n) if no index is passed.

3 Columns For column labels, the optional default syntax is - np.arrange(n). This is
only true if no index is passed.

4 Dtype Data type of each column.

5 Copy This command (or whatever it is) is used for copying of data, if the default is
False.

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior


A pandas DataFrame can be created using various inputs like −

 Lists

 dictionary

 Series

 Numpy ndarrays

 Another DataFrame

1.1 Create an Empty DataFrame


>>> import pandas as pd
>>> df=pd.DataFrame()
>>> df

Empty DataFrame
Columns: []
Index: []

1.2 Create a DataFrame from Lists


Example 1
>>> MyList=[10,20,30,40]
>>> MyFrame=pd.DataFrame(MyList)
>>> MyFrame

0
0 10
1 20
2 30
3 40

Example 2: (Nested List)


>>> Friends =
[['Shraddha','Doctor'],['Shanti','Teacher'],['Monica','Engineer']]
>>> MyFrame=pd.DataFrame(Friends,columns=['Name','Occupation'])
>>> MyFrame

Name Occupation
0 Shraddha Doctor
1 Shanti Teacher
2 Monica Engineer

1.3 Creation of a DataFrame from Dictionary of ndarrays / Lists


 All the ndarrays must be of same length.
 If index is passed, then the length of the index should equal to the length of the
arrays.
 If no index is passed, then by default, index will be range(n), where n is the array
length.

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior


Example 1 (without index)
>>> data = {'Name':['Shraddha', 'Shanti', 'Monica',
'Yogita'],'Age':[28,34,29,39]}
>>> df = pd.DataFrame(data)
>>> df
Name Age
0 Shraddha 28
1 Shanti 34
2 Monica 29
3 Yogita 39

Example 2 (with index)


>>> data = {'Name':['Shraddha', 'Shanti', 'Monica',
'Yogita'],'Age':[28,34,29,39]}
>>> df = pd.DataFrame(data, index=['Friend1','Friend2','Relative1','Relative2'])
>>> df
Name Age
Friend1 Shraddha 28
Friend2 Shanti 34
Relative1 Monica 29
Relative2 Yogita 39

1.4 Create a DataFrame from List of Dictionaries


Here we are passing list of dictionary to create a DataFrame. The dictionary
keys are by default taken as column names.

Example 1:
>>> Mydict= [{'Won': 15, 'Loose': 2},{'Won': 5, 'Loose': 10},
{'Won': 8, 'Loose': 9},{'Won':4}]
>>> df = pd.DataFrame(Mydict)
>>> df
Loose Won
0 2.0 15
1 10.0 5
2 9.0 8
3 NaN 4

Notice that Missing Value is stored as NaN (Not a Number)

Example 2:
>>> Mydict=[{'Won': 15, 'Loose': 2},{'Won': 5, 'Loose': 10},{'Won': 8, 'Loose':
9}]
>>> df = pd.DataFrame(Mydict, index=['India', 'Pakistan','Autralia'])
>>> df

Loose Won
India 2 15
Pakistan 10 5
Autralia 9 8

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior


Example 3
We can also create a DataFrame with by specifying list of dictionaries, row
indices, and column indices.

>>> L_dict = [{'Maths': 78, 'Chemistry': 78,'Physics':87},{'Maths': 67,


'Chemistryb': 70},{'Physics':77,'Maths':87}]

>>> df1 = pd.DataFrame(L_dict, index=['Student1', 'Student2','Student3'],


A columns=['Physics', 'Chemistry','Maths'])

>>> df1

Physics Chemistry Maths


Student1 87.0 78.0 78
Student2 NaN NaN 67
Student3 77.0 NaN 87

>>> df2 = pd.DataFrame(L_dict, index=['Student1', 'Student2','Student3'],


B
columns=['Chemistry','Maths'])

>>> df2

Chemistry Maths
Student1 78.0 78
Student2 NaN 67
Student3 NaN 87

>>> df3 = pd.DataFrame(L_dict, index=['Student1', 'Student2','Student3'],


C columns=['English','Chemistry','Maths'])

>>> df3

English Chemistry Maths


Student1 NaN 78.0 78
Student2 NaN NaN 67
Student3 NaN NaN 87
Observe the lines mentioned with A, B and C above.Output of A,B,C
are depends upon the COLUMNS MENTIONED while creating DataFrame. If
Dictionary Keys are matched with Columns specified then the
corresponding data will be shown. If columns mentioned are not
matched with Keys then NaN will be displayed
2. Addition of New Column & Row
2.1 Column Addition
>>> L_dict = [{'Maths': 78, 'Chemistry': 78,'Physics':87},{'Maths': 67,
'Chemistry': 70},{'Physics':77,'Maths':87,'Chemistry':90}]

df3 = pd.DataFrame(L_dict, index=['Student1', 'Student2','Student3'],


columns=['English','Chemistry','Maths'])

>>> df3['Physics']=[45,56,65]

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior


A new column’ Physics’ has
been added with new data
>>> df3

English Chemistry Maths Physics


Student1 NaN 78 78 45
Student2 NaN 70 67 56
Student3 NaN 90 87 65

 We can Update column Data also by using same method


>>> df3['English']=[78,98,89]

>>> df3
English Chemistry Maths Physics
Student1 78 78 78 45
Student2 98 70 67 56
Student3 89 90 87 65
 We can add new column using Data ,stored in existing Frame
>>> df3['Total']=df3.English+df3.Chemistry+df3.Maths+df3.Physics
Look a new Column
>>> df3 Total has been added
English Chemistry Maths Physics Total with total of marks in
Student1 78 78 78 45 279
Student2 98 70 67 56 291
other subjects
Student3 89 90 87 65 331

2.2 Row Addition


i. To add row with by specifying row index
>>> df3.loc['Student4']=[45,67,45]

>>> df3
English Chemistry Maths
Student1 78 78 78
Student2 98 70 67
Student3 89 90 87
Student4 45 67 45
ii. To add/Modify row with by specifying row index no.
>>> df3.iloc[3]=[45,67,45]

>>> df3
English Chemistry Maths
Student1 78 78 78
Student2 98 70 67
Student3 89 90 87
Student4 45 67 45

>>> df3.iloc[3]=[65,77,90]

>>> df3
English Chemistry Maths
Student1 78 78 78
Student2 98 70 67
Student3 89 90 87
Student4 65 77 90

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior


3. Deletion of an Existing Column/Row from Data
Frame
3.1 Column Deletion
There are two methods to delete a Column
i. Using del
>>> del df3['Total']

>>> df3
English Chemistry Maths Physics
Student1 78 78 78 45
Student2 98 70 67 56
Student3 89 90 87 65

ii. Using Pop ()


>>> df3.pop('Total')
Student1 279
Student2 291
Student3 331
Name: Total, dtype: int64

To delete Particular column we must


iii. Using Drop () specify axis=1
>>> df3.drop('Physics',axis=1)
English Chemistry Maths
Student1 78 78 78
Student2 98 70 67
Student3 89 90 87
3.1 Row Deletion
>>> df3.drop('Student3')

English Chemistry Maths Physics


Student1 78 78 78 45
Student2 98 70 67 56

4. Accessing Data from Data Frame


4.1 Accessing Column
>>> df3['English']
Student1 NaN
Student2 NaN
Student3 NaN
Name: English, dtype: float64

>>> df3.English
Student1 NaN
Student2 NaN
Student3 NaN
Name: English, dtype: float64

4.2 Accessing Rows Accessing Row from Start TO


thods to delete end-1 index
>>>df3[1:2]

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior


English Chemistry Maths
Student2 98 70 67

Here we are Mentioning Column index also


>>>df3.iloc[0:1,[0,1]]
English Chemistry
Student1 78 78

>>> df3.loc['Student3']

English 89.0

Chemistry 90.0

Maths 87.0

Physics 66.0

Name: Student3, dtype: float64

4.3 Accessing Data Value


>>> df3.Physics['Student3']
OR All 4 commands will generate same
>>> df3.Physics[2] output
OR
>>> df3.at['Student3','Physics']
OR
>>> df3.loc['Student3','Physics']

66.0

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior

You might also like