0% found this document useful (0 votes)
17 views

Notebook PYTHON DATA SCIENCE

The document provides an introduction to Python data science and Pandas. It covers Pandas Series, DataFrames, reading and writing data, merging and joining DataFrames. Key Pandas functions like head, tail, describe and aggregation functions are also covered.

Uploaded by

darayir140
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Notebook PYTHON DATA SCIENCE

The document provides an introduction to Python data science and Pandas. It covers Pandas Series, DataFrames, reading and writing data, merging and joining DataFrames. Key Pandas functions like head, tail, describe and aggregation functions are also covered.

Uploaded by

darayir140
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

PYTHON DATA SCIENCE

May 27, 2021

1 Hello Students I Abhishek Dixit Welcomes You

2 To Learn Python Data Science Tutorials

Name,Email,Phone Number,Address
Bob Smith,[email protected],123-456-7890,123 Fake Street
Mike Jones,[email protected],098-765-4321,321 Fake venue
[ ]: import csv

[ ]: with open('/content/MODIFIED MID SEMESTER EXAMINATION MARKS.csv', 'r') as data:


reader = csv.reader(data)
for data in reader:
print(data)

[ ]: import json

[ ]: with open('/content/sample_data/anscombe.json') as data:


reader = json.load(data)
for data in reader:
print(data)

[ ]: from google.colab import files


uploaded = files.upload()

[ ]: with open('/content/MODIFIED MID SEMESTER EXAMINATION MARKS.csv','w') as file:


writer = csv.writer(file)
writer.writerow(['NAME', 'ROLL NO.', 'MARKS'])
writer.writerow(['Abhishek', '0901FT201001.', 15])
writer.writerow(['Pawan', '0901FT201112.', 17])

[ ]: with open('/content/MODIFIED MID SEMESTER EXAMINATION MARKS.csv', 'r') as file:


reader = csv.DictReader(file)
for row in reader:

1
print(row)

OrderedDict([('NAME', 'Abhishek'), ('ROLL NO.', '0901FT201001.'), ('MARKS',


'15')])
OrderedDict([('NAME', 'Pawan'), ('ROLL NO.', '0901FT201112.'), ('MARKS', '17')])

[ ]: import os

[ ]: os.remove('/content/MODIFIED MID SEMESTER EXAMINATION MARKS.csv')

[ ]: os.makedirs('Test.csv')

[ ]: os.rmdir('/content/Test.csv')

3 PANDAS
[ ]: import pandas
print('Now Pandas is imprted and ready to use')

Now Pandas is imprted and ready to use

[ ]: import pandas as pd

[ ]: print(pd.__version__)

1.1.5
Series
DataFrame
[ ]: ms=pd.Series(data, index)

[ ]: v=[11, 17, 22]


data=pd.Series(v)
print(data)

0 11
1 17
2 22
dtype: int64

[ ]: v=[11, 17, 22]


data=pd.Series(v, index = [1, 2, 3])
print(data)

1 11
2 17

2
3 22
dtype: int64

[ ]: v=[11, 17, 22]


data=pd.Series(v, index = ['a', 'b', 'c'])
print(data)

[ ]: print(data['a'])

11

[ ]: calories = {"Day1": 420, "Day2": 320, "day3": 356}


pd.Series(calories, index= ["Day1", 2, 3])

[ ]: series1 = pd.Series([1, 2, 3 ,4], index =['a', 'b','c','d'])


series2 = pd.Series([1, 3, 6 ,4], index =['a', 'e','c','f'])

[ ]: series1+series2

[ ]: series1-series2

[ ]: series1*series2

[ ]: series1/series2

DATAFRAME
pd.DataFrame(data,index)

[ ]: data = {"Name": ['Abhishek', 'Ram', 'Kartik'],


"Age":[30, 28, 3]}
v= pd.DataFrame(data)
print(v)

Name Age
0 Abhishek 30
1 Ram 28
2 Kartik 3

[ ]: v.loc[0]

[ ]: Name Abhishek
Age 30
Name: 0, dtype: object

[ ]: v.iloc[0]

3
[ ]: Name Abhishek
Age 30
Name: 0, dtype: object

[ ]: data = {"Name": ['Abhishek', 'Suman', 'Kartik'],


"Age":[30, 28, 3],
"Gender":['M','F','M']}
v= pd.DataFrame(data, index= [3, 4, 5])
print(v)

Name Age Gender


3 Abhishek 30 M
4 Suman 28 F
5 Kartik 3 M

[ ]: print(v.loc[3])

Name Abhishek
Age 30
Name: 3, dtype: object

[ ]: v.iloc[3]

[ ]: v.loc[(v.Age>=10) & (v.Gender =='M')]

[ ]: Name Age Gender


3 Abhishek 30 M

[ ]: data = {"Name": ['Abhishek', 'Ram', 'Kartik'],


"Age":[30, 28, 3]}
v= pd.DataFrame(data)
print(v)

Name Age
0 Abhishek 30
1 Ram 28
2 Kartik 3

[ ]: v.iloc[[0,1]]

[ ]: Name Age
0 Abhishek 30
1 Ram 28

[ ]: v.iloc[[0,2]]

4
[ ]: Name Age
0 Abhishek 30
2 Kartik 3

[ ]: v.iloc[0:3]

[ ]: Name Age
0 Abhishek 30
1 Ram 28
2 Kartik 3

[ ]: data = {"Name": ['Abhishek', 'Ram', 'Kartik'],


"Age":[30, 28, 3],
"Gender":['M','F','M']}
v= pd.DataFrame(data, index = ['a','b','c'])
print(v)

Name Age Gender


a Abhishek 30 M
b Ram 28 F
c Kartik 3 M

[ ]: v.loc[['a','b'],['Name', 'Age']]

[ ]: Name Age
a Abhishek 30
b Ram 28

[ ]: v.shape

[ ]: (3, 3)

[ ]: df1=pd.DataFrame({'A': ['A0','A1', 'A2', 'A3'],


'B': ['B0','B1', 'B2', 'B3'],
'C': ['C0','C1', 'C2', 'C3'],
'D': ['D0','D1', 'D2', 'D3']}, index = [0, 1, 2 ,3])
df2=pd.DataFrame({'A': ['A4','A5', 'A6', 'A7'],
'B': ['B4','B5', 'B6', 'B7'],
'C': ['C4','C5', 'C6', 'C7'],
'D': ['D4','D5', 'D6', 'D7']}, index = [4, 5, 6 ,7])

[ ]: pd.concat([df1,df2], axis=0)

[ ]: pd.concat([df1,df2], axis=1)

pd.merge(left, right, how = ‘inner/outer’, on = ‘key’)

5
[ ]: left =pd.DataFrame({'KEY': ['A0','A1', 'A2', 'A3'],
'B': ['B0','B1', 'B2', 'B3'],
'C': ['C0','C1', 'C2', 'C3']})
right =pd.DataFrame({'KEY': ['A0','A1', 'A2', 'A4'],
'D': ['B0','B1', 'B6', 'B3'],
'E': ['C0','C5', 'C2', 'C3']})

[ ]: pd.merge(left, right, how='inner', on='KEY')

[ ]: KEY B C D E
0 A0 B0 C0 B0 C0
1 A1 B1 C1 B1 C5
2 A2 B2 C2 B6 C2

[ ]: pd.merge(left, right, how='outer', on='KEY')

[ ]: KEY B C D E
0 A0 B0 C0 B0 C0
1 A1 B1 C1 B1 C5
2 A2 B2 C2 B6 C2
3 A3 B3 C3 NaN NaN
4 A4 NaN NaN B3 C3

[ ]: left=pd.DataFrame({
'B': ['B0','B1', 'B2', 'B3'],
'C': ['C0','C1', 'C2', 'C3'],
}, index = ['A0','A1', 'A2' ,'A3'])
right=pd.DataFrame({
'D': ['D4','D5', 'D6', 'D7'],
'E' : ['E4', 'E5', 'E6', 'E7']}, index = ['A0','A1', 'A2'␣
,→,'A4'])

[ ]: print(left)

B C
A0 B0 C0
A1 B1 C1
A2 B2 C2
A3 B3 C3

[ ]: left.join(right)

[ ]: B C D E
A0 B0 C0 D4 E4
A1 B1 C1 D5 E5
A2 B2 C2 D6 E6
A3 B3 C3 NaN NaN

6
[ ]: right.join(left)

[ ]: D E B C
A0 D4 E4 B0 C0
A1 D5 E5 B1 C1
A2 D6 E6 B2 C2
A4 D7 E7 NaN NaN

[ ]: left.join(right, how ='outer')

[ ]: B C D E
A0 B0 C0 D4 E4
A1 B1 C1 D5 E5
A2 B2 C2 D6 E6
A3 B3 C3 NaN NaN
A4 NaN NaN D7 E7

[ ]: data=pd.DataFrame({'B': [1,2,1,3], 'C':[2,1,2,4]}, index = ['A0', 'A1','A2',␣


,→'A3'])

[ ]: data

[ ]: B C
A0 1 2
A1 2 1
A2 1 2
A3 3 4

[ ]: data['C'].unique()

[ ]: array([2, 1, 4])

[ ]: data['B'].nunique()

[ ]: 3

[ ]: data['C'].value_counts()

[ ]: 2 2
1 1
4 1
Name: C, dtype: int64

[ ]: data['B'].apply(lambda x: x*x)

[ ]: A0 1
A1 4

7
A2 1
A3 9
Name: B, dtype: int64

[ ]: data.columns

[ ]: Index(['B', 'C'], dtype='object')

[ ]: data.index

[ ]: Index(['A0', 'A1', 'A2', 'A3'], dtype='object')

[ ]: data.sort_values('B')

[ ]: B C
A0 1 2
A2 1 2
A1 2 1
A3 3 4

3.1 Where is the Pandas Codebase?


The source code for Pandas is located at this github repository https://github.com/pandas-
dev/pandas
Latest version: 1.2.4
Release date: Apr 12, 2021.

[ ]: import pandas as pd

[ ]: data = pd.read_csv('/content/SAMPLE.csv')
print(data)

[ ]: data = pd.read_csv('/content/SAMPLE.csv')
print(data.to_string())

[ ]: d = pd.read_csv('/content/MODIFIED MID SEMESTER EXAMINATION MARKS.csv')


print(d.to_string())

[ ]: len(data)

[ ]: 12

[ ]: nd=pd.read_json('/content/sample_data/anscombe.json')
print(nd.to_string())

8
[ ]: pd.read_excel('/content/MID SEMESTER EXAMINATION.xlsx')

[ ]: data={"Name": ['Abhishek', 'Pawan','Kartik'],


"age": [31, 27, 3]}
v=pd.DataFrame(data)
print(v)

Name age
0 Abhishek 31
1 Pawan 27
2 Kartik 3

[ ]: v.to_csv('WRITECSV')

[ ]: view=pd.read_csv('/content/WRITECSV')
print(view)

[ ]: v.to_csv('WRITECSV', index=False)

[ ]: view=pd.read_csv('/content/WRITECSV')
view

[ ]: v.to_excel('WRITEEXCEL.xlsx', index=False)

[ ]: view=pd.read_excel('/content/WRITEEXCEL.xlsx')
view

[ ]: data = pd.read_csv('/content/SAMPLE.csv')
print(data.head(6))

[ ]: data = pd.read_csv('/content/SAMPLE.csv')
print(data.tail(6))

[ ]: data = pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
print(data.to_string())

S. No. ROLL. NO. STUDENT_NAME DATE OF EXAM MARKS


PERCENTAGE
0 1 0901IT181001 AADITYA KHANTAL 25-05-2021 15.0
75.0
1 2 0901IT181002 ADITYA JOSHI 05-25-2021 NaN
NaN
2 3 0901IT181003 AJAY GARG 25-May-21 16.0
80.0
3 4 0901IT181004 AKASH KACHHAWAY 25-05-2021 17.0
85.0
4 5 0901IT181005 AKSHAT KOTHAVADE 25-05-2021 12.0

9
60.0
5 6 0901IT181006 ALAKH NIRANJAN THAKURIYA 05-25-2021 11.0
55.0
6 7 0901IT181007 ALOK KUMAR 05-25-2021 13.0
65.0
7 8 0901IT181008 AMAN DIXIT 25-05-2021 NaN
NaN
8 9 0901IT181009 AMIT BAMNIYA 25-05-2021 12.0
60.0
9 10 0901IT181010 ANKIT KUMAR 25-05-2021 12.5
62.5
10 11 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0
65.0
11 12 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0
65.0

[ ]: data = pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
newdata=data.dropna()
print(newdata.to_string())

S. No. ROLL. NO. STUDENT_NAME DATE OF EXAM MARKS


PERCENTAGE
0 1 0901IT181001 AADITYA KHANTAL 25-05-2021 15.0
75.0
2 3 0901IT181003 AJAY GARG 25-May-21 16.0
80.0
3 4 0901IT181004 AKASH KACHHAWAY 25-05-2021 17.0
85.0
4 5 0901IT181005 AKSHAT KOTHAVADE 25-05-2021 12.0
60.0
5 6 0901IT181006 ALAKH NIRANJAN THAKURIYA 05-25-2021 11.0
55.0
6 7 0901IT181007 ALOK KUMAR 05-25-2021 13.0
65.0
8 9 0901IT181009 AMIT BAMNIYA 25-05-2021 12.0
60.0
9 10 0901IT181010 ANKIT KUMAR 25-05-2021 12.5
62.5
10 11 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0
65.0
11 12 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0
65.0

[ ]: data = pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
newdata=data.dropna(axis=1)
print(newdata.to_string())

10
[ ]: data = pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
data.fillna(12)

[ ]: data = pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
data['MARKS'].fillna(12)

[ ]: 0 15.0
1 12.0
2 16.0
3 17.0
4 12.0
5 11.0
6 13.0
7 12.0
8 12.0
9 12.5
10 13.0
11 13.0
Name: MARKS, dtype: float64

[ ]: data = pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
data['MARKS'].fillna(12, inplace=True)
data

[ ]: S. No. ROLL. NO. … MARKS PERCENTAGE


0 1 0901IT181001 … 15.0 75.0
1 2 0901IT181002 … 12.0 NaN
2 3 0901IT181003 … 16.0 80.0
3 4 0901IT181004 … 17.0 85.0
4 5 0901IT181005 … 12.0 60.0
5 6 0901IT181006 … 11.0 55.0
6 7 0901IT181007 … 13.0 65.0
7 8 0901IT181008 … 12.0 NaN
8 9 0901IT181009 … 12.0 60.0
9 10 0901IT181010 … 12.5 62.5
10 11 0901IT181011 … 13.0 65.0
11 12 0901IT181011 … 13.0 65.0

[12 rows x 6 columns]

[ ]: data = pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
data = data.rename(columns= {'STUDENT_NAME':'NAME_STUDENT'})
print(data.to_string())

[ ]: data =pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
x = data["MARKS"].mean()
data["MARKS"].fillna(x, inplace=True)

11
data

[ ]: S. No. ROLL. NO. … MARKS PERCENTAGE


0 1 0901IT181001 … 15.00 75.0
1 2 0901IT181002 … 13.45 NaN
2 3 0901IT181003 … 16.00 80.0
3 4 0901IT181004 … 17.00 85.0
4 5 0901IT181005 … 12.00 60.0
5 6 0901IT181006 … 11.00 55.0
6 7 0901IT181007 … 13.00 65.0
7 8 0901IT181008 … 13.45 NaN
8 9 0901IT181009 … 12.00 60.0
9 10 0901IT181010 … 12.50 62.5
10 11 0901IT181011 … 13.00 65.0
11 12 0901IT181011 … 13.00 65.0

[12 rows x 6 columns]

[ ]: data =pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
x = data["MARKS"].median()
data["MARKS"].fillna(x, inplace=True)
data

[ ]: S. No. ROLL. NO. … MARKS PERCENTAGE


0 1 0901IT181001 … 15.0 75.0
1 2 0901IT181002 … 13.0 NaN
2 3 0901IT181003 … 16.0 80.0
3 4 0901IT181004 … 17.0 85.0
4 5 0901IT181005 … 12.0 60.0
5 6 0901IT181006 … 11.0 55.0
6 7 0901IT181007 … 13.0 65.0
7 8 0901IT181008 … 13.0 NaN
8 9 0901IT181009 … 12.0 60.0
9 10 0901IT181010 … 12.5 62.5
10 11 0901IT181011 … 13.0 65.0
11 12 0901IT181011 … 13.0 65.0

[12 rows x 6 columns]

[ ]: data =pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')
x = data["MARKS"].mode()[0]
data["MARKS"].fillna(x, inplace=True)
data

[ ]: data.describe()

[ ]: data.info()

12
[2]: import pandas as pd

[6]: data =pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')


print(data.to_string())

ROLL. NO. STUDENT_NAME DATE OF EXAM MARKS PERCENTAGE


0 0901IT181001 AADITYA KHANTAL 25-05-2021 15.0 75.0
1 0901IT181002 ADITYA JOSHI 05-25-2021 NaN NaN
2 0901IT181003 AJAY GARG 25-May-21 16.0 80.0
3 0901IT181004 AKASH KACHHAWAY 25-05-2021 17.0 85.0
4 0901IT181005 AKSHAT KOTHAVADE 25-05-2021 12.0 60.0
5 0901IT181006 ALAKH NIRANJAN THAKURIYA 05-25-2021 11.0 55.0
6 0901IT181007 ALOK KUMAR 05-25-2021 13.0 65.0
7 0901IT181008 AMAN DIXIT 25-05-2021 NaN NaN
8 0901IT181009 AMIT BAMNIYA 25-05-2021 12.0 60.0
9 0901IT181010 ANKIT KUMAR 25-05-2021 12.5 62.5
10 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0
11 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0

[5]: data['DATE OF EXAM']=pd.to_datetime(data['DATE OF EXAM'])


print(data.to_string())

ROLL. NO. STUDENT_NAME DATE OF EXAM MARKS PERCENTAGE


0 0901IT181001 AADITYA KHANTAL 2021-05-25 15.0 75.0
1 0901IT181002 ADITYA JOSHI 2021-05-25 NaN NaN
2 0901IT181003 AJAY GARG 2021-05-25 16.0 80.0
3 0901IT181004 AKASH KACHHAWAY 2021-05-25 17.0 85.0
4 0901IT181005 AKSHAT KOTHAVADE 2021-05-25 12.0 60.0
5 0901IT181006 ALAKH NIRANJAN THAKURIYA 2021-05-25 11.0 55.0
6 0901IT181007 ALOK KUMAR 2021-05-25 13.0 65.0
7 0901IT181008 AMAN DIXIT 2021-05-25 NaN NaN
8 0901IT181009 AMIT BAMNIYA 2021-05-25 12.0 60.0
9 0901IT181010 ANKIT KUMAR 2021-05-25 12.5 62.5
10 0901IT181011 ANKIT RAJ TIRKEY 2021-05-25 13.0 65.0
11 0901IT181011 ANKIT RAJ TIRKEY 2021-05-25 13.0 65.0

[ ]: data['DATE OF EXAM'].apply(lambda x:pd.to_datetime(x).strftime('%m-%d-%y'))

[12]: data = pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')


data['MARKS'].fillna(21, inplace=True)
data

[12]: ROLL. NO. STUDENT_NAME DATE OF EXAM MARKS PERCENTAGE


0 0901IT181001 AADITYA KHANTAL 25-05-2021 15.0 75.0
1 0901IT181002 ADITYA JOSHI 05-25-2021 21.0 NaN
2 0901IT181003 AJAY GARG 25-May-21 16.0 80.0
3 0901IT181004 AKASH KACHHAWAY 25-05-2021 17.0 85.0

13
4 0901IT181005 AKSHAT KOTHAVADE 25-05-2021 12.0 60.0
5 0901IT181006 ALAKH NIRANJAN THAKURIYA 05-25-2021 11.0 55.0
6 0901IT181007 ALOK KUMAR 05-25-2021 13.0 65.0
7 0901IT181008 AMAN DIXIT 25-05-2021 21.0 NaN
8 0901IT181009 AMIT BAMNIYA 25-05-2021 12.0 60.0
9 0901IT181010 ANKIT KUMAR 25-05-2021 12.5 62.5
10 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0
11 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0

[15]: for x in data.index:


if data.loc[x,'MARKS']>20:
data.loc[x, 'MARKS'] = 12
print(data.to_string())

ROLL. NO. STUDENT_NAME DATE OF EXAM MARKS PERCENTAGE


0 0901IT181001 AADITYA KHANTAL 25-05-2021 15.0 75.0
1 0901IT181002 ADITYA JOSHI 05-25-2021 12.0 NaN
2 0901IT181003 AJAY GARG 25-May-21 16.0 80.0
3 0901IT181004 AKASH KACHHAWAY 25-05-2021 17.0 85.0
4 0901IT181005 AKSHAT KOTHAVADE 25-05-2021 12.0 60.0
5 0901IT181006 ALAKH NIRANJAN THAKURIYA 05-25-2021 11.0 55.0
6 0901IT181007 ALOK KUMAR 05-25-2021 13.0 65.0
7 0901IT181008 AMAN DIXIT 25-05-2021 12.0 NaN
8 0901IT181009 AMIT BAMNIYA 25-05-2021 12.0 60.0
9 0901IT181010 ANKIT KUMAR 25-05-2021 12.5 62.5
10 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0
11 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0

[ ]: print(data.duplicated())

[ ]: data.drop_duplicates()

[ ]: data.corr()

[19]: import matplotlib.pyplot as plt

[20]: data =pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')


print(data.to_string())

ROLL. NO. STUDENT_NAME DATE OF EXAM MARKS PERCENTAGE


0 0901IT181001 AADITYA KHANTAL 25-05-2021 15.0 75.0
1 0901IT181002 ADITYA JOSHI 05-25-2021 NaN NaN
2 0901IT181003 AJAY GARG 25-May-21 16.0 80.0
3 0901IT181004 AKASH KACHHAWAY 25-05-2021 17.0 85.0
4 0901IT181005 AKSHAT KOTHAVADE 25-05-2021 12.0 60.0
5 0901IT181006 ALAKH NIRANJAN THAKURIYA 05-25-2021 11.0 55.0
6 0901IT181007 ALOK KUMAR 05-25-2021 13.0 65.0

14
7 0901IT181008 AMAN DIXIT 25-05-2021 NaN NaN
8 0901IT181009 AMIT BAMNIYA 25-05-2021 12.0 60.0
9 0901IT181010 ANKIT KUMAR 25-05-2021 12.5 62.5
10 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0
11 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0

[21]: data.plot()
plt.show()

[22]: data = pd.read_csv('/content/SAMPLE_FOR_CLEANING.csv')


data['MARKS'].fillna(12, inplace=True)
data['PERCENTAGE'].fillna(60, inplace=True)
data

[22]: ROLL. NO. STUDENT_NAME DATE OF EXAM MARKS PERCENTAGE


0 0901IT181001 AADITYA KHANTAL 25-05-2021 15.0 75.0
1 0901IT181002 ADITYA JOSHI 05-25-2021 12.0 60.0
2 0901IT181003 AJAY GARG 25-May-21 16.0 80.0
3 0901IT181004 AKASH KACHHAWAY 25-05-2021 17.0 85.0
4 0901IT181005 AKSHAT KOTHAVADE 25-05-2021 12.0 60.0
5 0901IT181006 ALAKH NIRANJAN THAKURIYA 05-25-2021 11.0 55.0
6 0901IT181007 ALOK KUMAR 05-25-2021 13.0 65.0
7 0901IT181008 AMAN DIXIT 25-05-2021 12.0 60.0
8 0901IT181009 AMIT BAMNIYA 25-05-2021 12.0 60.0
9 0901IT181010 ANKIT KUMAR 25-05-2021 12.5 62.5
10 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0

15
11 0901IT181011 ANKIT RAJ TIRKEY 25-May-21 13.0 65.0

[ ]: data.plot()
plt.show()

[24]: data.plot(kind='bar')
plt.show()

16

You might also like