A PROJECT REPORT ON
INDIAN
CRICKET
ANALYSIS
SUBJECT: INFORMATICS PRACTISES [065]
ACADEMIC YEAR: 2022 - 23
BOARD ROLL NUMBER:
CLASS: XII – HUMANITIES
1
CERTIFICATE
This is to certify that Suhani Kashyap Roll no. ……….. of
class 12 section E has prepared the project file as per the
prescribed syllabus of
INFORMATICS PRACTICES CLASS XII
(C.B.S.E)
Under my supervision, I am completely satisfied by the
performance.
I wish her all the success in life.
External examiner’s sign Teacher’s sign
(Miss Mamta
Choudhary)
2
index
Sr.N TITLE
o
1 ACKNOWLEDGEMENT
2 PROJECT ANALYSIS
3 MODULES
4 FUNCTIONS
5 DETAILED DESCRIPTION OF
PROJECT
6 OPTION PROVIDED
7 SOURCE CODE
8 SCREENSHOTS
9 BIBLIOGRAPHY
3
ACKNOWLEDGMENT
IT IS WITH PLEASURE THAT I ACKNOWLEDGE
MY SINCERE GARATITUDE TO OUR TEACHER,
Miss Mamta Choudhary WHO TAUGHT AND
UNDERTOOK THE RESPONSIBILITY OF
TEACHING THE SUBJECT INFROMATICS
PRACTISES. I HAVE BEEN GREATLY BENIFITED
FROM HER CLASSES.
I AM ESPECIALLY FILLED WITH GRATTITUDE
TO OUR PRINCIPAL Dr. Racchna Saddi. WHO HAS
ALWAYS BEEN A SOURCE OF
ENCOURAGEMENT AND SUPPORT AND
WITHOUT WHO’S INSPIRATION THIS PROJECT
WOULD NOT HAVE BEEN A SUCCESSFUL. I
WOULD LIKE TO PLACE ON RECORD
HEARTFELT THANKS TO HER.
I WOULD ALSO LIKE TO THANKS MY FAMILY
FOR ENCOURAGING AND SUPPORTING ME
DURING MY ACADEMICS AND PROVIDING ME
WITH SUCH FACILITIES TO MAKE MY FUTURE
BRIGHTER.
FINALLY, I WOULD LIKE TO EXPRESS MY
SINCERE APPRECIATION TO ALL THE OTHER
STUDENT OF MY BATCH, THEIR FRIENDSHIP
4
AND THE FINE TIMES THAT WE ALL SHARED
TOGETHER.
5
Project analysis
THE INDIAN CRICKET ANALYSIS PR0JECT
IS DESIGNED T0 ANALYSIS “CRICKET”
USING THE CONCEPTS OF PYTHON
INTERFACE WITH MATPLOTLIB AND FILE
HANDLING SYSTEM. THIS PROJECT HELPS
TO ENTER THE RECORDS IN DIFFERENT
DATA STRUCTURES AND STORE ALL
RECORDS PERMANENTLY IN FILES. ASLO,
DATA ANALYSIS IS DONE USING DIFFERENT
TYPES OF CHARTS. ADMIN CAN CREATE
NEW DATA STRUCTURES AS PER
REQUIREMENTS AND ALSO CAN ADD,
UPDATE, DELETE OR SEARCH FOR A
PARTICULAR RECORD IN EXSISITNG DATE
STRUCTURES. THE PROJECT DATABASE
SYSTEM CONSIST OF DIFFERENT DATA
STRUCTURES LIKE score, runs_scored,
balls_faced, strike_rate, fours, sixes, opposition,
ground, date. HERE AN INDIVIDUAL
PASSWORD IS GIVEN TO THE RESPECTIVE
ADMIN SO AS THEY CAN ENER INTO THE
SYSTEM AND CAN PERFORM REWUIRED
OPERATIONS. THE USER IS PROVIDE WITH
6
THE OPTION TO ENTER INTO DIFFERENT
DATA STRUCTURES AND WRANGLE WITH
THEM. IN CASE OF ENTERING ANY
INCORRECT DATA, APPROPIATE
VALIDATION WILL TAKE PLACE. THE
PROGRAM CAN BE TERMINATED WHENEVER
THE USER WANTS BY SELECTING EXIT
OPTION.
MODULES
1) import datetime
Import the datetime module and display the
current date.
2) import datetime
It allows you to connect with new processes.
3) import os
It provides commands to interact with
operating system
4) import matplotlib
It selectively import the pyplot module for
graphical reperesentation of data.
7
5) import pandas
It helps you to create data structures and used
as analyse tools for the python programming
language.
6) import numpy
It is fundamental package for scientific
computing with python and contains a
powerful N-dimensional array object.
Functions
1) to_datetime () :
The to_datetime() function is used to convert
argument to datetime. Eg. Object Date is
parsed as 12-12-2022
2) subprocess.run() :
This subprocess module allows you to spawn
new processes , connect to their
input/output/error , and obtain thoer return
codes. This module intends to replace several
older modules and functions.
3) exit() :
8
this function terminates the python program.
The value supplied as an argument to exit is
returned to the operationg system as the
program’s return code or exit code.
4) os.system() :
This method executes the command in a
subshell.
DETAILED DESCRIPTION
OF THE PROJECT
There is One Data Frames name DATA Table.
Data Frame’s data is saved permanently into
respective CSV files. And each Data Frame’s data
is analysed using different charts.
DATAFRAMES:
1) Data TABLE :
COLUMNS:
9
score,runs_scored,balls_faced,strike_rate,fo
urs,sixes,opposition,ground,date.
10
OPTIONS PROVIDED
in this python menu driven program , numerous
options are provided such as :
There are two different interfaces to work
1) Admin Interface
2) User Interface
Options in admin interface:
To Create Data Frames
To Add records
To Update records
To Delete records
To Search record
Analysis Indian cricketer’s data
Exit
Options in user interface:
Search records
Exit
11
CODE With SCREEN SHOTS
1. Starting screen :
import subprocess
import os
os.system("cls")
def user_input_int(label):
while (True):
value = (input(label)).strip()
if(value.isnumeric()):
return int(value)
else:
print("Invalid Input[Required 0-9] ")
def user_input(label):
while (True):
value = (input(label)).strip()
if((value >= 'a' and value <= 'z') or (value >= 'A' and value <= 'Z') or(value >= '0' and
value <= '9')):
return value
else:
print("Invalid Input[Required Aa0-Za9] ")
while (True):
os.system('cls')
12
print("="*10,"WELCOME TO INDIAN CRICKET ANALYSIS","="*20)
print("="*80)
print("="*10,"SELECT THE INTERFACE IN WHICH YOU WANT TO
WORK","="*20)
print("\n")
print("\t 1.ADMIN INTERFACE")
print("\t 2.USER INTERFACE")
print("\t 3.Exit Application")
ch=int(user_input_int("ENTER YOUR CHOICE:"))
if ch==1:
print("="*10,"WELCOME TO LOGIN SCREEN FOR ADMIN","="*20)
uname=input("\tENTER USERID:")
ps=input("\tENTER PASSWORD:")
if uname=='admin' and ps=="admin":
subprocess.run('python adminADD.py')
else:
print("INCORRECT USERNAME AND PASSWORD")
elif ch==2:
print("="*10,"WELCOME TO LOGIN SCREEN FOR ADMIN","="*20)
uname=user_input("ENTER USERID:")
ps=input("ENTER PASSWORD:")
if uname=='user' and ps=="user":
subprocess.run('python userFunction.py')
else:
print("INCORRECT USERNAME AND PASSWORD")
13
elif ch==3:
exit()
else:
os.system("cls")
print("="*10,"ERROR !!!","="*20)
print("INVALID REQUEST")
Screen
Input screen for admin login
14
1. Admin Interface :
import subprocess
import os
import pandas as pd
import datetime
date_format = '%d-%m-%Y'
os.system('cls')
def user_date(label):
while(True):
try:
value = (input(label)).strip()
# formatting the date using strptime() function
dateObject = datetime.datetime.strptime(value, date_format)
return(value)
# If the date validation goes wrong
except ValueError:
# printing the appropriate text if ValueError occurs
print("Incorrect data format, should be DD-MM-YYYY")
def user_input_int(label):
while (True):
value = (input(label)).strip()
if(value.isnumeric()):
return int(value)
else:
print("Invalid Input[Required 0-9] ")
def user_input(label):
15
while (True):
value = (input(label)).strip()
if((value >= 'a' and value <= 'z') or (value >= 'A' and value <= 'Z') or(value >= '0'
and value <= '9')):
return value
else:
print("Invalid Input[Required Aa0-Za9] ")
def AdminMenu():
print('='*20,"WELCOME TO ADD RECORDS SCREEN",'='*20)
print('SELECT APPROPIATE OPTION TO WORK')
print("1. Create New Blank CSV DATABASE")
print("2. Add")
print("3. EDIT")
print("4. SEARCH")
print("5. Display All Data")
print("6. DELETE")
print("7. Analysis")
print("8. EXIT")
ch=0
while (ch!=8):
os.system('cls')
AdminMenu()
ch = user_input_int("ENTER YOUR CHOICE....")
############################Data.CSV###################################
if ch==1:
16
headerList
=['score','runs_scored','balls_faced','strike_rate','fours','sixes','opposition','ground','date']
df=pd.DataFrame(headerList)
df1=df.T
df1.to_csv('data.csv',index=False,header=False)
print("New Database Created SUCCESFULLY!!")
os.system('cls')
##########################ADD#####################################
elif ch==2:
data=pd.read_csv('data.csv')
score=user_input("ENTER score:")
runs_scored=user_input("ENTER runs_scored:")
balls_faced=user_input("ENTER balls_faced:")
strike_rate=user_input("ENTER strike_rate:")
fours=user_input("ENTER No of fours:")
sixes=user_input("ENTER No Of Six:")
opposition=user_input("ENTER opposition Team Name:")
ground=user_input("Venue :")
date=user_date("Date (DD-MM-YYYY):")
data.at['new',:]=[score,runs_scored,balls_faced,strike_rate,fours,sixes,opposition,ground,
date]
data.to_csv('data.csv',index=False)
os.system('cls')
print("DATA HAS BEEN STORED SUCCESSFULLY ")
############################EDIT###################################
elif ch==3:
data=pd.read_csv('data.csv')
print("SEE RECORDS")
17
print(data)
no=int(input('ENTER THE RECORD NUMBER TO BE MODIFIED'))
score=user_input("ENTER score:")
runs_scored=user_input("ENTER runs_scored:")
balls_faced=user_input("ENTER balls_faced:")
strike_rate=user_input("ENTER strike_rate:")
fours=user_input("ENTER No of fours:")
sixes=user_input("ENTER No Of Six:")
opposition=user_input("ENTER opposition Team Name:")
ground=user_input("Venue :")
date=user_date("Date (DD-MM-YYYY):")
data.at[no,:]=[score,runs_scored,balls_faced,strike_rate,fours,sixes,opposition,ground,dat
e]
data.to_csv('data.csv',index=False)
os.system('cls')
print("DATA HAS BEEN MODIFIED SUCCESSFULLY")
##############################view
all#################################################
elif ch==4:
data=pd.read_csv('data.csv')
keyvalue=input("Search By Venue: ")
contain_values = data[data['ground'].str.contains(keyvalue)]
print("Record found")
print (contain_values)
##############################Search####################################
#############
18
elif ch==5:
data=pd.read_csv('data.csv')
print("SEE RECORD")
print(data)
#######################################################################
########
elif ch==6:
data=pd.read_csv('data.csv')
print("SEE RECORD")
print(data)
no=int(input('ENTER THE RECORD NUMBER TO BE DELETED:'))
os.system('cls')
print('THE RECORD IS AT',no,'IS:')
print(data.loc[no,:])
confirm=input('ARE YOU SURE YOU WANT TO DELETE THIS RECORD
PERMANENTLY?(Y/N)):')
if confirm=='Y':
med=data.drop(no)
print(data)
med.to_csv('data.csv',index=False)
os.system('cls')
print("RECORD HAS BEEN DELETED SUCCESSFULLY")
else:
19
os.system('cls')
print("RECORD HAS NOT BEEN DELETED")
#######################################################################
########
elif ch==7:
exit()
else:
print("INVALID OPTION!!!")
Creating Data Frame :
data table
######Data.CSV###############################
if ch==1:
headerList
=['score','runs_scored','balls_faced','strike_rate','fours','sixes','oppositi
on','ground','date']
df=pd.DataFrame(headerList)
20
df1=df.T
df1.to_csv('data.csv',index=False,header=False)
print("New Database Created SUCCESFULLY!!")
input('Press enter to continue')
Adding data into Data Frame :
elif ch==2:
data=pd.read_csv('data.csv')
score=user_input("ENTER score:")
runs_scored=user_input("ENTER runs_scored:")
balls_faced=user_input("ENTER balls_faced:")
strike_rate=user_input("ENTER strike_rate:")
fours=user_input("ENTER No of fours:")
sixes=user_input("ENTER No Of Six:")
opposition=user_input("ENTER opposition Team Name:")
ground=user_input("Venue :")
date=user_date("Date (DD-MM-YYYY):")
data.at['new',:]=[score,runs_scored,balls_faced,strike_rate,fours,sixes,opposition,gro
und,date]
data.to_csv('data.csv',index=False)
os.system('cls')
print("DATA HAS BEEN STORED SUCCESSFULLY ")
21
Modifying data into Data Frame :
##########EDIT######################
elif ch==3:
data=pd.read_csv('data.csv')
print("SEE RECORDS")
print(data)
no=int(input('ENTER THE RECORD NUMBER TO BE
MODIFIED'))
score=user_input("ENTER score:")
runs_scored=user_input("ENTER runs_scored:")
balls_faced=user_input("ENTER balls_faced:")
strike_rate=user_input("ENTER strike_rate:")
fours=user_input("ENTER No of fours:")
sixes=user_input("ENTER No Of Six:")
opposition=user_input("ENTER opposition Team Name:")
ground=user_input("Venue :")
date=user_date("Date (DD-MM-YYYY):")
22
data.at[no,:]=[score,runs_scored,balls_faced,strike_rate,fours,sixes,oppo
sition,ground,date]
data.to_csv('data.csv',index=False)
print("DATA HAS BEEN MODIFIED SUCCESSFULLY")
input('Press enter to continue')
After modify
Search by Venue
elif ch==4:
data=pd.read_csv('data.csv')
23
keyvalue=input("Search By Venue: ")
contain_values = data[data['ground'].str.contains(keyvalue)]
print("Record found")
print (contain_values)
input('Press enter to continue')
Deleting data into Data Frame :
elif ch==6:
data=pd.read_csv('data.csv')
print("SEE RECORD")
print(data)
no=int(input('ENTER THE RECORD NUMBER TO BE
DELETED:'))
os.system('cls')
print('THE RECORD IS AT',no,'IS:')
print(data.loc[no,:])
confirm=input('ARE YOU SURE YOU WANT TO DELETE
THIS RECORD PERMANENTLY?(Y/N)):')
if confirm=='Y':
med=data.drop(no)
print(data)
med.to_csv('data.csv',index=False)
os.system('cls')
print("RECORD HAS BEEN DELETED SUCCESSFULLY")
else:
os.system('cls')
24
print("RECORD HAS NOT BEEN DELETED")
input('Press enter to continue')
2. User Interface :
elif ch==2:
25
print("="*10,"WELCOME TO LOGIN SCREEN FOR
USER","="*20)
uname=user_input("ENTER USERID:")
ps=input("ENTER PASSWORD:")
if uname=='user' and ps=="user":
subprocess.run('python userFunction.py')
else:
print("INCORRECT USERNAME AND PASSWORD")
elif ch==3:
exit()
else:
os.system("cls")
print("="*10,"ERROR !!!","="*20)
print("INVALID REQUEST")
import subprocess
import os
import pandas as pd
import datetime
date_format = '%d-%m-%Y'
os.system('cls')
def user_date(label):
while(True):
try:
value = (input(label)).strip()
26
# formatting the date using strptime() function
dateObject = datetime.datetime.strptime(value, date_format)
return(value)
# If the date validation goes wrong
except ValueError:
# printing the appropriate text if ValueError occurs
print("Incorrect data format, should be DD-MM-YYYY")
def user_input_int(label):
while (True):
value = (input(label)).strip()
if(value.isnumeric()):
return int(value)
else:
print("Invalid Input[Required 0-9] ")
def user_input(label):
while (True):
value = (input(label)).strip()
if((value >= 'a' and value <= 'z') or (value >= 'A' and value <= 'Z')
or(value >= '0' and value <= '9')):
return value
else:
print("Invalid Input[Required Aa0-Za9] ")
def UserMenu():
print('='*20,"WELCOME TO ADD RECORDS SCREEN",'='*20)
print('SELECT APPROPIATE OPTION TO WORK')
print("1. Display All Data")
print("2. Search")
print("3. Analysis")
print("4. EXIT")
ch=0
while (ch!=5):
os.system('cls')
27
UserMenu()
ch = user_input_int("ENTER YOUR CHOICE....")
##############################view all
#################################################
if ch==1:
data=pd.read_csv('data.csv')
print("SEE RECORD")
print(data)
##############################Search######################
###########################
elif ch==2:
data=pd.read_csv('data.csv')
keyvalue=input("Search By Venue: ")
contain_values = data[data['ground'].str.contains(keyvalue)]
print("Record found")
print (contain_values)
##############################Analysis#####################
############################
elif ch==3:
subprocess.run('python AnalyzeCricket.py')
##########################################################
#####################
elif ch==4:
exit()
else:
print("INVALID OPTION!!!")
28
Analysis
Search:
##############################Search######################
###########################
elif ch==2:
data=pd.read_csv('data.csv')
keyvalue=input("Search By Venue: ")
contain_values = data[data['ground'].str.contains(keyvalue)]
print("Record found")
print (contain_values)
Result
29
Analysis
import pandas as pd
import numpy as np
import datetime
import matplotlib.pyplot as plt
df= pd.read_csv('data.csv')
#Data Cleaning and Preparation
# removing the first 2 characters in the opposition string
#df['opposition'] = df['opposition'].apply(lambda x: x[2:])
print(df.head())
# creating a feature for match year
df['year'] =pd.DatetimeIndex(pd.to_datetime(df['date'],
infer_datetime_format=True, utc=True, errors='ignore')).year
# creating a feature for being not out
df['score'] = df['score'].apply(str)
df['not_out'] = np.where(df['score'].str.endswith('*'), 1, 0)
# dropping the odi_number feature because it adds no value to the
analysis
#df.drop(columns='odi_number', inplace=True)
30
# dropping those innings where Dhoni did not bat and storing in a
new DataFrame
df_new = df.loc[((df['score'] != 'DNB') & (df['score'] != 'TDNB')),
'runs_scored':]
# fixing the data types of numerical columns
df_new['runs_scored'] = df_new['runs_scored'].astype(int)
df_new['balls_faced'] = df_new['balls_faced'].astype(int)
df_new['strike_rate'] = df_new['strike_rate'].astype(float)
df_new['fours'] = df_new['fours'].astype(int)
df_new['sixes'] = df_new['sixes'].astype(int)
#Career Statistics
first_match_date = pd.DatetimeIndex(pd.to_datetime(df['date'],
infer_datetime_format=True, utc=True, errors='ignore')).year.min() #
first match
print('First match:', first_match_date)
last_match_date = pd.DatetimeIndex(pd.to_datetime(df['date'],
infer_datetime_format=True, utc=True, errors='ignore')).year.max() #
last match
print('nLast match:', last_match_date)
number_of_matches = df.shape[0] # number of mathces played in
career
print('nNumber of matches played:', number_of_matches)
number_of_inns = df_new.shape[0] # number of innings
print('nNumber of innings played:', number_of_inns)
not_outs = df_new['not_out'].sum() # number of not outs in career
print('nNot outs:', not_outs)
runs_scored = df_new['runs_scored'].sum() # runs scored in career
print('nRuns scored in career:', runs_scored)
balls_faced = df_new['balls_faced'].sum() # balls faced in career
print('nBalls faced in career:', balls_faced)
career_sr = (runs_scored / balls_faced)*100 # career strike rate
print('nCareer strike rate: {:.2f}'.format(career_sr))
career_avg = (runs_scored / (number_of_inns - not_outs)) # career
average
31
print('nCareer average: {:.2f}'.format(career_avg))
highest_score_date = df_new.loc[df_new.runs_scored ==
df_new.runs_scored.max(), 'date'].values[0]
highest_score = df.loc[df.date == highest_score_date,
'score'].values[0] # highest score
print('nHighest score in career:', highest_score)
hundreds = df_new.loc[df_new['runs_scored'] >= 100].shape[0] #
number of 100s
print('nNumber of 100s:', hundreds)
fifties =
df_new.loc[(df_new['runs_scored']>=50)&(df_new['runs_scored']<10
0)].shape[0] #number of 50s
print('nNumber of 50s:', fifties)
fours = df_new['fours'].sum() # number of fours in career
print('nNumber of 4s:', fours)
sixes = df_new['sixes'].sum() # number of sixes in career
print('nNumber of 6s:', sixes)
#Analysis
# number of matches played against different oppositions
df['opposition'].value_counts().plot(kind='bar', title='Number of
matches against different oppositions', figsize=(8, 5));
runs_scored_by_opposition =
pd.DataFrame(df_new.groupby('opposition')['runs_scored'].sum())
runs_scored_by_opposition.plot(kind='bar', title='Runs scored against
different oppositions', figsize=(8, 5))
plt.xlabel(None);
innings_by_opposition = pd.DataFrame(df_new.groupby('opposition')
['date'].count())
not_outs_by_opposition =
pd.DataFrame(df_new.groupby('opposition')['not_out'].sum())
temp = runs_scored_by_opposition.merge(innings_by_opposition,
left_index=True, right_index=True)
32
average_by_opposition = temp.merge(not_outs_by_opposition,
left_index=True, right_index=True)
average_by_opposition.rename(columns = {'date': 'innings'},
inplace=True)
average_by_opposition['eff_num_of_inns'] =
average_by_opposition['innings'] - average_by_opposition['not_out']
average_by_opposition['average'] =
average_by_opposition['runs_scored'] /
average_by_opposition['eff_num_of_inns']
average_by_opposition.replace(np.inf, np.nan, inplace=True)
major_nations = df.opposition.unique()#['Australia', 'England', 'New
Zealand', 'Pakistan', 'South Africa', 'Sri Lanka', 'West
Indies']#df.opposition.unique()
'''
plt.figure(figsize = (8, 5))
plt.plot(average_by_opposition.loc[major_nations, 'average'].values,
marker='o')
plt.plot([career_avg]*len(major_nations), '--')
plt.title('Average against major teams')
plt.xticks(range(0, 7), major_nations)
plt.ylim(20, 70)
plt.legend(['Avg against opposition', 'Career average']);
'''
plt.show()
df_new.reset_index(drop=True, inplace=True)
career_average = pd.DataFrame()
career_average['runs_scored_in_career'] =
df_new['runs_scored'].cumsum()
career_average['innings'] = df_new.index.tolist()
career_average['innings'] = career_average['innings'].apply(lambda x:
x+1)
career_average['not_outs_in_career'] = df_new['not_out'].cumsum()
career_average['eff_num_of_inns'] = career_average['innings'] -
career_average['not_outs_in_career']
career_average['average'] = career_average['runs_scored_in_career'] /
career_average['eff_num_of_inns']
33
plt.figure(figsize = (8, 5))
plt.plot(career_average['average'])
plt.plot([career_avg]*career_average.shape[0], '--')
plt.title('Career average progression by innings')
plt.xlabel('Number of innings')
plt.legend(['Avg progression', 'Career average']);
input('Press enter to continue')
34
35
36
bibliography
https://www.geeksforgeeks.org/
https://www.edureka.co/
https://stackoverflow.com/
Informatics practises : A textbook for class XII
(IP textbook)
37