Descriptive Analytics
Prof. Rajiv Kumar
IIM Kashipur
Business Analytics (1of2)
Business analytics is the process of collating, sorting, processing, and studying business data, and
using statistical models and iterative methodologies to transform data into business insights.
• Descriptive Analytics
• Predictive Analytics
• Prescriptive Analytics
Business Analytics (2of2)
Descriptive Analytics
• Techniques that describe past performance and history
• Example: Creating a report that includes charts and graphs that explains data
Predictive Analytics
• Techniques that extract information from data and use it to predict future trends and identify behavioral
patterns
• Example: Using past sales data to predict future sales data
Prescriptive Analytics
• Techniques that create models indicating the best decision to make or course of action to take
• Example: Airline using past purchasing data as inputs into a model that recommends the best pricing
strategy across all flights allowing the company to maximize revenue.
Business Analytics (2of2)
Descriptive Analytics
• Techniques that describe past performance and history
• Example: Creating a report that includes charts and graphs that explains data
Predictive Analytics
• Techniques that extract information from data and use it to predict future trends and identify behavioral
patterns
• Example: Using past sales data to predict future sales data
Prescriptive Analytics
• Techniques that create models indicating the best decision to make or course of action to take
• Example: Airline using past purchasing data as inputs into a model that recommends the best pricing
strategy across all flights allowing the company to maximize revenue.
Example. IPL Auction Data Set (IPL_Data.xlsx)
Loading the dataset onto a DataFrame
import pandas as pd
ipl_auction_df = pd.read_excel( 'IPL_Data.xlsx’ )
Display DataFrame
print(ipl_auction_df)
Displaying first few records of the DataFrame
print(ipl_auction_df.head(5))
Example. IPL Auction Data Set (IPL_Data.xlsx)
Finding metadata of the DataFrame
list(ipl_auction_df)
Transpose DataFrame
ipl_auction_df.transpose()
Dimension of DataFrame
ipl_auction_df.shape
Finding Summary of the DataFrame
ipl_auction_df.info()
Example. IPL Auction Data Set (IPL_Data.xlsx)
Slicing and Indexing a dataframe
Selecting Rows by Indexes
ipl_auction_df[0:5] #First five rows
ipl_auction_df[-5:] #Last five rows
Selecting Columns by Column Names
ipl_auction_df['PLAYER NAME'][0:5] #Select “PLAYER NAME”
ipl_auction_df[['PLAYER NAME', 'COUNTRY']][0:5] #Select “PLAYER NAME”
Selecting Rows and Columns by indexes
ipl_auction_df.iloc[4:9, 1:4]
Example. IPL Auction Data Set (IPL_Data.xlsx)
Value Counts and Cross Tabulations
Finding Unique Occurances of Values in Columns
ipl_auction_df.COUNTRY.value_counts()
Cross-tabulation between two columns
pd.crosstab(ipl_auction_df['AGE'], ipl_auction_df['PLAYING ROLE'])
Sorting dataframe by column values( ipl_auction_df['AGE'], ipl_auction_df['PLAYING ROLE’])
ipl_auction_df[['PLAYER NAME', 'SOLD PRICE']].sort_values('SOLD PRICE')[0:5]
ipl_auction_df[['PLAYER NAME', 'SOLD PRICE']].sort_values('SOLD PRICE', ascending = False)[0:5]
Example. IPL Auction Data Set (IPL_Data.xlsx)
Grouping and Aggregating
What is the average SOLD PRICE for each age category?
ipl_auction_df.groupby('AGE')['SOLD PRICE'].mean()
Filtering Records from Dataframe based on conditions
#Which players have hit more then 80 sixes in the IPL tournament so far?
ipl_auction_df[ipl_auction_df['SIXERS'] > 80 ][['PLAYER NAME', 'SIXERS']]
‘
Example. IPL Auction Data Set (IPL_Data.xlsx)
Calculate Max of “SOLD PRICE”
print(ipl_auction_df['SOLD PRICE'].max())
Calculate Min of “SOLD PRICE”
print(ipl_auction_df['SOLD PRICE'].min())
Calculate Mean of “SOLD PRICE”
print(ipl_auction_df['SOLD PRICE'].mean())
Calculate Mode of “SOLD PRICE”
print(ipl_auction_df['SOLD PRICE'].mode())
Calculate Median of “SOLD PRICE”
print(ipl_auction_df['SOLD PRICE'].median())
Calculate Standard Deviation of “SOLD PRICE”
print(ipl_auction_df['SOLD PRICE'].std())
Example. IPL Auction Data Set (IPL_Data.xlsx)
Calculate correlation
#correlation between 'AVE', 'SIXERS', 'SOLD PRICE'
ipl_auction_df[['AVE', 'SOLD PRICE']].corr()
#correlation between 'SR-B', 'AVE', 'SIXERS', 'SOLD PRICE'
ipl_auction_df[['SR-B', 'AVE', 'SIXERS', 'SOLD PRICE']].corr()