100% found this document useful (1 vote)
156 views

Analysing Ad Budget

The document describes analyzing an advertising budget dataset to predict sales. It includes: 1. Importing and analyzing the dataset which has advertising budgets and sales data. 2. Creating feature and target variables to train and test a linear regression model to predict sales based on advertising budgets. 3. Splitting the data into training and test sets and fitting a linear regression model to predict sales for the test set. 4. Calculating the mean squared error to evaluate the model's performance.

Uploaded by

Srikanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
156 views

Analysing Ad Budget

The document describes analyzing an advertising budget dataset to predict sales. It includes: 1. Importing and analyzing the dataset which has advertising budgets and sales data. 2. Creating feature and target variables to train and test a linear regression model to predict sales based on advertising budgets. 3. Splitting the data into training and test sets and fitting a linear regression model to predict sales for the test set. 4. Calculating the mean squared error to evaluate the model's performance.

Uploaded by

Srikanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

12/12/2019 Assignment 01

Assignment 01: Evaluate the Ad Budget Dataset of XYZ


Firm
The comments/sections provided are your cues to perform the assignment. You don't need to limit yourself to the
number of rows/cells provided. You can add additional rows in each section to add more lines of code.

If at any point in time you need help on solving this assignment, view our demo video to understand the different
steps of the code.

Happy coding!

1: Import the dataset

In [1]: #Import the required libraries


import pandas as pd

In [3]: #Import the advertising dataset


df_data = pd.read_csv('C:\\Users\\srikanth.ganji\\Desktop\\@SG\\OLD_Users_srik
anth.ganji_Desktp\\Desktop\\CDS\\lilsmp\\ASSIGNMENTS\\Lesson 8\\Advertising_Bu
dget_and_Sales\\Advertising Budget and Sales.csv')

2: Analyze the dataset

In [9]: #View the initial few records of the dataset


df_data.head()

Out[9]:
Unnamed: 0 TV Ad Budget ($) Radio Ad Budget ($) Newspaper Ad Budget ($) Sales ($)

0 1 230.1 37.8 69.2 22.1

1 2 44.5 39.3 45.1 10.4

2 3 17.2 45.9 69.3 9.3

3 4 151.5 41.3 58.5 18.5

4 5 180.8 10.8 58.4 12.9

In [10]: #Check the total number of elements in the dataset


df_data.size

Out[10]: 1000

file:///C:/Users/srikanth.ganji/Downloads/Analysing Ad Budget.html 1/4


12/12/2019 Assignment 01

3: Find the features or media channels used by the firm

In [7]: #Check the number of observations (rows) and attributes (columns) in the datas
et
df_data.shape

Out[7]: (200, 5)

In [8]: #View the names of each of the attributes


df_data.columns

Out[8]: Index(['Unnamed: 0', 'TV Ad Budget ($)', 'Radio Ad Budget ($)',


'Newspaper Ad Budget ($)', 'Sales ($)'],
dtype='object')

4: Create objects to train and test the model; find the sales figures for each channel

In [11]: #Create a feature object from the columns


X_feature = df_data[['Newspaper Ad Budget ($)','Radio Ad Budget ($)','TV Ad Bu
dget ($)']]

In [12]: #View the feature object


X_feature.head()

Out[12]:
Newspaper Ad Budget ($) Radio Ad Budget ($) TV Ad Budget ($)

0 69.2 37.8 230.1

1 45.1 39.3 44.5

2 69.3 45.9 17.2

3 58.5 41.3 151.5

4 58.4 10.8 180.8

In [13]: #Create a target object (Hint: use the sales column as it is the response of t
he dataset)
Y_target = df_data['Sales ($)']

In [14]: #View the target object


Y_target.head()

Out[14]: 0 22.1
1 10.4
2 9.3
3 18.5
4 12.9
Name: Sales ($), dtype: float64

file:///C:/Users/srikanth.ganji/Downloads/Analysing Ad Budget.html 2/4


12/12/2019 Assignment 01

In [15]: #Verify if all the observations have been captured in the feature object
X_feature.shape

Out[15]: (200, 3)

In [16]: #Verify if all the observations have been captured in the target object
Y_target.shape

Out[16]: (200,)

5: Split the original dataset into training and testing datasets for the model

In [17]: #Split the dataset (by default, 75% is the training data and 25% is the testin
g data)
from sklearn.model_selection import train_test_split
X_train,X_test, Y_train, Y_test = train_test_split(X_feature,Y_target,random_s
tate = 1)

In [19]: #Verify if the training and testing datasets are split correctly (Hint: use th
e shape() method)
print(X_train.shape)
print(X_test.shape)
print(Y_train.shape)
print(Y_test.shape)

(150, 3)
(50, 3)
(150,)
(50,)

6: Create a model to predict the sales outcome

In [21]: #Create a linear regression model


from sklearn.linear_model import LinearRegression
linReg = LinearRegression()
linReg.fit(X_train,Y_train)

Out[21]: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,


normalize=False)

In [24]: #Print the intercept and coefficients


print(linReg.intercept_)
print(linReg.coef_)

2.8769666223179176
[0.00345046 0.17915812 0.04656457]

file:///C:/Users/srikanth.ganji/Downloads/Analysing Ad Budget.html 3/4


12/12/2019 Assignment 01

In [27]: #Predict the outcome for the testing dataset


y_pred = linReg.predict(X_test)
y_pred

Out[27]: array([21.70910292, 16.41055243, 7.60955058, 17.80769552, 18.6146359 ,


23.83573998, 16.32488681, 13.43225536, 9.17173403, 17.333853 ,
14.44479482, 9.83511973, 17.18797614, 16.73086831, 15.05529391,
15.61434433, 12.42541574, 17.17716376, 11.08827566, 18.00537501,
9.28438889, 12.98458458, 8.79950614, 10.42382499, 11.3846456 ,
14.98082512, 9.78853268, 19.39643187, 18.18099936, 17.12807566,
21.54670213, 14.69809481, 16.24641438, 12.32114579, 19.92422501,
15.32498602, 13.88726522, 10.03162255, 20.93105915, 7.44936831,
3.64695761, 7.22020178, 5.9962782 , 18.43381853, 8.39408045,
14.08371047, 15.02195699, 20.35836418, 20.57036347, 19.60636679])

7: Calculate the Mean Square Error (MSE)

In [29]: #Import required libraries for calculating MSE (mean square error)
from sklearn import metrics
import numpy as np

In [30]: #Calculate the MSE


print(np.sqrt(metrics.mean_squared_error(Y_test,y_pred)))

1.404651423032894

In [ ]:

file:///C:/Users/srikanth.ganji/Downloads/Analysing Ad Budget.html 4/4

You might also like