PRACTICAL 10
Practical 10: Perform Linear Regression on given warehouse data
Linear regression is a basic and commonly used type of predictive analysis. The overall idea
of regression is to examine two things:
(1) does a set of predictor variables do a good job in predicting an outcome (dependent)
variable?
(2) Which variables in particular are significant predictors of the outcome variable, and in what
way do they-indicate by the magnitude and sign of the beta estimates-impact the outcome
variable?
These regression estimates are used to explain the relationship between one dependent variable
and one or more independent variables
The simplest form of the regression equation with one dependent and one independent variable
is defined by the formula y = a + b*x, where y estimated dependent variable score. a constant
(intercept), b = regression coefficient (slope). and x-score on the independent variable.
E.g: y=a + b*x
a = -5.0895
b=0.9675
1. Open blank excel & insert Data {Customer ID, Age[X], Sales [Y]}. Save → linear.xls
2. Open data (linear.xls) in Power BI → Get Data → Excel → Edit.
3. Browse data (linear.xls) → select sheet1 click load.
4. Add more columns as per formula of XY, Xsquare, Ysquare for performing Linear
Regression.
On the right side of screen under Fields,
Right click sheet1 → New column {XY = Sheet1 [Age[X]]]*Sheet1 [Sales[Y]]]}
Right click sheet1 → New column {XSquare = Sheet1 [Age[X]]]*Sheet1[Age[X]]]}
Right click sheet1 → New column {YSquare Sheet1 [Sales [Y]]]*Sheet1 [Sales[Y]]]}
5. Right click on Fields → Sheet and add a new measure. Which may act as some constant
values.
Right click sheet1 → New measure {XSum = SUM(Sheet1[Age[X]]])}
Right click sheet1 → New measure {YSum = SUM(Sheet1 [Sales [Y]]])}
Right click sheet1 → New measure {XSquareSum = SUM(Sheet1[XSquare])}
Right click sheet1 New measure {YSquareSum = SUM(Sheet1 [YSquare])}
Right click sheet1 New measure {XYSum = SUM(Sheet1[XY])}
6. Calculate rnumerator
Right click sheet1 → New measure {rnumerator= 10*[XYSum]-[XSum]*[YSum]}
7. Calculate rdenominator1 & rdenominator2.
Right click sheet1 →
New measure (rdenominator1 = (10* [XSquareSum]-[XSum]^2) *(10* [YSquareSum]
[YSum]^2)}
Right click sheet1 → New measure {rdenominator2 =SQRT([rdenominator1])}
8. Calculate r.
Right click sheet1 → New measure {r= DIVIDE([rnumerator],[rdenominator2])}
9. Click Report icon, left side of the screen.
10. Select the Card visualizations to view calculated results.
11. Calculate R-squared using (r*r) formula.
Right click sheet1 → New measure {rSquared =[r]*[r] }
12. Select the Card visualizations to view calculated results.
13. Calculate Slope.
Right click sheet1 → New measure {Slope = (10*[XYSum]-[XSum]*[YSum])/(10*
[XSquareSum]-[XSum]^2)}
14. Calculate Y-Intercept.
Right click sheet1 → New measure {YIntercept =DIVIDE([YSum]*[XSquareSum]-
[XSum]*[XYSum],(10*[XSquareSum]-[XSum]^2))}
15. Use the formulate to calculate the Linear Regression (y-b*x+a) when sales will be 10, 30,
&50.
Right click sheet1 →
New measure {10 sales = [Slope]*10+ [YIntercept]}
New measure {30 sales = [Slope]*30+ [YIntercept]}
New measure {50 sales [Slope]*50+ [YIntercept]}
16. Select Scatter Chart from Visualization.
17 Add Legend as Customer ID, X-axis as Age and Y-axis as Sales.
18. Select Trend Line under Analytics tool to see the graph. Analytics Trend Line → Add.