
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find High Leverage Values for a Regression Model in R
To find the high leverage values for a regression model, we first need to find the predicted values or hat values that can be found by using hatvalues function and then define the condition for high leverage and extract them. For example if we have a regression model say M then the hat values can be found by using the command hatvalues(M), now to find the high leverage values that are greater than 0.05 can be found by using the below code −
which(hatvalues(M)>0.05)
Example1
Consider the below data frame −
x1<-c(100,rnorm(20)) y1<-c(100,rnorm(20)) df1<-data.frame(x1,y1) df1
Output
x1 y1 1 100.000000000 1.000000e+02 2 0.719522993 2.605494e-01 3 1.090771537 -1.865098e-04 4 -1.011001579 -8.651429e-01 5 0.371659744 -4.091896e-01 6 -1.604978659 2.156159e-01 7 0.062783668 3.888361e-01 8 -1.039950551 -1.191512e+00 9 0.221366328 -3.761471e-01 10 -1.034533695 -1.374490e+00 11 0.802379399 2.332555e+00 12 -0.005172285 8.169747e-01 13 2.082001055 -8.485883e-01 14 1.078353650 1.659481e+00 15 1.493539847 1.546731e+00 16 -0.504001706 -6.429116e-01 17 -0.251490099 -1.853292e-01 18 0.802616056 -1.794599e+00 19 0.333122883 8.486888e-02 20 1.284447868 4.464472e-01 21 -0.158662245 6.892470e-01
Creating the regression model between x1 and y1 −
Model1<-lm(y1~x1,data=df1)
Finding the hat values using Model1 −
hatvalues(Model1)
1 2 3 4 5 6 7 0.99823012 0.04953700 0.04921783 0.05140777 0.04986241 0.05219527 0.05017270 8 9 10 11 12 13 14 0.05144442 0.05001088 0.05143755 0.04946325 0.05024367 0.04850787 0.04922804 15 16 17 18 19 20 21 0.04890439 0.05079436 0.05050904 0.04946304 0.04990002 0.04906284 0.05040753
Finding high leverage value by considering values greater than 0.05 −
which(hatvalues(Model1)>0.05)
1 4 6 7 8 9 10 12 16 17 21 1 4 6 7 8 9 10 12 16 17 21
Example2
x2<-rpois(20,5) y2<-rpois(20,1) df2<-data.frame(x2,y2) df2
Output
x2 y2 1 3 2 2 4 1 3 7 0 4 6 0 5 6 0 6 6 1 7 2 3 8 7 1 9 3 2 10 5 0 11 6 1 12 6 0 13 3 0 14 5 1 15 6 0 16 5 0 17 9 1 18 3 0 19 5 0 20 3 4
Creating the regression model between x2 and y2 −
Model2<-lm(y2~x2,data=df2)
Finding the hat values using Model2 −
hatvalues(Model2)
1 2 3 4 5 6 7 0.11666667 0.06666667 0.11666667 0.06666667 0.06666667 0.06666667 0.20000000 8 9 10 11 12 13 14 0.11666667 0.11666667 0.05000000 0.06666667 0.06666667 0.11666667 0.05000000 15 16 17 18 19 20 0.06666667 0.05000000 0.31666667 0.11666667 0.05000000 0.11666667
Finding high leverage value by considering values greater than 0.05 −
which(hatvalues(Model2)>0.08)
1 3 7 8 9 13 17 18 20 1 3 7 8 9 13 17 18 20