Geographically Weighted Negative Binomial Regression-Incorporating Overdispersion
Geographically Weighted Negative Binomial Regression-Incorporating Overdispersion
net/publication/257665467
CITATIONS READS
81 2,326
2 authors:
All content following this page was uploaded by Alan Ricardo da Silva on 11 October 2016.
ABSTRACT
Geographically Weighted Negative Binomial Regression (GWNBR) was developed by Silva and Rodrigues
(2014) and it is a generalization of Geographically Weighted Poisson Regression (GWPR) proposed by
Nakaya et al. (2005) and of Poisson and negative binomial regressions. This paper aims to show a SAS®
macro to estimate GWNBR model encoded in SAS/IML and using PROC GMAP to draw the maps.
INTRODUCTION
Local spatial regression differs from global spatial regression by analyzing the relationship between
variables in a specific way for each unit of study instead of combining them. In fact, regions j closer region
i have greater influence in the estimates of the regression coefficients than when those regions are far
apart. Having a specific adjustment for each region, the final result is a better representation of the process
as a whole. The reason of to use this analysis is the violation of the assumption of stationarity demanded
by global models, which allows one to attribute the same relation between variables for all units of study.
The nonstationarity or spatial heterogeneity is the process where responses vary with location, area or
other characteristics of the spatial units (Anselin, 1988). In this way, Geographically Weighted Regression
(GWR) (see Fotheringham et al. (2002)) is used to local modeling.
However, this technique is used when the distribution of the data is Gaussian. In many applications, the
dependent variable represents a count, which makes classic GWR inappropriate to model this type of data.
Poisson and negative binomial are the most adequate distributions for the modeling of count data, and
Geographically Weighted Poisson Regression (GWPR) was developed by Nakaya et al. (2005). Silva and
Rodrigues (2014) developed GWR using negative binomial distribution, named GWNBR, and they showed
that GWNBR incorporates GWPR. Thus, the main objective of this paper is to show a SAS® macro to
estimate the parameters of the GWNBR model.
Parameterizing this model in terms of / , where is an off set variable, is the predicted mean,
The most used global Negative Binomial Regression (NB-2) considers a logarithm link function.
is the parameter of overdispersion, is the parameter related to the explicative variable , for =
1, … , , and is the j-th dependent variable for = 1, … , we have.
∼ ∑ , (1)
where NB represents Negative Binomial.
parameters and . Without limiting the functional form of this variation, GWNBR produces non-
GWNBR is an extension of the global or non-spatial model (1), which allows the spatial variation of the
parametric surfaces of the parameter estimates. This local model is described as the following:
∼ ∑ , , , (2)
where ( , ) are the locations (coordinates) of the data points j, for = 1, … , .
and IRLS methods, i. e., using !, which is estimated by the NR method, we estimate the vector β using the
The parameter estimation of the global model (1) is performed interactively with the combination of the NR
IRLS method. Thus, from this new β#, we update !, and so forth until convergence is obtained. However,
modifications in the NR and IRLS algorithms are necessary to incorporate the local variations.
As shown in Silva and Rodrigues (2014), the analytical solution for the local log-likelihood of GWNBR is
given by
% (& , & )('()) = *+ ,(& , & )-(& , & )(') *.) *+ ,(& , & )-(& , & )(') /(& , & )(')
$ (3)
where * is an × matrix of the explicative variables
1 )) … )
1 )2 … 2
*=1 6
⋮ ⋮ ⋱ ⋮
1 4) … 4
(4)
and -(& , & )(') is an × GLM diagonal weighting matrix for iteration m and location i
<&) 0
(')
… 0
; … 0?
=: 0 <&2
(')
-(& , & )(') >
⋮ ⋮ ⋱ ⋮
(6)
9 0 0 … <&4
(')
=
% (& , & )(') +
% (DE ,FE )(G)
/(& , & )(') = *$
AB .CB $
(G) % (DE ,FE )(G) K
HEB I)(JE ×CB $
(7)
R(& , & ) = [*+ ,(& , & )-(& , & )*].) × *′,(& , & )-(& , & )
where
(10)
and the elements of ,(& , & ) and -(& , & ) are given by (5) and (6), respectively.
After an estimate of $(& , & ) is obtained, the parameters & will be estimated using the NR method based
on the local log-likelihood given by (Silva and Rodrigues, 2014)
4
T(U& | , $(& , & )) W{ log[ ($(& , & ))] − [ + U& ] × ^N_[U& + ($(& , & ))]
\)
+ U& ^N_[U& ] + ^N_ [Γ( + U& )] − ^N_ [Γ(U& )] − ^N_[Γ( + 1)]} 7(b& ) (11)
Maximizing the local log-likelihood (11) using the univariate NR method, we obtain
(') .)
U& = U& − c& d&
('()) (') (')
(12)
where d& and c& are the first and second derivates of the local log-likelihood with respect to U&
(') (') (')
, i. e.,
gEG (AB
G L n 7(b& )
gE (CB $(DE ,FE )
(14)
where i(. ) and i′(. ) are the digamma and trigamma functions, respectively, which are given by
i(p) = and i′(p) = =
eqrst{(u) ey(u) e L z{|t{(u)
eu eu eu L
Using delta method, Silva and Rodrigues (2014) found that, for a function _(. ) that is differentiable in }, if
the distribution }#4 → (}, 42 ) achieves convergence, then the distribution _}#4 → (_(}), [_+ (})]2 × 22 )
also converges (Casella and Berger, 2011). If, under certain conditions, the estimators that maximize the
local likelihood are asymptotically normal, unbiased and consistent (Staniswalis, 1989), then
because <U(Û& ) = −1/c& , where c& is given in (14), and [_+ (U& )]2 = 1/U& .
Thus, for each regression point i, the NR and IRLS algorithms are used alternately until the parameter
estimates achieve convergence.
which helps determine the weights 7(b& ). One possibility could be to estimate it such that it minimizes the
To complete the fitting of the model, it is necessary to estimate the bandwidth of the chosen kernel function,
M = −2T($, ) + 2 +
2(())
4..)
(16)
where k is the effective number of parameters and T($, ) is the log-likelihood of GWNBR shown in
Equation (11).
The effective number of parameters of GWNBR can be written as = ) + 2 , where ) and 2 are the
effective number of parameters due to $ and , respectively. Following the method developed by Nakaya
et al. (2005), ) is given by the trace of the matrix , which is given by elements
GWNBRG MODEL
To avoid the difficult associate with the estimation of 2 , Silva and Rodrigues (2014) proposed the
Geographically Weighted Negative Binomial Regression with global methodology, which is namely
GWNBRg. In this model, the spatial variation is allowed only for $(& , & ), i. e.,
∼ ∑ ( , ) , (19)
where the parameters are the same as in Equation (2).
In the GWNBRg model, the estimation of the parameter is made globally, i.e., Silva and Rodrigues (2014)
assumed that all of the parameters in the model are stationary, and we estimate a global overdispersion !
to be used in the local estimates $(& , & ). Consequently, they proposed that the estimate of in the
GWNBRg model will be the same as that obtained through non-spatial (or global) negative binomial
regression.
The parameters $(& , & ) are estimated using the IRLS method, as in the GWNBR model described earlier,
assuming that !& = ! for all i. Note that it is not necessary to alternate the NR and IRLS methods, because
once is estimated globally, they estimate $(& , & ) for each regression point i using only the IRLS method.
Because there is no spatial variation for , its contribution to the effective number of the parameters in the
model is the unity, i.e., 2 = 1. Consequently, the bandwidth can be found using the AIC criterion (16),
where
4
and = U() + 1.
More details about GWR, bandwidth and other topics can be found in Fotheringham et al. (2202), Paez et
al. (2011); Leung et al. (2000); Wheeler and Tiefelsdorf (2005); Farber and Paez (2007); McMillen (2010);
Cleveland and Devlin (1988).
SAS® MACRO
The SAS® macros use the IML (Interactive Matrix Language) Procedure and the parameters are described
as follows. SAS® %golden macro is used to the Golden Section Search (find the optimal bandwidth) and
%gwnbr macro is used to estimated GWNBR model.
%golden(data=,y=,x=,lat=,long=,method=,type=,gwr=,offset=,out=);
%gwnbr(data=,y=,x=,lat=,long=,h=,grid=,latg=,longg=,gwr=,method=,alphag=,
offset=,geocod=,out=);
The macros calls used for the best models are as bellow:
The problem detected in the estimation of the bandwidth was a clue of the lack of fit obtained with the
Poisson distribution. Thus, we can conclude that the fleet of vehicles used in road freight transportation
probably exhibits overdispersion. Therefore, the negative binomial distribution is the best approach for the
modeling of these variables.
As an example, the surface of the parameters for GWNBR with bandwidth equal to 53.2 km and their
standard errors are shown in Figure 1. The greater values for the intercept ( ) are concentrated in the
amount of vehicles located in that place. In contrast, the parameter estimates for ) are smaller in that area
Vitoria metropolitan region (southeast side), which is the capital of the state. This finding reflects the greater
because there are many industries and the link function is exponential.
Figure 1. Surface of the parameter estimates and standard errors obtained with the GWNBR model
Output 1 shows the output of %gwnbr macro, where we can see, besides descriptive statistics, the actual
Fotheringham (2015), and pseudo 2 and adjusted 2 measures, considering Deviance (pctdev and
alpha-level, t-critical and the number of parameter estimated by GWNBR model, following Silva and
adjpctdev, respectively) and considering likelihood (pctll and adjpctll, respectively), following
Cameron and Windmeijer (1996).
CONCLUSION
Geographically Weighted Negative Binomial Regression (GWNBR) is an important tool to incorporate
overdispersion to the local model. The equations proposed by Silva and Rodrigues (2014) were encoded
in SAS/IML language and now this model can be estimated, as well as, it is possible to estimate GWPR
proposed by Nakaya et al. (2005) from GWNBR. The illustration showed that when the overdispersion is
present the Poisson distribution is not adequate to model the data, and the quality of fit of the negative
binomial distribution is superior, mainly GWNBR.
REFERENCES
Anselin, L. 1988. Spatial Econometrics: Methods and Models. Santa Barbara: Kluwer Academic
Publishers.
Cameron, A. C. and Windmeijer, F. A. G. (1996). “R-Squared Measures for Count Data
Regression Models with Applications to Health-Care Utilization”. Journal of Business and
Economic Statistics. 14(2): 209-220.
Casella, G. and Berger, R. L. 2001. Statistical Inference. 2nd ed. Duxbury.
Cleveland, W. S. and Devlin, S. J. 1988. “Locally-weighted Regression: An Approach to
Regression Analysis by Local Fitting”. Journal of the American Statistical Association. 83: 596-
610.
Farber, S. and Paez, A. 2007. “A Systematic Investigation of Cross-Validation in GWR Model
Estimation: Empirical Analysis and Monte Carlo Simulations”. Journal of Geographical Systems
9(4): 371-396.
Fotheringham, A. S., Brunsdon, C. and Charlton, M. 2002. Geographically Weighted Regression.
Wiley.
Leung, Y., Mei, C.-L. and Zhang, W.-X. 2000. “Statistical Tests for Spacial Nonstationarity Based
on the Geographically Weighted Regression Model”. Environment and Planning A 32: 9-32.
McMillen, D. P. 2010. “Issues in Spatial Data Analysis”. Journal of Regional Science 50(1): 119-
141.
Nakaya, T., Fotheringham, A. S., Brunsdon, C. and Charlton, M. 2005. “Geographically Weighted
Poisson Regression for Disease Association Mapping”. Statistics in Medicine 24: 2695-2717.
Paez, A., Farber, S. and Wheeler, D. 2011. “A Simulation-based Study of Geographically
Weighted Regression as a Method for Investigating Spatially Varying Relationships”. Environment
and Planning A 43(12): 2992-3010.
Silva, A. R. and Rodrigues, T. C. V. 2014. “Geographically Weighted Negative Binomial
Regression - Incorporating Overdispersion”. Statistics and Computing 24: 769-783.
Silva, A. R. and Fotheringham, A. S. 2015. “The Multiple Testing Issue in Geographically
Weighted Regression”. Geographical Analysis. Forthcoming.
Staniswalis, J. G. 1989. “The kernel Estimate of a Regression Function in Likelihood-based
Models”. Journal of the American Statistical Association 84: 276-283.
Wheeler, D. and Tiefelsdorf, M. 2005. “Multicollinearity and Correlation among Local Regression
Coefficients in Geographically Weighted Regression”. Journal of Geographical Systems 7(2):
161-187.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
Name: Alan Ricardo da Silva
Enterprise: Universidade de Brasília
Address: Campus Universitário Darcy Ribeiro, Departamento de Estatística, Prédio CIC/EST sala A1
35/28
City, State ZIP: Brasília, DF, Brazil, 70910-900
Work Phone: +5561 3107 3672
E-mail: [email protected]
Web: www.est.unb.br
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of
SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
APPENDIX I – SAS® MACRO
/************************************************/ b&i=j(n,1,0);
/**********************************************/ b&i[i,1]=b[(i-1)*npar+&i+1,5];
end;
/************************************************/ %end;
npar=&par+1; quit;
b&i=j(1,8,0);
b&i[1,]=b[(i-1)*npar+&i+1,];
%macro perm(data=,geocod=,x=,y=);
append from b&i;
proc iml;
end;
use &data;
%end;
read all var{&geocod &x &y} into tab;
quit;
close &data;
%mend beta;
n=nrow(tab);
u = 1:n;
%mend perm;
use _alpha_;
/************************************************/
n=nrow(b);
npar=&par+1;
%macro
n=n/npar; estac(data=,y=,x=,lat=,long=,h=,grid=,latg=,longg=,
gwr=,method=,alphag=,offset=,geocod=,rep=);
vk=0;
%let nvar=0;
%do i=0 %to ∥
%do %while(%scan(%str(&x),&nvar+1)~=); %macro
golden(data=,y=,x=,lat=,long=,method=,type=,gwr=,of
%let nvar=%eval(&nvar+1); fset=,out=);
%end; proc iml;
%gwnbr(data=&data,y=&y,x=&x,lat=&lat,long=&long,h=& use &data;
h,grid=&grid,
read all var {&y} into y;
latg=&latg,longg=&longg,gwr=&gwr,method=&method,alp
hag=&alphag, read all var {&x} into x;
%vk(&nvar); n=nrow(y);
data vk2; set vk; i=1; run; %if &offset= %then %do; offset=j(n,1,0);
%end;
%do it=2 %to (&rep+1);
%else %do; read all var {&offset} into
%perm(data=&data,geocod=&geocod,x=&long,y= offset; %end;
&lat);
close &data;
%gwnbr(data=perm,y=&y,x=&x,lat=&lat,long=&
long,h=&h,grid=&grid, x=j(n,1,1)||x;
count=j(1,nvar,0); raio=arcos(-1)/180;
do v=1 to nvar;
ang=sin(COORD[i,2]*raio)*sin(COORD[j,2]*ra
do i=1 to n; io)+cos(COORD[i,2]*raio)*cos(COORD[j,2]*raio)*cos(d
if*raio);
if x[i,v]>=x[1,v] then
count[v]=count[v]+1; arco=arcos(ang);
end; d[1]=i;
end; d[2]=j;
count=count/n*100; d[3]=arco*6371 /*Earth's Radius =
6371 (approximately)*/;
print count;
append from d;
varnames="b0":"b&nvar"||"alpha";
end;
create pvalor_est from count [colname=varnames];
else do;
append from count;
d[1]=i;
quit;
d[2]=j;
%mend estac;
d[3]=sqrt((COORD[i,1]-
COORD[j,1])**2+(COORD[i,2]-COORD[j,2])**2);
append from d;
end;
/***********************************************/
end;
/************ GOLDEN SECTION SEARCH ***********/
end;
/*******************************************/
close _dist_;
10
finish dist;
hess=choose(hess=0,1E-23,hess);
run dist(coord,n);
par0=par;
use _dist_;
par=par0-inv(hess)*g;
read all into d;
if aux1>50 &
maxd=int(max(d[,3])+1); par>1E5 then do;
free d; dpar=
0.0001;
close _dist_;
h1=h0+(1-r)*(h3-h0); end;
do while (abs(ddpar)>0.00001);
tt=choose(tt=0,1E-10,tt);
aux1=0; dpar=1;
dev=2*sum(y#log(tt)-
parold=par;
(y+1/a)#log((1+a*y)/(1+a*u)));
do while
ddev=dev-olddev;
(abs(dpar)>0.001);
end;
aux1=aux1+1;
if aux2>4 then ddpar=1E-
if par<0 then do;
9;
else ddpar=par-parold;
par=0.00001;
end;
end;
alpha=a;
par=choose(par<1E-10,1E-10,par); end;
g=sum(digamma(par+y)- n=nrow(y);
digamma(par)+log(par)+1-log(par+u)-
(par+y)/(par+u)); aux2=0;
hess=sum(trigamma(par+y)- do i=1 to n;
trigamma(par)+1/par-
2/(par+u)+(y+par)/((par+u)#(par+u))); d=j(1,3,0);
dist=d;
hess=choose(abs(hess)<1E-23,sign(hess)*1E-
do j=1 to n;
23,hess);
if abs(coord[,1])<180 then do;
11
dif=abs(COORD[i,1]-COORD[j,1]); if
dist[jj,3]<=h then w[jj]=(1-(dist[jj,3]/h)**2)**2;
raio=arcos(-1)/180;
else
w[jj]= 0;
ang=sin(COORD[i,2]*raio)*sin(COORD[j,2]*ra
io)+cos(COORD[i,2]*raio)*cos(COORD[j,2]*raio)*cos(d end;
if*raio);
end;
if i=j then
arco=0; end;
end; if
dist[jj,4]<= h & dist[jj,3]^=0 then w[jj,1]=(1-
u=nrow(dist); (dist[jj,3]/hn)**2)**2;
w=j(u,1,0); else
w[jj,1]=0;
if method= "fixed" then do;
if type="cv" then do; w[jj,2]=dist[jj,2];
do jj=1 to u; end;
if end;
dist[jj,3]<=maxd*0.8 & dist[jj,3]^=0 then
w[jj]=exp(-0.5*(dist[jj,3]/h)**2); else do;
else do jj=1 to n;
w[jj]= 0;
if
end; dist[jj,4]<=h then w[jj,1]=(1-
(dist[jj,3]/hn)**2)**2;
end;
else
else do; w[jj,1]=0;
do jj=1 to u;
w[jj,2]=dist[jj,2];
if
dist[jj,3]<=maxd*0.8 then w[jj]=exp(- end;
0.5*(dist[jj,3]/h)**2);
end;
else
w[jj]= 0; call sort(w,{2});
end; end;
end; wi=w[,1];
end; ym=sum(y)/nrow(y);
else aux1=0;
w[jj]= 0;
dpar=1;
end;
parold=par;
end;
if gwr="global" |
else do; gwr="poisson" then do;
do jj=1 to u; dpar=0.00001;
12
if gwr= "global" else alpha=1/par;
then par=1/a;
dev=0; ddev=1; cont=0;
end;
/* computing beta */
/* computing alpha=1/par,
where par=theta */ do while
(abs(ddev)>0.000001);
do while
(abs(dpar)>0.001); cont=cont+1;
aux1=aux1+1;
uj=choose(uj>1E100,1E100,uj);
if gwr="local"
then do; aux=
(alpha*uj/(1+2*alpha*uj+alpha*alpha*uj#uj));
par=choose(par<1E-10,1E-10,par); Ai=(uj/(1+alpha*uj))+(y-uj)#aux;
g=sum((digamma(par+y)-
digamma(par)+log(par)+1-log(par+uj)- Ai=choose(Ai<=0,1E-5,Ai);
(par+y)/(par+uj))#w[,1]);
zj=nj+(y-uj)/(Ai#(1+alpha*uj)) -
offset;
hess=sum((trigamma(par+y)-
if
trigamma(par)+1/par-
det(x`*(wi#Ai#x))=0 then bi=j(ncol(x),1,0);
2/(par+uj)+(y+par)/((par+uj)#(par+uj)))#w[,1]);
else
end;
bi=inv(x`*(wi#Ai#x))*x`*(wi#Ai#zj);
nj=x*bi + offset;
hess=choose(abs(hess)<1E-23,sign(hess)*1E-
23,hess);
nj=choose(nj>1E2,1E2,nj);
hess=choose(hess=0,1E-23,hess); uj=exp(nj);
par0=par; olddev=dev;
par=par0-inv(hess)*g; uj=choose(uj<1E-
150,1E-150,uj);
if par<=0 then
do; tt=y/uj;
count=count+1; tt=choose(tt=0,1E-10,tt);
if if gwr=
count<10 then par=0.000001; "poisson" then dev=2*sum(y#log(tt)-(y-
uj));
else
par=abs(par); else
dev=2*sum(y#log(tt)-
end;
(y+1/alpha)#log((1+alpha*y)/(1+alpha*uj)));
if aux1>50 &
if cont>100 then
par>1E5 then do;
ddev= 0.0000001;
dpar=
else ddev=dev-olddev;
0.0001;
end;
aux2=aux2+1; jj=jj+1;
if if gwr="global" |
aux2=1 then par=2 ; gwr="poisson" | aux2>4 | jj>50 | ddpar=0.0000001
then ddpar=1E-9;
else if
aux2=2 then par=1E5; else do;
else if ddpar=par-parold;
aux2=3 then par=0.0001;
if par<1E-3 then
end; ddpar=ddpar*100;
else do; end;
end;
dpar=par-par0;
Ai2=(uj/(1+alpha*uj))+(y-
if uj)#(alpha*uj/(1+2*alpha*uj+alpha*alpha*uj#uj));
par<1E-3 then dpar=dpar*100;
if Ai2[><,]<1E-5 then
end; Ai2=choose(Ai2<1E-5,1E-5,Ai2);
end; Ai=Ai2;
if gwr= "poisson" then
alpha=0;
13
if det(x`*(wi#Ai#x))=0 then else do;
S[i,]=j(1,n,0);
if type="aic" then pos=2;
else S[i,]=
x[i,]*inv(x`*(wi#Ai#x))*(x#wi#Ai)`; else pos=4;
npar2=golden2[3];
if type="cv" then do; res2=golden2[pos];
pos=1; append;
create &out var{h1 res1 h2 res2}; if g1<g2 then do;
end; xmin=h1;
14
npar=golden1[3]; read all var{&geocod} into
geocod_;
golden=g1;
%end;
end;
close &data;
else do;
%if &grid^= %then %do;
xmin=h2;
use &grid;
npar=golden2[3];
read all var{&longg &latg} into POINTS;
golden=g2;
close &grid;
end;
geocod_=nrow(points,1,0);
end;
%end;
else do;
x=j(n,1,1)||x;
xmin = (h3+h0)/2;
yhat=j(n,1,0);
golden = cv(xmin);
h=&h;
npar=golden[3];
gwr="&gwr"; *global,local, poisson;
golden=golden[pos];
method="&method"; *fixed, adaptive1, adaptiven;
end;
m=nrow(POINTS);
h1 = xmin;
bii=j(ncol(x)*m,2,0); alphaii= j(m,2,0);
res1 = golden;
xcoord=j(ncol(x)*m,1,0); ycoord=j(ncol(x)*m,1,0);
npar1=npar;
&geocod= j(ncol(x)*m,1,0);
h2 = .;
sebi=j(ncol(x)*m,1,0); sealphai= j(m,1,0);
res2 = .;
S=j(n,n,0);
npar2=.;
yp=y-sum(y)/n;
append;
probai=j(m,1,0); probbi=j(m,1,0);
if type="cv" then print golden xmin;
yhat=j(m,1,0);
else print golden xmin npar;
res= j(m,1,0);
quit;
if gwr^="poisson" then do;
%mend golden;
ym=sum(y)/nrow(y);
u=(y+ym)/2;
n=log(u);
parold=par;
%macro
gwnbr(data=,y=,x=,lat=,long=,h=,grid=,latg=,longg=, do while (abs(dpar)>0.001);
gwr=,method=,alphag=,offset=,geocod=,out=);
aux1=aux1+1;
proc iml;
if par<0 then
use &data; par=0.00001;
15
par0=par; if abs(COORD[,1])<180 then do;
par=par0-inv(hess)*g; dif=abs(POINTS[i,1]-COORD[j,1]);
16
nj=log(uj); aux2=aux2+1;
if gwr="global" | gwr="poisson"
then do; else alpha=1/par;
hess=sum((trigamma(par+y)- nj=choose(nj>1E2,1E2,nj);
trigamma(par)+1/par-
uj=exp(nj);
2/(par+uj)+(y+par)/((par+uj)#(par+uj)))#w[,1]);
olddev=dev;
end;
uj=choose(uj<1E-150,1E-
par0=par;
150,uj);
hess=choose(abs(hess)<1E-
tt=y/uj;
23,sign(hess)*1E-23,hess);
tt=choose(tt=0,1E-10,tt);
hess=choose(hess=0,1E-
23,hess); if gwr= "poisson" then
dev=2*sum(y#log(tt)-(y-uj));
par=par0-inv(hess)*g;
else dev=2*sum(y#log(tt)-
if par<=0 then do;
(y+1/alpha)#log((1+alpha*y)/(1+alpha*uj)));
count=count+1;
if cont>100 then ddev=
if count=1 then 0.0000001; *MAXINTB;
par=0.000001;
else ddev=dev-olddev;
else if count=2
end;
then par=0.0001;
jj=jj+1;
else
par=1/alphag; *print jj bi;
end; if gwr="global" | gwr="poisson" |
aux2>4 | count>3 | jj>200 then ddpar=1E-9;
if aux1>100 & par>1E5
then do; *MAXINTA; else do;
dpar= 0.0001; ddpar=par-parold;
if aux2=0 then if par<1E-3 then
par=1/alphag + 0.0011; ddpar=ddpar*100;
if aux2=1 then end;
par=2 ;
/* print j aux1 cont aux2 count parold par
else if aux2=2 ddpar;*/
then par=1E5;
end;
else if aux2=3
then par=0.0001; if aux2>4 then probai[i]=1;
17
if count>3 then probai[i]=2; b=bii[,2];
Ai2=(uj/(1+alpha*uj))+(y- alphai=alphaii[,2];
uj)#(alpha*uj/(1+2*alpha*uj+alpha*alpha*uj#uj));
_id_= bii[,1];
if Ai2[><,]<1E-5 then do;
_ida_=alphaii[,1];
probbi[i]=1;
Ai2=choose(Ai2<1E-5,1E-5,Ai2);
_beta_=shape(bii[,1:2],n);
end;
i=do(2,ncol(_beta_),2);
Ai=Ai2;
_beta_=_beta_[,i];
%if &grid= | &grid=&data %then %do;
call qntl(qntl,_beta_);
if det(x`*(wi#Ai#x))=0 then
S[i,]=j(1,n,0); qntl=qntl//(qntl[3,]-qntl[1,]);
r=1/alpha;
sealpha=ser/(r**2); _stdbeta_=shape(sebi,n);
alphaii[i,1]=i; qntls=qntls//(qntls[3,]-qntls[1,]);
alphaii[i,2]= alpha; descripts=_stdbeta_[:,]//_stdbeta_[><,]//_stdbeta_[
<>,];
end;
m1=(i-1)*ncol(x)+1;
print qntls[label="Quantiles of GWNBR Standard
m2=m1+(ncol(x)-1); Errors"
sebi[m1:m2,1]=seb; rowname={"P25", "P50", "P75", "IQR"}
colname={'Intercept' &x}],,
bii[m1:m2,1]=i;
descripts[label="Descriptive Statistics of Standard
bii[m1:m2,2]=bi;
Errors" rowname={"Mean", "Min", "Max"}
xcoord[m1:m2,1]= POINTS[i,1];
colname={'Intercept' &x}];
ycoord[m1:m2,1]= POINTS[i,2];
&geocod[m1:m2,1]= geocod_[i,1];
%if &grid= | &grid=&data %then %do;
%if &grid= | &grid=&data %then %do;
yhat=choose(yhat<1E-150,1E-150,yhat);
yhat[i]=uj[i];
tt=y/yhat;
%end;
tt=choose(tt=0,1E-10,tt);
end;
if gwr= "poisson" then do;
tstat= bii[,2]/sebi;
dev=2*sum(y#log(tt)-(y-yhat));
probtstat=2*(1-probnorm(abs(tstat)));
devnull=2*sum(y#log(y/y[:])-(y-
if gwr^="poisson" then do; y[:]));
aprobtstat=2*(1-probnorm(abs(atstat))); end;
*check for normality;
else do;
end;
dev=2*sum(y#log(tt)-
else do; (y+1/alphai)#log((1+alphai#y)/(1+alphai#yhat)));
atstat=j(n,1,0); devnull=2*sum(y#log(y/y[:])-
(y+1/alphai)#log((1+alphai#y)/(1+alphai#y[:])));
aprobtstat=j(n,1,1);
pctdev=1-dev/devnull;
end;
end;
18
if gwr^="poisson" then do; print gwr method ll dev pctdev adjpctdev
pctll adjpctll npar aic aicc bic;
a2=y+1/alphai; b2=1/alphai;
create _res_ from res[colname={"_id_"
algamma=j(n,1,0); "xcoord" "ycoord" "yobs" "yhat" "res" "resraw"}];
blgamma=j(n,1,0);
append from res;
do i=1 to nrow(y);
stat=ll|| dev|| pctdev || adjpctdev||
algamma[i]=lgamma(a2[i]); pctll || adjpctll || npar|| aic|| aicc|| bic;
blgamma[i]=lgamma(b2[i]); create _stat_ from stat[colname={"l1"
"dev" "pctdev" "adjpctdev" "pctll" "adjpctll"
end;
"npar" "aic" "aicc" "bic"}];
end;
append from stat;
c2=y+1;
%end;
clgamma=j(n,1,0);
%else %do; print gwr method; %end;
do i=1 to nrow(y);
clgamma[i]=lgamma(c2[i]);
create _beta_ var{_id_ &geocod xcoord ycoord b sebi
end; tstat probtstat}; * _beta_ has beta vector for each
point i;
if gwr^="poisson" then do;
append;
ll=sum(y#log(alphai#yhat)-
(y+1/alphai)#log(1+alphai#yhat)+ algamma - blgamma xcoord=COORD[,1];ycoord=COORD[,2];
- clgamma );
&geocod=unique(&geocod)`;
if gwr="global" & alphai^=1/parg
sig_alpha=j(n,1,"not significant at 90%");
then npar=trace(S);
v1=npar;
else npar=trace(S)+1;
do i=1 to n;
ll1=sum(y#log(y/(alphai#yhat))-
y+(y+1/alphai)#log(1+alphai#yhat)-algamma+blgamma); if aprobtstat[i]<0.01*(ncol(x)/v1) then
sig_alpha[i]="significant at 95%";
llnull=sum(y#log(y/y[:]));
else if aprobtstat[i]<0.1*(ncol(x)/v1) then
pctll=1-ll1/llnull;
sig_alpha[i]="significant at 90%";
end;
else sig_alpha[i]="not significant at 90%";
else do;
end;
ll=sum(-yhat+y#log(yhat)-clgamma);
create _alpha_ var{_ida_ &geocod xcoord ycoord
npar=trace(S); alphai sealphai atstat aprobtstat sig_alpha probai
probbi}; * _alpha_ has alpha vector for each point
pctll=pctdev; i;
end; append;
adjpctdev=1-((nrow(y)-1)/(nrow(y)- _tstat_=_beta_/_stdbeta_;
npar))*(1-pctdev);
_probt_=2*(1-probnorm(abs(_tstat_)));
adjpctll=1-((nrow(y)-1)/(nrow(y)-
npar))*(1-pctll); _bistdt_=geocod_||COORD||_beta_||_stdbeta_||_tstat_
||_probt_;
resord=y-yhat;
_colname1_={"Intercept" &x};
sigma2= (resord`*resord)/(n-npar);
_label_=repeat("std_",ncol(x))//repeat("tstat_",nco
sii=vecdiag(S); l(x))//repeat("probt_",ncol(x));
19
do j=1 to ncol(x); %macro map(data=,var=, map=, geocod=);
end; run;
end; quit;
_label_=repeat("sig_",ncol(x));
_colname_=concat(_label_,repeat(_colname1_`,1))`;
create _sig_parameters2_ from
_sig_[colname=_colname_];
/*
%let nvar=0;
%do %while(%scan(%str(&x),&nvar+1)~=);
%let nvar=%eval(&nvar+1);
%end;
use _beta_;
close _beta_;
n=nrow(b);
npar=&nvar+1;
do i=1 to (n/npar);
b&i[1,]=b[(i-1)*npar+&i+1,];
end;
%end;
*/
quit;
%mend gwnbr;
20