0% found this document useful (0 votes)
76 views21 pages

Geographically Weighted Negative Binomial Regression-Incorporating Overdispersion

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views21 pages

Geographically Weighted Negative Binomial Regression-Incorporating Overdispersion

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/257665467

Geographically Weighted Negative Binomial Regression—incorporating


overdispersion

Article  in  Statistics and Computing · September 2014


DOI: 10.1007/s11222-013-9401-9

CITATIONS READS

81 2,326

2 authors:

Alan Ricardo da Silva Thais Rodrigues


University of Brasília University of Brasília
37 PUBLICATIONS   397 CITATIONS    2 PUBLICATIONS   91 CITATIONS   

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Alan Ricardo da Silva on 11 October 2016.

The user has requested enhancement of the downloaded file.


Paper 8000-2016
®
A SAS Macro for Geographically Weighted Negative Binomial Regression
Alan Ricardo da Silva, Universidade de Brasília, Dep. de Estatística, Brazil
Thais Carvalho Valadares Rodrigues, Universidade de Brasília, Dep. de Estatística, Brazil

ABSTRACT
Geographically Weighted Negative Binomial Regression (GWNBR) was developed by Silva and Rodrigues
(2014) and it is a generalization of Geographically Weighted Poisson Regression (GWPR) proposed by
Nakaya et al. (2005) and of Poisson and negative binomial regressions. This paper aims to show a SAS®
macro to estimate GWNBR model encoded in SAS/IML and using PROC GMAP to draw the maps.

INTRODUCTION
Local spatial regression differs from global spatial regression by analyzing the relationship between
variables in a specific way for each unit of study instead of combining them. In fact, regions j closer region
i have greater influence in the estimates of the regression coefficients than when those regions are far
apart. Having a specific adjustment for each region, the final result is a better representation of the process
as a whole. The reason of to use this analysis is the violation of the assumption of stationarity demanded
by global models, which allows one to attribute the same relation between variables for all units of study.
The nonstationarity or spatial heterogeneity is the process where responses vary with location, area or
other characteristics of the spatial units (Anselin, 1988). In this way, Geographically Weighted Regression
(GWR) (see Fotheringham et al. (2002)) is used to local modeling.
However, this technique is used when the distribution of the data is Gaussian. In many applications, the
dependent variable represents a count, which makes classic GWR inappropriate to model this type of data.
Poisson and negative binomial are the most adequate distributions for the modeling of count data, and
Geographically Weighted Poisson Regression (GWPR) was developed by Nakaya et al. (2005). Silva and
Rodrigues (2014) developed GWR using negative binomial distribution, named GWNBR, and they showed
that GWNBR incorporates GWPR. Thus, the main objective of this paper is to show a SAS® macro to
estimate the parameters of the GWNBR model.

A BASIC OUTLINE OF GWNBR

Parameterizing this model in terms of  / , where  is an off set variable,  is the predicted mean, 
The most used global Negative Binomial Regression (NB-2) considers a logarithm link function.

is the parameter of overdispersion,  is the parameter related to the explicative variable  , for =
1, … , , and  is the j-th dependent variable for  = 1, … ,  we have.
 ∼     ∑   ,   (1)
where NB represents Negative Binomial.

parameters  and . Without limiting the functional form of this variation, GWNBR produces non-
GWNBR is an extension of the global or non-spatial model (1), which allows the spatial variation of the

parametric surfaces of the parameter estimates. This local model is described as the following:
 ∼    ∑   ,   ,  ,   (2)
where ( ,  ) are the locations (coordinates) of the data points j, for  = 1, … , .

and IRLS methods, i. e., using !, which is estimated by the NR method, we estimate the vector β using the
The parameter estimation of the global model (1) is performed interactively with the combination of the NR

IRLS method. Thus, from this new β#, we update !, and so forth until convergence is obtained. However,
modifications in the NR and IRLS algorithms are necessary to incorporate the local variations.
As shown in Silva and Rodrigues (2014), the analytical solution for the local log-likelihood of GWNBR is
given by
% (& , & )('()) = *+ ,(& , & )-(& , & )(') *.) *+ ,(& , & )-(& , & )(') /(& , & )(')
$ (3)
where * is an  × matrix of the explicative variables
1 )) … )
1 )2 … 2
*=1 6
⋮ ⋮ ⋱ ⋮
1 4) … 4
(4)

,(& , & ) is an  ×  GWR diagonal weighting matrix for point i


7&) 0 … 0
0 7&2 … 0
,(& , & ) = 1 6
⋮ ⋮ ⋱ ⋮
0 0 … 7&4
(5)

and -(& , & )(') is an  ×  GLM diagonal weighting matrix for iteration m and location i
<&) 0
(')
… 0
; … 0?
=: 0 <&2
(')
-(& , & )(') >
⋮ ⋮ ⋱ ⋮
(6)

9 0 0 … <&4
(')
=
% (& , & )(') +
% (DE ,FE )(G) 
/(& , & )(') = *$
AB .CB $
(G) % (DE ,FE )(G) K
HEB I)(JE ×CB $
(7)

elements <& ( = 1, … , ) of GWNBR (6) are the following:


(')
Silva and Rodrigues (2014) used the IRLS method with the observed Fisher Information Matrix, and the

% (DE ,FE )(G)  % (DE ,FE )(G) JE CB $


% (DE ,FE )(G) 
<& = +
(') CB $ AB .CB $
% (DE ,FE )(G) 
)(JE CB $ %
)(2JE CB $(DE ,FE ) (G) L
(J C $L % (DE ,FE )(G) 
E B
(8)

% (& , & )] = R(& , & )-(& , & ).) R′(& , & )


O [$
As in GWPR (Nakaya et al., 2005), the covariance matrix of the parameter estimates can be estimated by
MN (9)

R(& , & ) = [*+ ,(& , & )-(& , & )*].) × *′,(& , & )-(& , & )
where
(10)
and the elements of ,(& , & ) and -(& , & ) are given by (5) and (6), respectively.
After an estimate of $(& , & ) is obtained, the parameters & will be estimated using the NR method based
on the local log-likelihood given by (Silva and Rodrigues, 2014)
4

T(U& | , $(& , & )) W{  log[ ($(& , & ))] − [  + U& ] × ^N_[U& +  ($(& , & ))]
\)

+ U& ^N_[U& ] + ^N_ [Γ( + U& )] − ^N_ [Γ(U& )] − ^N_[Γ( + 1)]} 7(b& ) (11)
Maximizing the local log-likelihood (11) using the univariate NR method, we obtain
(') .)
U& = U& −  c&  d&
('()) (') (')
(12)

where d& and c& are the first and second derivates of the local log-likelihood with respect to U&
(') (') (')
, i. e.,

d&' = = ∑4\) h ij U& +  k − i [ U&' ] + ^N_[U&' ] + 1 −


e f(gE ) {(')}
e gE

^N_U&' +  $(& , & ) −


gEG(AB
G l 7(b& )
gE (CB $(DE ,FE )
(13)

c&' = = ∑4\) m i +  U&' +   − i + [ U&' ] + − +


e L f(gE ) ) 2
e gEL gEG gEG (CB $(DE ,FE )

gEG (AB
G L n 7(b& )
gE (CB $(DE ,FE )
(14)

where i(. ) and i′(. ) are the digamma and trigamma functions, respectively, which are given by
i(p) = and i′(p) = =
eqrst{(u) ey(u) e L z{|t{(u)
eu eu eu L

Using delta method, Silva and Rodrigues (2014) found that, for a function _(. ) that is differentiable in }, if
the distribution }#4 → (}, 42 ) achieves convergence, then the distribution _}#4  → (_(}), [_+ (})]2 × 22 )
also converges (Casella and Berger, 2011). If, under certain conditions, the estimators that maximize the
local likelihood are asymptotically normal, unbiased and consistent (Staniswalis, 1989), then

!& = <b <U(!& ) =


) .)
ĝ E ‚E ĝEƒ
(15)

because <U(Û& ) = −1/c& , where c& is given in (14), and [_+ (U& )]2 = 1/U&„ .
Thus, for each regression point i, the NR and IRLS algorithms are used alternately until the parameter
estimates achieve convergence.

which helps determine the weights 7(b& ). One possibility could be to estimate it such that it minimizes the
To complete the fitting of the model, it is necessary to estimate the bandwidth of the chosen kernel function,

corrected AIC criterion (AICc):

…†M‡ = −2T($, ‰) + 2 +
2(())
4..)
(16)

where k is the effective number of parameters and T($, ‰) is the log-likelihood of GWNBR shown in
Equation (11).
The effective number of parameters of GWNBR can be written as = ) + 2 , where ) and 2 are the
effective number of parameters due to $ and ‰, respectively. Following the method developed by Nakaya
et al. (2005), ) is given by the trace of the matrix Š, which is given by elements

‹ = * *+ , ,  - ,  * *′,( ,  )-( ,  )


.)
(17)
where * is the j-th row of *.
However, to date, it has not been possible to estimate 2 , i.e., the contribution of the surface of ‰ on the
effective number of parameters of the model. Consequently, Silva and Rodrigues (2014) opted to estimate
the bandwidth using the cross-validation criterion given by (Fotheringham et al., 2002):

M = ∑4\) − !Œ ()


2
(18)
where !Œ () is the estimated value for point j, omitting the observation j and b is the bandwidth.
Note that the indetermination of 2 does not prevent the adjustment of GWNBR. However, this
indetermination makes it difficult to compare models because the complexity of the GWNBR, which is given
by the effective number of parameters, is unknown.

GWNBRG MODEL
To avoid the difficult associate with the estimation of 2 , Silva and Rodrigues (2014) proposed the
Geographically Weighted Negative Binomial Regression with  global methodology, which is namely
GWNBRg. In this model, the spatial variation is allowed only for $(& , & ), i. e.,
 ∼     ∑  ( ,  )  ,   (19)
where the parameters are the same as in Equation (2).
In the GWNBRg model, the estimation of the parameter  is made globally, i.e., Silva and Rodrigues (2014)
assumed that all of the parameters in the model are stationary, and we estimate a global overdispersion !
to be used in the local estimates $(& , & ). Consequently, they proposed that the estimate of  in the
GWNBRg model will be the same as that obtained through non-spatial (or global) negative binomial
regression.
The parameters $(& , & ) are estimated using the IRLS method, as in the GWNBR model described earlier,
assuming that !& = ! for all i. Note that it is not necessary to alternate the NR and IRLS methods, because
once  is estimated globally, they estimate $(& , & ) for each regression point i using only the IRLS method.
Because there is no spatial variation for , its contribution to the effective number of the parameters in the
model is the unity, i.e., 2 = 1. Consequently, the bandwidth can be found using the AIC criterion (16),
where
4

T($(& , & ), | ,  ) = WŽ  ^N_(  ) −   + 1/^N_1 +    +


\)

^N_ Γ + 1/ − ^N_ jΓ I Kk − ^N_Γ + 1


)
J
(20)

and = U(Š) + 1.
More details about GWR, bandwidth and other topics can be found in Fotheringham et al. (2202), Paez et
al. (2011); Leung et al. (2000); Wheeler and Tiefelsdorf (2005); Farber and Paez (2007); McMillen (2010);
Cleveland and Devlin (1988).

SAS® MACRO
The SAS® macros use the IML (Interactive Matrix Language) Procedure and the parameters are described
as follows. SAS® %golden macro is used to the Golden Section Search (find the optimal bandwidth) and
%gwnbr macro is used to estimated GWNBR model.

%golden(data=,y=,x=,lat=,long=,method=,type=,gwr=,offset=,out=);

%gwnbr(data=,y=,x=,lat=,long=,h=,grid=,latg=,longg=,gwr=,method=,alphag=,
offset=,geocod=,out=);

Input arguments of %golden and %gwnbr macros are defined as follows:


• data = specifies the data set to be analyzed.
• y = specifies the response or dependent variable.
• x = specifies the independent or explicative variables.
• lat = specifies the latitude or y axis variable.
• long = specifies the longitude or x axis variable.
• method = specifies the method to be used to find the bandwidth: FIXED, ADAPTIVE1 (1 bandwidth
related to the number of neighbors ) or ADAPTIVEN ( n bandwidths).
• type = specifies the statistic to be used to find the bandwidth: AIC, CV (Cross Validation) or DEV
(Deviance).
• gwr = specifies the model: GLOBAL (GWNBRg), LOCAL (GWNBR), POISSON.
• offset = specifies the offset variable.
• out = specifies output data set.
• h = specifies the bandwidth.
• grid = specifies the data set with coordinates to be estimate the parameters.
• latg = specifies the latitude or y axis variable of the grid data set.
• longg = specifies the longitude or x axis variable of the grid data set.
• alphag= specifies the global alpha value to be used in GWNBRg.
• geocod = specifies the geocoding variable.
ILLUSTRATION
To illustrate how GWNBR works, we will use the data set presented by Silva and Rodrigues (2014) about
the vehicles used for road freight transportation in Espirito Santo, Brazil. Also we use the Poisson and
negative binomial models, in their global and spatial forms. The results of golden section search and of the
fit are in Table 1.
Model Method Type Golden b ($, ‰) AICc
GWNBR Fixed AIC 892.84 29.4 km -419.74 892.84
GWNBR Fixed CV 67550836 53.2 km -440.09 899.93
GWNBR Adaptive AIC 886.97 18 -416.49 886.97
GWNBR Adaptive CV 64188661 49 -446.04 907.01
GWNBRg Fixed AIC 908.66 53.1 km -444.41 908.66
GWNBRg Fixed CV 67668447 53.2 km -444.45 908.66
GWNBRg Adaptive AIC 910.08 34 -443.06 910.08
GWNBRg Adaptive CV 64303765 49 -447.99 910.91
GWPR Fixed AIC 1705.10 9.4 km -569.77 1705.10
GWPR Fixed CV 7309793 28.5 km -2055.79 4152.17
GWPR Adaptive AIC 1549.66 5 -467.89 1549.67
GWPR Adaptive CV 14127354 41 -3106.87 6228.27
NBR Fixed - - - -458.48 923.29
PR Fixed - - - -5001.36 10006.88
Table 1. Golden Section Search and GWNBR Results for Espirito Santo Data, Brazil

The macros calls used for the best models are as bellow:

/****** GWR=GWNBR and METHOD=FIXED AND TYPE=CV *********/


%golden(data=data_gwnbr,y=fleet,x=industry,lat=y,long=x,method=fixed,
type=cv,gwr=local,out=band);
%gwnbr(data=data_gwnbr,y=fleet,x=industry,lat=y,long=x,h=53.2,gwr=local,
method=fixed,geocod=geocod,out=gwnbr);

/****** GWR=GWNBRg and METHOD=FIXED AND TYPE=AIC *********/


%golden(data=data_gwnbr,y=fleet,x=industry,lat=y,long=x,method=fixed,
type=aic,gwr=global,out=band);
%gwnbr(data=data_gwnbr,y=fleet,x=industry,lat=y,long=x,h=53.068412,
gwr=global, method=fixed,geocod=geocod,out=gwr);

/****** GWR=GWPR and METHOD=FIXED AND TYPE=AIC *********/


%golden(data=data_gwnbr,y=fleet,x=industry,lat=y,long=x,method=fixed,
type=aic,gwr=poisson,out=band);
%gwnbr(data=data_gwnbr,y=fleet,x=industry,lat=y,long=x,h=9.3796134,
gwr=poisson,method=fixed,geocod=geocod,out=gwr);
with the best T($, ‰) is inappropriate because there are only between 1 and 9 points for each regression.
As shown in Table 1, the PR and GWPR models were the worst models. The bandwidth found for GWPR

The problem detected in the estimation of the bandwidth was a clue of the lack of fit obtained with the
Poisson distribution. Thus, we can conclude that the fleet of vehicles used in road freight transportation
probably exhibits overdispersion. Therefore, the negative binomial distribution is the best approach for the
modeling of these variables.
As an example, the surface of the parameters for GWNBR with bandwidth equal to 53.2 km and their
standard errors are shown in Figure 1. The greater values for the intercept (‘ ) are concentrated in the

amount of vehicles located in that place. In contrast, the parameter estimates for ) are smaller in that area
Vitoria metropolitan region (southeast side), which is the capital of the state. This finding reflects the greater

because there are many industries and the link function is exponential.

Figure 1. Surface of the parameter estimates and standard errors obtained with the GWNBR model
Output 1 shows the output of %gwnbr macro, where we can see, besides descriptive statistics, the actual

Fotheringham (2015), and pseudo ’ 2 and adjusted ’2 measures, considering Deviance (pctdev and
alpha-level, t-critical and the number of parameter estimated by GWNBR model, following Silva and

adjpctdev, respectively) and considering likelihood (pctll and adjpctll, respectively), following
Cameron and Windmeijer (1996).

Output 1. Output from %gwnbr macro

CONCLUSION
Geographically Weighted Negative Binomial Regression (GWNBR) is an important tool to incorporate
overdispersion to the local model. The equations proposed by Silva and Rodrigues (2014) were encoded
in SAS/IML language and now this model can be estimated, as well as, it is possible to estimate GWPR
proposed by Nakaya et al. (2005) from GWNBR. The illustration showed that when the overdispersion is
present the Poisson distribution is not adequate to model the data, and the quality of fit of the negative
binomial distribution is superior, mainly GWNBR.

REFERENCES

Anselin, L. 1988. Spatial Econometrics: Methods and Models. Santa Barbara: Kluwer Academic
Publishers.
Cameron, A. C. and Windmeijer, F. A. G. (1996). “R-Squared Measures for Count Data
Regression Models with Applications to Health-Care Utilization”. Journal of Business and
Economic Statistics. 14(2): 209-220.
Casella, G. and Berger, R. L. 2001. Statistical Inference. 2nd ed. Duxbury.
Cleveland, W. S. and Devlin, S. J. 1988. “Locally-weighted Regression: An Approach to
Regression Analysis by Local Fitting”. Journal of the American Statistical Association. 83: 596-
610.
Farber, S. and Paez, A. 2007. “A Systematic Investigation of Cross-Validation in GWR Model
Estimation: Empirical Analysis and Monte Carlo Simulations”. Journal of Geographical Systems
9(4): 371-396.
Fotheringham, A. S., Brunsdon, C. and Charlton, M. 2002. Geographically Weighted Regression.
Wiley.
Leung, Y., Mei, C.-L. and Zhang, W.-X. 2000. “Statistical Tests for Spacial Nonstationarity Based
on the Geographically Weighted Regression Model”. Environment and Planning A 32: 9-32.
McMillen, D. P. 2010. “Issues in Spatial Data Analysis”. Journal of Regional Science 50(1): 119-
141.
Nakaya, T., Fotheringham, A. S., Brunsdon, C. and Charlton, M. 2005. “Geographically Weighted
Poisson Regression for Disease Association Mapping”. Statistics in Medicine 24: 2695-2717.
Paez, A., Farber, S. and Wheeler, D. 2011. “A Simulation-based Study of Geographically
Weighted Regression as a Method for Investigating Spatially Varying Relationships”. Environment
and Planning A 43(12): 2992-3010.
Silva, A. R. and Rodrigues, T. C. V. 2014. “Geographically Weighted Negative Binomial
Regression - Incorporating Overdispersion”. Statistics and Computing 24: 769-783.
Silva, A. R. and Fotheringham, A. S. 2015. “The Multiple Testing Issue in Geographically
Weighted Regression”. Geographical Analysis. Forthcoming.
Staniswalis, J. G. 1989. “The kernel Estimate of a Regression Function in Likelihood-based
Models”. Journal of the American Statistical Association 84: 276-283.
Wheeler, D. and Tiefelsdorf, M. 2005. “Multicollinearity and Correlation among Local Regression
Coefficients in Geographically Weighted Regression”. Journal of Geographical Systems 7(2):
161-187.

CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
Name: Alan Ricardo da Silva
Enterprise: Universidade de Brasília
Address: Campus Universitário Darcy Ribeiro, Departamento de Estatística, Prédio CIC/EST sala A1
35/28
City, State ZIP: Brasília, DF, Brazil, 70910-900
Work Phone: +5561 3107 3672
E-mail: [email protected]
Web: www.est.unb.br

Name: Thais Carvalho Valadares Rodrigues


Enterprise: Universidade de Brasília
Address: Campus Universitário Darcy Ribeiro, Departamento de Estatística, Prédio CIC/EST sala A1
35/28
City, State ZIP: Brasília, DF, Brazil, 70910-900
Work Phone: +5561 3107 3672
E-mail: [email protected]
Web: www.est.unb.br

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of
SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
APPENDIX I – SAS® MACRO

/************************************************/ b&i=j(n,1,0);

/********* SAS MACROS ************************/ do i=1 to n;

/**********************************************/ b&i[i,1]=b[(i-1)*npar+&i+1,5];

end;

/***********************************************/ vk&i=sum((b&i - b&i[:] )##2)/n ;

/********** CREATING PARAMETERS ESTIMATES********/ vk=vk||vk&i;

/************************************************/ %end;

%macro beta(par); vka= sum((alpha - alpha[:] )##2)/n ;

proc iml; vk=vk||vka;

use _beta_; idx = setdif(1:(npar+2),1);


read all into b; vk = vk[,idx];

close _beta_; create vk from vk;

n=nrow(b); append from vk;

npar=&par+1; quit;

%do i=0 %to &par; %mend vk;

b&i=j(1,8,0);

nome={"id" "geocod" "x" "y" "b" "sebi" /************************************************/


"tstat" "probtstat"};
/***** PERMUTATION FOR STATIONARITY TEST ********/
create b&i from b&i[colname=nome];
/***********************************************/
do i=1 to (n/npar); * from i=1 to N;

b&i[1,]=b[(i-1)*npar+&i+1,];
%macro perm(data=,geocod=,x=,y=);
append from b&i;
proc iml;
end;
use &data;
%end;
read all var{&geocod &x &y} into tab;
quit;
close &data;
%mend beta;
n=nrow(tab);

u = 1:n;

call randgen(u, "Uniform");


/***********************************************/
_u_=rank(u);
/***** PARAMETERS FOR STATIONARITY TEST *********/
create perm var{_u_};
/************************************************/
append;
%macro vk(par);
quit;
proc iml;
data perm; merge perm &data(drop= &geocod &x y) ;
use _beta_; run;
read all into b; proc sort data=perm; by _u_; run;

close _beta_; data perm; merge perm &data(keep=&geocod &x &y);


run;

%mend perm;
use _alpha_;

read all var {alphai} into alpha;


/**********************************************/
close _alpha_;
/*********** STATIONARITY TEST ******************/

/************************************************/
n=nrow(b);

npar=&par+1;
%macro
n=n/npar; estac(data=,y=,x=,lat=,long=,h=,grid=,latg=,longg=,
gwr=,method=,alphag=,offset=,geocod=,rep=);
vk=0;
%let nvar=0;
%do i=0 %to &par;
%do %while(%scan(%str(&x),&nvar+1)~=); %macro
golden(data=,y=,x=,lat=,long=,method=,type=,gwr=,of
%let nvar=%eval(&nvar+1); fset=,out=);
%end; proc iml;
%gwnbr(data=&data,y=&y,x=&x,lat=&lat,long=&long,h=& use &data;
h,grid=&grid,
read all var {&y} into y;
latg=&latg,longg=&longg,gwr=&gwr,method=&method,alp
hag=&alphag, read all var {&x} into x;

offset=&offset,geocod=&geocod); read all var{&long &lat} into COORD;

%vk(&nvar); n=nrow(y);
data vk2; set vk; i=1; run; %if &offset= %then %do; offset=j(n,1,0);
%end;
%do it=2 %to (&rep+1);
%else %do; read all var {&offset} into
%perm(data=&data,geocod=&geocod,x=&long,y= offset; %end;
&lat);
close &data;
%gwnbr(data=perm,y=&y,x=&x,lat=&lat,long=&
long,h=&h,grid=&grid, x=j(n,1,1)||x;

latg=&latg,longg=&longg,gwr=&gwr,method=&m method= "&method"; *fixed, adaptive1 ou adaptiven;


ethod,alphag=&alphag,
type="&type"; *aic , cv ou dev ;
offset=&offset,geocod=&geocod);
gwr="&gwr"; *global, local, poisson;
%vk(&nvar);
print method type gwr;
data vk; set vk; i=&it; run;
proc append base=vk2 data=vk force; run;
start dist(coord,n);
%end;
d=j(1,3,0);

nome={"idi" "idj" "d"};


proc iml;
create _dist_ from d[colname=nome];
use vk2;
do i=1 to n;
read all into x;
do j=i+1 to n;
close vk2;
if abs(coord[,1])<180 then do;
nvar=ncol(x)-1;
dif=abs(COORD[i,1]-
n=nrow(x); COORD[j,1]);

count=j(1,nvar,0); raio=arcos(-1)/180;

do v=1 to nvar;
ang=sin(COORD[i,2]*raio)*sin(COORD[j,2]*ra
do i=1 to n; io)+cos(COORD[i,2]*raio)*cos(COORD[j,2]*raio)*cos(d
if*raio);
if x[i,v]>=x[1,v] then
count[v]=count[v]+1; arco=arcos(ang);
end; d[1]=i;
end; d[2]=j;
count=count/n*100; d[3]=arco*6371 /*Earth's Radius =
6371 (approximately)*/;
print count;
append from d;
varnames="b0":"b&nvar"||"alpha";
end;
create pvalor_est from count [colname=varnames];
else do;
append from count;
d[1]=i;
quit;
d[2]=j;
%mend estac;
d[3]=sqrt((COORD[i,1]-
COORD[j,1])**2+(COORD[i,2]-COORD[j,2])**2);

append from d;

end;
/***********************************************/
end;
/************ GOLDEN SECTION SEARCH ***********/
end;
/*******************************************/
close _dist_;

10
finish dist;
hess=choose(hess=0,1E-23,hess);
run dist(coord,n);
par0=par;
use _dist_;
par=par0-inv(hess)*g;
read all into d;
if aux1>50 &
maxd=int(max(d[,3])+1); par>1E5 then do;
free d; dpar=
0.0001;
close _dist_;

if method= "adaptive1" then do;


aux2=aux2+1;
h0= 5 ; h3= n;
if
end; aux2=1 then par=2 ;

else if method= "adaptiven" | method= "fixed" then else if


do; aux2=2 then par=1E5;

h0= 0 ; h3= maxd; else if


aux2=3 then par=0.0001;
end;
end;
r=0.61803399; c=1-r;
else dpar=par-
if method= "adaptive1" then tol=0.9; else tol=0.1; par0;

h1=h0+(1-r)*(h3-h0); end;

h2=h0+r*(h3-h0); a=1/par; dev=0; ddev=1;


i=0;
print h0 h1 h2 h3;
do while
(abs(ddev)>0.00001);
start cv(h) global(method, n, coord, x, y, type, i=i+1;
maxd, gwr, offset);
w=(u/(1+a*u))+(y-
alphaii= j(n,2,0); u)#(a*u/(1+2*a*u+a*a*u#u));
yhat=j(n,1,0); z=n+(y-u)/(w#(1+a*u)) -
offset;
S=j(n,n,0);
b=inv((x#w)`*x)*(x#w)`*z;
if gwr="global" then do;
n=x*b + offset;
ym=sum(y)/nrow(y);
u=exp(n);
u=(y+ym)/2;
olddev=dev;
n=log(u);
tt=y/u;
par=1; ddpar=1; j=0; aux2=0;

do while (abs(ddpar)>0.00001);
tt=choose(tt=0,1E-10,tt);
aux1=0; dpar=1;
dev=2*sum(y#log(tt)-
parold=par;
(y+1/a)#log((1+a*y)/(1+a*u)));
do while
ddev=dev-olddev;
(abs(dpar)>0.001);
end;
aux1=aux1+1;
if aux2>4 then ddpar=1E-
if par<0 then do;
9;

else ddpar=par-parold;
par=0.00001;
end;
end;
alpha=a;
par=choose(par<1E-10,1E-10,par); end;
g=sum(digamma(par+y)- n=nrow(y);
digamma(par)+log(par)+1-log(par+u)-
(par+y)/(par+u)); aux2=0;

hess=sum(trigamma(par+y)- do i=1 to n;
trigamma(par)+1/par-
2/(par+u)+(y+par)/((par+u)#(par+u))); d=j(1,3,0);

dist=d;
hess=choose(abs(hess)<1E-23,sign(hess)*1E-
do j=1 to n;
23,hess);
if abs(coord[,1])<180 then do;

11
dif=abs(COORD[i,1]-COORD[j,1]); if
dist[jj,3]<=h then w[jj]=(1-(dist[jj,3]/h)**2)**2;
raio=arcos(-1)/180;
else
w[jj]= 0;
ang=sin(COORD[i,2]*raio)*sin(COORD[j,2]*ra
io)+cos(COORD[i,2]*raio)*cos(COORD[j,2]*raio)*cos(d end;
if*raio);
end;
if i=j then
arco=0; end;

else arco=arcos(ang); else if method= "adaptive1" then


do;
d1=arco*6371;
call sort(dist,{3});
end;
dist=dist||(1:n)`;
else d1=sqrt((COORD[i,1]-
COORD[j,1])**2+(COORD[i,2]-COORD[j,2])**2); w=j(n,2,0);

d[1]=i; d[2]=j; d[3]=d1; hn=dist[h,3];

if j=1 then dist=d; if type="cv" then do;

else dist=dist//d; do jj=1 to n;

end; if
dist[jj,4]<= h & dist[jj,3]^=0 then w[jj,1]=(1-
u=nrow(dist); (dist[jj,3]/hn)**2)**2;

w=j(u,1,0); else
w[jj,1]=0;
if method= "fixed" then do;
if type="cv" then do; w[jj,2]=dist[jj,2];
do jj=1 to u; end;
if end;
dist[jj,3]<=maxd*0.8 & dist[jj,3]^=0 then
w[jj]=exp(-0.5*(dist[jj,3]/h)**2); else do;

else do jj=1 to n;
w[jj]= 0;
if
end; dist[jj,4]<=h then w[jj,1]=(1-
(dist[jj,3]/hn)**2)**2;
end;
else
else do; w[jj,1]=0;
do jj=1 to u;
w[jj,2]=dist[jj,2];
if
dist[jj,3]<=maxd*0.8 then w[jj]=exp(- end;
0.5*(dist[jj,3]/h)**2);
end;
else
w[jj]= 0; call sort(w,{2});

end; end;

end; wi=w[,1];

end; ym=sum(y)/nrow(y);

else if method= "adaptiven" then uj=(y+ym)/2;


do;
nj=log(uj);
if type="cv" then do;
if i=1 | aux2=5 then par=1; else
do jj=1 to u; par=alphaii[i-1,2];

if ddpar=1; jj=0; count=0; aux2=0;


dist[jj,3]<=h & dist[jj,3]^=0 then w[jj]=(1-
(dist[jj,3]/h)**2)**2; do while (abs(ddpar)>0.000001);

else aux1=0;
w[jj]= 0;
dpar=1;
end;
parold=par;
end;
if gwr="global" |
else do; gwr="poisson" then do;

do jj=1 to u; dpar=0.00001;

12
if gwr= "global" else alpha=1/par;
then par=1/a;
dev=0; ddev=1; cont=0;
end;
/* computing beta */
/* computing alpha=1/par,
where par=theta */ do while
(abs(ddev)>0.000001);
do while
(abs(dpar)>0.001); cont=cont+1;

aux1=aux1+1;
uj=choose(uj>1E100,1E100,uj);
if gwr="local"
then do; aux=
(alpha*uj/(1+2*alpha*uj+alpha*alpha*uj#uj));

par=choose(par<1E-10,1E-10,par); Ai=(uj/(1+alpha*uj))+(y-uj)#aux;

g=sum((digamma(par+y)-
digamma(par)+log(par)+1-log(par+uj)- Ai=choose(Ai<=0,1E-5,Ai);
(par+y)/(par+uj))#w[,1]);
zj=nj+(y-uj)/(Ai#(1+alpha*uj)) -
offset;
hess=sum((trigamma(par+y)-
if
trigamma(par)+1/par-
det(x`*(wi#Ai#x))=0 then bi=j(ncol(x),1,0);
2/(par+uj)+(y+par)/((par+uj)#(par+uj)))#w[,1]);
else
end;
bi=inv(x`*(wi#Ai#x))*x`*(wi#Ai#zj);

nj=x*bi + offset;
hess=choose(abs(hess)<1E-23,sign(hess)*1E-
23,hess);
nj=choose(nj>1E2,1E2,nj);
hess=choose(hess=0,1E-23,hess); uj=exp(nj);
par0=par; olddev=dev;
par=par0-inv(hess)*g; uj=choose(uj<1E-
150,1E-150,uj);
if par<=0 then
do; tt=y/uj;

count=count+1; tt=choose(tt=0,1E-10,tt);
if if gwr=
count<10 then par=0.000001; "poisson" then dev=2*sum(y#log(tt)-(y-
uj));
else
par=abs(par); else
dev=2*sum(y#log(tt)-
end;
(y+1/alpha)#log((1+alpha*y)/(1+alpha*uj)));
if aux1>50 &
if cont>100 then
par>1E5 then do;
ddev= 0.0000001;
dpar=
else ddev=dev-olddev;
0.0001;
end;
aux2=aux2+1; jj=jj+1;
if if gwr="global" |
aux2=1 then par=2 ; gwr="poisson" | aux2>4 | jj>50 | ddpar=0.0000001
then ddpar=1E-9;
else if
aux2=2 then par=1E5; else do;
else if ddpar=par-parold;
aux2=3 then par=0.0001;
if par<1E-3 then
end; ddpar=ddpar*100;
else do; end;

end;
dpar=par-par0;
Ai2=(uj/(1+alpha*uj))+(y-
if uj)#(alpha*uj/(1+2*alpha*uj+alpha*alpha*uj#uj));
par<1E-3 then dpar=dpar*100;
if Ai2[><,]<1E-5 then
end; Ai2=choose(Ai2<1E-5,1E-5,Ai2);
end; Ai=Ai2;
if gwr= "poisson" then
alpha=0;

13
if det(x`*(wi#Ai#x))=0 then else do;
S[i,]=j(1,n,0);
if type="aic" then pos=2;
else S[i,]=
x[i,]*inv(x`*(wi#Ai#x))*(x#wi#Ai)`; else pos=4;

yhat[i]=uj[i]; create &out var{h1 res1 npar1 h2 res2


npar2};
alphaii[i,1]=i;
end;
alphaii[i,2]= alpha;
res1=cv(h1); npar1=res1[3]; res1=res1[pos];
end;
res2=cv(h2); npar2=res2[3]; res2=res2[pos];
alpha= alphaii[,2];
append;
yhat=choose(yhat<1E-150,1E-150,yhat);
do while(abs(h3-h0) > tol*2);
tt=y/yhat;
if res2<res1 then do;
tt=choose(tt=0,1E-10,tt);
h0=h1;
if gwr= "poisson" then
dev=2*sum(y#log(tt)-(y-yhat)); h1=h2;

else dev=2*sum(y#log(tt)- h2=c*h1+r*h3;


(y+1/alpha)#log((1+alpha#y)/(1+alpha#yhat)));
res1=res2;
if gwr ^= "poisson" then do;
npar1=npar2;
a2=y+1/alpha; b2=1/alpha; c2=y+1;
res2=cv(h2);
end;
npar2=res2[3];
else do;
res2=res2[pos];
a2=y; b2=1/(alpha+1e-8); c2=y+1;
end;
end;
else do;
algamma=j(n,1,0); blgamma=j(n,1,0);
h3=h2;
clgamma=j(n,1,0);
h2=h1;
do i=1 to nrow(y);
h1=c*h2+r*h0;
algamma[i]=lgamma(a2[i]);
blgamma[i]=lgamma(b2[i]); clgamma[i]=lgamma(c2[i]); res2=res1;
end; npar2=npar1;
if gwr^="poisson" then do; res1=cv(h1);
ll=sum(y#log(alpha#yhat)- npar1=res1[3];
(y+1/alpha)#log(1+alpha#yhat)+ algamma - blgamma -
clgamma ); res1=res1[pos];
npar=trace(S)+1; end;
end; append;
else do; end;
ll=sum(-yhat+y#log(yhat)-clgamma); if method= "adaptive1" then do;
npar=trace(S); xmin = (h3+h0)/2;
end; h2=ceil(xmin);
/*AIC= 2*npar + dev;*/ h1=floor(xmin);
AIC= 2*npar -2*ll; golden1 = cv(h1);
AICC= AIC +(2*npar*(npar+1))/(n-npar-1); g1= golden1[pos];
CV=(y-yhat)`*(y-yhat); golden2= cv(h2);
res=cv||aicc||npar||dev; g2= golden2[pos];
return (res); npar1=golden1[3];
finish; res1=golden1[pos];

npar2=golden2[3];
if type="cv" then do; res2=golden2[pos];
pos=1; append;
create &out var{h1 res1 h2 res2}; if g1<g2 then do;
end; xmin=h1;

14
npar=golden1[3]; read all var{&geocod} into
geocod_;
golden=g1;
%end;
end;
close &data;
else do;
%if &grid^= %then %do;
xmin=h2;
use &grid;
npar=golden2[3];
read all var{&longg &latg} into POINTS;
golden=g2;
close &grid;
end;
geocod_=nrow(points,1,0);
end;
%end;
else do;
x=j(n,1,1)||x;
xmin = (h3+h0)/2;
yhat=j(n,1,0);
golden = cv(xmin);
h=&h;
npar=golden[3];
gwr="&gwr"; *global,local, poisson;
golden=golden[pos];
method="&method"; *fixed, adaptive1, adaptiven;
end;
m=nrow(POINTS);
h1 = xmin;
bii=j(ncol(x)*m,2,0); alphaii= j(m,2,0);
res1 = golden;
xcoord=j(ncol(x)*m,1,0); ycoord=j(ncol(x)*m,1,0);
npar1=npar;
&geocod= j(ncol(x)*m,1,0);
h2 = .;
sebi=j(ncol(x)*m,1,0); sealphai= j(m,1,0);
res2 = .;
S=j(n,n,0);
npar2=.;
yp=y-sum(y)/n;
append;
probai=j(m,1,0); probbi=j(m,1,0);
if type="cv" then print golden xmin;
yhat=j(m,1,0);
else print golden xmin npar;
res= j(m,1,0);
quit;
if gwr^="poisson" then do;
%mend golden;
ym=sum(y)/nrow(y);
u=(y+ym)/2;

n=log(u);

par=1; ddpar=1; j=0; aux2=0;


/***********************************************/
do while (abs(ddpar)>0.00001);
/**************** GWNBR ***********************/
aux1=0;
/***********************************************/
dpar=1;

parold=par;
%macro
gwnbr(data=,y=,x=,lat=,long=,h=,grid=,latg=,longg=, do while (abs(dpar)>0.001);
gwr=,method=,alphag=,offset=,geocod=,out=);
aux1=aux1+1;
proc iml;
if par<0 then
use &data; par=0.00001;

read all var {&y} into y; par=choose(par<1E-10,1E-


10,par);
read all var {&x} into x;
g=sum(digamma(par+y)-
read all var{&long &lat} into COORD; digamma(par)+log(par)+1-log(par+u)-
(par+y)/(par+u));
n=nrow(y);
hess=sum(trigamma(par+y)-
%if &offset= %then %do; offset=j(n,1,0);
trigamma(par)+1/par-
%end;
2/(par+u)+(y+par)/((par+u)#(par+u)));
%else %do; read all var {&offset} into
hess=choose(abs(hess)<1E-
offset; %end;
23,sign(hess)*1E-23,hess); *CONFERIR!!!;
%if &grid= %then %do;
hess=choose(hess=0,1E-
read all var{&long &lat} into 23,hess);
POINTS;

15
par0=par; if abs(COORD[,1])<180 then do;

par=par0-inv(hess)*g; dif=abs(POINTS[i,1]-COORD[j,1]);

if aux1>50 & par>1E5 then raio=arcos(-1)/180;


do;

dpar= 0.0001; ang=sin(POINTS[i,2]*raio)*sin(COORD[j,2]*r


aio)+cos(POINTS[i,2]*raio)*cos(COORD[j,2]*raio)*cos
aux2=aux2+1; (dif*raio);
if aux2=1 then if
par=2 ; round(ang,0.000000001)=1 then arco=0;
else if aux2=2 else arco=arcos(ang);
then par=1E5;
d1=arco*6371 /*Earth's Radius = 6371
else if aux2=3 (approximately)*/;
then par=0.0001;
end;
end;
else d1=sqrt((POINTS[i,1]-
else dpar=par-par0; COORD[j,1])**2+(POINTS[i,2]-COORD[j,2])**2);
end; d[1]=i; d[2]=j; d[3]=d1;
a=1/par; dev=0; ddev=1; i=0; if j=1 then dist=d;
*cleaning dist where i value changes;
do while (abs(ddev)>0.00001);
else dist=dist//d;
i=i+1;
end;
w=(u/(1+a*u))+(y-
u)#(a*u/(1+2*a*u+a*a*u#u)); w=j(n,1,0);
w=choose(w<=0,1E-5,w); if method= "fixed" then do;
z=n+(y-u)/(w#(1+a*u)) - do jj=1 to n;
offset;
w[jj]=exp(-
b=inv((x#w)`*x)*(x#w)`*z; 0.5*(dist[jj,3]/h)**2);
n=x*b + offset; end;
n=choose(n>1E2,1E2,n); end;
u=exp(n); else if method= "adaptiven" then do;
olddev=dev; do jj=1 to n;
tt=y/u; if dist[jj,3]<=h then
w[jj]=(1-(dist[jj,3]/h)**2)**2;
tt=choose(tt=0,1E-10,tt);
else w[jj]= 0;
dev=2*sum(y#log(tt)-
(y+1/a)#log((1+a*y)/(1+a*u))); end;
ddev=dev-olddev; end;
end; else if method= "adaptive1" then do;
if aux2>4 then ddpar=1E-9; w=j(n,2,0);
else ddpar=par-parold; call sort(dist,{3});
end; dist=dist||(1:n)`;
%if &alphag= %then %do; alphag=a;%end; hn=dist[h,3]; *bandwith for the
point i;
%else %if &alphag=0 %then %do; alphag=1e-
8;%end; do jj=1 to n;
%else %do; alphag=&alphag;%end; if dist[jj,4]<=h then
w[jj,1]=(1-(dist[jj,3]/hn)**2)**2;
bg=b;
else w[jj,1]=0;
parg=par;
w[jj,2]=dist[jj,2];
end;
end;
if gwr="global" then print alphag aux2;
call sort(w,{2});
n=nrow(y);
end;
aux2=0;
wi=w[,1];
do i=1 to m;
ym=sum(y)/nrow(y);
d=j(1,3,0);
uj=(y+ym)/2;
do j=1 to n;

16
nj=log(uj); aux2=aux2+1;

ddpar=1; jj=0; count=0; aux2=0; end;

if i=1 | aux2=5 | count=4 then par=1; else else do;


par=alphaii[i-1,2];
dpar=par-par0;
do while (abs(ddpar)>0.000001);
if par<1E-3 then
dpar=1; dpar=dpar*100;
if ddpar=1 then parold=1.8139; end;

else parold=par; end;

aux1=0; if gwr= "poisson" then alpha=0;

if gwr="global" | gwr="poisson"
then do; else alpha=1/par;

dpar=0.00001; dev=0; ddev=1; cont=0;

if gwr= "global" then /* computing beta */


par=1/alphag;
do while (abs(ddev)>0.000001);
end;
cont=cont+1;
/* computing alpha=1/par, where
par=theta=r */ Ai=(uj/(1+alpha*uj))+(y-
uj)#(alpha*uj/(1+2*alpha*uj+alpha*alpha*uj#uj));
do while (abs(dpar)>0.001);
Ai=choose(Ai<=0,1E-5,Ai);
aux1=aux1+1;
zj=nj+(y-uj)/(Ai#(1+alpha*uj))-offset;
if gwr="local" then do;
if det(x`*(wi#Ai#x))=0
then bi=j(ncol(x),1,0);
par=choose(par<1E-10,1E-10,par);
else
g=sum((digamma(par+y)- bi=inv(x`*(wi#Ai#x))*x`*(wi#Ai#zj);
digamma(par)+log(par)+1-log(par+uj)-
(par+y)/(par+uj))#w[,1]); nj=x*bi + offset;

hess=sum((trigamma(par+y)- nj=choose(nj>1E2,1E2,nj);
trigamma(par)+1/par-
uj=exp(nj);
2/(par+uj)+(y+par)/((par+uj)#(par+uj)))#w[,1]);
olddev=dev;
end;
uj=choose(uj<1E-150,1E-
par0=par;
150,uj);
hess=choose(abs(hess)<1E-
tt=y/uj;
23,sign(hess)*1E-23,hess);
tt=choose(tt=0,1E-10,tt);
hess=choose(hess=0,1E-
23,hess); if gwr= "poisson" then
dev=2*sum(y#log(tt)-(y-uj));
par=par0-inv(hess)*g;
else dev=2*sum(y#log(tt)-
if par<=0 then do;
(y+1/alpha)#log((1+alpha*y)/(1+alpha*uj)));
count=count+1;
if cont>100 then ddev=
if count=1 then 0.0000001; *MAXINTB;
par=0.000001;
else ddev=dev-olddev;
else if count=2
end;
then par=0.0001;
jj=jj+1;
else
par=1/alphag; *print jj bi;
end; if gwr="global" | gwr="poisson" |
aux2>4 | count>3 | jj>200 then ddpar=1E-9;
if aux1>100 & par>1E5
then do; *MAXINTA; else do;
dpar= 0.0001; ddpar=par-parold;
if aux2=0 then if par<1E-3 then
par=1/alphag + 0.0011; ddpar=ddpar*100;
if aux2=1 then end;
par=2 ;
/* print j aux1 cont aux2 count parold par
else if aux2=2 ddpar;*/
then par=1E5;
end;
else if aux2=3
then par=0.0001; if aux2>4 then probai[i]=1;

17
if count>3 then probai[i]=2; b=bii[,2];

Ai2=(uj/(1+alpha*uj))+(y- alphai=alphaii[,2];
uj)#(alpha*uj/(1+2*alpha*uj+alpha*alpha*uj#uj));
_id_= bii[,1];
if Ai2[><,]<1E-5 then do;
_ida_=alphaii[,1];
probbi[i]=1;

Ai2=choose(Ai2<1E-5,1E-5,Ai2);
_beta_=shape(bii[,1:2],n);
end;
i=do(2,ncol(_beta_),2);
Ai=Ai2;
_beta_=_beta_[,i];
%if &grid= | &grid=&data %then %do;
call qntl(qntl,_beta_);
if det(x`*(wi#Ai#x))=0 then
S[i,]=j(1,n,0); qntl=qntl//(qntl[3,]-qntl[1,]);

else S[i,]= descriptb=_beta_[:,]//_beta_[><,]//_beta_[<>,];


x[i,]*inv(x`*(wi#Ai#x))*(x#wi#Ai)`;
%end;
print qntl[label="Quantiles of GWNBR Parameter
C=inv(x`*(wi#Ai#x)); Estimates"

varb= C; rowname={"P25", "P50", "P75", "IQR"}


colname={'Intercept' &x}],,
seb=sqrt(vecdiag(varb));
descriptb[label="Descriptive Statistics"
if gwr^="poisson" then do; rowname={"Mean", "Min", "Max"}

ser=sqrt(1/abs(hess)); colname={'Intercept' &x}];

r=1/alpha;

sealpha=ser/(r**2); _stdbeta_=shape(sebi,n);

sealphai[i,1]=sealpha; call qntl(qntls,_stdbeta_);

alphaii[i,1]=i; qntls=qntls//(qntls[3,]-qntls[1,]);
alphaii[i,2]= alpha; descripts=_stdbeta_[:,]//_stdbeta_[><,]//_stdbeta_[
<>,];
end;

m1=(i-1)*ncol(x)+1;
print qntls[label="Quantiles of GWNBR Standard
m2=m1+(ncol(x)-1); Errors"
sebi[m1:m2,1]=seb; rowname={"P25", "P50", "P75", "IQR"}
colname={'Intercept' &x}],,
bii[m1:m2,1]=i;
descripts[label="Descriptive Statistics of Standard
bii[m1:m2,2]=bi;
Errors" rowname={"Mean", "Min", "Max"}
xcoord[m1:m2,1]= POINTS[i,1];
colname={'Intercept' &x}];
ycoord[m1:m2,1]= POINTS[i,2];

&geocod[m1:m2,1]= geocod_[i,1];
%if &grid= | &grid=&data %then %do;
%if &grid= | &grid=&data %then %do;
yhat=choose(yhat<1E-150,1E-150,yhat);
yhat[i]=uj[i];
tt=y/yhat;
%end;
tt=choose(tt=0,1E-10,tt);
end;
if gwr= "poisson" then do;
tstat= bii[,2]/sebi;
dev=2*sum(y#log(tt)-(y-yhat));
probtstat=2*(1-probnorm(abs(tstat)));
devnull=2*sum(y#log(y/y[:])-(y-
if gwr^="poisson" then do; y[:]));

atstat= alphaii[,2]/sealphai; pctdev=1-dev/devnull;

aprobtstat=2*(1-probnorm(abs(atstat))); end;
*check for normality;
else do;
end;
dev=2*sum(y#log(tt)-
else do; (y+1/alphai)#log((1+alphai#y)/(1+alphai#yhat)));

atstat=j(n,1,0); devnull=2*sum(y#log(y/y[:])-
(y+1/alphai)#log((1+alphai#y)/(1+alphai#y[:])));
aprobtstat=j(n,1,1);
pctdev=1-dev/devnull;
end;
end;

18
if gwr^="poisson" then do; print gwr method ll dev pctdev adjpctdev
pctll adjpctll npar aic aicc bic;
a2=y+1/alphai; b2=1/alphai;
create _res_ from res[colname={"_id_"
algamma=j(n,1,0); "xcoord" "ycoord" "yobs" "yhat" "res" "resraw"}];
blgamma=j(n,1,0);
append from res;
do i=1 to nrow(y);
stat=ll|| dev|| pctdev || adjpctdev||
algamma[i]=lgamma(a2[i]); pctll || adjpctll || npar|| aic|| aicc|| bic;
blgamma[i]=lgamma(b2[i]); create _stat_ from stat[colname={"l1"
"dev" "pctdev" "adjpctdev" "pctll" "adjpctll"
end;
"npar" "aic" "aicc" "bic"}];
end;
append from stat;
c2=y+1;
%end;
clgamma=j(n,1,0);
%else %do; print gwr method; %end;
do i=1 to nrow(y);

clgamma[i]=lgamma(c2[i]);
create _beta_ var{_id_ &geocod xcoord ycoord b sebi
end; tstat probtstat}; * _beta_ has beta vector for each
point i;
if gwr^="poisson" then do;
append;
ll=sum(y#log(alphai#yhat)-
(y+1/alphai)#log(1+alphai#yhat)+ algamma - blgamma xcoord=COORD[,1];ycoord=COORD[,2];
- clgamma );
&geocod=unique(&geocod)`;
if gwr="global" & alphai^=1/parg
sig_alpha=j(n,1,"not significant at 90%");
then npar=trace(S);
v1=npar;
else npar=trace(S)+1;
do i=1 to n;
ll1=sum(y#log(y/(alphai#yhat))-
y+(y+1/alphai)#log(1+alphai#yhat)-algamma+blgamma); if aprobtstat[i]<0.01*(ncol(x)/v1) then
sig_alpha[i]="significant at 95%";
llnull=sum(y#log(y/y[:]));
else if aprobtstat[i]<0.1*(ncol(x)/v1) then
pctll=1-ll1/llnull;
sig_alpha[i]="significant at 90%";
end;
else sig_alpha[i]="not significant at 90%";
else do;
end;
ll=sum(-yhat+y#log(yhat)-clgamma);
create _alpha_ var{_ida_ &geocod xcoord ycoord
npar=trace(S); alphai sealphai atstat aprobtstat sig_alpha probai
probbi}; * _alpha_ has alpha vector for each point
pctll=pctdev; i;

end; append;

adjpctdev=1-((nrow(y)-1)/(nrow(y)- _tstat_=_beta_/_stdbeta_;
npar))*(1-pctdev);
_probt_=2*(1-probnorm(abs(_tstat_)));
adjpctll=1-((nrow(y)-1)/(nrow(y)-
npar))*(1-pctll); _bistdt_=geocod_||COORD||_beta_||_stdbeta_||_tstat_
||_probt_;
resord=y-yhat;
_colname1_={"Intercept" &x};
sigma2= (resord`*resord)/(n-npar);
_label_=repeat("std_",ncol(x))//repeat("tstat_",nco
sii=vecdiag(S); l(x))//repeat("probt_",ncol(x));

res=resord/sqrt(sigma2#(1-sii)); _colname_={"&geocod" "x"


"y"}||_colname1_||concat(_label_,repeat(_colname1_`
res=unique(_id_)`||COORD[,1]||COORD[,2]||y ,3))`;
||yhat||res||resord;
call change(_colname_, "_ ", "_");
/*AIC= 2*npar + dev;*/
call change(_colname_, "_ ", "_");
AIC= 2*npar - 2*ll;
create _parameters_ from
AICC= AIC +(2*npar*(npar+1))/(n-npar-1); _bistdt_[colname=_colname_];
BIC= npar*log(n) - 2*ll ; append from _bistdt_;
_malpha_=0.05*(ncol(x)/npar); close _parameters_;
_t_critical_=abs(tinv(_malpha_/2,n-npar));

_sig_=j(n,ncol(x),"not significant at 90%");


print _malpha_[label="alpha-level=0.05"] v1=npar;
_t_critical_[format=comma6.2 label="t-Critical"]
npar; do i=1 to n;

19
do j=1 to ncol(x); %macro map(data=,var=, map=, geocod=);

if _probt_[i,j]<0.01*(ncol(x)/v1) then goptions reset=all;


_sig_[i,j]="significant at 99%";
proc gmap data=&data map=&map all;
else if _probt_[i,j]<0.05*(ncol(x)/v1) then
_sig_[i,j]="significant at 95%"; id &geocod;

else if _probt_[i,j]<0.1*(ncol(x)/v1) then choro &var / legend=legend1;


_sig_[i,j]="significant at 90%";
legend1 position=(middle right) across=1
else _sig_[i,j]="not significant at 90%"; mode=reserve label=(position=top j=c);

end; run;

end; quit;

_colname1_={"Intercept" &x}; %mend map;

_label_=repeat("sig_",ncol(x));

_colname_=concat(_label_,repeat(_colname1_`,1))`;
create _sig_parameters2_ from
_sig_[colname=_colname_];

append from _sig_;

/*
%let nvar=0;

%do %while(%scan(%str(&x),&nvar+1)~=);

%let nvar=%eval(&nvar+1);

%end;

use _beta_;

read all into b;

close _beta_;

n=nrow(b);

npar=&nvar+1;

%do i=0 %to &nvar;


b&i=j(1,8,0);

nome={"_id_" "&geocod" "xcoord" "ycoord"


"b" "sebi" "tstat" "probtstat"};

create &out._b&i from b&i[colname=nome];

do i=1 to (n/npar);

b&i[1,]=b[(i-1)*npar+&i+1,];

append from b&i;

end;

%end;

*/

quit;

%mend gwnbr;

20

View publication stats

You might also like