Developments in Statistical Modelling Secure Ebook Download
Developments in Statistical Modelling Secure Ebook Download
Visit the link below to download the full version of this book:
https://medipdf.com/product/developments-in-statistical-modelling/
analysis, among others. A wide range of applications is considered within these papers,
where a particular abundance of contributions in the field of biostatistics is notable. The
contributions are equally frequentist and Bayesian—but this is anyway a categorization
which this community has synergetically overcome.
We would like to draw particular attention to the keynote contribution ‘Statistical
Modelling for Big and Little Data’ by Robin Henderson, which looks at contemporary
data problems from the wider viewpoint of data science, encompassing statistics and
machine learning, highlighting that both small and large data sets come with their own
challenges, and can be equally simple—or hard!—to model and analyse.
We are looking forward to an enjoying and stimulating conference, and we hope that
this volume will contribute to initiating and sustaining discussions about problems in
statistical modelling, potentially triggering new developments and ideas. We are already
now looking forward to the next edition of the workshop in Limerick in 2025 where
perhaps some of these will be presented.
Acknowledgements. The Editors wish to thank the Durham Research Methods
Centre (DRMC) for their financial support of the conference.
Additive Mixed Models for Location, Scale and Shape via Gradient
Boosting Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Colin Griesbach and Elisabeth Bergherr
Martin P. Boer(B)
1 Introduction
Using B-splines for penalized regression, also known as P-splines [3], can offer
computational efficiency due to the local character of B-splines. This results in
sparse linear equations that can be solved easily. However, the primary chal-
lenge lies in determining the optimal penalty parameter. One effective approach
to tackle this issue is through mixed models and restricted maximum likelihood
(REML) [8]. Various methods have been suggested to convert the original penal-
ized B-spline model into a mixed model. The drawback of most existing trans-
formations to mixed models lies in their inability to preserve the local character
of B-splines, consequently diminishing computational efficiency.
In [1] a new method was proposed, using a sparse transformation to mixed
models. This method is computationally more efficient than other approaches. In
this paper we will show that REML can be used directly for P-splines, without
a transformation to a mixed model.
We will first briefly discuss the sparse mixed model formulation proposed by Boer
[1]. For the moment we will assume the one-dimensional case, to keep notation
simple, and extend to two-dimensional case later on. First we introduce some
notation. Let y = (y1 , y2 , . . . , yn ) be the response variable, depending on the
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024
J. Einbeck et al. (Eds.): IWSM 2024, CONTRIB.STAT., pp. 1–6, 2024.
https://doi.org/10.1007/978-3-031-65723-8_1
2 M. P. Boer
Analytic expressions for ξi,j,p are given in [6]. The matrix G is a q × k matrix
with (i, j)th entry ξi,j,p . The q × (q − k) matrix D has k-th order difference
penalties. The matrix X is defined by X = BG = [1|x1 | . . . |xk−1 ], where DG =
0. Let B∗ be a k × q matrix, with (i, j)th entry Bj,p (x∗,i ), where the k reference
points x∗,i can be chosen arbitrarily, with condition |B∗ G| = 0, see [1] for further
details.
The sparse mixed model formulation for P-splines is defined by [1]
y = Xβ + Zu + e, u ∼ N (0, Σ), e ∼ N (0, R), (1)
with X = BG, Z = B, R−1 = θ0 In and Σ−1 = θ1 (D D + B∗ B∗ ). The
restricted log-likelihood of Eq. (1) is given by
2 log L = log |R−1 | + log |Σ−1 | − log |C| − ê R−1 ê − û Σ−1 û, (2)
where ê = y − Xβ̂ − Zû. The mixed model equations are given by
−1 −1
XR X X R−1 Z β̂ XR y
= , (3)
Z R−1 X Z R−1 Z + Σ−1 û Z R−1 y
with on the left hand side the mixed model coefficient matrix C. The matrices
R−1 , Σ−1 , and C are sparse for the mixed model P-splines model, and therefore
log L and the partial derivatives with respect to precision parameters θm can be
calculated in a computationally efficient way.
Extension to two-dimension P-splines is relative straightforward. For details
and extension to higher dimensions see [1]. First we define the covariates x1 and
x2 , with corresponding n × qi matrices Bi (i = 1, 2). The n × q matrix B is
defined by B = B1 ⊗r B2 , where ⊗r denotes the row-wise Kronecker product,
and with q = q1 · q2 . Matrix G = G1 ⊗ G2 has dimension k × q, with k = k1 · k2 .
The matrix B∗ = B∗,1 ⊗ B∗,2 has dimension k × q. Finally, for Σ we have
Σ−1 = θ1 D1 D1 ⊗ Iq2 + θ2 Iq1 ⊗ D2 D2 + (θ1 + θ2 )B∗ B∗ , (4)
where D1 and D2 are difference penalty matrices.
with determinant
0k B∗
k
G Iq = (−1) |B∗ G| = 0. (6)
where
P = θ1 D1 D1 ⊗ Iq2 + θ2 Iq1 ⊗ D2 D2 . (8)
Using Eqs. (3), (5), and (7) it can be derived that the estimates for η̂ and â
are given by
(θ1 + θ2 )Ik 0 η̂ 0
=
0 B R−1 B + P â B R−1 y
2 log L = log |R−1 | + log |Σ−1 | − log |C| − ê R−1 ê − â Pâ, (10)
where ê = y − Bâ.
From Eq. (7) it follows that log |C| can be decomposed as
Using some linear algebra it can be shown that log |Σ−1 | can be decomposed
as
log |Σ−1 | = log |P|+ − log |G G| + 2 log |B∗ G| + k log(θ1 + θ2 ). (12)
where the pseudo determinant |P|+ is defined as the product of the q−k non-zero
eigenvalues of P. The pseudo determinant |P|+ can be obtained in an efficient
way, by using the spectral decomposition of Di Di = Ui Λi U−1i , where Λi is a
diagonal matrix with the eigenvalues of Di Di (i = 1, 2). The two-dimensional
penalty matrix P defined by Eq. (8) can be written as [2]
P = (θ1 U1 Λ1 U−1 −1 −1 −1
1 ) ⊗ (U2 Iq2 U2 ) + (θ2 U1 Iq1 U1 ) ⊗ (U2 Λ2 U2 )
= (U1 ⊗ U2 ) (θ1 Λ1 ⊗ Iq2 + θ2 Iq1 ⊗ Λ2 ) (U1 ⊗ U2 )−1 .
From this we obtain that |P|+ can be calculated as the product of the non-
zeros of a diagonal matrix
Precipitation anomaly
50
45
ypred
40 2
Latitude
35 0
−1
30
25
−120 −100 −80
Longitude
Fig. 1. Fitted surface for monthly precipitation anomalies in USA for April 1948,
using LMMsolver with 40 segments in both directions. Computation time is less than
one second, and 250 times faster than the SOP package, which gives the same fit.
Substituting Eqs. (11) and (12) into Eq. (10) gives the following expression
for the REML log-likelihood
2 log L = log |R−1 | + log |P|+ − log |G G| − log |C∗ | − ê R−1 ê − â Pâ, (14)
where C∗ = B R−1 B+P. An efficient way to obtain log |C∗ | and to solve Eq. (9)
is by using a sparse Cholesky decomposition of C∗ .
The expressions for the REML log-likelihood for P-splines and for standard
mixed models have a similar structure, as can be seen from the comparison
between Eqs. (2) and (14). There are two main differences. First, the precision
matrix Σ−1 in Eq. (2) is positive definite, P in Eq. (14) is singular. However,
log |P|+ can be calculated using Eq. (13) in an efficient way. A second difference
is an extra constant − log |G G| in the REML P-splines formulation in Eq. (14),
which can be easily calculated or just ignored.
An important element needed to calculate Eq. (14) and the partial deriva-
tives with respect to precision parameters θm in an efficient way is avoiding
the calculation of the inverses of the precision matrices, which are not sparse.
One way to do this is to calculate the so-called sparse inverse. In LMMsolver
[1] Automated Differentiation of the Cholesky algorithm [11] was implemented.
Backward differentiation was used, which calculates the partial derivatives of
the likelihood efficiently [11]. A detailed example for one-dimensional P-splines
is given by Eilers and Boer [4]. The automated differentiation was implemented
in LMMsolver using supernodal Cholesky factorization [7]. The implementation
was written in C++ using the Rcpp package.
1000
100
method
LMMsolver
10 LMMsolver2
mgcv
SOP
20 30 40 50
Number of segments
[9] for further details. Cubic B-splines with second-order differences were used
for both latitude and longitude. The result is shown in Fig. 1, using 40 segments
in both directions. The models defined in [1] and [9] are both equivalent to the
new formulation presented in this article.
We compared computation times with other methods for different number of
segments. The same number of segments was used in both dimensions. All com-
putations were performed in R4.4.0 (R Core Team 2024) and a 2.90 GHz Intel
Core i5-9400 CPU with 24 GB of RAM and Windows10 operating system. Ver-
sion 1.9-1 of mgcv [12], version 1.0.1 of SOP [10], and version 1.0.7 of LMMsolver
[1] were used. For mgcv we used the bam() function, with method="fREML".
Figure 2 compares the computation times, showing that the sparse mixed
model formulation in [1] and the new REML P-splines model are several orders
of magnitude faster than SOP and mgcv. For example, for 50 segments in both
directions, the computation time for the two LMMsolver methods are both less
than 2 s, SOP takes 11 min, and mgcv needs 49 min. The new REML P-splines
model is a bit faster than the original sparse mixed P-splines model in [1], but
the differences are marginal.
5 Discussion
In this article we have shown that there is direct connection between REML
and P-splines. Therefore a transformation of P-splines to mixed model is not
needed. The REML P-splines model and the sparse mixed model formulation in
6 M. P. Boer
[1] keep the sparse structure of the B-splines, which makes them fast compared
to other methods, where the sparse structure is lost in the transformation to
mixed models.
Here we showed results for two-dimensional P-splines, but the same idea can
be extended to other dimensions. For Generalized Additive Models the sparse
mixed model formulation by Boer [1] has the advantage of modeling an explicit
term for the intercept, which makes the system identifiable. The sparse mixed
model P-splines in [1] and the new REML P-splines are closely connected, and
therefore the combination of the two formulations looks promising.
References
1. Boer, M.P.: Tensor product P-splines using a sparse mixed model formulation.
Stat. Model. 23, 465–479 (2023)
2. Currie, I.D., Durban, M., Eilers, P.H.C.: Generalized linear array models with
applications to multidimensional smoothing. J. R. Stat. Soc. Ser. B Stat. Methodol.
68(2), 259–280 (2006)
3. Eilers, P.H., Marx, B.D.: Flexible smoothing with B-splines and penalties. Stat.
Sci. 11(2), 89–121 (1996)
4. Eilers, P.H.C., Boer, M.P.: Derivatives of the log of a determinant. Developments
in statistical modelling. In: 38th International Workshop on Statistical Modelling
(2024)
5. Furrer, R., Sain, S.R.: A sparse matrix R package with emphasis on MCMC meth-
ods for Gaussian Markov random fields. J. Stat. Softw. 36, 1–25 (2010)
6. Lyche, T., Manni, C., Speleers, H.: Foundations of spline theory: B-splines, spline
approximation, and hierarchical refinement. Lect. Notes Math. 2219, 1–76 (2018)
7. Ng, E.G., Peyton, B.W.: Block sparse Cholesky algorithms on advanced unipro-
cessor computers. SIAM J. Sci. Comput. 14, 1034–1056 (1993)
8. Patterson, H.D., Thompson, R.: Recovery of inter-block information when block
sizes are unequal. Biometrika 58, 545–554 (1971)
9. Rodriguez-Alvarez, M.X., Lee, D.J., Kneib, T., Durban, M., Eilers, P.H.: Fast
smoothing parameter separation in multidimensional generalized P-splines: the
SAP algorithm. Stat. Comput. 25, 941–957 (2015)
10. Rodrı́guez-Alvarez, M.X., Durban, M., Lee, D.J., Eilers, P.H.: On the estimation of
variance parameters in non-standard generalised linear mixed models: application
to penalised smoothing. Stat. Comput. 29, 483–500 (2019)
11. Smith, S.P.: Differentiation of the Cholesky algorithm. J. Comput. Graph. Stat. 4,
134 (1995)
12. Wood, S.N.: Generalized Additive Models: An Introduction with R. Chapman and
Hall/CRC (2017)
Learning Bayesian Networks from Ordinal
Data - The Bayesian Way
Marco Grzegorczyk(B)
1 Introduction
Bayesian networks (BNs) make use of directed acyclic graphs (DAGs) to describe
the conditional dependencies among random variables X1 , . . . , Xn . The large
majority of BN models assumes that the n random variables have either a joint
multivariate Gaussian distribution (see, e.g., [1,2]) or that each of the n variables
has a nominal (categorical) distribution (see, e.g., [3]). BNs for variables with
ordinal (categorical) distributions have been scarcely explored in the literature.
Recently, Luo et al. [4] proposed the so called OSEM (‘ordinal structural expecta-
tion maximization’) method for BN learning from ordinal data. Luo et al. assume
that there is a Gaussian Bayesian network (DAG) among continuous variables
but that the continuous variables cannot be observed directly. The continuous
variables can only be observed in discretized form; i.e. each Gaussian variable
is in one-to-one correspondence with an ordinal (categorical) variable, obtained
through discretization of the corresponding Gaussian variable. Figure 1 provides
a graphical illustration of the relationships between the unobserved latent Gaus-
sian variables and the observable ordinal variables. BN structure learning then
aims at learning the DAG among the non-observable latent Gaussian variables
from the observable discretized (ordinal) variables.
We propose a Bayesian variant of OSEM, and we refer to it as the BoB
method (‘Bayesian way of modelling ordinal data in form of latent Bayesian
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024
J. Einbeck et al. (Eds.): IWSM 2024, CONTRIB.STAT., pp. 7–13, 2024.
https://doi.org/10.1007/978-3-031-65723-8_2