0% found this document useful (0 votes)
20 views11 pages

Evaluation of Four Different Standard Addition Approaches With Respect To Trueness and Precision

This research paper evaluates four standard addition methods for estimating unknown concentrations: conventional extrapolation, interpolation, inverse regression, and normalization. The study compares these methods based on trueness and precision, concluding that the extrapolation method remains the most reliable under ideal conditions, while other methods may be beneficial in specific scenarios, such as dealing with outliers. Mathematical formulas for bias and variance are provided, alongside real-world data applications to illustrate the performance of each method.

Uploaded by

abstractofseo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views11 pages

Evaluation of Four Different Standard Addition Approaches With Respect To Trueness and Precision

This research paper evaluates four standard addition methods for estimating unknown concentrations: conventional extrapolation, interpolation, inverse regression, and normalization. The study compares these methods based on trueness and precision, concluding that the extrapolation method remains the most reliable under ideal conditions, while other methods may be beneficial in specific scenarios, such as dealing with outliers. Mathematical formulas for bias and variance are provided, alongside real-world data applications to illustrate the performance of each method.

Uploaded by

abstractofseo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Analytical and Bioanalytical Chemistry (2025) 417:1187–1197

https://doi.org/10.1007/s00216-024-05725-8

RESEARCH PAPER

Evaluation of four different standard addition approaches with respect


to trueness and precision
Gerhard Gössler1 · Vera Hofer2 · Walter Goessler1

Received: 4 November 2024 / Revised: 13 December 2024 / Accepted: 17 December 2024 / Published online: 10 January 2025
© The Author(s) 2025

Abstract
This work provides a statistical analysis of four different approaches suggested in the literature for the estimation of an
unknown concentration based on data collected using the standard addition method. These approaches are the conventional
extrapolation approach, the interpolation approach, inverse regression, and the normalization approach. These methods are
compared under the assumption that the measurement errors are normally distributed and homoscedastic. Comparison is done
with respect to the two most important characteristics of every estimator, namely trueness (bias) and precision (variability).
In addition, the authors supply, if not already available, mathematical formulas to approximate both quantities. Also, a real-
world data set is used to illustrate the performance of all four methods. It turns out, that, given that all assumptions underlying
the use of the standard addition method apply, the common extrapolation method is still the most recommendable method
with respect to bias and variability. Nonetheless, if additional concerns come into play, other methods like, for example, the
normalization approach in the case of increased problems with outliers might also be of interest for the practitioner.

Keywords Standard addition · Trueness and precision · Bias and variability · Extrapolation · Interpolation · Normalization ·
Inverse regression · Approximation formulas

Introduction To recap, the well-known standard addition method is


intended as a remedy in cases in which a matrix has a “rota-
In this work, four approaches proposed for the evaluation of tional” but no “translational” effect [1] (the matrix affects
standard addition results are statistically scrutinized. These only the slope of the calibration function but not its intercept)
approaches are the well-established extrapolation approach and, due to a lack of pure matrix, no external calibration can
and three alternative approaches, which are intended to be performed. The essential assumptions of this approach
improve the quality of the outcome of standard addition in are that the relationship between analyte concentration and
terms of the trueness and precision of the estimator and/or the measurement signal is linear and that the blank value is
simplicity of the statistical analysis of the results. All of these not significantly different from zero. If these assumptions
approaches were evaluated in detail on the basis of extensive are fulfilled, the relationship between measurement signal
simulation studies and—primarily for illustrative purposes— y (response) and concentration C (independent variable,
application to a real data set. The supplementary material denoted x for spiked concentrations in the following) can, in
contains additional mathematical considerations, namely the an ideal setting where also no measurement error is present,
derivation of some of the approximation formulas for bias and be stated as follows (see Fig. 1a):
variance of the different estimators for the unknown concen-
tration.
y = y(C) = βC.
B Gerhard Gössler
[email protected]
1 Institute of Chemistry, Analytical Chemistry, University Therefore, if an unknown concentration C0 is present in a
of Graz, Graz, Austria sample, the observed signal y(C0 ) = βC0 . When applying
2 Institute of Operations and Information Systems, University standard addition, the sample is subsequently spiked with
of Graz, Graz, Austria additional amounts of the analyte such that the observed sig-

123
1188 G. Gössler et al.

Fig. 1 a Basic principle of standard addition. Without error, the regression lines (green lines) show considerable variation when the pro-
unknown concentration C0 can be determined exactly by computing cess of standard addition is repeatedly and independently applied. The
β0 /β1 . b Due to measurement error, the measurements (within the possible estimates Ĉ0 are therefore scattered over the range indicated
ranges given by the black vertical bars) and therefore also the estimated by R

nal after spiking, y(C0 + x), equals What is of course interesting now is the distribution of
Ĉ0 which is, due to our assumptions, the quotient of two nor-
β(C0 + x) = βC0 + βx =: β0 + β1 x. mally distributed quantities. This question can be solved only
approximately, and there are several approaches that can be
Therefore, the unknown concentration C0 equals applied to approximate bias and variance of Ĉ0 . [2] for exam-
ple present two approaches which come to the same result
y(C0 ) βC0 β0 with respect to the variability of the estimator of the extrap-
= = olation method: firstly, the method of propagation of error
β β β1
[3], which works well also in all other cases if applied prop-
which results in the so-called extrapolation approach. erly. Problems arise if, for example, the correlation between
Unfortunately, even if the measurement system is unbi- intercept and slope is not correctly taken into account which
ased, the measurements are overlain with measurement error, would have the consequence that the variance of the estima-
i.e., when assuming normally distributed errors tor would be underestimated (an error that can, according
to [2], even be found in textbooks). A second approach to
determine the variance of Ĉ0 is the “extrapolation method”
Y (C0 + x) = β0 + β1 x + ε, ε ∼ N (0, σ ). (1)
which is based on the so-called “inverse regression problem”
(see [4, 5]). In the respective literature, also, the delta method
Throughout this work, the errors are considered to be [6] is mentioned, e.g., [7], which is derived from the method
homoscedastic, i.e., σ does not depend on the measured con- of propagation of error to obtain the asymptotic distribution
centrations. of a random variable, but in both cases, the approximation
These errors have the consequence, that only the estimates of mean and variance of a random variable which is given
β̂0 and β̂1 of the true parameters β0 and β1 are available, as a function of random variables with known means and
which are, though still unbiased, themselves subject to error variances is found by applying Taylor series expansion. The
(see Fig. 1b). Depending on the concentrations at which the general application of the Taylor series expansion for the
measurements are taken (the experimental design) and the approximate determination of mean and variance can, for
extent of the measurement error, these parameter estimates example, be found in [8] p.161ff. Therefore, depending on
are distributed as follows:

⎛  1 ⎞  
n 2 2 1
x 1 2
β̂0 ∼ N ⎝β0 , σ n i=1 i ⎠ , β̂1 ∼ N β1 , σ n
n i=1 (xi − x)2 i=1 (x i − x)
2

with xi , i = 1, ..., n, denoting the values of the independent the approach and the quantity in question (i.e., bias or vari-
variable. ance), Taylor series expansion is applied to Ĉ0 either as a
Due to this variability, also, the estimator Ĉ0 of the function of β̂0 , β̂1 or as a function of Y1 , ..., Yn , which in
unknown concentration comes now with some amount of particular means that Ĉ0 is either differentiated with respect
uncertainty, since it is the quotient of two random variables: to β̂0 , β̂1 or Y1 , ..., Yn . For the extrapolation approach, the
Ĉ0 = β̂0 /β̂1 .

123
Evaluation of four different standard addition approaches... 1189

variance of Ĉ0 , σ 2 , is now approximately given by the fol- the standard notation used in regression analysis, which—at
Ĉ0
lowing: least in our opinion—promotes understanding of the topic
in general and also with regard to the details in which
  the approaches differ. The “Results” section provides the
σβ̂0 2 σβ̂1 2 σβ̂0 σβ̂1
σĈ2 ≈ + − 2ρβ̂0 β̂1 C02 approximation formulas and shows the application of the four
0 β0 β1 β0 β1
approaches and the respective approximation formulas to the
(2) data set given by [13]. In addition, some simulation results
combined with the corresponding results gained by applying
with ρβ̂0 β̂1 denoting the correlation ρ of β̂0 and β̂1 which is the approximation formulas are provided. These results are
given by discussed in detail in the “Discussion” section, highlighting
n in particular the differences between the approaches with
− i=1 x i respect to bias and variance. The “Summary” section con-
ρ = ρβ̂0 β̂1 = n .
n 2 tains a summary, and last but not least, the derivation of the
i=1 x i
approximation formulas can be found in the supplementary
By plugging in the estimators for all unknown quantities, material.
one gets an estimator for σ 2 based on measured data:
Ĉ0
  Methods
sβ̂0 2 sβ̂1 2 sβ̂0 sβ̂1
s2 = + − 2ρβ̂0 β̂1 Ĉ02 .
Ĉ0 β̂0 β̂1 β̂0 β̂1 The extrapolation approach, as described in the “Introduc-
tion” section, estimates the unknown concentration C0 as
This approximation works pretty well if the ratio β1 /σ
Ĉ0e = β̂0 /β̂1 , with β̂0 and β̂1 denoting the estimates for inter-
is above a certain limit, i.e., it is good as long as it is very
cept and slope gained by simple linear regression applied
unlikely, that the denominator takes on values closen to zero. to the data generated by standard addition. Graphically, the
Therefore, as a rule of thumb, we suggest β1 /(σ ( i=1 (xi −
intersection of the regression line and the x-axis is equal to
x)2 )−0.5 ) > 12, but also different ratios are reported in lit-
−Ĉ0e . This is illustrated in Fig. 2a. Due to the measurement
erature [9]. Especially if this rule of thumb is observed, the
error, repeated application of standard addition yields differ-
distribution of Ĉ0 is considered to be approximately normal.
ing regression lines. In Fig. 2, 1000 such regression lines are
Therefore, the formula for σ 2 cannot only be used for judg-
Ĉ0 depicted as green (a, c, and d) or gray (b) lines. Due to this
ing the performance of the estimator but also for constructing variability, also, the resulting estimates vary, which is indi-
a confidence interval (CI) for C0 . For more information on cated by the respective ranges. In the case of the extrapolation
the distribution of the ratio of two normally distributed ran- approach, the corresponding range is denoted as R.
dom variables, the interested reader is referred to [10, 11] The interpolation approach is intended to improve the pre-
and [12]. cision of the estimate, i.e., to supply an estimator Ĉ0i with
The variability of the estimator for C0 derived by the reduced variance compared to Ĉ0e . The idea is as follows
extrapolation approach as approximated by formula 2 gave ([14, 15] and [16]): The estimation is shifted to the region
rise to the interpolation approach in an attempt to reduce covered by spiking by calculating
the estimator variability. Further efforts to improve the han-
dling and robustness of the extrapolation estimator resulted
in reversed regression and the normalization approach. As to Ĉ0i = (2y1 − β̂0 )/β̂1 ,
our knowledge, there is no thorough comparison of these
different approaches available in literature regarding the with y1 denoting the observed (averaged) signal for the
goodness of the respective estimator as expressed by trueness unspiked sample, to obtain a concentration estimator with
and precision. We bridge this gap by deriving approximation lower variance due to the lower variance of the regression
formulas for these quantities and providing the results of line within the range of the available data compared to the
simulation studies. variance of the regression line outside of this range—hence
In order to facilitate the understanding of the different the designation “interpolation.” Graphically (see Fig. 2b), Ĉ0i
approaches this work is hereinafter organized as follows: is given by the x-value (curly bracket 2) of the intersection
The “Methods” section presents all four approaches to be of the random horizontal line (y = 2y1 , curly bracket 1) and
compared in this work. One of the main aims of the “Meth- the random regression line (y = β̂0 + β̂1 x, curly bracket R).
ods” section is to present all approaches in a unified notation, Contrary to the basic idea of the interpolation approach, the
as many different notations can be found in the literature. range indicated by curly bracket R (Ĉ0e ) is narrower than the
Although all these notations are equivalent, we have chosen range indicated by curly bracket 2 (Ĉ0i ).

123
1190 G. Gössler et al.

Fig. 2 Graphical comparison of the different standard addition approaches

The reversed-axis approach ([7, 13]) is intended to reduce from y = β0 + β1 x + ε that


the effort for the estimation of the precision, i.e., of sĈ0 . This
is done by reversing the role of the analyte concentration yin = β0 /y1 + (β1 /y1 )xi + ε/y1 =: β0n + β1n xi + εin .
and the instrument response in linear regression. The former
serves now as the dependent variable and the latter as the If more than one series is measured, each series has to
independent variable. Therefore, by rearranging the assumed be normalized with respect to its unspiked measurement. It
relationship has to be pointed out that normalization has two important
consequences:
y = β0 + β1 x + ε to x = −β0 /β1 + y/β1 + ε/β1 =: β0r + β1r y + εr Firstly, all normalized observations for the unspiked sam-
ples are equal to 1 which means, that in the subsequent
regression analysis, the regression line has to be forced
it can be seen that the estimator β̂0r of β0r is the negative value through 1 which is equal to a fixed y-intercept, i.e., β̂0n
of the desired estimate for C0 , i.e.,
is always equal to 1. Therefore, the estimator Ĉ0n for the
unknown concentration C0 is now given by
Ĉ0r = −β̂0r .
Ĉ0n = 1/β̂1n
The variance of β̂0r can be easily estimated by applying the
formula for the variance of the estimator of the y-intercept
since β0 /β1 = (β0 /y1 )/(β1 /y1 ) = β0n /β1n and β̂0n = 1.
of simple linear
regression to the reversed data, i.e.,
n
y2
This result is also reached by the following consideration:
s 2 r = sr2 n n i=1 i
(y −y)2
with sr2 denoting the estimate of Since y1 ∼ N (β0 , σ ), the expectation of β̂1n is approximately
Ĉ0 i=1 i
the variance of εr . β1 /β0 = 1/C0 (there will be some bias), i.e., Ĉ0n = 1/β̂1n .
Graphically (see Fig. 2c), −Ĉ0r is given by the intersection Graphically (see Fig. 2d), the intersection of the regression
of the regression line and the y-axis. Due to the measurement line and the x-axis is equal to −Ĉ0n , which can, due to the vari-
error, the negative estimate can be found within the range ability of the measurements, be found in the range bounded
bounded by the two vertical lines below the x-axis (due to by the two vertical lines on the left-hand side of the y-axis.
different axis scaling, this range cannot be directly compared The figure also shows the fixed intercept which is always
to the other estimator ranges indicated in Fig. 2). equal to 1 and the heteroscedasticity brought about by nor-
The last approach considered is the so-called normal- malization.
ization approach [16] which is intended to mitigate the Secondly, the quotient of random variables is a random
effects of outliers. The idea is to normalize all observations variable whose variance depends not only on the variance, but
yi , i = 1, ..., n e , with n e denoting the number of different also on the expectation of the random variables involved in
spiked concentrations, with respect to y1 , the observation for the quotient [12]. The latter has the consequence that the error
the unspiked sample. Normalization is carried out by dividing term is now heteroscedastic (see Fig. 2d) in a way that makes
all observations by y1 , i.e. yin = yi /y1 . Therefore, it follows a proper estimation of σ , the variance of the measurement

123
Evaluation of four different standard addition approaches... 1191

error, based on the regression with respect to the normalized a solid theoretical argument that needs to be further inves-
data impossible. Therefore, also, the variance of β̂1n and thus tigated and is not yet available. However, overestimation of
the variance of Ĉ0n , σ 2 n , cannot be determined easily. The σ 2n obviously has the consequence, that the bias is on average
Ĉ0 β̂1
respective approach according to the supplementary material overestimated as well as the variance of Ĉ0 . The latter yields
of [16] therefore estimates the variance of the slope estima- confidence intervals which are, on average, too wide and
tor, σ 2n , not by applying an approximation formula (like that therefore exceed the chosen confidence level. If the estimate
β̂1
given in Table 2), but alternatively by calculating the slope for σ gained by regression analysis applied to the unnormal-
for each of the independently taken series of measurements ized data is used to calculate bias, variance and the CI for the
separately. This set of different estimates for β1n is now used normalization approach the results are in line with the sim-
to estimate the variability of β̂1n . Let us denote the number ulation results and the chosen confidence level (using this
of the different series measured by nr , nr ≥ 1, and obvi- approach d f = n − 2), but in our opinion, it would be incon-
ously, to make this approach work, nr needs to be greater sistent to extend the normalization approach by an additional
than 1. In order to remain consistent in the simulations, we analysis of the unnormalized data.
adopted this approach and also used this estimate for σ 2n , As also pointed out in [16], when applying the normaliza-
β̂1
tion method, proper measurement of the unspiked sample
s 2 n , subsequently to estimate σ by rearranging the approx- is crucial. Outliers in the case of the unspiked measure-
β̂1
imation formula for σ 2n . This estimate for σ can then be ment would severely worsen the result of the normalization
β̂1
used to calculate an estimate for the bias by applying the method compared to the extrapolation method, but if outliers
respective approximation formula in Table 1. Unfortunately, occur with respect to spiked samples that cannot be removed
this approach overestimates σβ̂ n on average by a factor of before regression analysis, it can outperform the extrapo-
√ 1 lation approach. That the unaffectedness of the unspiked
roughly nr − 1. This can be regarded as similar to the rela-
samples is of outermost importance for the normalization
tionship between the standard deviation of a random variable
approach has its reason in the fact that the gain in robustness
X with variance σ X2 and the standard deviation of the cor-
with respect to outliers is achieved by fixing the y-intercept
responding sample mean X n based on n observations for
√ after normalization (β̂0n = 1). Therefore, it is essential to be
which σ X n = σ X / n. Therefore, correcting sβ̂ n by mul-
√ 1 able to rely on the measured values for the unspiked samples.
tiplying it with a factor of around 1/ nr − 1 is necessary Before discussing the performance of the different approach
to avoid overestimation which subsequently also leads to a es in more detail in the next section, we want to point out the
correction of sĈ n by the same factor. A further problem is following important facts to avoid misunderstandings:
0
to determine the proper degrees of freedom, d f , for calcu-
lating a reasonable confidence interval for C0 . Simulations
indicate that especially the number of different spiked con- 1. The quotient of two random variables has the following
centrations, n e , seems to have a rather neglectable influence property: Since, in general, for a random variable X , the
on the adequate choice of d f . For example, simulations indi- expectation of its reciprocal value E(1/X ) is not equal
sĈ n to the reciprocal value of its expectation, i.e., 1/E(X ),
cate that Ĉ0n ± √n 0−1 td f ,0.975 yields a 95% CI for C0 , when we have that the estimator for Ĉ0 = β̂0 /β̂1 is biased, i.e.,
r
choosing d f around 4 (depending on nr ) irrespective of n e . E(Ĉ0 ) = E(β̂0 )/E(β̂0 ) = β0 /β1 . This means that the
However, in our opinion, it would be necessary for apply- estimator for the unknown concentration is on average
ing this correction in practice, to base this choice for d f on not identical to the true value, but more or less off [17].

Table 1 Approximation
Approach Bias
formulas for the bias of Ĉ0 ,
bias= E(Ĉ0 ) − C0
Extrapolationa σ 2 SSx2x (C0 + x)
xy

Interpolationb σ 2 SSx2x (C0 + x) or, alternatively


xy
σ 2 n r Sx x
Sx3y
x((2/nr − 1/n)Sx y + x(β0 − β1 x)) + (β0 − β1 x)Sx x /nr
2
Reversed axisc − y(n−3)σ
β1 S yy
⎛ 2 ⎞

ne 
n 
ne
(y1i0 xi ) β1 xi2 (y1i0 xi )
nr σ 2 β0 ⎜ 
ne

Normalizationd ⎝
i=1
β02
− i=1 i=1
β02
+ xi2 ⎠
n 2
β13 xi2 i=1
i=1
a,b,d ...derived by authors (see supplementary material), c ...[7]

123
1192 G. Gössler et al.

2. The chosen number of measurements and spiked concen- also be used to validate the approximation formulas since
trations (i.e., the x-values ) represent the design of the the results of simulation and proper approximation have to
standard addition approach. The design has a significant be very close.
impact on the variance (i.e., precision) of the estimator We would like to point out that although [16] analyze
for the unknown concentration [18]. Therefore, compar- the same standard addition approaches for bias and variance,
ing different approaches is only reasonable if the designs there are significant differences between our work and theirs.
do not differ. In contrast to our work, [16]do not provide formulas for the
3. Looking only at the confidence band for the regression bias and the formulas provided for the variances in the case
line can be misleading, as the interpolation method will of the interpolation (Eq. (7)) and the normalization approach
show. With this method, the estimate is determined by (Eq. (12)) contain quantities which are not easy to compute.
the intersection of the regression line with a horizontal In particular, these quantities are the following: in Eq. (7),
line which is not fixed (as the x- or y-axis), but is itself it is S yb , S ym , and Sbm , and in Eq. (12), it is Sm 1 , which
subject to chance. would have to be stated explicitly so that Eq. (7) and Eq.
(12) can be applied without further elaboration. [16] leave
it open how they are to be calculated. In addition, we based
Results our simulations on the usually applied basic assumptions of
standard addition (homoscedastic and normally distributed
As already mentioned, bias and variance are the most impor- errors, blank value not significantly different from zero, lin-
tant performance criteria of a statistical estimator. Both earity). Due to these standardized conditions, deviations from
quantities should of course be as small as possible, although the basic assumptions are excluded which allows a clear state-
there might be some trade-off in the sense that a larger bias ment to be made. In contrast to our approach, real-world
might be acceptable when it comes along with a correspond- data are often overlain by the characteristics of the different
ingly smaller variance. instrumental methods used which can obscure the underlying
The aim of this work is to deliver results based on the- relationship. If discrepancies between experiment and theory
oretical considerations and extensive simulations. One of occur, this indicates that additional factors play a role that
the goals is to provide approximation formulas for bias and lead to deviations from the basic assumptions. This is also
variance as these formulas cannot only be used to better argued by [16], who are extensively discussing the poten-
understand the behavior of the different approaches and to tial influences of the different instrumental techniques on
judge their performance, but are also useful in practice, i.e., the outcome of their comparisons. However, neither [16] nor
to calculate confidence intervals for the unknown concen- this work makes the other work superfluous, as theory and
tration C0 . To the best of our knowledge, our publication experiment are always complementary methods of investiga-
contains for the first time a complete set of formulas for all tion, both of which are indispensable. This means that theory
four investigated standard addition approaches for both bias must be tested with experiments and, conversely, theoretical
and variance which are ready to use for the practitioner since considerations are necessary to understand and model the
all necessary components are known/can be estimated using observations from the experiments and make them acces-
linear regression. In addition, a standardized and widely used sible for general application. The discrepancies between
notation is applied to improve the accessibility of the formu- experiment and theory should provide the impetus for fur-
las. ther research and improvement of the method (e.g., applying
Besides the use of formulas, bias and variance of the esti- weighted regression in the case of heteroscedasticity).
mator can be determined by applying Monte Carlo methods, In the following, consider that nr different series are
in the following denoted simulations. Such simulations uti- measured and that each series r consists of n e single
lize the generation of a huge number of synthetic random observations Yri , r = 1, ..., nr , i = 1, ..., n e , i.e., the
samples to yield numerical results (see [19]). They do not rely total number of observations n = nr n e . Therefore, the
on the approximation formulas, but are based solely on the used spiked concentrations in vector notation are x =
relationship given in formula (1) to randomly generate obser- (x11 , ..., x1n e , x21 , ..., x2n e , x31 , ..., xnr n e ), and the vector of
vations (“measurements”) used for calculating estimates and the measured responses is Y = (Y11 , ..., Y1n e , Y21 , ..., Y2n e ,
CIs for C0 for all four approaches in parallel. By iterating Y31 , ..., Ynr n e ). Keep in mind that xr 1 = 0 for all r = 1, ..., nr
this process t times, t random estimates and CIs for each and that for all i = 1, ..., n e x ji = xli for all j, l = 1, ..., nr .
approach are generated which can subsequently be used to
calculate bias and variance of Ĉ0x , x = e, i, r , n, and cover- Approximation formulas
age probabilities and average widths of the respective CIs.
These simulation results cannot only be used to judge and Since no closed forms for bias and variance of the estimators
compare the performance of the several approaches, but can for C0 exist, we need to resort to approximation formulas

123
Evaluation of four different standard addition approaches... 1193

to enable the calculation of approximate values for these approximation formulas indicating the validity of the derived
quantities. To derive these formulas, let the errors be nor- formulas.
mally distributed and homoscedastic and yri0 := β0 + β1 xri
denote the expected value of a measurement given spiked Real-world example and simulations
concentration xri . Furthermore, define
This subsection provides in addition to the analysis of a real

n 
n 
n data set the results of some simulations based on this data
Sx x := (xi − x)2 , S yy := (yi − y)2 , Sx y := (xi − x)(yi − y). set. Especially, the simulations serve two different purposes,
i=1 i=1 i=1
firstly to validate the approximation formulas by showing
that the approximations and the simulations yield reasonably
close results and secondly to enable the comparison of the
By making use of Taylor expansions (propagation of
different standard addition approaches.
error), we get the approximation formulas for bias and
The real-world example is taken from the paper of
variance of the different estimators for the unknown con-
Gonçalves et al. [13]. They compared the extrapolation
centration that can be found in Tables 1 and 2.
approach and reverse regression for Na and K determination
Some of these results can already be found in literature or
in biodiesel based on measurements generated by applying
have, where not available, been derived by the authors. The
FAES. The results of the analysis of these FAES data with
derivation can be found in the supplementary material.
respect to all four approaches can be found in Table 3 which
Note that these formulas contain the true parameters of
shows estimates for C0 , σ 2 and the bias and also the lower
the underlying relationship, e.g., β0 is the true but normally Ĉ0
and upper bounds of the CIs, as well as their width. These
unknown y-intercept, and σ 2 is the true variance of its esti-
β̂0 estimates are denoted Ĉ0 , sĈ0 , 
bias, C Il , C Iu and CI width.
mator. Of course, the true parameters are known in theoretical
In the case of the FAES data, the assumption of
considerations and simulations, but when these formulas are
homoscedastic errors seems to apply. Therefore, the derived
used in practical applications, these unknown parameters
formulas have been used to estimate bias and variance for
have to be replaced by proper estimates.
all considered methods by replacing the true (unknown) val-
Keep in mind that a thorough mathematical analysis with
ues of the parameters of the underlying relationship by the
respect to the evaluation of the goodness of the approximation
respective estimates. The variance estimates s 2 have further
formulas as well as of the performance of the CIs based on Ĉ0
been used to calculate the confidence intervals which has
these formulas would be extremely difficult or perhaps even
been done based on the following assumptions: Since the
impossible. Also, a respective evaluation based on just one
estimator Ĉ0 can be assumed to be approximately normally
dataset is not possible. Therefore, extensive simulations have
been employed to investigate the performance of the approxi- distributed (see Fig. 3), we assume that Ĉ0s−C0 ∼ tn−n p ,
Ĉ0
mation formulas and of the respective CIs. These simulations i.e., that this fraction is distributed according to Student’s
have been performed by utilizing the programming language t-distribution with n − n p degrees of freedom (d f ). n p
[20] (R version 4.2.3) which has also been used to create equals the number of estimated parameters, i.e., n p = 2 for
all figures shown in this work. There has been good agree- all approaches (but see the discussion on the normalization
ment between the results gained by the simulations and the method). Therefore, the proper confidence interval should be

Table 2 Approximation
Approach Variance
formulas for σ 2 , the variance of
Ĉ0
 σ 2  σ 2 σ σ 
the estimator Ĉ0 Extrapolationa β̂0
+ β̂1
− 2ρβ̂0 β̂1C02 β̂0 β̂1
β0 β1 β0 β1
 
σβ̂ i 2  σ 2 σ σ 
β̂1 β̂0 β̂1
Interpolationb β0
0
+ β1 − 2ρβ̂0 β̂1 β0 β1 C02
 
x2
with σ 2i = σ 2 + 4σ 2 1
− 1
− nsx2
β̂0 β̂0 nr n

n
yi2
y 2 Sx x
Reversed axisc,d σ 2 /β12 i=1
nS yy or σ 2 /β12 1
n + Sx2y
 
C02 nr σ 2 
ne 2

ne
Normalizatione 1
β0 y1i0 xi + xi2 = C04 σ 2
 n 2
β̂1nor m
β1 xi2 i=1 i=1
i=1
a,c ...various sources, b,e ...derived by authors (see supplementary material), d ...[7]

123
1194 G. Gössler et al.

Fig. 3 Histograms showing 104 simulated estimates for C0 for all four approaches. Estimates given in μg/g. The simulation is based on the
parameters which are deduced from the FAES dataset for Na given in Table 4 (for more information on the simulations, see text below)

given by Ĉ0 ± sĈ0 tn−n p ,1−α/2 with 1 − α denoting the cho- With respect to the simulation results in Table 5, this means
sen confidence level of the CI and tn−n p ,1−α/2 denoting the that β0 , β1 and σ have been chosen to be β̂0 , β̂1 and σ̂ from
1 − α/2 quantile of the t-distribution with d f = n − n p . Table 4. Therefore, biasappr and σĈ0 appr shown in Table 5
Furthermore, the FAES data are also used to deduce the approximating the true values of bias and σ 2 can be cal-
Ĉ0
parameters for the simulations whose results are shown in culated just by plugging in the known parameters together
Table 5. These parameters are estimated by applying lin- with the chosen spiked concentrations given by x into the
ear regression to the FAES data and are shown in Table 4. respective formulas in Tables 1 and 2.
For these simulations, also, the spiked concentrations x cho- To get robust simulation results, each of these results is
sen by Gonçalves et al. are used, which are as follows: based on the outcomes of a large number K of iterations
n e = nr = 5 and thus n = n e nr = 25 with (xr 1 , ..., xr 5 ) = which has been chosen to be 104 in this case. In each such iter-
(0, 11.4, 23, 34.5, 45.9) and r = 1, ..., 5. ation k, k = 1, ..., K , the spiked concentrations x together
All simulations (as already stated, many more than those with the parameters are used to generate n = nr n e new syn-
whose results are shown in this work have been performed) thetic random measurements by applying the relationship
are based on the assumption, that the true parameters (y- Yri = β0 + β1 xri + ε with ε ∼ N (0, σ ), r = 1, ..., nr , i =
intercept β0 , slope β1 and measurement error σ ), and 1, ..., n e . Each of these newly generated sets of n measure-
therefore, the true relationship and especially C0 is known. ments yk = (y11 , ..., ynr n e ) is subsequently analyzed using

Table 3 Application of the


Element Approach Ĉ0 sĈ0 
bias C Il C Iu CI width
different approaches and the
respective approximation
formulas to the FAES dataset Na589.0 nm Extrapolation 19.72 0.63 0.008 18.41 21.03 2.63
[13] Na589.0 nm Interpolation 21.34 0.92 0.008 19.45 23.23 3.79
Na589.0 nm Reversed axis 19.53 0.63 −0.181 18.22 20.84 2.61
Na589.0 nm Normalization 21.02 0.34 −0.011 20.32 21.73 1.41
K766.5 nm Extrapolation 24.02 0.46 0.004 23.06 24.98 1.91
K766.5 nm Interpolation 24.19 0.63 0.004 22.88 25.50 2.61
K766.5 nm Reversed axis 23.92 0.46 −0.089 22.97 24.88 1.91
K766.5 nm Normalization 24.16 0.48 −0.018 23.16 25.16 2.01
K769.9 nm Extrapolation 22.65 0.63 0.008 21.34 23.95 2.60
K769.9 nm Interpolation 23.30 0.87 0.007 21.50 25.11 3.61
K769.9 nm Reversed axis 22.47 0.63 −0.169 21.17 23.77 2.59
K769.9 nm Normalization 23.19 0.52 −0.021 22.12 24.26 2.14
Values given in μg/g

123
Evaluation of four different standard addition approaches... 1195

Table 4 Elementwise estimated


Element β̂0 β̂1 σ̂ β̂0 /β̂1 (true C0 in simulations given in μg/g)
parameters for the FAES dataset
[13]
Na589.0 nm 4016.98 203.71 229.96 19.72
K766.5 nm 3603.00 150.01 113.49 24.02
K769.9 nm 1348.53 59.55 62.86 22.65

the four standard addition approaches to estimate C0 and also s 2 and therefore the width of the respective CIs behave as
Ĉ0
plugging in the proper estimates into the formulas given in anticipated except for the normalization approach, but that
Table 2 to calculate CIs, i.e., in each iteration k, there is a s 2 n is close to or even smaller than s 2 e is only due to chance.
Ĉ0 Ĉ0
new estimate Ĉ0k and also a new CI, C Ik , for C0 calculated. As already stated, assessment of an approach is not possi-
Therefore, the quantities found in Table 5 are calculated as ble based on just one data set, i.e., to get a clear picture, the
follows: approximation formulas or simulations are needed.

1 
K
Ĉ 0sim = Ĉ0k , biassim = C0 − Ĉ 0sim , σĈ2 sim
K 0 Validation of the approximation formulas
k=1
and comparison of the approaches
1 
K
= (Ĉ0k − Ĉ 0sim )2 ,
K −1 Numerous simulations have been applied to validate the
k=1
approximation formulas which also allows in parallel to
investigate the performance of the approaches. For validation
C Icovsim is the fraction of all C Ik covering C0 , and C I sim is
of the formulas, the values calculated by using these formu-
the mean width of all C Ik .
las have to be compared to the respective simulation results,
which should be very close to the true values. Hence, for the
values given in Table 5, the values for biasappr and σĈ0 appr
Discussion have to be compared to the values for biassim and σĈ0 sim .
Since the approximations are very close to the simulation
Analysis of the FAES data results for bias and variance of the different approaches, all
four approaches can be judged either based on the approxi-
The results with respect to the analysis of the FAES data mations or the simulations. Therefore, the conclusions with
shown in Table 3 vary considerably with respect to the esti- respect to all considered parameter settings (not just the set-
mated concentration Ĉ0 (e.g., it can be shown that Ĉ0e − Ĉ0r ≈ tings shown in Table 4) are the same whether they are based
bias e − 
 bias r ). Also, the results with respect to the perfor- on the approximation formulas or the simulations.
mance criteria, i.e., bias and variance of Ĉ0 , show significant The following observations were made with regard to
differences between the approaches. The estimated variances bias: First of all, the extra- and interpolation approach

Table 5 Results of applying the


approximation formulas and Element Approach Ĉ 0sim biasappr σĈ0 appr biassim σĈ0 sim C Icovsim C I sim
simulations (104 iterations) to
the parameters (Table 4) Na589.0 nm Extrapolation 19.73 0.008 0.63 0.006 0.63 0.952 2.60
deduced from the FAES dataset Na589.0 nm Interpolation 19.72 0.007 0.90 0.003 0.90 0.959 3.69
[13] Na589.0 nm Reversed axis 19.54 −0.181 0.63 −0.176 0.63 0.940 2.59
Na589.0 nm Normalization 19.65 −0.068 0.81 −0.066 0.81 0.984 6.16
K766.5 nm Extrapolation 24.03 0.004 0.46 0.013 0.46 0.949 1.90
K766.5 nm Interpolation 24.02 0.004 0.63 0.003 0.64 0.955 2.58
K766.5 nm Reversed axis 23.93 −0.090 0.46 −0.083 0.46 0.946 1.89
K766.5 nm Normalization 23.99 −0.026 0.59 −0.028 0.58 0.985 4.42
K769.9 nm Extrapolation 22.65 0.008 0.63 0.008 0.63 0.948 2.57
K769.9 nm Interpolation 22.65 0.007 0.87 0.001 0.87 0.956 3.55
K769.9 nm Reversed axis 22.49 −0.169 0.63 −0.153 0.63 0.942 2.57
K769.9 nm Normalization 22.60 −0.053 0.80 −0.050 0.80 0.986 6.06
Values given in μg/g

123
1196 G. Gössler et al.

show a positive bias (systematic overestimation of C0 ), and the normalization approach, approximately with the desired
the reversed-axis and the normalization approach show a probability. Of course, the larger the bias, the less accurate are
negative bias (systematic underestimation of C0 ). The extrap- the respective CIs. Especially in the case of the reversed-axis
olation approach together with the interpolation approach approach, this might cause a (slightly) decreased probability
shows, in absolute terms, the smallest bias, the reversed-axis of covering the true value C0 due to the increased (negative)
approach shows the largest, and the normalization approach bias of this approach (correcting for the bias by incorporating
is somewhere in between. The approximation formulas are 
bias in the CI calculation might be of interest). In the nor-
consistent with the simulation results (see again Table 5 for malization case, the coverage probability is higher than the
some of these results) for the reversed-axis and the normaliza- desired confidence level due to the already stated problem
tion approach and the respective simulation results fluctuate of overestimating the variance of Ĉ0 , which has the conse-
in a range that is in line with the order of magnitude indicated quence of overly wide CIs (see discussion above).
by the approximation formulas for the extrapolation and the Therefore, taking all of the above together, the most rec-
interpolation approach. We think that the fluctuations in the ommendable approach with respect to bias and variance is
simulations for the latter two cases are due to the relatively still the common extrapolation approach. The results with
small bias compared to the variance of the estimator. respect to the comparison of the different approaches are
The following observations were made with regard to briefly summarized in Table 6.
variance: The extrapolation approach has the smallest vari-
ance which is equal to that of the reversed-axis approach,
whereas the interpolation approach in contradiction to its Summary
intention shows the largest variance. This effect with respect
to σ 2 i is due to the additional variance introduced by y1 , Simulations yield that, in comparison to the common extrap-
Ĉ0
i.e., this approach has an adverse effect due to an additional olation approach, all the other approaches show a decreased
source of variability (see Fig. 2b and also the description of performance with respect to bias and/or variance of the
this approach in the “Methods” section). The normalization estimator for the unknown concentration when the mea-
approach shows a variance which is also significantly larger surement errors are normally distributed and homoscedastic.
than that of the extrapolation approach and can be found Especially, the interpolation approach does not seem to be
somewhere in between that of the extra- and the interpola- recommendable as it does not increase the bias of the respec-
tion approach. The approximation formulas work well for all tive estimator, but in contradiction to the intention of reducing
approaches, i.e., they are pretty close to the results gained by its variance, it significantly increases its variability com-
simulations. pared to the estimators resulting from the extrapolation and
The following observations were made with regard to the the reversed-axis approach. For the latter two approaches,
confidence intervals: In the case of Table 5, the respective the variance of the estimators is very similar. Therefore,
results given by simulations are contained in the columns the reversed-axis approach works well with regard to the
C Icovsim (coverage probability) and C I sim (average width). intended simplification of the determination of the variance
It is of interest whether these intervals cover the true concen- of Ĉ0 . The disadvantage of this approach is that it has the
tration with a probability determined beforehand by choosing greatest bias of all approaches in absolute terms. Since this
the desired level of confidence. Another important feature is bias is of negative sign, the reversed-axis approach under-
the width of the CIs since one, of course, wants them to be estimates the true concentration on average more than the
as narrow as possible. Since the CI width is a direct conse- extrapolation approach, whose bias is positive, overestimates
quence of s 2 , the CIs are narrowest for the extrapolation- and it. In addition, the increased bias also has a slight impact on
Ĉ0
the reversed-axis approach. Different confidence levels have the CI calculated with respect to reverse regression, i.e., the
been investigated by using simulations. In Table 5, results achieved confidence level is slightly lower than the chosen
are shown for a chosen confidence level of 95%. The CIs one. The normalization approach has an exceptional posi-
covered the true concentration for all approaches, except for tion since performance with respect to bias and variance is

Table 6 Quick overview of the


Approach Extrapolation Interpolation Reversed axis Normalization
results of the comparison of the
four approaches in terms of bias
and precision for the case of
Bias
normally distributed
homoscedastic errors
Precision

123
Evaluation of four different standard addition approaches... 1197

weaker than that of the extrapolation approach, but it can 2. Bruce GR, Gill PS. Estimates of precision in a standard additions
handle outliers better. Therefore, a trade-off must be found analysis. J Chem Educ. 1999;76(6):805. ISSN 0021-9584.https://
doi.org/10.1021/ed076p805.
between greater robustness to outliers (in cases where the 3. Ku HH. Notes on the use of propagation of error formulas. J Res
unspiked measurements are not affected and the outliers can- NBS C Eng Inst. 1966;70C(4):263. ISSN 0022-4316. https://doi.
not be removed before analysis) and the price of greater bias org/10.6028/jres.070c.025.
and variance. An additional problem when applying the nor- 4. Williams EJ. Regression analysis. Wiley, 1959.
5. Draper NR, Smith H. Applied regression analysis: Wiley; 1998.
malization approach is the estimation of the variance of the ISBN 9780471170822. https://doi.org/10.1002/9781118625590.
estimator. Depending on the method applied, this can lead 6. Portnoy S. Letter to the editor. Am Stat. 2013;67(3):190. ISSN
to CIs which are wastefully wide, i.e., reaching a confidence 0003-1305. https://doi.org/10.1080/00031305.2013.820668.
level significantly higher than chosen beforehand. 7. Kang P, Koo C, Roh H. Reversed inverse regression for the uni-
variate linear calibration and its statistical properties derived using
The derived approximation formulas, which are an addi-
a new methodology. Int J Metrol Qual Eng. 2017;8:28. https://doi.
tional outcome of this investigation, proved to be valid, i.e., org/10.1051/ijmqe/2017021.
they are in very good agreement with the simulation results. 8. Rice JA. Mathematical statistics and data analysis. Thom-
This allows the calculation of CIs also in the case of inverse son/Brooks/Cole, Belmont, Calif., 3. ed., internat. ed. edition; 2007.
ISBN 0-534-39942-8.
regression and the normalization approach and can replace
9. Hayya J, Armstrong D, Gressis N. A note on the ratio of two
the use of simulations when investigating the performance normally distributed variables. Manag Sci. 1975;21(11):1338–41.
of the different standard addition approaches. ISSN 0025-1909. https://doi.org/10.1287/mnsc.21.11.1338.
10. Geary RC. The frequency distribution of the quotient of two normal
Supplementary Information The online version contains supplemen- variates. J R Stat Soc. 1930;93(3):442. ISSN 09528385. https://doi.
tary material available at https://doi.org/10.1007/s00216-024-05725- org/10.2307/2342070.
8. 11. Hinkley DV. On the ratio of two correlated normal random vari-
ables. Biometrika. 1969;56(3):635. ISSN 00063444. https://doi.
Acknowledgements This article was published Open Access with org/10.2307/2334671.
financial support from the University of Graz. The authors want to 12. Díaz-Francés Eloísa, Rubio Francisco J. On the existence of a
thank Gonçalves, Bradley, and Donati for their kind permission to use normal approximation to the distribution of the ratio of two inde-
the data published in [13] for this work. pendent normal random variables. Stat Pap. 2013;54(2):309–23.
ISSN 0932-5026. https://doi.org/10.1007/s00362-012-0429-2.
Author contribution Conceptualization: G. Gössler. Formal analysis: 13. Goncalves DA, Jones BT, Donati GL. The reversed-axis method
G. Gössler and V. Hofer. Validation: G. Gössler, V. Hofer and W. to estimate precision in standard additions analysis. Microchem
Goessler. Visualization: G. Gössler. Writing—original draft: G. Gössler J. 2016;124:155–8. ISSN 0026265X. https://doi.org/10.1016/j.
and V. Hofer. Writing—review and editing: G. Gössler, V. Hofer, and microc.2015.08.006.
W. Goessler. Supervision: V. Hofer and W. Goessler. Resources: W. 14. Meier PC. Statistical methods in analytical chemistry, volume v.
Goessler. 153 of Chemical analysis. Wiley, New York, 2nd ed. edition, 2000.
ISBN 978-0-471-72611-1. https://doi.org/10.1002/0471728411.
Funding Open access funding provided by University of Graz. https://onlinelibrary.wiley.com/doi/book/10.1002/0471728411.
15. Andrade JM, Terán-Baamonde J, Soto-Ferreiro RM, Carlosena A.
Interpolation in the standard additions method. Anal Chim Acta.
Declarations 2013;780:13–9. https://doi.org/10.1016/j.aca.2013.04.015.
16. Sloop John T, Gonçalves Daniel A, O’Brien Logan M, Carter
Conflict of Interest The authors declare no competing interests. Jake A, Jones Bradley T, Donati George L. Evaluation of dif-
ferent approaches to applying the standard additions calibration
Open Access This article is licensed under a Creative Commons method. Anal Bioanal Chem. 2021;413(5):1293–302. https://doi.
Attribution 4.0 International License, which permits use, sharing, adap- org/10.1007/s00216-020-03092-8.
tation, distribution and reproduction in any medium or format, as 17. Cochran WG. Sampling techniques. A Wiley publication in applied
long as you give appropriate credit to the original author(s) and the statistics. Wiley, New York, NY, 3. ed. edition, 1977. ISBN
source, provide a link to the Creative Commons licence, and indi- 0-471-16240-X. http://www.loc.gov/catdir/description/wiley037/
cate if changes were made. The images or other third party material 77000728.html.
in this article are included in the article’s Creative Commons licence, 18. Franke JP, de Zeeuw RA, Hakkert R. Evaluation and optimization
unless indicated otherwise in a credit line to the material. If material of the standard addition method for absorption spectrometry and
is not included in the article’s Creative Commons licence and your anodic stripping voltammetry. Anal Chem. 1978;50(9):1374–80.
intended use is not permitted by statutory regulation or exceeds the ISSN 0003-2700. https://doi.org/10.1021/ac50031a045.
permitted use, you will need to obtain permission directly from the copy- 19. Harrison RL. Introduction to Monte Carlo simulation. AIP
right holder. To view a copy of this licence, visit http://creativecomm Conf Proc. 2010;1204:17–21. ISSN 0094-243X. https://doi.org/
ons.org/licenses/by/4.0/. 10.1063/1.3295638.
20. R Core Team. R: a language and environment for statistical com-
puting. R Foundation for Statistical Computing, Vienna, Austria;
2024. https://www.R-project.org/.
References
1. Ellison SLR, Thompson M. Standard additions: myth and real- Publisher’s Note Springer Nature remains neutral with regard to juris-
ity. The Analyst. 2008;133(8):992–7. https://doi.org/10.1039/ dictional claims in published maps and institutional affiliations.
b717660k.

123

You might also like