Efficiently Computing The Inverse Square Root Using Integer Operations
Efficiently Computing The Inverse Square Root Using Integer Operations
Contents
1 Introduction 2
2 Background 2
2.1 Floating Point Representation . . . . . . . . . . . . . . . . . . . . 2
2.2 Integer Representation and Operations . . . . . . . . . . . . . . . 3
3 The Algorithm 3
3.1 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 Computing the Initial Guess . . . . . . . . . . . . . . . . . . . . . 5
3.2.1 Idea Behind the Initial Guess . . . . . . . . . . . . . . . . 5
3.2.2 Detailed Analysis . . . . . . . . . . . . . . . . . . . . . . . 6
3.2.3 Error of the Initial Guess . . . . . . . . . . . . . . . . . . 8
3.2.4 Finding the Optimal Magic Number . . . . . . . . . . . . 9
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Conclusion 13
1
1 Introduction
In the field of computer science, clarity of code is usually valued over efficiency.
However, sometimes this rule is broken, such as in situations where a small
piece of code must be run millions of times per second. One famous example
of this can be found in the rendering-related source code of the game Quake III
Arena. In computer graphics, vector normalization is a heavily used operation
when computing lighting and shading. For example, a renderer will commonly
need to do a shading computation at least once for each pixel on the screen. In
the modest case where a game is running at 30 frames per second on a screen
with a resolution of 800×600 pixels and only a single vector must be normalized
for each shading computation, 14,400,000 vector normalizations will need to
be performed per second! To normalize a vector x, one must multiply each
1
component by |x| = √ 2 1 2
. Thus it is important that an inverse square
x1 +···+xn
root can be computed efficiently. In Quake III Arena, this task is performed by
the following function [2] (complete with the original comments):
1 float Q_rsqrt( float number )
2 {
3 const float threehalfs = 1.5F;
4 float x2 = number * 0.5F;
5 float y = number;
6 long i = * ( long * ) &y; // evil floating point bit level hacking
7 i = 0x5f3759df - ( i >> 1 ); // what the fuck?
8 y = * ( float * ) &i;
9 y = y * ( threehalfs - ( x2 * y * y ) );
10 return y;
11 }
Unfortunately, while the code performs well and is quite accurate, it is not
at all clear what is actually going on, and the comments don’t provide much
insight. What is the meaning of the constant 0x5f3759df, and why are integer
operations being performed on floating point numbers? This paper will provide
a detailed explanation of the operations being performed and the accuracy of
the resulting value.
2 Background
2.1 Floating Point Representation
Floating point numbers (defined by float in the code) consist of a sign bit, an
8-bit exponent, and a 23-bit mantissa.
s E M
bit 31 bits 30...23 bits 22...0
2
negative and positive values. The mantissa M represents a real number in the
range [0, 1); the leading 1 is implied and thus not explicitly included. The value
represented by a floating point number is thus given by
(−1)s (1 + M )2E−127 .
3 The Algorithm
Given a number x > 0, the algorithm uses Newton’s method to approximate
√1 . Newton’s method is an iterative root-finding algorithm which requires an
x
initial guess y0 . Line 7 computes the initial guess y0 and line 9 performs a single
iteration of Newton’s method.
We will begin by assuming that we have a reasonable initial guess and prove
the error bounds of Newton’s method. Then we will address the details of
making the initial guess.
f (yn )
yn+1 = yn − .
f 0 (yn )
Geometrically, yn+1 can be interpreted as the zero of the tangent line to f (y)
at yn , as shown in Figure 1.
3
y0 y1 y
1/yn2 − x
3 1 2
yn+1 = yn − = yn − xy ,
−2/yn3 2 2 n
y(1 − 0 ) ≤ y0 ≤ y(1 + 0 ).
Now let g(δ) = y1 . We can find the maximum relative error after applying
Newton’s method by minimizing and maximizing g(δ) for δ ∈ [−0 , 0 ]. The
roots of g 0 (δ) = −3yδ(1 + δ/2) occur at δ = 0 and δ = −2, meaning that these
4
are the critical points of g(δ). However, for small enough 0 (specifically, for
0 < 2), we must only consider the point δ = 0 as well as the endpoints δ = ±0
(we will later see that 0 is well below 2). Evaluating at these points, we see that
g(0) = y, g(−0 ) = y 1 − 23 20 + 21 30 , and g(0 ) = y 1 − 32 20 − 12 30 . Thus, the
minimum occurs at δ = 0 and the maximum occurs at δ = 0, so
3 2 1 3
y 1 − 0 − 0 < y1 < y.
2 2
Therefore, we conclude that if our initial guess y0 has an initial relative error
of at most 0 , by performing an iteration of Newton’s method our new relative
error becomes at most 23 20 + 12 30 , or
y − y1 3 2 1 3
y ≤ 1 = 2 0 + 2 0 .
For easier analysis, we rewrite these operations in the following way. Suppose
we are trying to find the inverse square root of x. Let c be the “magic number”
(0x5f3759df in the original code), and let y0 be our initial guess. Then our
code becomes:
t = x >> 1;
y_0 = c - t;
Although c is defined as an integer, we can interpret its bits as a floating point
number, and so our notation also applies to c.
5
positive, xs is 0. Thus the desired value we are trying to approximate with y0
is
1 1 1
√ =p =√ 2−(xE −127)/2 .
x (1 + xM )2 xE −127 1 + xM
Recall that performing a shift right operation by 1 divides an integer by 2.
Ignoring the issue of a bit being shifted from xE into xM , we consider the
exponent and mantissa fields to be divided by 2 separately so that tE = xE /2
and tM = xM /2. If we now treat the subtraction operation c − t separately
for the exponent and mantissa fields (meaning that we ignore the possibility
that the mantissa “borrows” a bit from the exponent), y0E = cE − xE /2 and
y0M = cM − xM /2. Thus, y0 = (cM − xM /2)2cE −xE /2−127 . The value of the
exponent field in c = 0x5f3759df
√
is 0xbe = 190. Substituting this value for cE ,
we see that y0 becomes 22 (cM −xM /2)2−(xE −127)/2 . By selecting cE to be 190,
the exponent 2−(xE −127)/2 is the same in y0 as it is in √1x . Therefore, cM should
√
be picked appropriately so that 22 (cM − xM /2) is a linear approximation of
√ 1 .
1+xM
In practice it is not as straightforward, as the integer shift right and sub-
traction operations on the exponent and mantissa are not performed separately,
meaning that bits can be shifted or borrowed from one field to the other. How-
ever, the high level idea of the approximation remains the same.
6
2. cE − xE /2 must not be zero. If cM − xM /2 is negative, a bit will be
borrowed from cE − xE /2, and if cE − xE /2 is 0, the resulting exponent
will become negative, breaking our first condition.
Thus we require that cE − xE /2 ≥ 1. Recall that xE is an 8-bit unsigned integer
(before bias), meaning that it is in the range [0..255]. The values 0 and 255 are
reserved for special cases (0, denormalization, ∞, and NaN), so a valid xE falls
into [1..254]. Recalling that we are only dealing with an even xE , we are further
restricted to even integers in [2..254]. Thus, xE /2 ∈ [1..127], meaning that cE
must be at least 128.
We now consider the result of subtracting the mantissas. If cM ≥ xM /2,
then no borrowing from the exponent field occurs, and we immediately obtain
the result
y0 = (1 + cM − xM /2)2cE −xE /2−127 .
However, if cM < xM /2, then subtracting the mantissas will cause a bit to be
borrowed from the exponent field (as shown above, at least one bit is guaranteed
to be available for this purpose as cE − xE /2 ≥ 1), reducing the resulting
exponent by 1 and thus dividing the result by 2. Because cM − cE /2 < 0
but the bits in the mantissa still represent a value in the range [0, 1), this will
cause the resulting mantissa’s value to “wrap around”, effectively adding 1 to
what would have been a negative result. The resulting mantissa is therefore
1 + cM − xM /2 and we obtain the result
Rewriting the exponents to be the same in each case, we summarize so far with
the following:
(
(2 + 2cM − xM )2cE −xE /2−128 , if cM ≥ xM /2
y0 = cE −xE /2−128
.
(2 + cM − xM /2)2 , if cM < xM /2
7
If cM < (xM + 1)/2, we must account for the bit borrowed from the exponent
field. As before, the exponent is reduced by 1, dividing the result by 2, and
the value of the mantissa wraps around to fall in the range [0, 1), equivalent to
adding 1. Therefore in this case,
y0 = (2 + cM − (xM + 1)/2)2cE −(xE −1)/2−128 .
We again summarize our results up to this point, rewriting the exponents to
be more consistent:
(2 + 2cM − xM )2cE −xE /2−128 ,
if xE is even, cM ≥ xM /2
(2 + c − x /2)2cE −xE /2−128 ,
if xE is even, cM < xM /2
M M
y 0 = √2 cE −xE /2−128
.
2 (2 + 4c M − 2x M )2 , if x E is odd, c M ≥ (xM + 1)/2
√2
cE −xE /2−128
2 (3 + 2cM − xM )2 , if xE is odd, cM < (xM + 1)/2
1 1 2−(xE −127)/2
y= √ = p = √ .
x (1 + xM )2xE −127 1 + xM
Then the relative error of y0 to y is 0 = y−y y . Define to be
0 y−y0
y . As
we have not yet determined the optimal value of c, depends on cE , cM , xE ,
and xM . We again proceed by considering the cases where xE is even and odd
separately.
First suppose xE is even. Let
(
2cM − xM , if cM ≥ xM /2
fe (xM , cM ) = .
cM − xM /2, if cM < xM /2
Then we can write y0 as
y0 = (2 + fe (xM , cM ))2cE −xE /2−128 .
In this case we see that
y − y0 y0
= =1−
y y
(2 + fe (xM , cM ))2cE −xE /2−128
=1− −(x −127)/2
2 √E
1+xM
√ √
=1− 2 1 + xM (2 + fe (xM , cM ))2cE −192 .
Now suppose xE is odd. Let
(
4cM − 2xM , if cM ≥ (xM + 1)/2
fo (xM , cM ) = .
1 + 2cM − xM , if cM < (xM + 1)/2
8
We can again write y0 as
√
2
y0 = (2 + fo (xM , cM ))2cE −xE /2−128 .
2
As before, we simplify to see that
√
2√
=1− 1 + xM (2 + fo (xM , cM ))2cE −192 .
2
Combining our results to cover all cases,
( √ √
1 − 2 1 + xM (2 + fe (xM , cM ))2cE −192 , if xE is even
= √ .
1 − 1 + xM (2 + fo (xM , cM ))2cE −192 , if xE is odd
This result almost looks to be in the form (1 + yM )2yE −127 . However, the
mantissa (including the implied leading 1) must lie in the range [1, 2), and since
1
xM lies [0, 1) (as it does not include the implied leading 1), √1+x lies between
√ M
9
Using right shift notation, we can rewrite these results. If xE is even, then
yE = 190 − (xE >> 1), and if xE is odd, then yE = 189 − (xE >> 1). The
expressions for yE are nearly identical to the line of code to compute y0 except
that the code works with all of y0 , c, and x, rather than just the exponent fields.
However, as we want y0E to be close to yE , we have found appropriate values for
cE : 189 or 190. We cannot use both of them, as we must use the same constant
for both even and odd cases, so we decide to let cE = 190 as in the original code
and continue. This meets our previous requirement that cE be at least 128.
Having picked cE , we can now simplify our expression for :
( √ √
1 − 42 1 + xM (2 + fe (xM , cM )), if xE is even
= √ .
1 − 14 1 + xM (2 + fo (xM , cM )), if xE is odd
Our task is now to find a value for cM ∈ [0, 1) such that maxxM ∈[0,1) {||} is
minimized. To do this, we first fix cM and determine the value (or values) of
xM which maximize || for that fixed cM . As usual, we examine the cases for
xE even and xE odd separately.
First suppose xE is even. Recall the definition of fe (xM , cM ),
(
2cM − xM , if cM ≥ xM /2
fe (xM , cM ) = .
cM − xM /2, if cM < xM /2
The first case is 0 when xM = 23 cM , which also satisfies the condition that
cM > xM /2. The second case is 0 when xM = 23 (1 + cM ), and the condition
10
cM < xM /2 can only be satisfied when cM < 12 . Thus, there is a critical point
at xM = 23 cM and another at xM = 32 (1 + cM ) if cM < 12 , so we define g4 (cM )
and g5 (cM ), corresponding respectively to these critical points, as
√ 32
2 2
g4 (cM ) = 1 − 1 + cM ,
2 3
(
0 if cM ≥ 12
g5 (cM ) = √ 3 .
1 − 5 3630 1 + 25 cM 2 if cM < 12
fo (xM , cM ) is continuous and differentiable for cM > (xM + 1)/2 and cM <
(xM + 1)/2 and if cM = (xM + 1)/2, 4cM − 2xM = 1 + 2cM − xM . Therefore,
fo (xM , cM ) is continuous and piecewise differentiable on [0, 1], as is , so the
maximums and minimums of will occur at critical points or endpoints of its
pieces.
We first consider the endpoints where xM is 0, 1, and 2cM − 1, and de-
fine g6 (cM ), g7 (cM ), and g8 (cM ) to be corresponding to these values of xM ,
respectively. Then
(
1
− cM , if cM ≥ 12
g6 (cM ) = 12 1
,
4 − cM /2, if cM < 2
√
2
g7 (cM ) = 1 − (1 + cM ).
2
The point xM = 2cM − 1 can only occur if cM ≥ 12 , so we define g8 (cM ) as
( √
1 − 2cM , if cM ≥ 12
g8 (cM ) = .
0, if cM < 12
The first case is 0 when xM = (2cM − 1)/3, which satisfies the condition cM >
(xM + 1)/2 when cM > 21 . The second case is 0 when xM = (2cM + 1)/3, which
always satisfies the condition cM < (xM + 1)/2 (when cM < 1). We define
g9 (cM ) and g10 (cM ) corresponding respectively to these critical points as
( 3
1 − 32 (1 + cM ) 2 , if cM > 12
g9 (cM ) = ,
0, if cM ≤ 21
11
32
1 2
g10 (cM ) = 1 − (2 + cM ) .
2 3
We now have a gj (cM ) covering each possible case where could be mini-
mized or maximized over all xM ∈ [0, 1). Let h(cM ) = max1≤j≤10 {|gj (cM )|}.
Then for fixed cM ,
max {||} = h(cM ).
xM ∈[0,1)
Our last step is to determine the value for cM ∈ [0, 1) such that h(cM ) is
minimized. Using a numerical minimizer for this task, we find the optimal
value to be
cM ≈ 0.4327448899640689,
giving a maximum error of
0 = || ≈ 0.03421281.
This choice for cM corresponds to the value 0x37642f for the right half of
the magic number. This differs from the choice of cM for the original magic
number, which was 0x3759df. However, the original choice of cM corresponds
to a mantissa field of 0.432430148124695, which is not far off from our result.
3.3 Results
We have determined that the maximum relative error of the initial guess y0 is
approximately 0 = 0.03421281 and that by applying one iteration of Newton’s
method, the maximum relative error becomes 1 = 32 20 + 12 30 . Combining these
results, we obtain the maximum relative error of our final result,
1 ≈ 0.0017758.
Figure 2 shows a plot of the initial guess compared to the actual value and
Figure 3 shows the final result after an iteration of Newton’s method.
1.25 1.25
1.0 1.0
0.75 0.75
Figure 2: The initial guess (red) Figure 3: The final result (red) com-
compared to √1x (blue). pared to √1x (blue).
12
It is worth noting that the original magic number used differs slightly from
the value we derived. It is possible that the original value was determined using
a criterion other than the minimum of the maximum error. Eberly [1] optimized
a variety of different criteria to obtain several other choices for cM , though none
of them were exactly the same as the original. It also may be the case that
the optimization to determine the original value for cM was performed after the
iteration of Newton’s method, as the true value requiring optimization is the
final result, not the initial guess. Although in theory this should produce the
same value for cM , it is easy to overlook the details of floating point math in
hardware which may introduce small amounts of error in practice. Notably, by
testing all possible floating point values using each choice for cM , Lomont [4]
found that while the initial guess using the value for cM derived here performed
better than the original choice, the new value actually produced a slightly higher
maximum error after an iteration of Newton’s method. This indicates that
Newton’s method is in fact a source of error being overlooked. Further analysis
would need to be performed to determine why this is the case.
4 Conclusion
Although the fast inverse square root function is difficult to decode at first
glance, after a detailed analysis we have a thorough understanding of the moti-
vation behind the method and its inner workings. After analyzing the specific
case of computing x−1/2 for 32-bit floating point numbers, several questions
naturally arise. Is it possible to extend the algorithm to work using 64-bit dou-
ble precision floats? The answer, fortunately, is yes, as the algorithm is not
dependent on the length of the bit strings being manipulated. Furthermore, our
optimal value for cM remains the same as long as we have computed enough
digits. Code for a 64-bit version is provided by McEniry in [5]. Another question
that arises is whether it is possible to derive similar algorithms which approxi-
mate x to powers other than −1/2. [5] briefly discusses √ this in the conclusion,
1
providing some equations as a starting point for √ n x,
n
x, and xa . Notably, we
√ √
can easily approximate x, since x = x √1x .
References
[1] David Eberly, Fast Inverse Square Root (Revisited), 2010.
[2] id software, quake3-1. 32b/ code/ game/ q_ math. c , Quake III Arena, 1999.
[3] Lee W. Johnson and R. Dean Riess, Newton’s Method, Numerical Analysis, 1982, pp. 160–
161.
[4] Chris Lomont, Fast Inverse Square Root, 2003.
[5] Charles McEniry, The Mathematics Behind the Fast Inverse Square Root Function Code,
2007.
13