Timing Attacks on Implementations of
Die-Hellman, RSA, DSS, and Other Systems
Paul C. Kocher
Cryptography Research, Inc.
607 Market Street, 5th Floor, San Francisco, CA 94105, USA.
E-mail: paul@[Link].
Abstract. By carefully measuring the amount of time required to per-
form private key operations, attackers may be able to nd xed Die-
Hellman exponents, factor RSA keys, and break other cryptosystems.
Against a vulnerable system, the attack is computationally inexpensive
and often requires only known ciphertext. Actual systems are potentially
at risk, including cryptographic tokens, network-based cryptosystems,
and other applications where attackers can make reasonably accurate
timing measurements. Techniques for preventing the attack for RSA and
Die-Hellman are presented. Some cryptosystems will need to be re-
vised to protect against the attack, and new protocols and algorithms
may need to incorporate measures to prevent timing attacks.
Keywords: timing attack, cryptanalysis, RSA, Die-Hellman, DSS.
1 Introduction
Cryptosystems often take slightly dierent amounts of time to process dierent
inputs. Reasons include performance optimizations to bypass unnecessary op-
erations, branching and conditional statements, RAM cache hits, processor in-
structions (such as multiplication and division) that run in non-xed time, and
a wide variety of other causes. Performance characteristics typically depend on
both the encryption key and the input data (e.g., plaintext or ciphertext). While
it is known that timing channels can leak data or keys across a controlled perime-
ter, intuition might suggest that unintentional timing characteristics would only
reveal a small amount of information from a cryptosystem (such as the Ham-
ming weight of the key). However, attacks are presented which can exploit timing
measurements from vulnerable systems to nd the entire secret key.
2 Cryptanalysis of a Simple Modular Exponentiator
Die-Hellman[2] and RSA[8] private-key operations consist of computing R =
yx mod n, where n is public and y can be found by an eavesdropper. The at-
tacker's goal is to nd x, the secret key. For the attack, the victim must com-
pute yx mod n for several values of y, where y, n, and the computation time are
known to the attacker. (If a new secret exponent x is chosen for each operation,
the attack does not work.) The necessary information and timing measurements
might be obtained by passively eavesdropping on an interactive protocol, since
an attacker could record the messages received by the target and measure the
amount of time taken to respond to each y. The attack assumes that the attacker
knows the design of the target system, although in practice this could probably
be inferred from timing information.
The attack can be tailored to work with virtually any implementation that
does not run in xed time, but is rst outlined using the simple modular expo-
nentiation algorithm below which computes R = yx mod n, where x is w bits
long:
Let s0 = 1.
For k = 0 upto w ; 1:
If (bit k of x) is 1 then
Let Rk = (sk y ) mod n.
Else
Let Rk = sk .
Let sk+1 = Rk2 mod n.
EndFor.
Return (Rw;1 ).
The attack allows someone who knows exponent bits 0::(b-1) to nd bit b. To
obtain the entire exponent, start with b equal to 0 and repeat the attack until
the entire exponent is known.
Because the rst b exponent bits are known, the attacker can compute the
rst b iterations of the For loop to nd the value of sb . The next iteration requires
the rst unknown exponent bit. If this bit is set, Rb = (sb y) mod n will be
computed. If it is zero, the operation will be skipped.
The attack will be described rst in an extreme hypothetical case. Sup-
pose the target system uses a modular multiplication function that is nor-
mally extremely fast but occasionally takes much more time than an entire
normal modular exponentiation. For a few sb and y values the calculation of
Rb = (sb y) mod n will be extremely slow, and by using knowledge about the
target system's design the attacker can determine which these are. If the total
modular exponentiation time is ever fast when Rb = (sb y) mod n is slow, expo-
nent bit b must be zero. Conversely, if slow Rb = (sb y) mod n operations always
result in slow total modular exponentiation times, the exponent bit is probably
set. Once exponent bit b is known, the attacker can verify that the overall oper-
ation time is slow whenever sb+1 = Rb2 mod n is expected to be slow. The same
set of timing measurements can then be reused to nd the following exponent
bits.
3 Error Correction
If exponent bit b is guessed incorrectly, the values computed for Rkb will be
incorrect and, so far as the attack is concerned, essentially random. The time
2
required for multiplies following the error will not be re
ected in the overall
exponentiation time. The attack thus has an error-detection property; after an
incorrect exponent bit guess, no more meaningful correlations are observed.
The error detection property can be used for error correction. For example,
the attacker can maintain a list of the most likely exponent intermediates along
with a value corresponding to the probability each is correct. The attack is
continued for only the most likely candidate. If the currently-favored value is
incorrect, it will tend to fall in ranking, while correct values will tend to rise.
Error correction techniques increase the memory and processing requirements
for the attack, but can greatly reduce the number of samples required.
4 The General Attack
The attack can be treated as a signal detection problem. The \signal" consists
of the timing variation due to the target exponent bit, and \noise" results from
measurement inaccuracies and timing variations due to unknown exponent bits.
The properties of the signal and noise determine the number of timing measure-
ments required to for the attack.
Given j messages y0 ; y1 ; :::; yj;1 with corresponding timing measurements
T0 ; T1 ; :::; Tj;1 , the probability that a guess xb for the rst b exponent bits is
correct is proportional to
jY;1
P (xb ) / F (Ti ; t(yi ; xb ))
i=0
where t(yi ; xb ) is the amount of time required for the rst b iterations of the
yix mod n computation using exponent bits xb , and F is the expected probability
distribution function of T ; t(y; xb ) over all y values and correct xb . Because F
is dened as the probability distribution of Ti ; t(yi ; xb ) if xb is correct, it is the
best function for predicting Ti ; t(yi ; xb ). Note that the timing measurements
and intermediate s values can be used improve the estimate of F .
Given a correct guess for xb;1 , there are two possible values for xb . The
probability that xb is correct and x0b is incorrect can be found as
Qj ;1
i=0 F (Ti ;Qt(yi ; xb ))
Qj ;1 j ;1 0 :
i=0 F (Ti ; t(yi ; xb )) + i=0 F (Ti ; t(yi ; xb ))
In practice, this formula is not very useful because nding F would require
extraordinary eort.
5 Simplifying the Attack
Fortunately it is generally not necessary to compute F . Each timing observation
consists of T = e + wi=0;1 ti , where ti is the time required for the multiplication
P
and squaring steps for bit i and e includes measurement error, loop overhead,
3
P ;1
etc. Given guess xb , the attacker can P nd bi=0 ti for each sample y. If xb is
w ;1 Pb;1 Pw;1
correct, subtracting from T yields e + i=0 ti ; i=0 ti = e + i=b ti . Since
the modular multiplication times are eectively independent from each other
and from the measurement error, the variance of e + wi=;b1 ti over all observed
P
samples is expected to be Var(e) + (w ; b)Var(t). However if only the rst c < b
bits of the exponent guess are correct, the expected variance will be Var(e) +
(w ; b + 2c)Var(t). Correctly-emulated iterations decrease the expected variance
by Var(t), while iterations following an incorrect exponent bit each increase the
variance by Var(t). Computing the variances is easy and provides a good way to
identify correct exponent bit guesses.
It is now possible to estimate the number of samples required for the attack.
Suppose an attacker has j accurate timing measurements and has two guesses
for the rst b bits of a w-bit exponent, one correct and the other incorrect with
theP rst error at bit c. For each guess the timing measurements can be adjusted
by bi=0 ;1 t . The correct guess will be identied successfully if its adjusted values
i
have the smaller variance.
It is possible to approximate ti using independent standard normal variables.
If Var(e) is negligible, the expected probability of a correct guess is
j ;1 p j ;1 p !
X p 2 X 2
P w ; bXi + 2(b ; c)Yi > w ; bXi
i=0 i=0
j ;1 j ;1 !
p X X
= P 2 2(b ; c)(w ; b) Xi Yi + 2(b ; c) Yi2 > 0
i=0 i=0
where X and Y arePnormal random variables with = 0 and = 1. Because j
is relatively large, j ;1 Y 2 j and Pj ;1 X Y is approximately normal with
p i=0 i
= 0 and = j , yielding
i=0 i i
p !
P 2 2(b ; c)(w ; b)( jZ ) + 2(b ; c)j > 0 = P Z > ; p j (b ; c)
p p
2(w ; b)
where Z is a standard normal randomq variable.Finally, integrating to nd the
probability of a correct guess yields 2(j(wb;;cb)) , where (x) is the area under
the standard normal curve from ;1 to x. The required number of samples (j ) is
thus proportional to the exponent size (w). The number of measurements might
be reduced if attackers choose inputs known to have extreme timing character-
istics at exponent locations of interest.
6 Experimental Results
Figure 1 shows the distribution of 106 modular multiplication times observed
using the RSAREF toolkit[10] on a 120-MHz PentiumTM computer running
MSDOSTM . The distribution was prepared by timing one million (a b mod n)
calculations using a and b values from actual modular exponentiation operations
4
with random inputs. The 512-bit sample prime # 1 from the RSAREF Die-
Hellman demonstration program was used for n. A few wildly aberrant samples
(which took over 1300s) were discarded. The Figure 1 distribution has mean
= 1167.8s and standard deviation = 12:01s. The measurement error is small;
the tests were run twice and the average measurement dierence was found to
be under 1s. RSAREF uses the same function for squaring and multiplication,
so squaring and multiplication times have identical distributions.
RSAREF precomputes y2 and y3 mod n and processes two exponent bits
at a time. In total, a 512-bit modular exponentiation with a random 256-bit
exponent requires 128 iterations of the modular exponentiation loop and a total
of about 352 modular multiplication and squaring operations. Each iteration
of the modular exponentiation loop does two squaring operations and, if either
exponent bit is nonzero, one multiply. The attack can be adjusted to append
pairs of exponent bits and to evaluate four candidate values at each exponent
position instead of two.
Since modular multiplications consume most of the total modular exponen-
tiation time, it is expected that the distribution of modular exponentiation
times willpbe approximately normal with (1167:8)(352) = 411; 065:6s and
12:01 352 = 225:3s. Figure 2 shows measurements from 5000 actual mod-
ular exponentiation operations using the same computer and modulus, which
yielded = 419; 901s and = 235s.
With 250 timing measurements, the probability that subtracting the time for
a correct modexp loop iteration from each sample will reduce the totalq variance
more than subtracting an incorrect iteration is estimated to be 2(j(wb;;cb)) ,
where j = 250, b = 1, c = 0, and w = 127. (There are 128 iterations of the
RSAREF modexp loop for a 256-bit exponent, but theq rst iteration
is ignored.)
250(1 ; 0)
Correct guesses are thus expected with probability 2(126) 0:84. The
5
5000 samples from Figure 2 were divided into 20 groups of 250 samples each,
and variances from subtracting the time for incorrect and correct modexp loop
iterations were compared at each of the 127 exponent bit pairs. Of the 2450
trials, 2168 produced a larger variance after subtracting an incorrect modexp
loop time than after subtracting the time for a correct modexp loop, yielding a
probability of 0.885. The rst exponent bits are most dicult, since b becomes
larger as more exponent bits become known and the probabilities should improve.
(The test above did not take advantage of this property.) It is important to note
that accurate timing measurements were used; measurement errors which are
large relative to the total modular exponentiation time standard deviation will
increase the number of samples needed.
The attack is computationally quite easy. With RSAREF, the attacker has
to evaluate four choices per pair of bits. Thus the attacker only has to do four
times the number of operations done by the victim, not counting eort wasted
by incorrect guesses.
7 Montgomery Multiplication and the CRT
Modular reduction steps usually cause most of the timing variation in a modu-
lar multiplication operation. Montgomery multiplication[6] eliminates the mod n
reduction steps and, as a result, tends to reduce the size of the timing character-
istics. However, some variation usually remains. If the remaining \signal"
P is not
dwarfed by measurement errors, the variance in tb and the variance of wi=;b1+1 ti
would be reduced proportionally and the attack would still work. However if the
measurement error e is large, the required number of samples will increase in
proportion to Var1(t ) .
i
The Chinese Remainder Theorem (CRT) is also often used to optimize RSA
private key operations. With CRT, (y mod p) and (y mod q) are computed rst,
where y is the message. These initial modular reduction steps can be vulnerable
to timing attacks. The simplest such attack is to choose values of y that are
close to p or q, then use timing measurements to determine whether the guessed
value is larger or smaller than the actual value of p (or q). If y is less than
p, computing y mod p has no eect, while if y is larger than p, it is necessary
to subtract p from y at least once. Also, if the message is very slightly larger
than p, y mod p will have leading zero digits, which may reduce the amount of
time required for the rst multiplication step. The specic timing characteristics
depend on the implementation. RSAREF's modular reduction function with a
512-bit modulus the Pentium computer with y chosen randomly between 0 and
2p takes an average of 42.1s if y < p, as opposed to 73.9s if y > p. Timing
measurements from many y could be combined to successively approximate p.
In some cases it may be possible to improve the Chinese Remainder Theorem
RSA attack to use known (not chosen) ciphertexts, reducing the number of mes-
sages required and making it possible to attack RSA digital signatures. Modular
reduction is done by subtracting multiples of the modulus, and exploitable timing
variations can be caused by variations in the number of compare-and-subtract
6
steps. For example, RSAREF's division loop integer-divides the uppermost two
digits of y by one more than the upper digit of p, multiplies p by the quotient,
shifts left the appropriate number of digits, then subtracts the result from y. If
the result is larger than p (shifted left), a extra subtraction is performed. The
decision whether to perform an extra subtraction step in the rst loop of the
division algorithm usually depends only on y (which is known) and the upper
two digits of p. A timing attack could be used to determine the upper digits
of p. For example, an exhaustive search over all possible values for the upper
two digits of p (or more ecient techniques) could identify value for which the
observed times correlate most closely with the expected number of subtraction
operations. As with the Die-Hellman/non-CRT attack, once one digit of p has
been found, the timing measurements could be reused to nd subsequent digits.
It is not yet known whether timing attacks can be adapted to directly attack
the mod p and mod q modular exponentiations performed with the Chinese
Remainder Theorem.
8 Timing Cryptanalysis of DSS
The Digital Signature Standard[5] computes s = (k;1 (H (m) + x r)) mod q,
where r and q are known to attackers, k;1 is usually precomputed, H (m) is the
hash of the message, and x is the private key. In practice, (H (m) + x r) mod q
would normally be computed rst, then is multiplied by k;1 (mod q).
If the modular reduction function runs in non-xed time, the overall signa-
ture time should be correlated with the time for the (x r mod q) computation.
The attacker can calculate and compensate for the time required to compute
H (m). Since H (m) is of approximately the same size as q, its addition has little
eect on the reduction time. The most signicant bits of x r are typically the
rst used in the modular reduction. These depend on r, which is known, and
the most signicant bits of the secret value x. There would thus be a correla-
tion between values of the upper bits of x and the total time for the modular
reduction. By looking for the strongest probabilities over the samples, the at-
tacker would try to identify the upper bits of x. As more upper bits of x become
known, more of x r becomes known, allowing the attacker to proceed through
more iterations of the modular reduction loop to attack new bits of x. If k;1 is
precomputated, DSS signatures require just two modular multiplication opera-
tions, potentially making the amount of additional timing noise which must be
ltered out relatively small.
9 Masking Timing Characteristics
The most obvious way to prevent timing attacks is to make all operations take
exactly the same amount of time. Unfortunately this is often dicult. Making
software run in xed time, especially in a platform-independent manner, is hard
because compiler optimizations, RAM cache hits, instruction timings, and other
7
factors can introduce unexpected timing variations. If a timer is used to delay
returning results until a pre-specied time, factors such as the system respon-
siveness or power consumption may still change detectably when the operation
nishes. Some operating systems also reveal processes' CPU usage. Fixed time
implementations are also likely to be slow; many performance optimizations
cannot be used since all operations must take as long as the slowest operation.
(Note: Always performing the optional Ri = (si y) mod n step does not make
an implementation run in constant time, since timing characteristics from the
squaring operation and subsequent loop iterations can be exploited.)
Another approach is to make timing measurements so inaccurate that the
attack becomes unfeasible. Random delays added to the processing time do in-
crease the number of ciphertexts required, but attackers can compensate by col-
lecting more measurements. The number of samples required increases roughly
as the square of the timing noise. For example, if a modular exponentiator whose
timing characteristics have a standard deviation of 10 ms can be broken success-
fully with 1000 timing measurements, adding a random normally distributed
delay with 1 second standard deviation will make the attack require approxi-
;
mately 1000 ms 2 (1000) = 107 samples. (Note: The mean delay would have to
10ms
be several seconds to get a standard deviation of 1 second.) While 107 samples
is probably more than most attackers can gather, a security factor of 107 is not
usually considered adequate.
10 Preventing the Attack
Fortunately there is a better solution. Techniques used for blinding signatures[1]
can be adapted to prevent attackers from knowing the input to the modular ex-
ponentiation function. Before computing the modular exponentiation operation,
choose a random pair (vi ; vf ) such that vf;1 = vi x mod n. For Die-Hellman,
it is simplest to choose a random vi then compute vf = (vi;1 )x mod n. For
RSA it is faster to choose a random vf relatively prime to n then compute
vi = (vf;1 )e mod n, where e is the public exponent. Before the modular expo-
nentiation operation, the input message should be multiplied by vi (mod n), and
afterward the result is corrected by multiplying with vf (mod n). The system
should reject messages equal to 0 (mod n).
Computing inverses mod n is slow, so it is often not practical to generate a
new random (vi ; vf ) pair for each new exponentiation. The vf = (vi;1 )x mod n
calculation itself might even be subject to timing attacks. However (vi ; vf ) pairs
should not be reused, since they themselves might be compromised by timing
attacks, leaving the secret exponent vulnerable. An ecient solution to this
problem is update vi and vf before each modular exponentiation step by com-
puting vi0 = vi2 and vf0 = vf2 . The total performance cost is small (2 modular
squarings, which can be precomputed, plus 2 modular multiplications). More
sophisticated update operations using exponents other than 2, multiplication
with other (vi ; vf ) pairs, etc. can also be used, but do not appear to oer any
advantages.
8
If (vi ; vf ) is secret, attackers have no useful knowledge about the input to
the modular exponentiator. Consequently the most an attacker can learn is the
general timing distribution for exponentiation operations. In practice, distribu-
tions are close to normal and the 2w exponents cannot possibly be distinguished.
However, a maliciously-designed modular exponentiator could theoretically have
a distribution with sharp spikes corresponding to exponent bits, so blinding does
not provably prevent timing attacks.
Even with blinding, the distribution will reveal the average time per op-
eration, which can be used to infer the Hamming weight of the exponent. If
anonymity is important or if further masking is required, a random multiple of
'(n) can be added to the exponent before each modular exponentiation. If this is
done, care must be taken to ensure that the addition process itself does not have
timing characteristics which reveal '(n). This technique may be helpful in pre-
venting attacks that gain information leaked during the modular exponentiation
operation due to electromagnetic radiation, system performance
uctuations,
changes in power consumption, etc. since the exponent bits change with each
operation.
11 Further Work
Timing attacks can potentially be used against other cryptosystems, includ-
ing symmetric functions. For example, in software the 28-bit C and D values
in the DES[4] key schedule are often rotated using a conditional which tests
whether a one-bit must be wrapped around. The additional time required to
move nonzero bits could slightly degrade the cipher's throughput or key setup
time. The cipher's performance can thus ; 56 revealthe Hamming
weight of the key,
P56
which provides an average of n=0 256 log2 ;256 3:95 bits of key infor-
n 56
mation. IDEA[3] uses an f () function with a modulo (216 + 1) multiplication
n
operation, which will usually run in non-constant time. RC5[7] is at risk on
platforms where rotates run in non-constant time. RAM cache hits can produce
timing characteristics in implementations of Blowsh[11], SEAL[9], DES, and
other ciphers if tables in memory are not used identically in every encryption.
Additional research is needed to determine whether specic implementations
are at risk and, if so, the degree of their vulnerability. So far, only a few specic
systems have been studied in detail and the attacks against CRT/Montgomery
RSA and DSS are currently theoretical.
Further renements to the attack may also be possible. A direct attack
against p and q in RSA with the Chinese Remainder Theorem would be partic-
ularly important.
12 Conclusions
In general, any channel which can carry information from a secure area to the
outside should be studied as a potential risk. Implementation-specic timing
9
characteristics provide one such channel and can sometimes be used to com-
promise secret keys. Vulnerable algorithms, protocols, and systems need to be
revised to incorporate measures to resist timing cryptanalysis and related at-
tacks.
13 Acknowledgements
I am grateful to Matt Blaze, Joan Feigenbaum, Martin Hellman, Phil Karn,
Ron Rivest, and Bruce Schneier for their encouragement, helpful comments, and
suggestions for improving the manuscript.
References
1. D. Chaum, \Blind Signatures for Untraceable Payments," Advances in Cryptology:
Proceedings of Crypto 82, Plenum Press, 1983, pp. 199-203.
2. W. Die and M.E. Hellman, \New Directions in Cryptography," IEEE Transac-
tions on Information Theory, IT-22, n. 6, Nov 1976, pp. 644-654.
3. X. Lai, On the Design and Security of Block Ciphers, ETH Series in Information
Processing, v. 1, Konstanz: Hartung-Gorre Verlag, 1992.
4. National Bureau of Standards, \Data Encryption Standard," Federal Information
Processing Standards Publication 46, January 1977.
5. National Institute of Standards and Technology, \Digital Signature Standard,"
Federal Information Processing Standards Publication 186, May 1994.
6. P.L. Montgomery, \Modular Multiplication without Trial Division," Mathematics
of Computation, v. 44, n. 170, 1985, pp. 519-521.
7. R.L. Rivest, \The RC5 Encryption Algorithm," Fast Software Encryption: Second
International Workshop, Leuven, Belgium, December 1994, Proceedings, Springer-
Verlag, 1994, pp. 86-96.
8. R.L. Rivest, A. Shamir, and L.M. Adleman, \A method for obtaining digital sig-
natures and public-key cryptosystems," Communications of the ACM, 21, 1978,
pp. 120-126.
9. P.R. Rogaway and D. Coppersmith, \A Software-Optimized Encryption Algo-
rithm," Fast Software Encryption: Cambridge Security Workshop, Cambridge,
U.K., December 1993, Proceedings, Springer-Verlag, 1993, pp. 56-63.
10. RSA Laboratories, \RSAREF: A Cryptographic Toolkit," Version 2.0, 1994, avail-
able via FTP from [Link].
11. B. Schneier, \Description of a New Variable-Length Key, 64-bit Block Cipher
(Blowsh)," Fast Software Encryption: Second International Workshop, Leuven,
Belgium, December 1994, Proceedings, Springer-Verlag, 1994, pp. 191-204.
10