978 3 642 28496 0
978 3 642 28496 0
Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Germany
Madhu Sudan
Microsoft Research, Cambridge, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbruecken, Germany
Ali Miri Serge Vaudenay (Eds.)
Selected Areas
in Cryptography
18th International Workshop, SAC 2011
Toronto, ON, Canada, August 11-12, 2011
Revised Selected Papers
13
Volume Editors
Ali Miri
Ryerson University
Department of Computer Science
Toronto, ON, Canada
E-mail: [email protected]
Serge Vaudenay
Ecole Polytechnique Fédérale de Lausanne
Lausanne, Switzerland
E-mail: [email protected]
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Preface
We would like to thank the authors of all submitted papers, whether their
submission was published or could not be accommodated. Moreover, Program
Committee members and external sub-reviewers put a tremendous amount of
work into the review process and we would like to thank them for their time
and effort. We would also like to acknowledge and thank the work done by the
conference Publicity and Publication Chair, Atefeh Mashatan.
Ryerson University
Toronto, Ontario, Canada
August 11-12, 2011
Organized by
In Cooperation with
Program Committee
Carlisle Adams University of Ottawa, Canada
Mikhail J. Atallah Purdue University, USA
Thomas Baignères CryptoExperts, France
Feng Bao Institute for Infocomm Research, Singapore
Lejla Batina Radboud University Nijmegen,
The Netherlands and K.U. Leuven, Belgium
Alex Biryukov University of Luxembourg, Luxembourg
Ian Blake University of British Columbia, Canada
Anne Canteaut INRIA, France
Christophe Doche Macquarie University, Australia
Orr Dunkelman Weizmann Institute of Science, Israel
Pierre-Alain Fouque Ecole Normale Supérieure, France
Steven Galbraith University of Auckland, New Zealand
Catherine H. Gebotys University of Waterloo, Canada
Guang Gong University of Waterloo, Canada
Anwar Hasan University of Waterloo, Canada
Howard Heys Memorial University, Canada
Thomas Johansson Lund University, Sweden
Antoine Joux University of Versailles, France
Pascal Junod HEIG-VD, Switzerland
Seny Kamara Microsoft Research, USA
Liam Keliher Mount Allison University, Canada
Stefan Lucks Bauhaus Universität Weimar, Germany
Atefeh Mashatan EPFL, Switzerland
Barbara Masucci Università di Salerno, Italy
Mitsuru Matsui Mitsubishi Electric Corporation, Japan
Kanta Matsuura University of Tokyo, Japan
Willi Meier FHNW, Switzerland
Kaisa Nyberg Aalto University, Finland
Thomas Peyrin NTU, Singapore
Vincent Rijmen K.U. Leuven and TU Graz, Belgium, Austria
Greg Rose Qualcomm, Australia
Rei Safavi-Naini University of Calgary, Canada
Taizo Shirai Sony Corporation, Japan
Doug Stinson University of Waterloo, Canada
Willy Susilo University of Wollongong, Australia
Nicolas Thériault Universidad del Bı́o-Bı́o, Chile
Ruizhong Wei Lakehead University, Canada
Michael Wiener Irdeto, Canada
Adam Young MITRE Corp, USA
Amr Youssef Concordia University, Canada
SAC 2011 IX
External Reviewers
Rodrigo Abarzúa Marcio Juliato
Zahra Aghazadeh Aleksandar Kircanski
Hadi Ahmadi Simon Knellwolf
Kazumaro Aoki Miroslav Knezevic
Roberto Avanzi Gaëtan Leurent
Masoud Barati Julio López
Aslı Bay Alexander May
Murat Cenk Carlos Moreno
Sherman S.M. Chow Shiho Moriai
Stelvio Cimato James Muir
Paolo D’Arco Ashkan Namin
Vanesa Daza Maria Naya-Plasencia
Giancarlo De Maio Christophe Negre
Junfeng Fan Kenji Ohkuma
Xinxin Fan Roger Oyono
Anna Lisa Ferrara Pascal Paillier
Matthieu Finiasz Chris Peikert
Ewan Fleischmann Mohammad Reza Reyhanitabar
Christian Forler Arnab Roy
Clemente Galdi Sumanta Sarkar
Benoı̂t Gérard Pouyan Sepehrdad
Michael Gorski Kyoji Shibutani
Robert Granger Claudio Soriente
Matthew Green Martijn Stam
Johann Groszschaedl Petr Sušil
Jian Guo Tomoyasu Suzaki
Jason Hinek Ashraful Tuhin
Man Ho Au Jalaj Upadhyay
Honggang Hu Yongge Wang
Xinyi Huang Gaven Watson
Sebastiaan Indesteege Ralf-Philipp Weinmann
Takanori Isobe Yanjiang Yang
Kimmo Järvinen Jingwei Zhang
Jeremy Jean Chang-An Zhao
Sponsoring Institutions
Faculty of Engineering, Architecture, and Science, Ryerson University
Department of Computer Science, Ryerson University
Fields Institute
Certicom
Table of Contents
Security in Clouds
On CCA-Secure Somewhat Homomorphic Encryption . . . . . . . . . . . . . . . . 55
Jake Loftus, Alexander May, Nigel P. Smart, and
Frederik Vercauteren
Invited Paper I
Group Law Computations on Jacobians of Hyperelliptic Curves . . . . . . . . 92
Craig Costello and Kristin Lauter
Cryptanalysis of Ciphers I
Combined Differential and Linear Cryptanalysis of Reduced-Round
PRINTcipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Ferhat Karakoç, Hüseyin Demirci, and A. Emre Harmancı
Cryptanalysis of Ciphers II
Some Instant- and Practical-Time Related-Key Attacks on
KTANTAN32/48/64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Martin Ågren
Cipher Implementation
Very Compact Hardware Implementations of the Blockcipher
CLEFIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Toru Akishita and Harunaga Hiwatari
Invited Paper II
Another Look at Tightness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Sanjit Chatterjee, Alfred Menezes, and Palash Sarkar
Table of Contents XIII
New Designs
Duplexing the Sponge: Single-Pass Authenticated Encryption and
Other Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Guido Bertoni, Joan Daemen, Michaël Peeters, and
Gilles Van Assche
Yu Sasaki
1 Introduction
Hash functions are taking important roles in various aspects of the cryptography.
After the breakthrough by Wang et al. [26,27] and through the SHA-3 competi-
tion [20], cryptanalysis against hash functions have been improved significantly.
The boomerang attack, which was proposed by Wagner [22], is a tool for the
cryptanalysis against block-ciphers. At FSE2011, Biryukov et al. applied the
boomerang attack for hash functions, and showed that a zero-sum distinguisher
could be constructed on them [3], where zero-sum is a set of messages whose
XOR is 0 and the XOR of their corresponding outputs is also 0. Lamberger and
Mendel independently applied the boomerang attack on SHA-2 and obtained
a significant improvement on the 4-sum distinguisher against its reduced-step
compression function [10], where a k-sum is a set of k paired initial-values and
messages such that the XOR of their outputs is 0. It seems that the boomerang
attack is potentially very powerful against hash functions, and thus more investi-
gation is required to understand their impact deeply. Note that at CRYPTO2007,
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 1–18, 2012.
c Springer-Verlag Berlin Heidelberg 2012
2 Y. Sasaki
Joux and Peyrin proposed an (amplified) boomerang attack for SHA-1 [7]. They
used the idea of the boomerang attack for the message modification technique
in the collision attack, which the purpose is different from our research.
The boomerang attack on hash functions does not always discuss the security
as the hash function. As done in [10], it often discusses the security of the
compression function or the internal block-cipher. Although they do not impact
to the security of the hash function immediately, such analyses are useful from
several viewpoints; 1) The progress of the cryptanalysis, in other words, the
security margin can be measured, 2) The attack could be used as a tool for
different purposes in the future, e.g., a pseudo-collision attack on MD5 [4]. 3)
The attack on a building-block may invalidate the security proof for the hash
function. Specifically, hash functions using the PGV modes tend to have the
reduction security by assuming the ideal behavior of the internal block-cipher.
MD4, which was proposed by Rivest in 1990 [13], is a hash function that is
used as a base of various hash functions. MD4 has an interesting property in its
message expansion. The sketch of its computation is as follows;
– Divide an input message block M into several message words m0 ,
m1 , . . . , mNS −1 .
– Iteratively apply a round function NR times, where the round function con-
sists of NS steps.
– For NS steps in each round, each of m0 to mNS −1 is used exactly once.
– The order of message words, in other words, the permutation of the message-
word index may change for different rounds.
We call this type of the message expansion message-words permutation. MD4,
MD5 [14], and HAVAL [32] are examples using the message-words permutation.
MD4, MD5, and HAVAL are now known to be vulnerable against various
attacks. For example, Van Rompay et al. found collisions of 3-pass HAVAL in
2003 [21], and Wang et al. found collisions of MD4, MD5, and 3-pass HAVAL in
2004 [25,27]. The complexity of collision attacks were optimized to 2 for MD4
[18], 210 for MD5 [29], 27 for 3-pass HAVAL [19,24], 236 for 4-pass HAVAL
[28,31], and 2123 for 5-pass HAVAL [31], where the unit of the complexity is one
computation of the compression function. Note that, only the theoretical result
is known for 5-pass HAVAL, and thus real collisions have not been found yet.
Theoretical preimage attacks are also presented. For example, [1,6,11] for
MD4, [17] for MD5, [2,16] for 3-pass HAVAL, and [16] for 4-pass HAVAL. For
5-pass HAVAL, only the attack on 158-steps out of 160-steps is known [15].
Several researchers evaluated the security of the building block for these hash
functions. Examples which analyzed full steps are [4,5] for MD5 and [8,9,30] for
HAVAL. Among them, the work by Kim et al. [8,9], which applied the boomerang
attack to distinguish their encryption modes from a random permutation in
the related-key setting, is very powerful. They successfully distinguished these
encryption modes with 26 queries for MD4, 211.6 queries for MD5, and 29.6
queries for 4-pass HAVAL. These attacks were implemented and an example of
the boomerang quartet was presented for MD5. In addition, Kim et al. claimed
that 5-pass HAVAL could also be distinguished with 261 queries and the attack
Boomerang Distinguishers on MD4-Family 3
Our Contributions
In this paper, we study the boomerang attack approach on MD4-based hash
functions. We use the differential path for the boomerang attack to construct
the 4-sum distinguisher on the compression function, while Kim et al. [9] used the
boomerang path to distinguish its encryption mode from a random permutation.
For both of our approach and the one in [9], the core of the attack is the existence
of the differential path suitable for the boomerang attack. However, because the
attack scenario is different, the procedure to optimize the attack is quite different.
We first collect various techniques for the boomerang attack on hash functions
from several papers (mainly [3,9,10]), and summarize the attack framework.
We then revisit the differential path for the boomerang attack against
5-pass HAVAL in [9]. On the contrary to the authors’ claim, we prove that the
differential path in [9] contains a critical flaw and thus the attack cannot work.
We then search for new differential paths for the boomerang attack and con-
struct the attack procedure optimized for attacking the compression function.
Finally, by using the new paths, we mount the distinguisher on the full com-
pression function of 5-pass HAVAL which generates a 4-sum quartet with a
complexity of 211 compression function computations. The attack complexity is
summarized in Table 1. As far as we know, this is the first result on the full
5-pass HAVAL that can be computed in practice. The attack is implemented on
a PC and we present a generated 4-sum quartet.
Note that as long as the good boomerang differential path is available, 4-sum
distinguishers can be constructed on the compression function. Then, with the
differential paths in [9], we attack MD5, 3-pass HAVAL, and 4-pass HAVAL with
a complexity of 210 , 24 and 211 compression function computations, respectively.
We present generated 4-sums in Appendix B.
Paper Outline
We describe the specification of HAVAL and clarify the terminology in Sect. 2.
We summarize previous work in Sect. 3. We give a summary of techniques for
the boomerang attack on hash functions in Sect. 4. We demonstrate a dedicate
attack on 5-pass HAVAL in Sect. 5. Finally, we conclude this paper in Sect. 6.
4 Y. Sasaki
x6 x5 x4 x3 x2 x1 x0 x6 x5 x4 x3 x2 x1 x0 x6 x5 x4 x3 x2 x1 x0
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
φ3,1 x1 x0 x3 x5 x6 x2 x4 φ4,1 x2 x6 x1 x4 x5 x3 x0 φ5,1 x3 x4 x1 x0 x5 x2 x6
φ3,2 x4 x2 x1 x0 x5 x3 x6 φ4,2 x3 x5 x2 x0 x1 x6 x4 φ5,2 x6 x2 x1 x0 x3 x4 x5
φ3,3 x6 x1 x2 x3 x4 x5 x0 φ4,3 x1 x4 x3 x6 x0 x2 x5 φ5,3 x2 x6 x0 x4 x3 x1 x5
- - φ4,4 x6 x4 x0 x5 x2 x1 x3 φ5,4 x1 x5 x3 x2 x0 x4 x6
- - - - φ5,5 x2 x5 x0 x6 x4 x3 x1
Table 3. Message-words permutation. The first column shows the round numbers.
2 Preliminaries
2.1 Specification of HAVAL
HAVAL [32] uses a narrow-pipe Merkle-Damgård structure. An input message
M is padded to be a multiple of the block-size (1024 bits), and then divided
into message blocks (M0 , M1 , . . . , ML−1 ). Then, chaining variable Hi starting
from the pre-specified initial value H0 is iteratively updated by the compression
function CF; Hi+1 ← CF(Hi , Mi ), for i = 0, 1, . . . , L − 1. Finally, HL is the
hash value of M . HAVAL can produce a hash value of smaller sizes by using the
output tailoring function. Because our attack target is the compression function,
we omit the description for the padding and the output tailoring function.
The size of chaining variables is 256 bits. Inside the compression function,
Mi is divided into thirty-two 32-bit message words (m0 , m1 , . . . , m31 ). Three
algorithms are prepared for HAVAL; 3-pass, 4-pass, and 5-pass. The number
of rounds for 3-pass, 4-pass, and 5-pass are 3 rounds (96 steps), 4 rounds (128
steps), and 5 rounds (160 steps), respectively.
Let us denote a 256-bit state before step j by pj and denote pj by eight 32-bit
variables Qj−7 Qj−6 Qj−5 Qj−4 Qj−3 Qj−2 Qj−1 Qj . The step function Rj
computes Qj+1 as follows:
3 Related Work
The boomerang attack was proposed by Wagner [22] as a tool for attacking
block-ciphers. The attack is a chosen-plaintext and adaptively chosen-ciphertext
attack. It can be regarded as a type of the second-order differential attack. In
this attack, the attacker divides the target cipher E into two parts E1 and E2
such that E(·) = E2 ◦ E1 (·). Let us denote the differential for E1 by Δ → Δ∗
and for E2 by ∇∗ → ∇. The differences Δ, Δ∗ , ∇∗ , and ∇ are chosen by the
attacker at offline. The attack procedure is as follows;
In 2011, Biryukov et al. pointed out that the zero-sum distinguisher can be
constructed by applying the boomerang attack on hash functions [3]. In the
boomerang attack, P 1 ⊕P 2 = Δ and P 3 ⊕P 4 = Δ. Therefore, P 1 ⊕P 2 ⊕P 3 ⊕P 4 =
Δ ⊕ Δ = 0. Similarly, C 1 ⊕ C 2 ⊕ C 3 ⊕ C 4 = ∇ ⊕ ∇ = 0. Hence, by starting
6 Y. Sasaki
¨ma mb ¨ ma mb
¨* = 0
1R ¨P 1R ¨P
ma ¨ mb mb ma
¨* = 0
¨
2R C 2R
¨
*=0 ma
¨
*=0 ¨
mb p1j-1 p3j-1
¨
3R C
ZŽƵŶĚj-1
mπ( j-1 ) Δj-1 Δj-1
¨ma mb ¨ma mb p1j Δ p3j
1R ¨P 1R ¨P j
ZŽƵŶĚj
mb ¨ ma mb ¨ ma
mπ( j ) Δj p2j-1 Δj p4j-1
2R
¨* 2R
Δ
¨
mb ma mb ma
p1j+1 j+1 p3j+1
¨*
3R
¨
*
3R
message p2j Δ p4j
modification ¨ j
ma ¨ mb ¨
mb ma *
¨
4R C 4R
ma ¨ mb
p2j+1
Δ p4j+1
¨ j+1
5R C
For E1 , we search for the message word which appear in an early step in the
first round and in a late step in the second round. Then, the message difference
is propagated until the beginning and end of E1 . The same strategy is applied
for E2 . Because the differential paths for both of E1 and E2 are short, they are
satisfied with high probability even without the message modification technique.
For NR = 5, we extend the differential paths for the 4-round attack by a half
more round. As shown in Fig. 1, the paths become long and hard to satisfy by
the naive search. Wang et al. showed that the differential path for one round
can be satisfied for free by the message modification technique [26,27]. Hence,
with these techniques, 5 rounds can be attacked. In this paper, we denote the
differential path between the end of round 2 and the beginning of round 4 by
inside path, and the differential paths in round 1 and round 5 by outside paths.
1: Choose the value of p1j+1 to satisfy conditions (∇-conditions) for p1j+1 . Then, com-
pute m1π(j) by solving Rj .
2: Compute m2π(j) , m3π(j) , and m4π(j) with the specified differences ΔM and ∇M .
3: Compute p2j+1 and check if all conditions (∇-conditions) for p2j+1 are satisfied. If
so, compute p3j+1 and p4j+1 . If not, repeat the procedure with different p1j+1 .
Table 4. Differential path and conditions for 5-pass HAVAL [9]. ez represents that
only z-th bit has a difference.
Φj (x6 , x5 , x4 , x3 , x2 , x1 , x0 ) = x1 x2 x3 ⊕ x1 x4 ⊕ x2 x5 ⊕ x3 x6 ⊕ x0 x3 ⊕ x0 .
Conditions to achieve the path were not explained in [9]. We first derive the nec-
essary and sufficient conditions to achieve the path, which are shown in Table 4.
Proof. For steps 70, 75, and 76, we have conditions Q69 = 0, Q71 = 1, and
Q73 = 0. Then, the left-hand side of the condition for step 74 becomes 0 ⊕ 1 ⊕
(Q72 · 0) = 1, which contradicts to the condition that this value must be 0. 2
2: Apply the backward computation in Alg. 1 to obtain p165 , p265 , p365 and p465 . This
fixes chaining variables up to Qx58 and message words from mπ(79) to mπ(65) .
3: Apply the forward computation in Alg. 2 to obtain p193 , p293 , p393 and p493 . This fixes
chaining variables up to Qx93 and message words from mπ(80) to mπ(92) .
//End of the message modification for the inside path
4: while a 4-sum quartet of the compression function output is not found do
5: Randomly choose the values of message-words quartet for mπ(93) = m11 ,
mπ(94) = m5 , and mπ(95) = m2 with the message difference on m2 , and compute
a chaining-variables quartet until p198 , p298 , p398 and p498 .
6: Randomly choose the values of message-words quartet for mπ(64) = m19 , and
compute a chaining-variables quartet until p160 , p260 , p360 and p460 .
7: Compute a chaining-variables quartet until p10 , p20 , p30 and p40 in backward and
p1160 , p2160 , p3160 and p4160 in forward.
8: if (p10 p1160 ) (p20 p2160 ) (p30 p3160 ) (p40 p4160 ) = 0 then
9: return (p10 , p20 , p30 , p40 ) and (M 1 , M 2 , M 3 , M 4 )
10: end if
11: end while
Table 6. New differential paths and conditions for 5-Pass HAVAL. [z] = 0, [z] = 1 are
conditions on the value of z-th bit of the chaining variable. For the first and last several
steps, we do not fix a particular difference for the amplified probability. The difference
is considered in XOR. In some cases, we need conditions on the sign of the difference.
[z] = 0+, [z] = 1− mean the value is first fixed to 0 (resp. 1) and change to 1 (resp. 0)
after the difference is inserted.
Path for E1 with Δm2 = 0x80000000 Path for E2 with ∇m4 = 0x80000000
j ΔQj Conditions on Qj Δm j ∇Qj Conditions on Qj ∇m mπ(j)
-7 AP AP -7 m1
-6 AP AP -6 m0
-5 AP AP 0x80000000 -5 m2
-4 -4 m3
··· ··· ··· ··· ··· ··· ··· ··· ···
52 52 m13
53 0x80000000 53 m2
54 54 m25
55 55 m31
56 56 m27
57 57 m19
58 [31]=0 58 m9
59 [31]=0 59 m4
60 [31]=0 60 m20
61 0x80000000 61 m28
62 [31]=0 62 m17
63 [31]=0 63 m8
64 [31,24]=0 64 m22
65 [24]=0 65 m29
66 [24,20]=0 66 m14
67 0x01000000 [20]=0 67 m25
68 [24,20]=0 68 m12
69 0x00100000 [24]=0 69 m24
70 [24,20,17]=0 70 m30
71 [20,17]=0 71 m16
72 [24,20,17]=0 72 m26
73 0x00020000 [17]=1− 73
74 [20,17]=0 74 0x00000001 [0]=1−
75 [17,9]=0 75 [18]=0
76 [17,9]=0 76 [18]=0 start
77 0x00000200 [9]=0+ 77 [18,0]=0 step
78 [17,10,9]=0 78 0x00040000 [21]=0,[18]=0+
79 0x00000400 [10]=1− 79 [21]=0,[18]=1
80 80 [21,18]=0
81 81 [21,18,14]=0 m31
82 82 0x00200000 [14]=0 m15
83 83 [21]=1,[14]=0 m7
84 84 0x00004000 [21]=0 m3
85 85 [21]=0,[14]=1 m1
86 86 [14]=0 m0
87 87 [14,10]=0 m18
88 88 [10]=0 m27
89 89 [10]=0 m13
90 90 0x00000400 m6
91 91 [10]=1 m21
92 92 [10]=1 m10
93 93 [10]=1 m23
94 94 m11
95 95 m5
96 96 m2
97 97 m24
98 98 0x80000000 m4
99 99 m0
··· ··· ··· ··· ··· ··· ··· ··· ···
156 156 m22
157 157 0x80000000 AP 0x80000000 m4
158 158 AP AP m1
159 159 AP AP m25
160 160 AP AP m15
14 Y. Sasaki
Table 8. An example of the boomerang quartet for the full 5-pass HAVAL
6 Concluding Remarks
References
1. Aoki, K., Sasaki, Y.: Preimage Attacks on One-Block MD4, 63-Step MD5 and
More. In: Avanzi, R.M., Keliher, L., Sica, F. (eds.) SAC 2008. LNCS, vol. 5381,
pp. 103–119. Springer, Heidelberg (2009)
2. Aumasson, J.-P., Meier, W., Mendel, F.: Preimage Attacks on 3-Pass HAVAL and
Step-Reduced MD5. In: Avanzi, R.M., Keliher, L., Sica, F. (eds.) SAC 2008. LNCS,
vol. 5381, pp. 120–135. Springer, Heidelberg (2009)
3. Biryukov, A., Nikolić, I., Roy, A.: Boomerang Attacks on BLAKE-32. In: Joux, A.
(ed.) FSE 2011. LNCS, vol. 6733, pp. 218–237. Springer, Heidelberg (2011)
4. den Boer, B., Bosselaers, A.: Collisions for the Compression Function of MD-5. In:
Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 293–304. Springer,
Heidelberg (1994)
5. Dobbertin, H.: The Status of MD5 after a Recent Attack. CryptoBytes The tech-
nical newsletter of RSA Laboratories, a division of RSA Data Security, Inc. 2(2)
(Summer 1996)
6. Guo, J., Ling, S., Rechberger, C., Wang, H.: Advanced Meet-in-the-Middle Preim-
age Attacks: First Results on Full Tiger, and Improved Results on MD4 and SHA-2.
In: Abe, M. (ed.) ASIACRYPT 2010. LNCS, vol. 6477, pp. 56–75. Springer, Hei-
delberg (2010)
7. Joux, A., Peyrin, T.: Hash Functions and the (Amplified) Boomerang Attack. In:
Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 244–263. Springer, Heidel-
berg (2007)
8. Kim, J.-S., Biryukov, A., Preneel, B., Lee, S.-J.: On the Security of Encryption
Modes of MD4, MD5 and HAVAL. In: Qing, S., Mao, W., López, J., Wang, G.
(eds.) ICICS 2005. LNCS, vol. 3783, pp. 147–158. Springer, Heidelberg (2005)
Boomerang Distinguishers on MD4-Family 17
9. Kim, J., Biryukov, A., Preneel, B., Lee, S.: On the Security of Encryption Modes
of MD4, MD5 and HAVAL. Cryptology ePrint Archive, Report 2005/327 (2005);
In: Qing, S., Mao, W., López, J., Wang, G. (eds.) ICICS 2005. LNCS, vol. 3783,
pp. 147–158. Springer, Heidelberg (2005)
10. Lamberger, M., Mendel, F.: Higher-Order Differential Attack on Reduced SHA-
256. Cryptology ePrint Archive, Report 2011/037 (2011),
http://eprint.iacr.org/2011/037
11. Leurent, G.: MD4 is Not One-Way. In: Nyberg, K. (ed.) FSE 2008. LNCS, vol. 5086,
pp. 412–428. Springer, Heidelberg (2008)
12. Murphy, S.: The Return of the Cryptographic Boomerang. IEEE Transactions on
Information Theory 57(4), 2517–2521 (2011)
13. Rivest, R.L.: The MD4 Message Digest Algorithm. In: Menezes, A., Vanstone, S.A.
(eds.) CRYPTO 1990. LNCS, vol. 537, pp. 303–311. Springer, Heidelberg (1991),
also appeared in RFC 1320 http://www.ietf.org/rfc/rfc1320.txt
14. Rivest, R.L.: Request for Comments 1321: The MD5 Message Digest Algorithm.
The Internet Engineering Task Force (1992)
15. Sakai, Y., Sasaki, Y., Wang, L., Ohta, K., Sakiyama, K.: Preimage Attacks on
5- Pass HAVAL Reduced to 158-Steps and One-Block 3-Pass HAVAL. Industrial
Track of ACNS 2011 (2011)
16. Sasaki, Y., Aoki, K.: Preimage Attacks on 3, 4, and 5-Pass HAVAL. In: Pieprzyk,
J.P. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp. 253–271. Springer, Heidelberg
(2008)
17. Sasaki, Y., Aoki, K.: Finding Preimages in Full MD5 Faster than Exhaustive
Search. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 134–152.
Springer, Heidelberg (2009)
18. Sasaki, Y., Wang, L., Ohta, K., Kunihiro, N.: New Message Difference for MD4. In:
Biryukov, A. (ed.) FSE 2007. LNCS, vol. 4593, pp. 329–348. Springer, Heidelberg
(2007)
19. Suzuki, K., Kurosawa, K.: How to Find Many Collisions of 3-Pass HAVAL. In:
Miyaji, A., Kikuchi, H., Rannenberg, K. (eds.) IWSEC 2007. LNCS, vol. 4752, pp.
428–443. Springer, Heidelberg (2007)
20. U.S. Department of Commerce, National Institute of Standards and Technology:
Federal Register Vol. 72, No. 212/Friday, November 2, 2007/Notices (2007)
21. Van Rompay, B., Biryukov, A., Preneel, B., Vandewalle, J.: Cryptanalysis of 3-Pass
HAVAL. In: Laih, C.-S. (ed.) ASIACRYPT 2003. LNCS, vol. 2894, pp. 228–245.
Springer, Heidelberg (2003)
22. Wagner, D.: The Boomerang Attack. In: Knudsen, L.R. (ed.) FSE 1999. LNCS,
vol. 1636, pp. 156–170. Springer, Heidelberg (1999)
23. Wagner, D.: A Generalized Birthday Problem. In: Yung, M. (ed.) CRYPTO 2002.
LNCS, vol. 2442, pp. 288–303. Springer, Heidelberg (2002)
24. Wang, X., Feng, D., Yu, X.: An Attack on Hash Function HAVAL-128. Science in
China (Information Sciences) 48(5), 545–556 (2005)
25. Wang, X., Lai, X., Feng, D., Chen, H., Yu, X.: Cryptanalysis of the Hash Functions
MD4 and RIPEMD. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494,
pp. 1–18. Springer, Heidelberg (2005)
26. Wang, X., Yin, Y.L., Yu, H.: Finding Collisions in the Full SHA-1. In: Shoup, V.
(ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 17–36. Springer, Heidelberg (2005)
27. Wang, X., Yu, H.: How to Break MD5 and Other Hash Functions. In: Cramer, R.
(ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 19–35. Springer, Heidelberg (2005)
18 Y. Sasaki
28. Wang, Z., Zhang, H., Qin, Z., Meng, Q.: Cryptanalysis of 4-Pass HAVAL. Crptology
ePrint Archive, Report 2006/161 (2006)
29. Xie, T., Liu, F., Feng, D.: Could the 1-MSB Input Difference be the Fastest Colli-
sion Attack for MD5? Cryptology ePrint Archive, Report 2008/391 (2008)
30. Yoshida, H., Biryukov, A., De Cannière, C., Lano, J., Preneel, B.: Non-Randomness
of the Full 4 and 5-Pass HAVAL. In: Blundo, C., Cimato, S. (eds.) SCN 2004. LNCS,
vol. 3352, pp. 324–336. Springer, Heidelberg (2005)
31. Yu, H., Wang, X., Yun, A., Park, S.: Cryptanalysis of the Full HAVAL with 4
and 5 Passes. In: Robshaw, M.J.B. (ed.) FSE 2006. LNCS, vol. 4047, pp. 89–110.
Springer, Heidelberg (2006)
32. Zheng, Y., Pieprzyk, J., Seberry, J.: HAVAL — One-Way Hashing Algorithm with
Variable Length of Output. In: Zheng, Y., Seberry, J. (eds.) AUSCRYPT 1992.
LNCS, vol. 718, pp. 83–104. Springer, Heidelberg (1993)
The differential paths in [9] can be used to construct a 4-sum on the compression
function. We show the generated 4-sums for MD5, 3-pass HAVAL, and 4-pass
HAVAL. The amplified probability to satisfy the entire path is approximately
2−8 for MD5, 2−2 for 3-pass HAVAL, and 2−9 for 4-pass HAVAL.
Improved Analysis of ECHO-256
1 Introduction
ECHO-256 [1] is the 256-bit version of one of the second-round candidates of
the SHA-3 competition. It is an AES-based hash function that has been the
subject of many studies. Currently, the best known analysis of ECHO-256 are
a distinguisher on the full 8-round internal permutation proposed in [13] and
improved in [10]. Furthermore, a 4-round collision attack of the compression
function has been presented in [4]. A previous analysis due to Schläffer in [14]
has been shown to be incorrect in [4], but it introduced an alternative description
of the ECHO round-function, which has then been reused in several analyses,
including this paper. The best results of this paper are a collision attack on
the hash function reduced to 5 rounds and a distinguisher of the compression
function on 7 rounds. Additionally, we cover two more attacks in the Appendix.
The complexities of previous results and our proposed attacks are reported in
Table 1.
Apart from the improved attacks on ECHO-256, this paper also covers a num-
ber of new techniques. The merging process of multiple inbound phases has been
improved to find solutions also for the hash function, where much less freedom
is available in the chaining input. For the hash function collision attack on 5
This work was supported in part by ANR/SAPHIR II, by the French DGA, by the
NCCR-MICS under grant number 5005-67322, by the ECRYPT II contract ICT-
2007-216646, by the Austrian FWF project P21936 and by the IAP Programme
P6/26 BCRYPT of the Belgian State.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 19–36, 2012.
c Springer-Verlag Berlin Heidelberg 2012
20 J. Jean, M. Naya-Plasencia, and M. Schläffer
rounds, we use subspace differences which collide with a high probability at the
output of the hash function. Additionally, we use multiple phases also in the
outbound part to reduce the overall complexity of the attacks. For the 7-round
compression function distinguisher, we use the new techniques and algorithms
introduced in [10, 11].
2 ECHO-256 Description
0 0
BigSB 1 BigSR 1 BigMC
2 2
3 3
Fig. 1. One round of the ECHO permutation. Each of the 16 cells is an AES state.
using any property relying on symmetry and adding constants does not change
differences, we omit these steps in the following.
Two versions of the hash function ECHO have been submitted to the SHA-3
contest: ECHO-256 and ECHO-512, which share the same state size and round
function, but inject messages of size 1536 or 1024 bits respectively in the com-
pression function. Note that the message is padded by adding a single 1 followed
by zeros to fill up the last message block. The last 18 bytes of the last message
block always contain the 2-byte hash output size, followed by the 16-byte mes-
sage length. Focusing on ECHO-256 and denoting f its compression function, Hi
the i-th output chaining value, Mi = Mi0 || Mi1 || Mi2 the i-th message block
composed of three chunks of 512 bits each Mij and S = [C0 C1 C2 C3 ] the four
512-bit ECHO-columns constituting state S, we have (H0 = IV ):
AES. We recall that one round, among the ten ones, of the AES-128 permuta-
tion is the succession of four transformations: SubBytes (SB), ShiftRows (SR),
MixColumns (MC) and AddRoundKey (AK). We refer to the original publication
[15] for further details.
these operations does not change the computational result of ECHO and similar
alternative descriptions have already been used in the analysis of AES. Hence,
one round of ECHO results in the two transformations SuperSBox (SB-MC-SB)
and SuperMixColumns (MC-BMC), which are separated just by byte-shuffling op-
eration. The SuperSBox has first been analyzed by Daemen and Rijmen in [2] to
study two rounds of AES and has been independently used by Lamberger et al.
in [6] and Gilbert and Peyrin in [12] to analyze AES-based hash functions. The
SuperMixColumns has been first introduced by Schläffer in [14] and reused in [4].
We refer to those articles for further details as well.
!
Improved Analysis of ECHO-256
Fig. 2. The truncated differential path to get a collision for 5 rounds of ECHO-256. Black bytes are active, blue and cyan bytes are
determined by the chaining input and padding, red bytes are values computed in the red inbound phase, yellow bytes in the yellow
23
Analyzing the resulting matrix Mcomb for all four active column-slices shows
that in each case, the rank of Mcomb is two, and not four. This reduces the di-
mension of the vector space in each active column-slice from 32 to 16. Since we
have four active columns, the total dimension of the vector space at the output of
the hash function is 64. Furthermore, column i ∈ {0, 1, 2, 3} of the output hash
value depends only on columns 4i of state S38 . It follows that the output differ-
ence in the first column i = 0 of the output hash value depends only on the four
active differences in columns 0, 4, 8 and 12 of state S38 , which we denote by a,
b, c and d. To get a collision in the first column of the hash function output, we
get the following linear system of equations:
Improved Analysis of ECHO-256 25
T T
Mcomb · a b c d = 00000000 .
Since we cannot control the differences a, b, c and d in the following attack,
we need to find a solution for this system of equations by brute-force. However,
the brute-force complexity is less than expected due to the reduced rank of the
given matrix. Since the rank is two, 216 solutions exist and a random difference
results in a collision with a probability of 2−16 instead of 2−32 for the first
output column. Since the rank of all four output column matrices is two, we get
a collision at the output of the hash function with a probability of 2−16×4 = 2−64
for the given truncated differential path.
First Inbound between S16 and S24 . We first search for internal state
pairs conforming to the truncated differential path in round 3 (yellow and black
bytes). We start the attack by choosing differences for the active bytes in state
S16 such that the truncated differential path of SuperMixColumns between state
S14 and S16 is fulfilled (Section 2.1). We compute this difference forward to state
S17 through the linear layers.
We continue with randomly chosen differences of state S24 and compute back-
wards to state S20 , the output of the SuperSBoxes. Since we have 64 active S-
boxes in this state, the probability of a differential is about 2−1×64 . Hence, we
need 264 starting differences but get 264 solutions for the inbound phase in round
3 (see [9]). We determine the right pairs for each of the 16 SuperSBox between
state S17 and S20 independently. Using the Differential Distribution Table of the
SuperSBoxes, we can find one right pair with average complexity one. In total,
we compute 296 solutions for this inbound phase with time complexity 296 and
memory complexity of at most 264 . For each of these pairs, differences and values
of all yellow and black bytes in round 3 are determined.
Second Outbound between S24 and S31 . In the outbound phase, we ensure
the propagation in round 4 of the truncated differential path by propagating the
right pairs of the previous inbound phase forwards to state S31 . With a prob-
ability of 2−96 , we get four active bytes after MixColumns in state S31 (green)
conforming to the truncated path. Hence, among the 296 right pairs of the in-
bound phase between S16 and S24 we expect to find one such right pair.
The total complexity to find this partial pair between S16 and S31 is then 296 .
Note that for this pair, the values and differences of the yellow, green and black
bytes between states S16 and S31 can be determined. Furthermore, note that for
any choice of the remaining bytes, the truncated differential path between state
S31 and state S40 is fulfilled.
Second Inbound between S7 and S14 . Here, we search for many pairs of
internal states conforming to the truncated differential path between states S7
and S14 . Note that we can independently search for pairs of each BigColumn
Improved Analysis of ECHO-256 27
of state S7 , since the four BigColumns stay independent until they are mixed
by the following BigMixColumns transformation between states S15 and S16 . For
each BigColumn, four SuperSBoxes are active and we need at least 216 starting
differentials for each one to find the first right pair.
The difference in S14 is already fixed due to the yellow inbound phase but we
can still choose at least 232 differences for each active AES state in S7 . Using the
rebound technique, we can find one pair on average for each starting difference
in the inbound phase. Then, we independently iterate through all 232 starting
differences for the first, second and third column and through all 264 starting
differences for the fourth column of state S7 . We get 232 right pairs for each of
the first three columns and 264 pairs for the fourth column. The complexity to
find all these pairs is 264 in time and memory.
For each resulting right pair, the values and differences of the red and black
bytes between states S7 and S14 can be computed. Furthermore, the truncated
differential path in backward direction, except for two cyan bytes in the first
states, is fulfilled. In the next phase, we partially merge the right pairs of the
yellow and red inbound phase. But first, we recall the conditions for this merge.
First Part in Merging the Inbound Phases. For each pair of the previ-
ous two phases, the values of the red, yellow and black bytes of state S14 and
S16 are fixed. These two states are separated by the linear SuperMixColumns
transformation: taking the first column-slice as an example, we get
2 · A0 + 3 · A1 + A2 + A3 = 14 · B0 + 11 · B1 + 13 · B2 + 9 · B3 . (1)
Similar 8-bit conditions exist for all 16 column-slices. In total, each right pair
of the two (independent) inbound phases results in a 128-bit condition on the
whole SuperMixColumns transformation between states S14 and S16 .
Remember that we have constructed one pair for the yellow inbound phase
and in total, 232 × 232 × 232 × 264 = 2160 pairs for the red inbound phase. Among
these 2160 pairs, we expect to find 232 right pairs which also satisfy the 128-bit
condition of the SuperMixColumns between states S14 and S16 . In the following,
we show how to find all these 232 pairs with a complexity of 296 .
First, we combine the 232 × 232 = 264 pairs determined by the two first
BigColumns of state S7 in a list L1 and the 232 × 264 = 296 pairs determined by
28 J. Jean, M. Naya-Plasencia, and M. Schläffer
the last two BigColumns of state S7 in a list L2 . Note that the pairs in these two
lists are independent. Then, we separate Equation (1) into terms determined by
L1 and terms determined by L2 :
2 · A0 + 3 · A1 = A2 + A3 + 14 · B0 + 11 · B1 + 13 · B2 + 9 · B3 . (2)
We apply the left-hand side to the elements of L1 and the right-hand side to
elements of L2 and sort L1 according to the bytes to be matched.
Then, we can simply merge (join) these lists to find those pairs which satisfy
the 128-bit condition imposed by the SuperMixColumns and store these results
in list L12 = L1 128 L2 . This way, we get 264 × 296 × 2−128 = 232 right pairs
with a total complexity of 296 . We note that the memory requirements can be
reduced to 264 if we do not store the elements of L2 but compute them online.
The resulting 232 solutions are partial right pairs for the black, red, yellow and
green bytes between state S7 and S31 .
Merge Chaining Input. Next, we need to merge the 232 results of the previous
phases with the chaining input (blue) and the bytes fixed by the padding (cyan).
The chaining input and padding overlap with the red inbound phase in state S7
on 5 × 4 = 20 bytes. This results in a 160-bit condition on the overlapping
blue/cyan/red bytes. To find a pair verifying this condition, we first generate
2112 random first message blocks, compute the blue bytes of state S7 and store
the results in a list L3 .
Additionally, we repeat 216 times from the yellow inbound phase but with
other starting points1 in state S24 . This way, we get 216 × 232 = 248 right pairs
for the combined yellow and red inbound phases, which also satisfy the 128-bit
condition of SuperMixColumns between states S14 and S16 . The complexity is
216 × 296 = 2112 . We store the resulting 248 pairs in list L12 .
Next, we merge the lists according to the overlapping 160-bits (L12 160 L3 )
and get 248 × 2112 × 2−160 = 1 right pair. If we compute the 2112 message blocks
of list L3 online, the time complexity of this merging step is 2112 with memory
requirements of 248 . For the resulting pair, all differences between states S4 and
S33 and all colored byte values (blue, cyan, red, yellow, green and black) between
states S0 and S31 can be determined.
Fig. 3. States used to merge the two inbound phases with the chaining values. The
merge inbound phase consists of three parts. Brown bytes show values already deter-
mined (first part) and gray values are chosen at random (second part). Green, blue,
yellow and red bytes show independent values used in the generalized birthday attack
(third part) and cyan bytes represent values with the target conditions.
to state S14 . Note that we need to try 22×8+1 values for AES state S7 [2, 1] to also
match the 2-byte (cyan) and 1-bit padding at the input in AES state S0 [2, 3].
Then, all gray, light-gray, cyan and brown bytes have already been determined
either by an inbound phase, chaining value, padding or just by choosing random
values for the remaining free bytes of the two first columns of S7 . However, all
white, red, green, yellow and blue bytes are still free to choose.
By considering the linear SuperMixColumns transformation, we observe that in
each column-slice, 14 out of 32 input/output bytes are already fixed and 2 bytes
are still free to choose. Hence, we expect to get 216 solutions for this linear system
of equations. Unfortunately, also for the given position of already determined 14
bytes, the linear system of equations does not have a full rank. Again, we can
determine the resulting system using the matrix MSMC of SuperMixColumns.
As an example, for the first column-slice, the system is given as follows:
MSMC · [A0 L0 L1 L2 A1 L0 L1 L2 A2 x6 x7 x8 A3 x9 x10 x11 ]T
= [B0 B1 B2 B3 y0 y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 ]T .
The free variables in this system are x6 , . . . , x11 (green). The values A0 , A1 ,
A2 , A3 , B0 , B1 , B2 , B3 (brown) have been determined by the first or second
inbound phase and the values L0 , L1 , L2 (light-gray) and L0 , L1 , L2 (gray) are
determined by the choice of arbitrary values in state S7 . We proceed as before
and determine the linear system of equations which needs to have a solution:
⎡ ⎤
31 131 1
T T
⎣ 2 3 1 2 3 1 ⎦ · x6 x7 x8 x9 x10 x11 = c 0 c 1 c 2 c 3 . (3)
12 312 3
11 211 2
The resulting linear 8-bit equation to get a solution for this system can be sep-
arated into terms depending on values of Li and on Li , and we get f1 (Li ) +
f2 (Li ) + f3 (ai , bi ) = 0, where f1 , f2 and f3 are linear functions. For all other
16 column-slices and fixed positions of gray bytes, we get matrices of rank three
as well. In total, we get 16 8-bit conditions and the probability to find a solu-
tion for a given choice of gray and light-gray values in states S14 and S16 is 2−128 .
30 J. Jean, M. Naya-Plasencia, and M. Schläffer
However, we can find a solution to these linear equations using the birthday
effect and a meet-in-the-middle attack with a complexity of 264 in time and
memory.
We start by choosing 264 values for each of the first (gray) and second (light-
gray) BigColumns in state S7 . We compute these values independently forward
to state S14 and store them in two lists L and L . We also separate all equations
of the 128-bit condition into parts depending only on values of L and L . We
apply the resulting functions f1 , f2 , f3 to the elements of lists Li and Li , and
merge two lists L 128 L using the birthday effect.
– The associated differences in the two same diagonals in the four active AES
states of S16 can be computed. Consequently, we can check in the previously
computed ordered lists Lij,0 with j ∈ [0, 3] and i ∈ [0, 1] where we find this
difference2 . For j ∈ {0, 3}, on average, we obtain one match on each one of
the lists L00,0 , L10,0 , L03,0 and L13,0 . For j ∈ {1, 2}, we obtain 232 matches,
one for each of the 232 possible differences in the associated diagonals in
S23 . That is 232 matches for L01,0 and L11,0 , where a pair of values formed
2
i is either 0 or 1 because we are just considering the two first diagonals.
Improved Analysis of ECHO-256 33
by one element of each list is only valid if they were generated from the
same difference in S23 . Consequently, we can construct the list L0,1 1,0 of size
232 where we store the values and differences of those two diagonals in the
AES state S16 [1, 0] as well as the difference in S23 from which they were
generated. Repeating the process for L02,0 and L12,0 , we construct the list
L0,1 32 0,1 0,1
2,0 of size 2 . We can merge the lists L1,0 , L2,0 and the four fixed values
for differences and values obtained from the matches in the lists L00,0 , L10,0 ,
L03,0 and L13,0 , corresponding to the AES states S16 [0, 0] and S16 [3, 0]. This
generates the list L0,1 of size 264 . Each element of this list contains the values
and differences of the two diagonals 0 and 1 of the four active AES states
in S16 . As we have all the values for the two first diagonals in the four AES
states, for each one of these elements, we compute the values in the two first
diagonals of the active state in S15 by applying the inverse of BigMixColumns.
We order them according to these values.
– Next, we go through the 264 possible differences of the two next diagonals
(diagonals 2 and 3) of the active AES state in S15 . For each one of these 264
possible differences:
• All the differences in the AES state S13 [0, 0] are determined. We check
in the lists L0A , L1A , L2A and L3A if we find a match for the differences.
We expect to find one in each list and this determines the values for
the whole state S15 [0, 0] (as the elements in these lists are formed by
differences and values). This means that the value of the active AES
state in S15 is also completely determined. This way, we can check in the
previously generated list L0,1 if the correct value for the two diagonals
0 and 1 appears. We expect to find it once.
• As we have just found a valid element from L0,1 , it determines the dif-
ferences in the AES states S23 [1, 0] and S23 [2, 0] that were not fixed yet.
Now, we need to check if, for those differences in S23 , the corresponding
elements in the four lists Li1,0 , Li2,0 for i ∈ [2, 3] that match with the dif-
ferences fixed in the diagonals 2 and 3 of S15 3 , satisfy the values in S15
that were also determined by the lists LiA . This occurs with probability
2−64 .
All in all, the time complexity of this algorithm is 264 · (264 + 264 ) = 2129 with
a memory requirement of 264 . The resulting expected number of valid pairs is
264 · 264 · 264 · 2−64 · 2−64 = 264 .
S32 [1, 1], and we still have 232 possible differences for each of the two remaining
active AES states in S32 . Then, the lists LiA are generated from the end and
contain values and differences from S40 . Similarly, the lists Lij,j contain values
and differences from S38 . We can apply the same algorithm as before and obtain
264 solutions with a cost of 2128 in time and 264 in memory.
of S23 . Since for each of the 264 differences we have 232 possible associated values
in S32 , the probability of finding a good pair is 296−128−64+32 = 2−64 .
If we repeat this merging procedure 264 times, namely for 232 differences in
the columns 0 and 1 of S23 and for 232 differences in the columns 0 and 1 of S32 ,
we should find a solution. We then repeat the procedure for the cross product of
the 232 solutions for each side. As we do not want to compute them each time
that we use them, as it would increase the time complexity, we can just store
the 264+32+32 = 2128 solutions for the first part and use the corresponding ones
when needed, while the second part is computed in sequence. The complexity
would be: 2192 + 2192 + 296+64 in time and 2128 in memory. So far, we have found
a partial solution for the differential part for rounds from S6 to S48 . We still
have the passive bytes to determine and the condition to pass from S50 to S51
to verify. This can be done exactly as in the second and third part of the merge
inbound phase of Section 3.4 with no additional cost.
Moreover, since we can find x solutions with complexity max{x, 296 } in time
and 296 memory for the (independent) merge inbound phase, we can get x < 2193
solutions with time complexity 2193 + max{x, 296 } ∼ 2193 and 2128 memory. We
need only 296 of these solutions to pass the probabilistic propagation in the last
round from S50 to S51 . Hence, we can find a complete solution for the whole
path with a cost of about 2193 in time and 2128 in memory. Furthermore, with a
probability of 2−128 , the input and output differences in S0 and S48 collide in the
feed-forward and BigFinal transformation. Therefore, we can also generate free-
start collisions for 6 rounds of the compression function with a time complexity
of 2193 + 2128 ∼ 2193 and 2128 memory.
5 Conclusions
In this work, we have presented new results on the second-round candidate of the
SHA-3 competition ECHO-256 that improve considerably the previous published
cryptanalysis. Our analysis are based on multi-inbound rebound attacks and are
summarized in Table 1. The main results are a 5-round collision of the hash
function and a 7-round distinguisher of its compression function. All of our
results take into account the condition observed in [4], which is needed to merge
the results of multiple inbound phases, and satisfy it. The 7-round distinguisher
on the compression function uses the stop-in-the-middle algorithms proposed
in [10].
References
1. Benadjila, R., Billet, O., Gilbert, H., Macario-Rat, G., Peyrin, T., Robshaw, M.,
Seurin, Y.: SHA-3 proposal: ECHO. Submission to NIST (updated) (2009),
http://crypto.rd.francetelecom.com/echo/doc/
echo_description_1-5.pdf
2. Daemen, J., Rijmen, V.: Understanding Two-Round Differentials in AES. In: De
Prisco, R., Yung, M. (eds.) SCN 2006. LNCS, vol. 4116, pp. 78–94. Springer,
Heidelberg (2006)
36 J. Jean, M. Naya-Plasencia, and M. Schläffer
3. Gilbert, H., Peyrin, T.: Super-Sbox Cryptanalysis: Improved Attacks for AES-
Like Permutations. In: Hong, S., Iwata, T. (eds.) FSE 2010. LNCS, vol. 6147,
pp. 365–383. Springer, Heidelberg (2010)
4. Jean, J., Fouque, P.-A.: Practical Near-Collisions and Collisions on Round-Reduced
ECHO-256 Compression Function. In: Joux, A. (ed.) FSE 2011. LNCS, vol. 6733,
pp. 107–127. Springer, Heidelberg (2011)
5. Jean, J., Naya-Plasencia, M., Schläffer, M.: Improved Analysis of ECHO-256. Cryp-
tology ePrint Archive, Report 2011/422 (2011), http://eprint.iacr.org/
6. Lamberger, M., Mendel, F., Rechberger, C., Rijmen, V., Schläffer, M.: Rebound
Distinguishers: Results on the Full Whirlpool Compression Function. In: Matsui,
M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 126–143. Springer, Heidelberg
(2009)
7. Van Le, T., Sparr, R., Wernsdorf, R., Desmedt, Y.G.: Complementation-Like and
Cyclic Properties of AES Round Functions. In: Dobbertin, H., Rijmen, V., Sowa,
A. (eds.) AES 2005. LNCS, vol. 3373, pp. 128–141. Springer, Heidelberg (2005)
8. Matusiewicz, K., Naya-Plasencia, M., Nikolić, I., Sasaki, Y., Schläffer, M.: Rebound
Attack on the Full Lane Compression Function. In: Matsui, M. (ed.) ASIACRYPT
2009. LNCS, vol. 5912, pp. 106–125. Springer, Heidelberg (2009)
9. Mendel, F., Rechberger, C., Schläffer, M., Thomsen, S.S.: The Rebound Attack:
Cryptanalysis of Reduced Whirlpool and Grøstl. In: Dunkelman, O. (ed.) FSE
2009. LNCS, vol. 5665, pp. 260–276. Springer, Heidelberg (2009)
10. Naya-Plasencia, M.: How to Improve Rebound Attacks. In: Rogaway, P. (ed.)
CRYPTO 2011. LNCS, vol. 6841, pp. 188–205. Springer, Heidelberg (2011)
11. Naya-Plasencia, M.: How to Improve Rebound Attacks. Cryptology ePrint Archive,
Report 2010/607 (2010) (extended version), http://eprint.iacr.org/
12. Peyrin, T.: Improved Differential Attacks for ECHO and Grøstl. In: Rabin, T. (ed.)
CRYPTO 2010. LNCS, vol. 6223, pp. 370–392. Springer, Heidelberg (2010)
13. Sasaki, Y., Li, Y., Wang, L., Sakiyama, K., Ohta, K.: Non-Full-Active Super-Sbox
Analysis: Applications to ECHO and Grøstl. In: Abe, M. (ed.) ASIACRYPT 2010.
LNCS, vol. 6477, pp. 38–55. Springer, Heidelberg (2010)
14. Schläffer, M.: Subspace Distinguisher for 5/8 Rounds of the ECHO-256 Hash Func-
tion. In: Biryukov, A., Gong, G., Stinson, D.R. (eds.) SAC 2010. LNCS, vol. 6544,
pp. 369–387. Springer, Heidelberg (2011)
15. National Institute of Standards, Technology (NIST): Advanced E.D.F.-
G.D.F.ncryption Standard (FIPS PUB 197) (November 2001),
http://www.csrc.nist.gov/publications/fips/
fips197/fips-197.pdf
16. Wagner, D.: A Generalized Birthday Problem. In: Yung, M. (ed.) CRYPTO 2002.
LNCS, vol. 2442, pp. 288–303. Springer, Heidelberg (2002)
Provable Chosen-Target-Forced-Midfix
Preimage Resistance
Abstract. This paper deals with definitional aspects of the herding at-
tack of Kelsey and Kohno, and investigates the provable security of sev-
eral hash functions against herding attacks.
Firstly, we define the notion of chosen-target-forced-midfix (CTFM)
as a generalization of the classical herding (chosen-target-forced-prefix)
attack to the cases where the challenge message is not only a prefix
but may appear at any place in the preimage. Additionally, we identify
four variants of the CTFM notion in the setting where salts are explicit
input parameters to the hash function. Our results show that including
salts without weakening the compression function does not add up to
the CTFM security of the hash function.
Our second and main technical result is a proof of CTFM security of
the classical Merkle-Damgård construction. The proof demonstrates in
the ideal model that the herding attack of Kelsey and Kohno is optimal
(asymptotically) and no attack with lower complexity exists. Our security
analysis applies to a wide class of narrow-pipe Merkle-Damgård based
iterative hash functions, including enveloped Merkle-Damgård, Merkle-
Damgård with permutation, HAIFA, zipper hash and hash-twice hash
functions. To our knowledge, this is the first positive result in this field.
Finally, having excluded salts from the possible tool set for improv-
ing narrow-pipe designs’ CTFM resistance, we resort to various message
modification techniques. Our findings, however, result in the negative
and we demonstrate CTFM attacks with complexity of the same order
as the Merkle-Damgård herding attack on a broad class of narrow-pipe
schemes with specific message modifications.
1 Introduction
Hash functions are an important cryptographic primitive finding numerous ap-
plications. Most commonly, hash functions are designed from a fixed input length
compression function to accommodate messages of arbitrary length. The most
common domain extender is the Merkle-Damgård (MD) iteration [8,16], which
has long been believed to be a secure design choice due to its collision secu-
rity reduction. Recently, however, several results cast doubt on its security with
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 37–54, 2012.
c Springer-Verlag Berlin Heidelberg 2012
38 E. Andreeva and B. Mennink
respect to other properties. The MD design was showed to not preserve either
second preimage or preimage properties [3]. Moreover, the indifferentiability at-
tack of Coron et al. [7], the multicollision attack of Joux [12] and the herding
attack of Kelsey and Kohno [13] exposed various weaknesses of the MD design.
The herding attack of Kelsey and Kohno, also known as the chosen-target-
forced-prefix (CTFP) attack, considers an adversary that commits to a hash
value y for a message that is not entirely under his control. The adversary then
demonstrates abilities to incorporate an unknown challenge prefix as part of
the original preimage corresponding to the committed value y. While for a ran-
dom oracle the complexity of such an attack is Θ(2n ) compression function
calls
√ 2n/3 for y of length n bits, the herding attack on Merkle-Damgård takes about
n2 compression function executions for a preimage message of length O(n)
as demonstrated by Kelsey and Kohno [13]. A more precise attack bound was
obtained by Blackburn et al. [6].
Several other hash functions have been analyzed with respect to resistance to
herding attacks. In [2], Andreeva et al. showed applications of the herding attack
to dither hash functions, and in [1] they generalized the herding attack of [13] to
several multi-pass domain extenders, such as the zipper hash and the hash twice
design. Gauravaram et al. [9,10] concluded insecurity against herding attacks for
the MD designs where an XOR tweak is used in the final message block.
While this topic has generated significant interest, many important questions
still remain unanswered. All research so far focused on negative results, being
generalized herding attacks on hash function designs, but it is not known whether
one can launch herding attacks with further improved complexity against the MD
design. The task becomes additionally complicated by the lack of formal security
definitions for herding to accommodate the objectives of a proof based approach.
Apart from wide-pipe designs, no scheme known yet is secure against herding
attacks, nor is it clear how to improve the MD design without enlarging its state
size. Some of the possible directions are to either use randomization (salts) or
to apply certain message modification techniques.
value for an unknown salt, where the salt is defined as an input parameter
to the hash function. Although the other three variants are plausible from a
practical perspective, we show that they do not attribute to an improved CTFM
resistance. This is true for the case where the salt is added in such a way that
the cryptographic strength of the compression function is not compromised on
some other level, i.e. collision or preimage resistance.
Our main technical contribution is to exhibit a CTFM security proof for
the MD domain extender. While until now the research in this area has been
focusing on finding herding attacks against hash function designs, we are the first
to provide a security upper bound for a hash function design. In more detail,
we upper bound the strength of a CTFM attacker in finding in the ideal model
a preimage for the MD design, and show that the herding attack described by
Kelsey and Kohno is optimal (asymptotically). Using new proof techniques we
prove that at least approximately 22n/3 /L1/3 compression function queries are
needed for a CTFM attack, where n is size of the commitment y and L is the
maximal allowed length of the preimage in blocks. To the best of our knowledge,
there has not been a positive result of this form before. Due to its generic nature,
the new security techniques introduced in this work not only apply to the MD
design, but directly carry over to a broad spectrum of domain extenders derived
from MD, including strengthened MD, MD with a distinct final transformation
and HAIFA [5]. Additionally, the bound implies optimality of the attacks on
hash twice and the zipper hash function performed by Andreeva et al. [1].
We explore further the question of whether a simple tweak on the narrow-pipe
MD construction would allow us to prove optimal CTFM security. Excluding
randomness or salting from the set of available tools, we investigate tweaks that
modify the message inputs by simple message modification techniques like the
XOR operation. These schemes can be viewed as MD type domain extenders
with a more sophisticated padding. Our findings, however, result in the negative
and we demonstrate CTFM attacks on a class of schemes of this form. The attack
particularly applies also to the MD with checksum design, therewith providing
a simple and elegant alternative to the attack by Gauravaram et al. [10].
2 Preliminaries
By x ← X we denote the uniformly random sampling of an element from a set
$
midfix, and denote by L ≥ 1 the maximal length of the forged preimage in blocks.
Let A be a chosen-target-forced-midfix (CTFM) finding adversary for this hash
function. The advantage of A is defined as
Advctfm
H (A) = Pr f ← Func(n+m,n), (y, st) ← A , P ← {0, 1} ,
$ f $ p
(g, R) ← Af (P, st) : Hf (g(P, R)) = y ∧ rng(g) ≤ 2Lm .
By Advctfm
H (q) we denote the maximum advantage, taken over all adversaries
making q queries to their oracle.
The function g can technically be any function as long as its range is at most
2Lm , but for some choices of g the definition becomes irrelevant. For instance, if
the mutual information between P and g(P, R) is 0, the CTFM attack is trivial.
More generally, the attack becomes easier if the function g is allowed to split P
into parts. However, this type of functions does not correspond to any practically
relevant CTFM attacks. Therefore, in the remainder, we restrict g to satisfy that
g(P, R1 R2 ) = R1 P R2 , where R1 , R2 are of arbitrary length.
The chosen-target-forced-prefix attack of Kelsey and Kohno is covered for g
restricted to R1 being the empty string. The variant of the herding attack by
Andreeva et al. [1] on the zipper hash function can be seen as a chosen-target-
forced-suffix attack, with R2 being the empty string.
The value p defines the size of the challenge P , and plays an important role in
the security results. A smaller value of p allows for higher success probability in
guessing P in the first phase of the attack. A larger value of p limits the number
of “free” queries of the adversary, the adversary needs to make at least p/m
compression function queries for incorporating the challenge P , where m is the
message block size.
4 Salted-Chosen-Target-Forced-Midfix Preimage
Resistance
We can view the functions enci as keyed permutations on the chaining value and
message block inputs to the compression function, where the key is the salt S.
Our choice of this encoding function is guided by a simple security objective.
Let us define fi as fi (S, hi−1 , Mi ) = f (enci (S, hi−1 , Mi )) for i = 1, . . . , L. We
choose enci to be a bijection on hi−1 and Mi to provide the full set of valid
input points for the function f . Any deviation from this would weaken the cryp-
tographic strength of f , i.e. by allowing an adversary to easily launch collision
attacks on the encoding function as a way to circumvent collision computations
on f . If enci is not a bijection on its variable inputs (notice that once chosen the
salt is fixed), then the function f would be working only with a restricted set of
its domain points.
We provide the definitions of SCTFM security for the four variants indexed
by j = 1, 2, 3, 4.
Definition 2. Let H : {0, 1}∗ → {0, 1}n be a salted hash function design and
p, L be as in Def. 1. Let s denote the size of the salt and let enc = {enci }Li=1
be the family of encoding functions as described above. Let B be a salted-chosen-
target-forced-midfix (SCTFM) finding adversary for this hash function. For j ∈
{1, 2, 3, 4} the advantage of an adversary B is defined as
Advsctfm
H (B) = Pr Ej : Hf,enc (g(P, R), S) = y ∧ rng(g) ≤ 2Lm ∧ |S| = s .
By Advsctfm
H (q) we denote the maximum advantage, taken over all adversaries
making q queries to their oracle. The events Ej (j = 1, 2, 3, 4) are illustrated by
the following four game experiments:
j Ej
$ $
1 f ← Func(n+m,n), (y, S, st) ← Bf , P ← {0, 1}p , (g, R) ← Bf (P, st)
$ $ $
2 f ← Func(n+m,n), S ← {0, 1}s , (y, st) ← Bf (S), P ← {0, 1}p , (g, R) ← Bf (P, st)
f ← Func(n+m,n), (y, st) ← Bf , P ← {0, 1}p , S ← {0, 1}s , (g, R) ← Bf (P, S, st)
$ $ $
3
$ $
4 f ← Func(n+m,n), (y, st) ← Bf , P ← {0, 1}p , (g, R, S) ← Bf (P, st)
We provide a discussion on the adversarial abilities for the four SCTFM secu-
rity notions in comparison with the standard CTFM definition and also to the
relevance of salted definitions in practice.
problem that he has to handle the encoding functions enci which may differ each
message block. With respect to the advantage of A, B’s advantage would differ
only in the case B loses control over the outputs of the enci function (which are
inputs to f ), i.e. in the case when he does not know the salt value.
But in each of these three variants the SCTFM adversary B knows the salt and
has control over the inputs to f (as is the case with A) before his commitment
to y, and thus his advantage will be the same as the advantage of A. In variant
1, the SCTFM adversary is in full control of the salt value and in variant 2 he
knows the salt before committing to y, therefore he can prepare the respective
computations for f . Notice that, although in variant 4 the salt value is revealed
by the B in the second stage of the game, he is still in full control of the salt
value and his advantage is optimal when he chooses the salt S in the first phase
of the attack, does the respective computations for f , and then reveals S only
in the second phase.
This analysis comes to show that the SCTFM adversary has the same compu-
tational advantage as a CTFM adversary in variants 1, 2 and 4. The conclusion is
that salting in these variants does not help build more secure CTFM hash func-
tions H and one can do as good without additionally increasing the efficiency
and complexity of H.
to draw here is that salts do not help prevent CTFM attacks and one shall aim
at non-salted solutions. We want to elaborate that this is a conclusion drawn
for a reasonable encoding function. A different encoding function might lead to
weakening the cryptographic strength of the compression function.
splitted into blocks of length m. As demonstrated by Kelsey and Kohno [13] and
Blackburn et al. [6], one can obtain a CTFM preimage of length O(n) in about
√
n22n/3 compression function executions. When larger preimages are allowed,
the elongated herding attack of [13] results in faster attacks: for 0 ≤ r√≤ n/2, one
can find a CTFP preimage of length L = O(n+2r ) in approximately n2(2n−r)/3
queries. As we will demonstrate, this is (asymptotically) the best possible result.
In Thm. 1 we derive an upper bound on Advctfm MD (q) that holds for any q, and
we consider the limiting behavior in Cor. 1, in which we show that at least
22n/3 /L1/3 queries are needed for an attack to succeed. After Cor. 1 we explain
why the same or similar analysis applies to a wide class of MD based functions.
Theorem 1. For any integral threshold t > 0, we have
2 t
ctfm (L − 1)tq m2 p/m q q e q3
AdvMD (q) ≤ + + + .
2n 2p t2n 22n
Proof. See Sect. 5.1.
The bound of Thm. 1 includes a parameter t used to bound multiple events in the
security analysis, and the bound holds for any integral t. Notice that for larger
value of t, the first factor of the bound becomes large, while for small values the
third term becomes large. Therefore, it is of importance to find a balance for this
value t. Recall that, as explained in the beginning of this section, an adversary
has a higher success probability if larger preimages are allowed. Consequently,
the optimum for t partially depends on the allowed length L. In Cor. 1 we analyze
the limiting behavior of the bound of Thm. 1.
Notice that the bound of Thm. 1 contains a term that not directly depends on
n, the second term. This term essentially represents the “guessing probability”
of the attacker: A may succeed guessing P in advance. If p = |P | is very small,
this factor dominates the bound. Therefore, it only makes sense to evaluate
this bound for p being “large enough”, and we have to put a requirement on
p. Although the requirement looks complicated at first sight, it is satisfied by
any relevant value of p. In particular, it is satisfied for p ≥ 2n/3 for L = O(n)
preimages and even for lower values of p when L becomes larger.
Provable Chosen-Target-Forced-Midfix Preimage Resistance 45
p/m 2n/3
Corollary 1. Let L = O(2n/2 ) and let p be such that 2 L1/322p = O(1) for
n(2/3−ε) 1/3
n → ∞. For any ε > 0, we obtain limn→∞ Advctfm
MD 2 /L = 0.
Proof. The bound of Thm. 1 holds for any t ≥ 1. As L = O(2n/2 ), there exists
2n/3
a constant c such that L ≤ c2n/2 . We put t = (L/c) 2/3 ≥ 1. Without loss of
generality, t is integral (one can tweak c a little bit to get integral t). From
Thm. 1:
2n/3
L1/3 c2/3 q m2 p/m q (L/c)2/3 q 2 e (L/c)2/3 q3
Advctfm
MD (q) ≤ + + + .
22n/3 2p 24n/3 22n
For any ε > 0, we obtain:
2n(2/3−ε) c2/3 m 2 p/m 22n/3 e 2n/32/3 1
Advctfm
MD ≤ + + (L/c)
+ .
L1/3 2nε 2nε L1/3 2p c2/3 22nε L23nε
All terms approach 0 for n → ∞ (notice that for the second term we have
m = Θ(n), and for the third term its exponent is ≥ 1).
Although the security analysis of Thm. 1 and Cor. 1 focuses on the original
Merkle-Damgård (MD) hash function, a very similar analysis can be directly
derived for a broad class of MD based iterative hash functions, including MD
with length strengthening [14], enveloped MD [4], MD with permutation [11] and
HAIFA [5]. Indeed, a CTFP attack against strengthened MD is provably harder
than an attack against plain MD due to the presence of the length encoding at
the end, and a similar remark applies to HAIFA. For enveloped MD and MD
with permutation, and in general for any MD based function with final trans-
formation, one can use security properties of the final transformation to show
the adversary knows only a limited amount of state values y which propagate
to the commitment y through the final transformation, and we can analyze the
success probability with respect to each of these possible commitments y .1 The
security analysis furthermore applies to the hash twice hash function (where
the padded message is simply hashed twice) and the zipper hash function (where
the padded message is hashed once forward, and once in the opposite direction)
[15], therewith demonstrating the (asymptotic) optimality of the attacks de-
ployed by Andreeva et al. [1]. Indeed, a CTFM attack for zipper or hash twice is
provably harder than an attack for MD, but the attacks of Andreeva et al. are
of similar complexity as the attack of Kelsey and Kohno on MD.
following event E0 :
E2 considers the event that Q contains a multi-collision of more than two com-
pression function executions. By basic probability theory, we have
Pr (sucA (q2 )) ≤ Pr (sucA (q2 ) | ¬E0 ∧ ¬E1 ) + Pr (E0 ∨ E1 ) ,
≤ Pr (sucA (q2 ) | ¬E0 ∧ ¬E1 ) + Pr (E0 ∨ E1 | ¬E2 ) + Pr (E2 ) ,
≤ Pr (sucA (q2 ) | ¬E0 ∧ ¬E1 ) + Pr (E0 | ¬E2 )
+ Pr (E1 | ¬E2 ) + Pr (E2 ) , (3)
and we consider the probabilities on the right hand side separately.
– Pr (sucA (q2 ) | ¬E0 ∧ ¬E1 ). By ¬E0 , P is not contained in (V (0), A(0)) yet,
but it may be contained partially and hence the adversary will at least
need to make 1 compression function execution. It may be the case that
the adversary makes calls to the compression function for multiple strings
P , and it may be the case that after he queried for P , he knows multiple
paths of different length including P , but this does not violate the analysis.
In general, the suffix R2 of the attack covers at most L − 1 message blocks.
At any time in the attack, there are at most
|{h ∈ V (q2 ) | disth→y (q2 ) ∩ {0, . . . , L − 1}|
possible nodes for which a hit results in a collision. By ¬E1 , this set is upper
bounded by (L − 1)t. As the adversary makes at most q2 ≤ q compres-
sion function calls that may result in success, the total probability is upper
bounded by (L−1)tq
2n ;
– Pr (E0 | ¬E2 ). Notice that ¬E2 implies that all nodes in (V (q2 ), A(q2 )), as
well as all nodes in (V (0), A(0)), have at most 2 incoming arcs. We consider
the probability that there exists a P -comprising path. The existence of such
path implies the existence of an arc that supplies the last bit of P . Consider
Mj (i)
any arc hj−1 −→ hj , and let Mj for i = 1, . . . , m denote the m-th bit. Now,
we can analyze the probability that P ← {0, 1}p is included as a substring of
$
(i)
a path in (V (0), A(0)), with Mj corresponding to the last bit of P . Then,
Pr (E0 | ¬E2 ) is upper bounded by this probability summed over all i and
the number of arcs. We consider the probability for different values of i:
(i−p+1) (i)
• i ≥ p. P is integrally captured in Mj as Mj . . . Mj = P . This
p
happens with probability 1/2 for predetermined Mj and random P ;
• i < p. The first i bits of Mj correspond to the last i bits of P , and
that the first p − i bits of P are a suffix of any path ending in hj−1 .
Let β = (p − i)/m. As by ¬E2 there are at most 2β paths of length β
β β
blocks to hj−1 , we can upper bound the probability by 21i · 22p−i = 22p .
Now, we can sum over all possible values of i and the number of queries q1 .
We obtain
(m+p−1)q1
if p ≤ m,
Pr (E0 | ¬E2 ) ≤ 2p
m2p/m q1
2p if p > m.
m2p/m q
In both cases, we derive upper bound 2p , given q1 ≤ q;
48 E. Andreeva and B. Mennink
– Pr (E1 | ¬E2 ). Let k ∗ be minimal such that the maximum is achieved, and
let h1 , . . . , hα1 be all nodes with distance k ∗ from y. Consider the subgraph
(V , A) of (V (q2 ), A(q2 )) consisting of all2 paths hi → y of length k ∗ edges
(for i = 1, . . . , α1 ). By ways of an elaborate case distinction (see App. A), one
can show that for each node h in (V , A), all paths to y are of the same length.
This in particular implies that the hi ’s (i = 1, . . . , α1 ) have no ingoing edge,
and that y has no outgoing edge. Therefore, we can classify ∗
the nodes in the
∗
subgraph into sets: αk1 = α1 at distance k ∗ from y, αk1 −1 at distance k ∗ − 1,
∗ ∗
etc., α01 = 1 at distance 0. Notice that α01 , . . . , αk1 −1 < αk1 by definition,
but it can be the case that αi−1 1 > αi1 (for 1 < i < k ∗ ) for technical reasons.
By ¬E2 , Q does not contain any 3-way collisions, but only 2-way collisions.
The number of 2-way collisions between the nodes at distances i and i − 1
equals max{αi1 − αi−1 1 , 0}. Consequently, the described subgraph, and hence
(V (q2 ), A(q2 )) itself, contains at least
k∗
∗
max{αi1 − αi−1
1 , 0} ≥ α1 − α1 = α1 − 1 ≥ t
k 0
i=1
t
2-way collisions. Thus, the probability is upper bounded by qt 2qn ≤
2 t
q e
t2n , where the inequality holds due to Stirling’s approximation (x! ≥
(x/e)x for any x);
– Pr (E2 ). The occurrence of E2 implies the presence of a 3-way collision in Q,
which exists with probability at most q 3 /22n only [18].
From equation (3) and above upper bounds on the three probabilities, we obtain:
t
(L − 1)tq m2 p/m q q2 e q3
Advctfm
MD (A) = Pr (sucA (q2 )) ≤ + + + .
2n 2p t2n 22n
As this holds for any adversary making q queries, this completes the proof.
1
2
l
s-pad1 (M ) = M1 Mi M2 Mi · · · Ml Mi .
i=1 i=1 i=1
For simplicity, denote by Ni for i = 1, . . . , 2l the i-th block of s-pad1 (M ). Let IH1
be defined as an iterated hash function of (1) accommodated with the advanced
padding function s-pad1 . We will describe a CTFM attack against IH1 , but
before that we briefly recall the attack of Kelsey and Kohno against the MD
hash function. Denote by κ ≥ 1 the size of the diamond we will use.
one can arbitrarily change one block while still having a valid string of message
blocks. Thus, when constructing the diamond one can vary the message blocks
independently for obtaining collisions. For the sophisticated padding function
s-pad1 this is not possible. If for a given padded message one changes N2i−1 for
i ∈ {1, . . . , l−1}, the values taken by the checksum blocks N2i , . . . , N2i+2 , . . . , N2l
should change as well. At first sight, this makes the construction of the diamond
impossible, but by additionally changing N2i+1 , one can “stabilize” the values
N2i+2 , . . . , N2l and only the blocks N2i−1 , N2i , N2i+1 get affected (in case i = l
only N2i−1 , N2i get affected). Based on this observation the attack is defined as
follows. Notice, the adversary decides on the length of the forgery in advance:
p + 2κ + 2.
1. The adversary constructs a diamond of κ levels.
– He fixes constants C0 , C1 , . . . , Cκ ∈ {0, 1}m in advance. These constants
represent
p +2
C0 = Mi , Ci = Mp +2i+1 ⊕ Mp +2i+2 for i = 1, . . . , κ. (4)
i=1
The adversary does not know the blocks Mi yet, but will choose them
so as to comply with (4);
(1) (2κ )
– He randomly generates 2κ state values h0 , . . . , h0 , and dynamically
finds collisions of the following form for j = 1, . . . , 2κ−1
(2j−1) (2j−1) (2j−1)
Mp +3 C0 ⊕Mp +3 C1 ⊕Mp +3
(2j−1) C0
h0 −→ −−−−→ −−−−−−→ −−−−−−→
(j)
(2j)
Mp +3
(2j)
C0 ⊕Mp +3
(2j)
C1 ⊕Mp +3 h1 .
(2j) C0
h0 −→ −−−−→ −−−−−−→ −−−−−−→
Mp C−1
. . . −→ −→ h2p ;
3. The adversary finds a message Mp +1 such that
Mp +1 C−1 ⊕Mp +1 C−1 ⊕Mp +1 ⊕C0 (j)
h2p −→ −−−−−−→ −−−−−−→ h0
s-pad2 (M ) = M1 M2 M1 ⊕ M2 M3 M2 ⊕ M3 · · · Ml Ml−1 ⊕ Ml ,
s-pad3 (M ) = rotatem/2 (pad(M )),
l
s-pad4 (M ) = pad(M ) Mj ,
j=1
where the function rotatem/2 rotates the bit string by m/2 places (a half message
block). We define by IHk for k = 2, 3, 4 the standard iterated hash function of
(1) accommodated with the advanced padding function s-padk . Notice that for
IH4 , any change of Mi can be corrected by Mi+1 to keep the final checksum
invariant. Now, the attacks are described in a similar manner. For IH2 , the
complexity is the same as for IH1 . The complexities for IH3 , IH4 are half as
large. For each of the functions IHk the optimum is achieved for κ = n/3.
By tweaking the proof of Thm. 1, asymptotic tightness of this bound can be
proven. We notice that Gauravaram et al. [10] describe a generalized herding
attack against a class of MD based hash functions using checksums at the end
(such as IH4 ). The attack described in this section carries over to many of these
designs, therewith providing an elegant alternative. These attacks are of the same
complexity, although our attack renders shorter messages in case n/3 < m. The
cause of this difference is the fact that the attack of Gauravaram et al. sets the
value of the final checksum at the end while in our attack it is essentially fixed
by the adversary in advance.
We leave the existence of narrow-pipe hash functions that achieve optimal
security against the CTFM attack as an open problem.
7 Conclusions
we investigated the CTFM security of salted hash functions showing that adding
a salt value without weakening the compression function does not improve the
CTFM security of the hash function.
As a main technical contribution of the paper we provided a formal security
proof of the MD design against the CTFM attack, and showed that the attack of
Kelsey and Kohno [13] is (asymptotically) the best possible. This proof directly
applies to a wide class of MD based domain extenders, and implies optimality
of other herding attacks, such as those of Andreeva et al. [1] and Gauravaram
et al. [10].
In the quest for optimally CTFM secure narrow-pipe MD designs, we analyzed
the possibility of message modification as a tool to increase CTFM security. Our
result shows however, that such techniques applied to a wide class of narrow-pipe
iterated hash function designs do not block CTFM attacks. An open research
question that emerges from these observations is to construct a narrow-pipe
iterated hash functions that achieves optimal security against the CTFM attacks.
Acknowledgments. This work has been funded in part by the IAP Program
P6/26 BCRYPT of the Belgian State (Belgian Science Policy), in part by the
European Commission through the ICT program under contract ICT-2007-
216676 ECRYPT II, and in part by the Research Council K.U.Leuven: GOA
TENSE. The first author is supported by a Ph.D. Fellowship from the Flemish
Research Foundation (FWO-Vlaanderen). The second author is supported by a
Ph.D. Fellowship from the Institute for the Promotion of Innovation through
Science and Technology in Flanders (IWT-Vlaanderen).
References
1. Andreeva, E., Bouillaguet, C., Dunkelman, O., Kelsey, J.: Herding, Second Preim-
age and Trojan Message Attacks Beyond Merkle-Damgård. In: Jacobson Jr., M.J.,
Rijmen, V., Safavi-Naini, R. (eds.) SAC 2009. LNCS, vol. 5867, pp. 393–414.
Springer, Heidelberg (2009)
2. Andreeva, E., Bouillaguet, C., Fouque, P.-A., Hoch, J., Kelsey, J., Shamir, A.,
Zimmer, S.: Second Preimage Attacks on Dithered Hash Functions. In: Smart,
N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 270–288. Springer, Heidelberg
(2008)
3. Andreeva, E., Neven, G., Preneel, B., Shrimpton, T.: Seven-Property-Preserving It-
erated Hashing: ROX. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833,
pp. 130–146. Springer, Heidelberg (2007)
4. Bellare, M., Ristenpart, T.: Multi-Property-Preserving Hash Domain Extension
and the EMD Transform. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS,
vol. 4284, pp. 299–314. Springer, Heidelberg (2006)
5. Biham, E., Dunkelman, O.: A framework for iterative hash functions – HAIFA.
Cryptology ePrint Archive, Report 2007/278 (2007)
6. Blackburn, S., Stinson, D., Upadhyay, J.: On the complexity of the herding
attack and some related attacks on hash functions. Des. Codes Cryptography
(to appear, 2011)
Provable Chosen-Target-Forced-Midfix Preimage Resistance 53
7. Coron, J.-S., Dodis, Y., Malinaud, C., Puniya, P.: Merkle-Damgård Revisited: How
to Construct a Hash Function. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621,
pp. 430–448. Springer, Heidelberg (2005)
8. Damgård, I.: A Design Principle for Hash Functions. In: Brassard, G. (ed.)
CRYPTO 1989. LNCS, vol. 435, pp. 416–427. Springer, Heidelberg (1990)
9. Gauravaram, P., Kelsey, J.: Linear-XOR and Additive Checksums Don’t Protect
Damgård-Merkle Hashes from Generic Attacks. In: Malkin, T. (ed.) CT-RSA 2008.
LNCS, vol. 4964, pp. 36–51. Springer, Heidelberg (2008)
10. Gauravaram, P., Kelsey, J., Knudsen, L., Thomsen, S.: On hash functions using
checksums. International Journal of Information Security 9(2), 137–151 (2010)
11. Hirose, S., Park, J.H., Yun, A.: A Simple Variant of the Merkle-Damgård Scheme
with a Permutation. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833,
pp. 113–129. Springer, Heidelberg (2007)
12. Joux, A.: Multicollisions in Iterated Hash Functions. Application to Cascaded Con-
structions. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 306–316.
Springer, Heidelberg (2004)
13. Kelsey, J., Kohno, T.: Herding Hash Functions and the Nostradamus Attack. In:
Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 183–200. Springer,
Heidelberg (2006)
14. Lai, X., Massey, J.L.: Hash Functions Based on Block Ciphers. In: Rueppel, R.A.
(ed.) EUROCRYPT 1992. LNCS, vol. 658, pp. 55–70. Springer, Heidelberg (1993)
15. Liskov, M.: Constructing an Ideal Hash Function from Weak Ideal Compression
Functions. In: Biham, E., Youssef, A.M. (eds.) SAC 2006. LNCS, vol. 4356, pp.
358–375. Springer, Heidelberg (2007)
16. Merkle, R.C.: One Way Hash Functions and DES. In: Brassard, G. (ed.) CRYPTO
1989. LNCS, vol. 435, pp. 428–446. Springer, Heidelberg (1990)
17. Neven, G., Smart, N., Warinschi, B.: Hash function requirements for Schnorr sig-
natures. Journal of Mathematical Cryptology 3(1), 69–87 (2009)
18. Suzuki, K., Tonien, D., Kurosawa, K., Toyota, K.: Birthday Paradox for Multi-
Collisions. In: Rhee, M.S., Lee, B. (eds.) ICISC 2006. LNCS, vol. 4296, pp. 29–40.
Springer, Heidelberg (2006)
M
h −→2
y is a subpath of any hi → y for some i, one can replace this subpath
M1
by h −→ y to obtain a path hi → y of length strictly shorter than k ∗ arcs,
M2
rendering contradiction. Thus, we assume that h −→ y is not integrally included
M2
as a subpath of any hi → y. We split up the path h −→ y intro three parts.
M2
Let i ∈ {1, . . . , α1 } be such that the first edge of h −→ y is included in the
(1)
(1) M
path hi → y. Let M2 be the maximal prefix of M2 such that h −→ 2
h(1) (for
some h ) is a subpath of hi → y. Secondly, identify the edge leaving3 h(1)
(1)
M
in the path h −→ 2
y, and let i be such that this edge is included in the path
(2) (1) (2)
hi → y. Let M2 be of maximal length such that M2 M2 is a prefix of M2
(2)
M
and h(1) −→
2
h(2) (for some h(2) ) is a subpath of hi → y. Thus, we splitted
M
h −→
2
y into
(1) (2) (3)
M M M
h −→
2
h(1) −→
2
h(2) −→
2
y, (5)
(1) (2) (3)
where |M2 |, |M2 | > 0 and |M2 | ≥ 0 and
(1) (2)
M M M M M M
hi −→
3
h −→
2
h(1) −→
4
y, hi −→
5
h(1) −→
2
h(2) −→
6
y, (6)
∗ (1) (2)
for some M3 , M4 , M5 , M6 ∈ ({0, 1}m) . Here, M2 and M2 are of maximal
(2)
(1) M4 (1) M2
possible length, i.e. the first arcs of h −→ y and h −→ h (2)
are different
(3)
(2) M6 (2) M2
and the first arcs of h −→ y and h −→ y are different.
M3 M4
If h(1) = h, the path hi −→ h −→ y is in (V (q2 ), A(q2 )) and of length shorter
than k ∗ blocks, rendering contradiction. Similarly, if h(2) = h(1) , a shorter path
hi → y can be found. Hence, we consider the case h = h(1) = h(2) , and make
the following case distinction:
(2)
1. |M4 | = |M2 M6 |. One can combine the two paths described in (6) to obtain
either a path hi → y or hi → y of length strictly shorter than k ∗ arcs;
(2)
2. |M4 | = |M2 M6 |. We make the following case distinction:
(3) (2) (3) (1)
a. |M6 | ≥ |M2 |. This means that |M4 | ≥ |M2 M2 | and hence |M2 M4 | ≥
M M
|M2 | > |M1 |. The path hi −→
3
h −→
1
y is thus strictly shorter than k ∗ arcs;
(3) M
b. |M6 | < |M2 |. One can do the same analysis with paths h(2) −→
6
y and
(3)
M2 (3)
h(2) −→ y. But by construction |M2 | < |M2 | − 2m so one will eventually
(3)
end up with the same problem with |M2 | = 0, in which case one will not
arrive in case 2b.
Concluding, there does not exist any node in (V , A) which has two paths of
different lengths to y.
3 M
This edge exists, as h −→
2
y is not an integral subpath of any path hi → y.
On CCA-Secure Somewhat Homomorphic Encryption
Abstract. It is well known that any encryption scheme which supports any form
of homomorphic operation cannot be secure against adaptive chosen ciphertext
attacks. The question then arises as to what is the most stringent security defini-
tion which is achievable by homomorphic encryption schemes. Prior work has
shown that various schemes which support a single homomorphic encryption
scheme can be shown to be IND-CCA1, i.e. secure against lunchtime attacks.
In this paper we extend this analysis to the recent fully homomorphic encryp-
tion scheme proposed by Gentry, as refined by Gentry, Halevi, Smart and Ver-
cauteren. We show that the basic Gentry scheme is not IND-CCA1; indeed a
trivial lunchtime attack allows one to recover the secret key. We then show that
a minor modification to the variant of the somewhat homomorphic encryption
scheme of Smart and Vercauteren will allow one to achieve IND-CCA1, indeed
PA-1, in the standard model assuming a lattice based knowledge assumption. We
also examine the security of the scheme against another security notion, namely
security in the presence of ciphertext validity checking oracles; and show why
CCA-like notions are important in applications in which multiple parties submit
encrypted data to the “cloud” for secure processing.
1 Introduction
That some encryption schemes allow homomorphic operations, or exhibit so called
privacy homomorphisms in the language of Rivest et. al [24], has often been consid-
ered a weakness. This is because any scheme which supports homomorphic operations
is malleable, and hence is unable to achieve the de-facto security definition for en-
cryption namely IND-CCA2. However, homomorphic encryption schemes do present
a number of functional benefits. For example schemes which support a single additive
homomorphic operation have been used to construct secure electronic voting schemes,
e.g. [9,12].
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 55–72, 2012.
c Springer-Verlag Berlin Heidelberg 2012
56 J. Loftus et al.
The usefulness of schemes supporting a single homomorphic operation has led some
authors to consider what security definition existing homomorphic encryption schemes
meet. A natural notion to try to achieve is that of IND-CCA1, i.e. security in the pres-
ence of a lunch-time attack. Lipmaa [20] shows that the ElGamal encryption scheme is
IND-CCA1 secure with respect to a hard problem which is essentially the same as the
IND-CCA1 security of the ElGamal scheme; a path of work recently extended in [2] to
other schemes.
A different line of work has been to examine security in the context of Plaintext
Awareness, introduced by Bellare and Rogaway [5] in the random oracle model and
later refined into a hierarchy of security notions (PA-0, -1 and -2) by Bellare and Palacio
[4]. Intuitively a scheme is said to be PA if the only way an adversary can create a valid
ciphertext is by applying encryption to a public key and a valid message. Bellare and
Palacio prove that a scheme which possesses both PA-1 (resp. PA-2) and is IND-CPA,
is in fact secure against IND-CCA1 (resp. IND-CCA2) attacks.
The advantage of Bellare and Palacio’s work is that one works in the standard model
to prove security of a scheme; the disadvantage appears to be that one needs to make
a strong assumption to prove a scheme is PA-1 or PA-2. The assumption required is a
so-called knowledge assumption. That such a strong assumption is needed should not
be surprising as the PA security notions are themselves very strong. In the context of
encryption schemes supporting a single homomorphic operation Bellare and Pallacio
show that the Cramer-Shoup Lite scheme [10] and an ElGamal variant introduced by
Damgård [11] are both PA-1, and hence IND-CCA1, assuming the standard DDH (to
obtain IND-CPA security) and a Diffie–Hellman knowledge assumption (to obtain PA-
1 security). Informally, the Diffie–Hellman knowledge assumption is the assumption
that an algorithm can only output a Diffie–Hellman tuple if the algorithm “knows” the
discrete logarithm of one-tuple member with respect to another.
Rivest et. al originally proposed homomorphic encryption schemes so as to enable
arbitrary computation on encrypted data. To perform such operations one would require
an encryption scheme which supports two homomorphic operations, which are “com-
plete” in the sense of allowing arbitrary computations. Such schemes are called fully
homomorphic encryption (FHE) schemes, and it was not until Gentry’s breakthrough
construction in 2009 [15,16] that such schemes could be constructed. Since Gentry’s
construction appeared a number of variants have been proposed, such as [14], as well
as various simplifications [27] and improvements thereof [17]. All such schemes have
been proved to be IND-CPA, i.e. secure under chosen plaintext attack.
At a high level all these constructions work in three stages: an initial somewhat ho-
momorphic encryption (SHE) scheme which supports homomorphic evaluation of low
degree polynomials, a process of squashing the decryption circuit and finally a boot-
strapping procedure which will give fully homomorphic encryption and the evaluation
of arbitrary functions on ciphertexts. In this paper we focus solely on the basic some-
what homomorphic scheme, but our attacks and analysis apply also to the extension
using the bootstrapping process. Our construction of an IND-CCA1 scheme however
only applies to the SHE constructions as all existing FHE constructions require public
keys which already contain ciphertexts; thus with existing FHE constructions the notion
On CCA-Secure Somewhat Homomorphic Encryption 57
AdvIN D−atk
A,E,λ = | Pr(b = b ) − 1/2| .
L = L(B) = {v · B|v ∈ Zn }
to be the resulting lattice. An integer lattice is a lattice in which all the bases vectors
have integer coordinates.
For any basis
nthere is an associated fundamental parallelepiped which can be taken
as P(B) = { i=1 xi · bi |xi ∈ [−1/2, 1/2)}. The volume of this fundamental par-
allelepiped is given by the absolute value of the determinant of the basis matrix Δ =
| det(B)|. We denote by λ∞ (L) the ∞-norm of a shortest vector (for the ∞-norm) in L.
by a tuple of integers (N, t, μ); we assume there is a function mapping security param-
eters
√
λ into tuples (N, t, μ). In practice N will be a power of two, t will be greater than
2 N and μ will be a small integer, perhaps one.
KeyGen(1λ )
– Pick an irreducible polynomial F ∈ Z[X] of degree N .
– Pick a polynomial G(X) ∈ Z[X] of degree at most N − 1, with coefficients
bounded by t.
– d ← resultant(F, G).
– G is chosen such that G(X) has a single unique root in common with F (X) modulo
d. Let α denote this root.
– Z(X) ← d/G(X) (mod F (X)).
– pk ← (α, d, μ, F (X)), sk ← (Z(X), G(X), d, F (X)).
n
In [17] Gentry and Halevi show how to compute, for the polynomial F (X) = X 2 + 1,
the root α and the polynomial Z(X) using a method based on the Fast Fourier Trans-
form. In particular they show how this can be done for non-prime values of d (removing
one of the main restrictions in the key generation method proposed in [27]).
By construction, the principal ideal g generated by G(X) in the number field K =
Z[X]/(F (X)) is equal to the ideal with OK basis (d, X − α). In particular, the ideal
g precisely consists of all elements in Z[X]/(F (X)) that are zero when evaluated at
α modulo d. The Hermite-Normal-Form of a basis matrix of the lattice defined by the
coefficient vectors of g is given by
⎛ ⎞
d 0
⎜ −α 1 ⎟
⎜ ⎟
⎜ −α2 1 ⎟
B=⎜ ⎟, (1)
⎜ .. . ⎟
⎝ . . . ⎠
−αN −1 0 1
where the elements in the first column are reduced modulo d.
To aid what follows we write Z(X) = z0 + z1 · X + . . . + zN −1 · X N −1 and define
g(X) · h(X) (mod F (X))∞
δ∞ = sup : g, h ∈ Z[X], deg(g), deg(h) < N .
g(X)∞ · h(X)∞
For the choice f = X N + 1, we have δ∞ = N . The key result to understand how
the simplification of Smart and Vercauteren to Gentry’s scheme works is the following
lemma adapted from [27].
Lemma 1. Let Z(X), G(X), α and d be as defined in the above key generation proce-
dure. If C(X) ∈ Z[X]/(F (X)) is a polynomial with C(X)∞ < U and set c = C(α)
(mod d), then
c · Z(X)
C(X) = c − · G(X) (mod F (X))
d
for
d
U= .
2 · δ∞ · Z(X)∞
On CCA-Secure Somewhat Homomorphic Encryption 61
cZ(X) C(X)Z(X)
q(X) = − .
d d
Since q(X) has integer coefficients, we can recover it by rounding the coefficients of
the first term if the coefficients of the second term are strictly bounded by 1/2. This
shows that C(X) can be recovered from c for C(X)∞ < d/(2 · δ∞ · Z(X)∞ ).
Note that the above lemma essentially states that if C(X)∞ < U , then C(X) is
determined uniquely by its evaluation in α modulo d. Recall that any polynomial H(X)
of degree less than N , whose coefficient vector is in the lattice defined in equation (1),
satisfies H(α) = 0 (mod d). Therefore, if H(X) = 0, the lemma implies, for such an
H, that H(X)∞ ≥ U , and thus we conclude that U ≤ λ∞ (L). Since the coefficient
vector of G(X) is clearly in the lattice L, we conclude that
U ≤ λ∞ (L) ≤ G(X)∞ .
Although Lemma 1 provides the maximum value of U for which ciphertexts are de-
cryptable, we will only allow a quarter of this maximum value, i.e. T = U/4. As such
we are guaranteed that T ≤ λ∞ (L)/4. We note that T defines the size of the circuit
that the somewhat homomorphic encryption scheme can deal with. Our choice of T will
become clear in Section 5.
Using the above key generation method we can define three variants of the Smart–
Vercauteren variant of Gentry’s scheme. The first variant is the one used in the Gen-
try/Halevi implementation of [17], the second is the general variant proposed by Smart
and Vercauteren, whereas the third divides the decryption procedure into two steps and
provides a ciphertext validity check. In later sections we shall show that the first variant
is not IND-CCA1 secure, and by extension neither is the second variant. However, we
will show that the third variant is indeed IND-CCA1. We will then show that the third
variant is not IND-CVA secure.
Each of the following variants is only a somewhat homomorphic scheme, extending
it to a fully homomorphic scheme can be performed using methods of [15,16,17].
G ENTRY–H ALEVI VARIANT : The plaintext space is the field F2 . The above KeyGen
algorithm is modified to only output keys for which d ≡ 1 (mod 2). This implies that
at least one coefficient of Z(X), say zi0 will be odd. We replace Z(X) in the private
key with zi0 , and can drop the values G(X) and F (X) entirely from the private key.
Encryption and decryption can now be defined via the functions:
F ULL -S PACE S MART–V ERCAUTEREN : In this variant the plaintext space is the algebra
F2 [X]/(F (X)), where messages are given by binary polynomials of degree less than
N . As such we call this the Full-Space Smart–Vercauteren system as the plaintext space
is the full set of binary polynomials, with multiplication and addition defined modulo
F (X). We modify the above key generation algorithm so that it only outputs keys for
which the polynomial G(X) satisifies G(X) ≡ 1 (mod 2). This results in algorithms
defined by:
Encrypt(M (X), pk; r) Decrypt(c, sk)
– R(X) ← Z[X] s.t. R(X)∞ ≤ μ. – C(X) ← c − c · Z(X)/d.
– C(X) ← M (X) + 2 · R(X). – M (X) ← C(X) (mod 2).
– c ← [C(α)]d . – Return M (X).
– Return c.
That decryption works, assuming the input ciphertext corresponds to the evaluation of
a polynomial with coefficients bounded by T , follows from Lemma 1 and the fact that
G(X) ≡ 1 (mod 2).
CC SHE: This is our ciphertext-checking SHE scheme (or ccSHE scheme for short).
This is exactly like the above Full-Space Smart–Vercauteren variant in terms of key
generation, but we now check the ciphertext before we output the message. Thus en-
cryption/decryption become;
Encrypt(M (X), pk; r) Decrypt(c, sk)
– R(X) ← Z[X] s.t. R(X)∞ ≤ μ. – C(X) ← c − c · Z(X)/d · G(X).
– C(X) ← M (X) + 2 · R(X). – C(X) ← C(X) (mod F (X))
– c ← [C(α)]d . – c ← [C(α)]d .
– Return c. – If c = c or C(X)∞ > T return ⊥.
– M (X) ← C(X) (mod 2).
– Return M (X).
A NALYSIS: The core idea of the algorithm is simple: in each step we choose a “cipher-
text” c such that the length of the interval for the quantity c · z is bounded by d. Since in
On CCA-Secure Somewhat Homomorphic Encryption 63
each step, z ∈ [L, U ], we need to take c = d/(U − L). As such it is easy to see that
c(U − L) ≤ d.
To reduce cL, we need to subtract kd such that −d/2 ≤ cL − kd < d/2, which
shows that k = Lc/d + 1/2. Furthermore, since the length of the interval for c · z
is bounded by d, there will be exactly one number of the form d/2 + id in [cL, cU ],
namely d/2 + kd. This means that there is exactly one boundary B = (k + 1/2)d/c in
the interval for z.
Define q as the unique integer such that −d/2 ≤ cz − qd < d/2, then since the
length of the interval for c · z is bounded by d, we either have q = k or q = k + 1.
To distinguish between the two cases, we simply look at the output of the decryption
oracle: recall that the oracle outputs [c · z]d (mod 2), i.e. the bit output by the oracle is
5 ccSHE is PA-1
In this section we prove that the ccSHE encryption scheme given earlier is PA-1,
assuming a lattice knowledge assumption holds. We first recap on the definition of PA-1
in the standard model, and then we introduce our lattice knowledge assumption. Once
this is done we present the proof.
64 J. Loftus et al.
P LAINTEXT AWARENESS – PA-1: The original intuition for the introduction of plain-
text awareness was as follows - if an adversary knows the plaintext corresponding to
every ciphertext it produces, then the adversary has no need for a decryption oracle and
hence, PA+IND-CPA must imply IND-CCA. Unfortunately, there are subtleties in the
definition for plaintext awareness, leading to three definitions, PA-0, PA-1 and PA-2.
However, after suitably formalizing the definitions, PA-x plus IND-CPA implies IND-
CCAx, for x = 1 and 2. In our context we are only interested in IND-CCA1 security, so
we will only discuss the notion of PA-1 in this paper.
Before formalizing PA-1 it is worth outlining some of the terminology. We have a
polynomial time adversary A called a ciphertext creator, that takes as input a public key
and can query ciphertexts to an oracle. An algorithm A∗ is called a successful extractor
for A if it can provide responses to A which are computationally indistinguishable from
those provided by a decryption oracle. In particular a scheme is said to be PA-1 if there
exists a successful extractor for any ciphertext creator that makes a polynomial number
of queries. The extractor gets the same public key as A and also has access to the
random coins used by algorithm A. Following [4] we define PA-1 formally as follows:
Definition 1 (PA1). Let E be a public key encryption scheme and A be an algorithm
with access to an oracle O taking input pk and returning a string. Let D be an algorithm
that takes as input a string and returns a single bit and let A∗ be an algorithm which
takes as input a string and some state information and returns either a string or the
symbol ⊥, plus a new state. We call A a ciphertext creator, A∗ a PA-1-extractor, and D
a distinguisher. For security parameter λ we define the (distinguishing and extracting)
experiments in Figure 1, and then define the PA-1 advantage to be
AdvP A-1
E,A,D,A ∗ (λ) = Pr(Exp P A-1-d
E,A,D (λ) = 1) − Pr(Exp P A-1-x
E,A,D,A ∗ (λ) = 1).
ExpP A-1-d
E,A,D (λ): ExpP A-1-x
E,A,A∗ (λ):
responded to by the PA-1-extractor. If A∗ did not receive the coins coins[A] from A
then it would be functionally equivalent to the real decryption oracle, thus the fact that
A∗ gets access to the coins in the second experiment is crucial. Also note that the distin-
guisher acts independently of A∗ , and thus this is strictly stronger than having A decide
as to whether it is interacting with an extractor or a real decryption oracle.
The intuition is that A∗ acts as the unknowing subconscious of A, and is able to
extract knowledge about A’s queries to its oracle. That A∗ can obtain the underlying
message captures the notion that A needs to know the message before it can output a
valid ciphertext.
The following lemma is taken from [4] and will be used in the proof of the main
theorem.
Lemma 2. Let E be a public key encryption scheme. Let A be a polynomial-time ci-
phertext creator attacking E, D a polynomial-time distinguisher, and A∗ a polynomial-
time PA-1-extractor. Let DecOK denote the event that all A∗ ’s answers to A’s queries
are correct in experiment ExpP A-1-x
E,A,D,A∗ (λ). Then,
Definition 2 (LK- ). Let be a fixed constant in the interval (0, 1/2). Let G denote an
algorithm which on input of a security parameter 1λ outputs a lattice L given by a basis
B of dimension n = n(λ) and volume Δ = Δ(λ). Let C be an algorithm that takes
a lattice basis B as input, and has access to an oracle O, and returns nothing. Let C ∗
denote an algorithm which takes as input a vector c ∈ Rn and some state information,
and returns another vector p ∈ Rn plus a new state. Consider the experiment in Figure
2. The LK- advantage of C relative to C ∗ is defined by
AdvLK- LK-
G,C,C ∗ (λ) = Pr[ExpG,C,C ∗ (λ) = 1].
We say G satisfies the LK- assumption, for a fixed , if for every polynomial time C
there exists a polynomial time C ∗ such that AdvLK-
G,C,C ∗ (λ) is a negligible function of λ.
The algorithm C is called an LK- adversary and C ∗ a LK- extractor. We now discuss
this assumption in more detail. Notice, that for all lattices, if < 1/4 then the proba-
bility of a random vector being within · λ∞ (L) of the lattice is bounded from above
by 1/2n, and for lattices which are not highly orthogonal this is likely to hold for all
66 J. Loftus et al.
ExpLK-
G,C,C ∗ (λ):
– B ← G(1λ ).
– Choose coins coins[C] (resp. coins[C ∗ ]) for C (resp. C ∗ ).
– St ← (B, coins[C]).
– Run C O (B; coins[C]) until it halts, replying to the oracle queries O(c) as follows:
• (p, St) ← C ∗ (c, St; coins[C ∗ ]).
• If p ∈ L(B), return 1.
• If p − c ∞ > · λ∞ (L), return 1.
• Return p to C.
– Return 0.
up to 1/2. Our choice of T in the ccSHE scheme as U/4 is to guarantee that our lattice
knowledge assumption is applied with = 1/4, and hence is more likely to hold.
If the query c which C asks of its oracle is within · λ∞ (L) of a lattice point then we
require that C ∗ finds such a close lattice point. If it does not then the experiment will
output 1; and the assumption is that this happens with negligible probability.
Notice that if C asks its oracle a query of a vector which is not within · λ∞ (L) of a
lattice point then the algorithm C ∗ may do whatever it wants. However, to determine this
condition within the experiment we require that the environment running the experiment
is all powerful, in particular, that it can compute λ∞ (L) and decide whether a vector
is close enough to the lattice. Thus our experiment, but not algorithms C and C ∗ , is
assumed to be information theoretic. This might seem strange at first sight but is akin
to a similarly powerful game experiment in the strong security model for certificateless
encryption [1], or the definition of insider unforgeable signcryption in [3].
For certain input bases, e.g. reduced ones or ones of small dimension, an algorithm
C ∗ can be constructed by standard algorithms to solve the CVP problem. This does not
contradict our assumption, since C would also be able to apply such an algorithm and
hence “know” the close lattice point. Our assumption is that when this is not true, the
only way C could generate a close lattice point (for small enough values of ) is by
computing x ∈ Zn and perturbing the vector x · B.
M AIN T HEOREM:
Theorem 1. Let G denote the lattice basis generator induced from the KeyGen algo-
rithm of the ccSHE scheme, i.e. for a given security parameter 1λ , run KeyGen(1λ )
to obtain pk = (α, d, μ, F (X)) and sk = (Z(X), G(X), d, F (X)), and generate the
lattice basis B as in equation (1). Then, if G satisfies the LK- assumption for = 1/4
then the ccSHE scheme is PA-1.
Proof. Let A be a polynomial-time ciphertext creator attacking the ccSHE scheme,
then we show how to construct a polynomial time PA1-extractor A∗ . The creator A
takes as input the public key pk = (α, d, μ, F (X)) and random coins coins[A] and
returns an integer as the candidate ciphertext. To define A∗ , we will exploit A to build a
polynomial-time LK- adversary C attacking the generator G. By the LK- assumption
there exists a polynomial-time LK- extractor C ∗ , that will serve as the main building
On CCA-Secure Somewhat Homomorphic Encryption 67
block for the PA1-extractor A∗ . The description of the LK- adversary C is given in
Figure 3 and the description of the PA-1-extractor A∗ is given in Figure 4.
• Let c = [C(α)]d
• If c = c or C(X) ∞ ≥ T , then M (X) ←⊥, else M (X) ← C(X) (mod 2)
• Return M (X) to A as the oracle response.
– Halt
Fig. 3. LK- adversary
Fig. 4. PA-1-extractor
When c is an invalid ciphertext then the real decryption oracle will always output
⊥, and it can be easily seen that our PA-1 extractor A∗ will also output ⊥. Thus in
the case of an invalid ciphertext the adversary A cannot tell the two oracles apart. The
theorem now follows from combining the inequality Pr(DecOK) ≤ AdvLK- G,C,C ∗ (λ) with
Lemma 2 as follows:
≤ Pr(ExpP
E,A,D (λ) = 1) − Pr(ExpE,A,D (λ) = 1) + Pr(DecOK)
A-1-d P A-1-d
≤ AdvLK-
G,C,C ∗ (λ) .
We now show that our ccSHE scheme is not secure when the attacker, after being given
the target ciphertext c∗ , is given access to an oracle OCVA (c) which returns 1 if c is
a valid ciphertext (i.e. the decryption algorithm would output a message), and which
returns 0 if it is invalid (i.e. the decryption algorithm would output ⊥). Such an “oracle”
can often be obtained in the real world by the attacker observing the behaviour of a party
who is fed ciphertexts of the attackers choosing. Since a CVA attack is strictly weaker
than a IND-CCA2 attack it is an interesting open (and practical) question as to whether
an FHE scheme can be CVA secure.
We now show that the ccSHE scheme is not CVA secure, by presenting a relatively
trivial attack: Suppose the adversary is given a target ciphertext c∗ associated with a
hidden message m∗ . Using the method in Algorithm 2 it is easy to determine the mes-
sage using access to OCVA (c). Basically, we add on multiples of αi to the ciphertext
until it does not decrypt; this allows us to perform a binary search on the i-th coefficient
of C(X), since we know the bound T on the coefficients of C(X).
If ci is the ith coefficient of the actual C(X) underlying the target ciphertext c∗ , then
the ith coefficient of the polynomial underlying ciphertext c being passed to the OCVA
On CCA-Secure Somewhat Homomorphic Encryption 69
– The ciphertext validity check for E is computable using only the secret key sk .
– CCA1 security of E is not compromised by leakage of sk .
Prover Verifier
c1 = Encrypt (m, pk1 ; r1 )
c2 = Encrypt (m, pk2 ; r2 ) c1 , c2
y ← {0, 1}
a1 ← Encrypt (y, pk1 ; s1 )
a2 ← Encrypt (y, pk2 ; s2 ) a1 , a-
2
e e ← {0, 1}
z ← y⊕e·m
t1 ← s1 + e · r 1 + e · y · m
t2 ← s2 + e · r 2 + e · y · m z, t1 ,-
t2 Accept if and only if
Encrypt (z, pk1 ; t1 ) = a1 + e · c1
Encrypt (z, pk2 ; t2 ) = a2 + e · c2 .
References
1. Al-Riyami, S.S., Paterson, K.G.: Certificateless Public Key Cryptography. In: Laih, C.-S.
(ed.) ASIACRYPT 2003. LNCS, vol. 2894, pp. 452–473. Springer, Heidelberg (2003)
2. Armknecht, F., Peter, A., Katzenbeisser, S.: A cleaner view on IND-CCA1 secure homomor-
phic encryption using SOAP. IACR e-print 2010/501 (2010),
http://eprint.iacr.org/2010/501
3. Baek, J., Steinfeld, R., Zheng, Y.: Formal proofs for the security of signcryption. Journal of
Cryptology 20(2), 203–235 (2007)
4. Bellare, M., Palacio, A.: Towards Plaintext-Aware Public-Key Encryption Without Random
Oracles. In: Lee, P.J. (ed.) ASIACRYPT 2004. LNCS, vol. 3329, pp. 48–62. Springer, Hei-
delberg (2004)
5. Bellare, M., Rogaway, P.: Optimal Asymmetric Encryption. In: De Santis, A. (ed.) EURO-
CRYPT 1994. LNCS, vol. 950, pp. 92–111. Springer, Heidelberg (1995)
6. Bendlin, R., Damgård, I., Orlandi, C., Zakarias, S.: Semi-Homomorphic Encryption and
Multiparty Computation. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632,
pp. 169–188. Springer, Heidelberg (2011)
7. Bernhard, D., Cortier, V., Pereira, O., Smyth, B., Warinschi, B.: Adapting Helios for Provable
Ballot Privacy. In: Atluri, V., Diaz, C. (eds.) ESORICS 2011. LNCS, vol. 6879, pp. 335–354.
Springer, Heidelberg (2011)
72 J. Loftus et al.
8. Bleichenbacher, D.: Chosen Ciphertext Attacks Against Protocols based on the RSA Encryp-
tion Standard PKCS #1. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 1–12.
Springer, Heidelberg (1998)
9. Cramer, R., Gennaro, R., Schoenmakers, B.: A Secure and Optimally Efficient Multi-
Authority Election Scheme. In: Fumy, W. (ed.) EUROCRYPT 1997. LNCS, vol. 1233,
pp. 103–118. Springer, Heidelberg (1997)
10. Cramer, R., Shoup, V.: A Practical Public Key Cryptosystem Provably Secure Against Adap-
tive Chosen Ciphertext Attack. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462,
pp. 13–25. Springer, Heidelberg (1998)
11. Damgård, I.B.: Towards Practical Public Key Systems Secure against Chosen Ciphertext
Attacks. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 445–456. Springer,
Heidelberg (1992)
12. Damgård, I., Groth, J., Salomonsen, G.: The theory and implementation of an electronic
voting system. In: Secure Electronic Voting, pp. 77–99. Kluwer Academic Publishers (2002)
13. Dent, A.: A Designer’s Guide to KEMs. In: Paterson, K.G. (ed.) Cryptography and Coding
2003. LNCS, vol. 2898, pp. 133–151. Springer, Heidelberg (2003)
14. van Dijk, M., Gentry, C., Halevi, S., Vaikuntanathan, V.: Fully Homomorphic Encryption
Over the Integers. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 24–43.
Springer, Heidelberg (2010)
15. Gentry, C.: Fully homomorphic encryption using ideal lattices. In: Symposium on Theory of
Computing – STOC 2009, pp. 169–178. ACM (2009)
16. Gentry, C.: A fully homomorphic encryption scheme. PhD, Stanford University (2009)
17. Gentry, C., Halevi, S.: Implementing Gentry’s Fully-Homomorphic Encryption Scheme. In:
Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 129–148. Springer, Heidel-
berg (2011)
18. Hu, Z.-Y., Sun, F.-C., Jiang, J.-C.: Ciphertext verification security of symmetric encryption
schemes. Science in China Series F 52(9), 1617–1631 (2009)
19. Joye, M., Quisquater, J., Yung, M.: On the Power of Misbehaving Adversaries and Security
Analysis of the Original EPOC. In: Naccache, D. (ed.) CT-RSA 2001. LNCS, vol. 2020, pp.
208–222. Springer, Heidelberg (2001)
20. Lipmaa, H.: On the CCA1-security of ElGamal and Damgård’s ElGamal. In: Lai, X., Yung,
M., Lin, D. (eds.) Inscrypt 2010. LNCS, vol. 6584, pp. 18–35. Springer, Heidelberg (2011)
21. Manger, J.: A Chosen Ciphertext Attack on RSA Optimal Asymmetric Encryption Padding
(OAEP) as Standardized in PKCS #1 v2.0. In: Kilian, J. (ed.) CRYPTO 2001. LNCS,
vol. 2139, pp. 230–238. Springer, Heidelberg (2001)
22. Naor, M., Yung, M.: Public-key cryptosystems provably secure against chosen ciphertext
attacks. In: Symposium on Theory of Computing – STOC 1990, pp. 427–437. ACM (1990)
23. Regev, O.: On lattices, learning with errors, random linear codes, and cryptography. In: Sym-
posium on Theory of Computing – STOC 2005, pp. 84–93. ACM (2005)
24. Rivest, R.L., Adleman, L., Dertouzos, M.L.: On data banks and privacy homomorphisms. In:
Foundations of Secure Computation, pp. 169–177 (1978)
25. Regev, O.: On lattices, learning with errors, random linear codes, and cryptography. Journal
ACM 56(6), 1–40 (2009)
26. Smart, N.P.: Errors Matter: Breaking RSA-Based PIN Encryption with Thirty Ciphertext
Validity Queries. In: Pieprzyk, J. (ed.) CT-RSA 2010. LNCS, vol. 5985, pp. 15–25. Springer,
Heidelberg (2010)
27. Smart, N.P., Vercauteren, F.: Fully Homomorphic Encryption with Relatively Small Key
and Ciphertext Sizes. In: Nguyen, P.Q., Pointcheval, D. (eds.) PKC 2010. LNCS, vol. 6056,
pp. 420–443. Springer, Heidelberg (2010)
28. Smyth, B., Cortier, V.: Attacking and fixing Helios: An analysis of ballot secrecy. In: IEEE
Computer Security Foundations Symposium – CSF 2011 (to appear, 2011)
Efficient Schemes for Anonymous Yet Authorized
and Bounded Use of Cloud Resources
Daniel Slamanig
1 Introduction
Cloud computing is an emerging paradigm, but some significant attention re-
mains justifiably focused on addressing security and privacy concerns. Reasons
are among others that customers have to trust the security mechanisms and con-
figuration of the cloud provider and the cloud provider itself. Recently, different
cryptographic solutions to improve privacy, mainly focusing on private storage,
private computations and private service usage have been proposed and will be
briefly discussed below.
Storing data encrypted seems to be sine qua non in many cloud storage set-
tings, since cloud providers, having access to the storage infrastructure, can
neither be considered as fully trustworthy nor are resistant to attacks. Kamara
and Lauter [25] propose several architectures for cryptographic cloud storage and
provide a sound overview of recent non-standard cryptographic primitives like
searchable encryption and attribute-based encryption, which are valuable tools
in this context. Other issues are data privacy and verifiability when outsourcing
data and performing computations on these data using the cloud as computa-
tion infrastructure. The recent introduction of fully homomorphic encryption
[24] is a promising concept for performing arbitrary computation on encrypted
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 73–91, 2012.
c Springer-Verlag Berlin Heidelberg 2012
74 D. Slamanig
data. Up to now these concepts are far from being practical, although for some
practical applications somewhat homomorphic schemes seem to be promising
[26]. Another interesting issue from a privacy perspective is to hide user’s us-
age behavior (access patterns and frequencies) when accessing cloud services.
More precisely, users may not want the cloud provider to learn how often they
use a service or which resources they access. Nevertheless, cloud providers can
be assumed to have access restricted to authorized users and additionally users
may want to enforce (attribute-based) access control policies. Some approaches
to realize this are anonymous credential systems [3], oblivious transfer [6,7] or
oblivious RAM [23].
In this paper we discuss an additional aspect, which may be valuable when
moving towards privacy friendly cloud computing and seems to be valuable when
used in conjunction with the aforementioned approaches. In particular, we focus
on the anonymous yet authorized and bounded use of cloud resources like CPU
time (e.g. CPU per hour) or storage space. Thereby, we note that in this paper
we illustrate our concept by means of the resource storage space. Think for
instance of anonymous document publishing services provided by organizations
like WikiLeaks or the American Civil Liberties Union (ACLU) who may use
third party cloud storage services like Amazon’s S31 as their document stores.
In this example, WikiLeaks or ACLU may wish to store documents in the cloud,
but may not want to learn the cloud provider, e.g. Amazon, how much storage
they (and their users respectively) store. These organizations may also force
their users to respect storage limits, since they will have to pay for the storage,
but at the same time provide their users with anonymity. Another example are
clients who outsource computations to the cloud and want to hide their pattern.
Our Contribution. We consider a setting where users should be able to register
and obtain a resource bound (limit) from a cloud provider (CP) in form of a
“partially blindly signed” token. This token includes an identifier, the already
consumed resources and the limit, wheres the limit in fact is the only value signed
in clear. This limit determines how much of a resource, e.g. CPU time, storage
space, a user is allowed to consume. Then, users should be able to consume their
resources in an anonymous and unlinkable yet authorized fashion. For instance, if
a user wants to consume l resources, he has to convince the CP that he possesses
a signed token with a valid identifier (double-spending protection) and that his
consumed resources (including l) do not exceed his bound. If this holds, the
anonymous user is allowed to consume the resources and obtains an updated
signature for a token corresponding to a new identifier and updated consumed
resources. Note, due to the anonymity and unlinkability properties, the CP is
unable to track how much a user has already consumed, however, can be sure
that he solely consumes what he has been granted. Furthermore, a user may also
reclaim resources back, e.g. when deleting data or computations did not require
the preassigned time, while still hiding the pattern.
We for the first time consider this problem and provide a definition for the
concept of anonymous yet authorized and bounded cloud resource schemes along
1
http://aws.amazon.com/s3/
Efficient Anonymous Yet Authorized and Bounded Use of Clouds 75
fit our scenario but could be mapped to it. However, [7] do not provide an
efficiency analysis in their work and their protocols seem to be quite costly. Their
rechargeable wallets are an interesting feature and recharching is also supported
by our second scheme in Section 4.4.
2 Definition
2.1 Problem Description and Motivation
In our setting we have a cloud provider (CP) and a set of users U . Our main goal
is that users are able to purchase a contingent of resources (we focus on storage
space here) and CP does not learn anything about the resource consumption
behavior of users. In particular, users can store data at the CP as long as there
are still resources from their contingent available. The CP is in any interaction
with the user convinced that a user is allowed to consume (or reclaim) resources
but cannot identify the user nor link any of the user’s actions. Clearly, if the
resource is storage space and the data objects contain information on the user,
then this may break the anonymity property. Nevertheless, then we can assume
that data is encrypted which seems to be sine qua non in many cloud storage
settings.
Our main motivation is that it is very likely that only a few large cloud
providers will own large portions of the infrastructure of the future Internet.
Thus, these cloud providers will eventually be able to link data and information
about resource consumption behavior of their consumers (users) allowing them
to build extensive dossiers. Since for many enterprises such a transparency can
be too intrusive or problematic if these information are available to their com-
petitors we want to hide these information from cloud providers. As for instance
argued in [18], activity patterns may constitute confidential business information
and if divulged could lead to reverse-engineering of customer base, revenue size,
and the like.
– Consume. In this protocol user u wants to consume l units from his remaining
resources. The user shows value t.id of a token t and convinces the CP that
he holds a valid signature σt for token t. If the token was not already spend
(t.id is not contained in BL), the signature is valid and there are still enough
resources left, i.e. s + l ≤ L (or L − l ≥ 0), then the user’s output is accept
and an updated token t for resource limit L and actually consumed resources
s + l (or L − l) with an updated signature σt from CP. Otherwise the user’s
output is reject. The output of CP is a transcript TC .
– Reclaim. In this protocol user u wants to reclaim l units, e.g. he wants to
delete some data of size l. The protocol is exactly the same as the Consume
protocol. Except for the accept case the updated token t contains s − l (or
L + l) as the actually consumed resources and the transcript is denoted as
TR . We emphasize that u needs to prove by some means that he is allowed
to reclaim l resources, e.g. when deleting some data, the user needs prove
knowledge of some secret associated with the data during the integration.
Otherwise, users could simply run arbitrary many Reclaim protocols to il-
licitly reclaim resources and indirectly improve their actual resource limit
(see end of Section 4.3 for a discussion).
during all Consume protocols is at most L − v. Then, a bit b is secretly and ran-
domly chosen and A runs a Consume with value at most v (or Reclaim) protocol
with the user who was issued his initial token during the ObtainLimit protocol
corresponding to TOLb
. Finally, A outputs a bit b and we say that A has won the
game if b = b holds. We require that for every efficient adversary A the proba-
bility of winning the game differs from 1/2 at most by a negligible fraction (the
intuition why we require the sum to be L − v and the last Consume is performed
with respect to value v at most is to rule out trivial attacks3 ).
Unforgeability. It is required that no collusion of users can spend more tokens
(which will be accepted in Consume or Reclaim protocols) than they have been
issued. Furthermore, no collusion of users must be able to consume more re-
sources than they have obtained. Formally, we consider a game and provide the
adversary A with a public key pk generated by the ProviderSetup algorithm.
Then, during the game A can
– execute ObtainLimit protocols with an honest CP and
– execute Consume and Reclaim protocols with an honest CP.
At some point, A specifies a sequence t = (t1 , . . . , tn ) of valid tokens (which
were not already shown) and at the end of the game the verifier
– outputs a token t either not contained in t
– or a modified token t corresponding to a token ti in t, whereas it holds that
t .id = ti .id and/or t .L = ti .L and/or t .s = ti .s (or if L is not explicitey
included t .L = ti .L ).
– and conducts a Consume or Reclaim protocol with an honest CP.
We require that for every efficient adversary A the probability that the Consume
or Reclaim protocol in the last step terminates with accept is negligible.
3 Preliminaries
An essential ingredient for our construction are honest-verifier zero-knowledge
proofs of knowledge (Σ-protocols). We use the notation from [13], i.e. a proof
of knowledge of a discrete logarithm x = logg y to the base g will be denoted
as P K{(α) : y = g α }, whereas Greek letters always denote values whose knowl-
edge will be proven. We note, that compositions of single Σ-protocols using
conjunctions and disjunctions can be efficiently realized [20]. Furthermore, the
non-interactive version of a (composed) proof obtained by applying the Fiat-
Shamir transform [22] is denoted as a signature of knowledge or SP K for short.
3
The adversary could run one single ObtainLimit protocol and run Consume till the
user can have no more available resources, i.e. Consume protocols will terminate with
reject. Then before going into the challenge phase, the adversary can run another
ObtainLimit protocol and output those two transcripts. Obviously, he will be able to
assign a Consume protocol to the correct user, since one will terminate with reject
and the other one with accept.
Efficient Anonymous Yet Authorized and Bounded Use of Clouds 79
Bilinear Maps. Let G and Gt be two groups of prime order p, let g be a gen-
erator of G and e : G × G → Gt a bilinear map between these two groups. The
map e must satisfy the following properties:
1. Bilinear: for all u, v ∈ G and a, b ∈ Zp we have e(ua , v b ) = e(u, v)ab .
2. Non-degenerate: e(g, g) = 1.
3. Computable: there is an efficient algorithm to compute e(u, v) for any u, v ∈
G.
Though the group operation in G is in general an additive one, we express both
groups using multiplicative notation. This notion is commonly used, since Gt is
always multiplicative and it is more easy to capture the sense of cryptographic
protocols.
Pedersen Commitments. Pedersen commitments [30] represent a widely used
commitment scheme working in any group G of prime order p. Let g, h be random
generators of G, whereas logg h is unknown. To commit to a value s ∈ Zp , one
chooses r ∈R Zp and computes C(s, r) = g s hr , which unconditionally hides
s as long as r is unknown. To open the commitment, one simply publishes
(s, r, C(s, r)) and one verifies whether g s hr = C(s, r) holds. For simplicity, we
often write C(s) for a commitment to s instead of C(s, r). We note that the
Pedersen commitment inherits an additive homomorphic property, i.e. given two
commitments C(s1 , r1 ) = g s1 hr1 and C(s2 , r2 ) = g s2 hr2 then one is able to
compute C(s1 + s2 , r1 + r2 ) = C(s1 , r1 ) · C(s2 , r2 ) without either knowing any
of the hidden values s1 or s2 . Furthermore, note that a proof of knowledge
P K{(α, β) : C = g α hβ } of the ability to open a Pedersen commitment can be
realized using a proof of knowledge of a DL representation of C with respect to
the elements g and h [28].
Range Proofs. An elegant proof that a number hidden within a Pedersen
commitment lies in an interval [a, b] in the setting of prime order groups was
presented in [27]. Although this proof might be impractical in general, since it
requires O(log b) single bit-proofs, it is efficient for the application that we have
in mind due to relatively small values of b. The basic idea is to consider for a
number x ∈ [0, b] its binary representation x = x0 20 + x1 21 + . . . + xk−1 2k−1 ,
whereas xi ∈ {0, 1}, 0 ≤ i < k. Thereby, k = [log2 b] + 1 represents the number
of digits, which are necessary to represent every number within [0, b]. Now, in
essence one proves that the binary representation of x lies within the interval
[0, 2k − 1]. This can be done by committing to each xi using an Okamoto com-
mitment [29] (essentially a Pedersen bit commitment) along with a proof that
this commitment hides either 0 or 1 and demonstrating that for commitments
to x and all xi ’s it holds that x = x0 20 + x1 21 + . . . + xk−1 2k−1 . The concrete
range proof is a Σ-protocol for a proof of knowledge
k−1
P K{(α0 , . . . , αk−1 ) : (Ci = hαi ∨ Ci g −1 = hαi )}
i=0
4 Scheme
In this section we present our scheme along with an optional modification in
order to increase the privacy in some settings even further. We start with the
presentation of an important observation of CL signatures which is central to
our constructions. Then, we first give a high level description followed by a
detailed description of the schemes. Additionally, we present an performance
evaluation of a prototypical implementation which supports the efficiency of the
schemes. Finally, we present some extensions as well as system issues and provide
a security analysis of the protocols.
Assume that the signer holds a private key sk = (x, y, z) and publishes the cor-
responding public key pk = (X, Y, Z, e, g, G, Gt, p).
Blind Signing. If a receiver wants to obtain a blind signature for message m,
he chooses r ∈R Zp , computes a commitment C = g m Z r and sends C along with
a signature of knowledge SP K{(α, β) : C = g α Z β } to the signer (the ability to
open the commitment is necessary for the security of the scheme, cf. [12]). If the
verification of the proof holds, the signer computes a signature σ = (a, A, b, B, c)
for the commitment C by choosing k ∈R Zp , setting a = g k and computing
σ = (a, az , ay , ayz , ax C kxy ) and sends σ to the receiver.
Verification. In order to show the signature to a verifier, the receiver random-
izes the signature by choosing r, r ∈R Zp and computing σ = (a , A , b , B , c )
as σ = (ar , Ar , br , B r , crr ) and sends σ with the message m along with a sig-
nature of knowledge SP K{(γ, δ) : vσγ = vvrδ } to the verifier. Therefore, both
need to compute vσ = e(c , g), v = e(X, a ) · e(X, b )m and vr = e(X, B ). The
verifier checks the proof and checks whether A as well as b and B were cor-
rectly formed. Note, that the proof can be conducted by means of a standard
DL-representation proof [16], which can easily be seen by rewriting the proof as
SP K{(γ, δ) : v = vσγ (vr−1 )δ }.
Remark. Observe, that we can realize a concept which is similar to partially
blind signatures. However, in contrast to existing partially blind signature schemes
[1], where the signer can integrate some common agreed upon information in the
signature, here, the signer arithmetically adds a message to the “blinded mes-
sage” (hidden in the commitment). Therefore, during the signing, the signer
simply updates the commitment to C = Cg mS and uses C instead of C for
signing. The receiver then obtains a signature for message m + mS , whereas mS
is determined by the signer and m is hidden from the signer.
Update. The interesting and from our point of view novel part is that a signer
can use a somewhat related idea to “update” a randomized signature without
showing the message. Assume that a receiver holds a randomized signature σ
for message (m , r) whereas m = m + mS and wants the signer to update
the signature such that it represents a signature for message (m + mS , r + 1).
Since showing m , as within the signature above, would destroy the unlinka-
bility due to both messages are known, the receiver can solely prove that he
knows the message in zero knowledge and both can then interactively update
the signature. Therefore in the verification the receiver provides a signature of
β
knowledge SP K{(α, β, γ) : vσα = vvm vr } to the verifier, whereas vσ = e(g, c ),
γ
v = e(g, a ), vm = e(g, b ) and vr = e(g, B ), which convinces the signer that the
receiver possesses a valid signature for unknown message (m , r). Then, for the
update, i.e. to add mS it is sufficient for the signer to compute C̃m +mS = amS A
and send it to the receiver. The receiver computes Cm +mS = (C̃m +mS )r and
β
provides a signature of knowledge SP K{(α, β, γ) : vσα = vvm γ
vr ∧ C̃m +mS =
(Cm +mS ) }. Note that this proof convinces the signer that the receiver has
α
randomized the commitment of the signer using the same random factor (r )
82 D. Slamanig
as within the randomization of the signature. Then, the signer computes the
updated signature σ = (ar̃ , Ar̃ , br̃ , B r̃ , (c (Cm +mS )xy )r̃ ) for r̃ ∈ Zp and gives
−1
σ = (a , A , b , B , c̃ ) to the receiver. The receiver sets c = (c̃ )r and now
holds a valid signature for message (m +mS , r+1) which he can in turn random-
ize. Therefore, observer that in the signature tuple only the last element actually
includes the messages and we have c = crr = (ax C kxy )rr = (ax+xy(m +zr) )rr
and (Cm +mS )xy = (axy(mS +z) )r . By taking these results together we have a
well formed signature component c = (ax+xy(m +mS +z(r+1)) )rr . The remaining
elements of the signature are easy to verify for correctness.
Remark. This functionality can easily be extended to signatures on arbitrary
tuples of messages, will be a building block for our scheme and may also be of
independent interest. Note that issuing a new signature in every step without
revealing the hidden messages would not work and thus we use this “update
functionality”.
Reclaim. Assume that the user holds a token t = (C(id), C(s), L) and corre-
sponding signature σt . When a user wants to delete a data object d, as above, the
user computes C(id ) for the new token, randomizes the signature σt to σt and
“proves” that he is allowed to delete d and that σt is a valid signature for id and L
(by revealing these two elements). If id is not contained in BL and the signature
is valid, the user obtains a signature for a token t = (C(id + id ), C(s − |d|), L).
Otherwise, the cloud provider will reject to delete d.
1. The user chooses a token-identifier id ∈R {0, 1}lid and randomizers rid , rs ∈R Zp for the
commitments and we let the user start with value s = 1. Then, he computes the commitments
r
Cid = gid Z1 id and Cs = Z2s Z3rs and sends them along with a signature of knowledge
to prove the ability to open the commitments, whereas the second part in the proof also con-
vinces the cloud provider that s = 1.
2. If the verification of the signature of knowledge in (1) holds, the cloud provider computes a CL
signature for (Cid , Cs , L) as follows: He chooses k ∈R Zp , computes a = gk , b = ay , Ai = azi ,
Bi = Ay i for 1 ≤ i ≤ 4 and c = a (Cid Cs Z4 )
x L kxy
and sends σ = (a, {Ai }, b, {Bi }, c) to the
user.
3. The user verifies whether the signature is valid and if this holds the user is in possession
of a valid signature σ for a token t = (id, s, L), whereas the cloud provider is not aware
of id and knows that s = 1. Furthermore, the user locally randomizes the signature σ to
σ = (a , {Ai }, b , {Bi }, c ) by choosing r, r ∈ Zp and computing σ = (ar , {Ari }, br , {Bir }, crr ).
Remark. All further actions are fully anonymous and in practice also unlinkable, since we can
assume that one limit will be issued to a quite large number of users (and the limit is the only
information that could potentially be used for linking)!
1. The user sends the randomized signature σ , the “visible part” (id, L) of the token t and a data
object d along with a signature of knowledge
α β γ δ lL −l|d|
SP K{(α, β, γ, δ) : vσ = vvr v v ∧ (0 ≤ γ ≤ 2 − 1)} (2)
id s rs
for the validity of the randomized signature containing a proof that still enough space is available
to the cloud provider. It must be noted, that the presentation of the proof in (2) represents a
shorthand notation for the signature of knowledge
SP K{(α, β, γ, δ, , 1 , . . . , k , ζ, ζ1 , . . . , ζk )
α
: v = vσ (vr−1 )β (vs−1 )γ (vr−1
s
)δ ∧
id
C = gβ Z1ζ ∧
k
ζ i−1
C = (g i Z1 i )2 ∧
i=1
k
ζ
(Ci = Z1 i ∨ Ci g−1 = Z1 i )}
ζ
i=1
Essentially, besides the DL-representation proof for the validity of the randomized signature,
we use an additional commitment C = gs Z1r to the value s with a new randomizer r computed
as
r = r1 20 + r2 21 + . . . + rk 2k−1 MOD p
for ri ’s chosen uniformly at random from Zp and the single commitments for the range proof
r
are Ci = gsi Z1 i . It also must be mentioned, that k represents lL − l|d| , the binary length
of L − |d|. Furthermore, note that in case of s = 1, i.e. in the first execution of the Consume
protocol, it would not be necessary to provide a range proof. However, when performing a range
proof, the initial Consume protocol is indistinguishable from other protocol executions and thus
provides stronger privacy guarantees.
2. The cloud provider checks whether id ∈ BL. If id is not blacklisted, the cloud provider verifies
the validity of the signature for the part (id, L) of the token t. Therefore, the cloud provider
locally computes the values
vσ = e(g, c ), vrid = e(X, B1 ), vs = e(X, B2 ), vrs = e(X, B3 ) and
and in order to update a signature for the id-part (to construct a new id for the new token) it
is sufficient to update a and A1 . To update the s-part, which amounts to update the currently
consumed space, it is sufficient to update A2 and A3 . The latter update needs to be computed
by the cloud provider to be sure that the correct value |d| is integrated and the former one
needs to be computed by the user to prevent the cloud provider from learning the new token
|d|
identifier. Hence, the cloud provider computes C̃s+|d| = A2 A3 and sends C̃s+|d| to the user,
who verifies whether |d| has been used to update the commitment. The user in turn chooses a
r
new identifier and randomizer id , rid ∈R Zp , computes Cid+id = (aid A1 id )r , Cs+|d| =
|d|
(C̃s+|d| )v = (A2 A3 )r and sends (Cid+id , Cs+|d| ) along with a signature of knowledge:
C̃s+|d| = (Cs+|d| )η ∧ v = vσ
η
(vr−1 )φ (vs−1 )ι (vr−1
s
)κ }
id
Note, that the user additionally to the knowledge of the ability to open the commitments
proves that he has randomized the commitment C̃s+|d| to a commitment Cs+|d| using the same
randomization factor (r ) as used to randomize the signature σ without revealing this value.
After positive verification of this signature of knowledge, the cloud provider chooses r̃ ∈R Zp
and computes an updated signature
the ObtainLimit, the Consume and the Reclaim protocols from a computational
and bandwidth perspective, whereas point compression for elements in G is used
to reduce the bandwidth consumption. As one can see, all protocols are highly
155
150 6000 Consume
145
140 4000
135 Reclaim
130 2000
125
Boundbit Boundbit
15 20 25 30 15 20 25 30
efficient from the user’s as well as the cloud provider’s perspective, both in the
computational effort and the bandwdith consumption. This holds, although the
code has not been optimized for performance and pre-computations have not
been used. Hence, our evaluation shows that from the efficiency point of view
our protocols are entirely practical.
be done analogously): The keys of the CP are adapted such that the public
key is pk = (X, Y, Z1 , Z2 , Z3 , Z4 , Z5 , Z6 , e, g, G, Gt, p). Token are augmented by
elements (V, rV ) whereas the former represent the validity period, e.g. a hash
computed from an encoding in Unix time. In the ObtainLimit protocol the user
additionally computes Z6rV (and proves knowledge of this DL) and the c part
of the signature is adapted to c = ax (Cid Cs Z4L Z5V Z6rV ) whereas the CP here
integrates the validity V . The remainig ideas stay the same with exception that
in the Consume protocol, the SP K needs to be adapted to
SP K{(α, β, γ, δ, , ζ) : vσα = vvrβid vsγ vrδs vV vrζV ∧
(0 ≤ γ ≤ 2lL −l|d| − 1) ∧ (2ltime − 1 ≤ ≤ 2lp − 1)}
whereas p represents the maximum validity period and time the representation
of the actual date and time (in the Reclaim protocol we only need the second
range proof). For the update of the signature and the token respectively, the
r
user has to additionally compute CV = (A5 A6 V )r and augment the proof of
knowledge in step 3 of Protocol 2 to
SP K{(ζ, η, φ, ι, κ, λ, μ, ν, ξ) : Cid+id = aζ Aη φ ι
1 ∧ CV = A5 A6 ∧
5 Conclusion
In this paper we have investigated the problem of anonymous yet authorized and
bounded use of cloud resources. We have presented a scheme, it’s modification
providing even more privacy, have presented extensions valuable for practical
application and have supported the efficiency of the proposed scheme by a per-
formance analysis based on a prototypical implementation.
Concluding we present anonymity revocation as an open problem. It is not
clear to us how anonymity revocation could be suitably realized in this setting.
We argue that it does not seem to be meaningful to use identity escrow within
every transaction, i.e. to verifiably encrypt the user’s identity. It is absolutely not
clear who would have the power to perform anonymity revocation. In contrast,
if at all, it seems more suitable to employ techniques like used within e-cash
[9] or (n-times) anonymous authentication [9,21]. However, it is not clear to us
how to achieve this, since in the aforementioned approaches spend protocols or
authentications are atomic and in our setting we do not know in advance how
often a user will consume or reclaim resources. We leave this functionality as an
open problem for future work.
Acknowledgements. The author would like to thank the anonymous referees
for providing valuable and helpful comments on this work as well as Gregory
Zaverucha for pointing out prior independent work [10] on signature updates.
References
1. Abe, M., Okamoto, T.: Provably Secure Partially Blind Signatures. In: Bellare, M.
(ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 271–286. Springer, Heidelberg (2000)
2. Au, M.H., Susilo, W., Mu, Y.: Practical Anonymous Divisible E-Cash from Bounded
Accumulators. In: Tsudik, G. (ed.) FC 2008. LNCS, vol. 5143, pp. 287–301. Springer,
Heidelberg (2008)
3. Backes, M., Camenisch, J., Sommer, D.: Anonymous Yet Accountable Access Con-
trol. In: WPES, pp. 40–46. ACM (2005)
4. Balasch, J., Rial, A., Troncoso, C., Preneel, B., Verbauwhede, I., Geuens, C.:
PrETP: Privacy-Preserving Electronic Toll Pricing. In: 19th USENIX Security
Symposium, pp. 63–78. USENIX Association (2010)
5. Blanton, M.: Online Subscriptions with Anonymous Access. In: ASIACCS, pp.
217–227. ACM (2008)
6. Camenisch, J., Dubovitskaya, M., Neven, G.: Oblivious Transfer with Access Con-
trol. In: CCS, pp. 131–140. ACM (2009)
7. Camenisch, J., Dubovitskaya, M., Neven, G.: Unlinkable Priced Oblivious Transfer
with Rechargeable Wallets. In: Sion, R. (ed.) FC 2010. LNCS, vol. 6052, pp. 66–81.
Springer, Heidelberg (2010)
8. Camenisch, J., Hohenberger, S., Kohlweiss, M., Lysyanskaya, A., Meyerovich, M.:
How to Win the Clone Wars: Efficient Periodic n-Times Anonymous Authentica-
tion. In: CCS, pp. 201–210. ACM (2006)
9. Camenisch, J.L., Hohenberger, S., Lysyanskaya, A.: Compact E-Cash. In: Cramer,
R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 302–321. Springer, Heidelberg
(2005)
90 D. Slamanig
10. Camenisch, J., Kohlweiss, M., Soriente, C.: An Accumulator Based on Bilin-
ear Maps and Efficient Revocation for Anonymous Credentials. In: Jarecki, S.,
Tsudik, G. (eds.) PKC 2009. LNCS, vol. 5443, pp. 481–500. Springer, Heidelberg
(2009)
11. Camenisch, J.L., Lysyanskaya, A.: A Signature Scheme with Efficient Proto-
cols. In: Cimato, S., Galdi, C., Persiano, G. (eds.) SCN 2002. LNCS, vol. 2576,
pp. 268–289. Springer, Heidelberg (2003)
12. Camenisch, J.L., Lysyanskaya, A.: Signature Schemes and Anonymous Creden-
tials from Bilinear Maps. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152,
pp. 56–72. Springer, Heidelberg (2004)
13. Camenisch, J.L., Stadler, M.A.: Efficient Group Signature Schemes for Large
Groups. In: Kaliski Jr., B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 410–424.
Springer, Heidelberg (1997)
14. Canard, S., Gouget, A.: Divisible E-Cash Systems Can Be Truly Anonymous.
In: Naor, M. (ed.) EUROCRYPT 2007. LNCS, vol. 4515, pp. 482–497. Springer,
Heidelberg (2007)
15. Canard, S., Gouget, A., Hufschmitt, E.: A Handy Multi-Coupon System. In: Zhou,
J., Yung, M., Bao, F. (eds.) ACNS 2006. LNCS, vol. 3989, pp. 66–81. Springer,
Heidelberg (2006)
16. Chaum, D., Evertse, J.-H., van de Graaf, J.: An Improved Protocol for Demonstrat-
ing Possession of Discrete Logarithms and Some Generalizations. In: Price, W.L.,
Chaum, D. (eds.) EUROCRYPT 1987. LNCS, vol. 304, pp. 127–141. Springer,
Heidelberg (1988)
17. Chen, L., Escalante B., A.N., Löhr, H., Manulis, M., Sadeghi, A.-R.: A Privacy-
Protecting Multi-Coupon Scheme with Stronger Protection Against Splitting. In:
Dietrich, S., Dhamija, R. (eds.) FC 2007 and USEC 2007. LNCS, vol. 4886,
pp. 29–44. Springer, Heidelberg (2007)
18. Chen, Y., Paxson, V., Katz, R.H.: What’s New About Cloud Computing Security?
Tech. Rep. UCB/EECS-2010-5, University of California, Berkeley (2010)
19. Coull, S., Green, M., Hohenberger, S.: Controlling Access to an Oblivious Database
Using Stateful Anonymous Credentials. In: Jarecki, S., Tsudik, G. (eds.) PKC 2009.
LNCS, vol. 5443, pp. 501–520. Springer, Heidelberg (2009)
20. Cramer, R., Damgård, I.B., Schoenmakers, B.: Proof of Partial Knowledge and Sim-
plified Design of Witness Hiding Protocols. In: Desmedt, Y.G. (ed.) CRYPTO 1994.
LNCS, vol. 839, pp. 174–187. Springer, Heidelberg (1994)
21. Damgård, I.B., Dupont, K., Pedersen, M.Ø.: Unclonable Group Identification. In:
Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 555–572. Springer,
Heidelberg (2006)
22. Fiat, A., Shamir, A.: How to Prove Yourself: Practical Solutions to Identification
and Signature Problems. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263,
pp. 186–194. Springer, Heidelberg (1987)
23. Franz, M., Williams, P., Carbunar, B., Katzenbeisser, S., Peter, A., Sion, R.,
Sotakova, M.: Oblivious Outsourced Storage with Delegation. In: Financial Cryp-
tography and Data Security. LNCS, Springer, Heidelberg (2011)
24. Gentry, C.: Fully Homomorphic Encryption using Ideal Lattices. In: STOC,
pp. 169–178 (2009)
25. Kamara, S., Lauter, K.: Cryptographic Cloud Storage. In: Sion, R., Curtmola, R.,
Dietrich, S., Kiayias, A., Miret, J.M., Sako, K., Sebé, F. (eds.) RLCPS, WECSR,
and WLC 2010. LNCS, vol. 6054, pp. 136–149. Springer, Heidelberg (2010)
26. Lauter, K., Naehrig, M., Vaikuntanathan, V.: Can Homomorphic Encryption be
Practical? Tech. Rep. MSR-TR-2011-58, Microsoft Research (2011)
Efficient Anonymous Yet Authorized and Bounded Use of Clouds 91
27. Mao, W.: Guaranteed Correct Sharing of Integer Factorization with Off-Line Share-
holders. In: Imai, H., Zheng, Y. (eds.) PKC 1998. LNCS, vol. 1431, pp. 60–71.
Springer, Heidelberg (1998)
28. Okamoto, T.: Provably Secure and Practical Identification Schemes and Cor-
responding Signature Schemes. In: Brickell, E.F. (ed.) CRYPTO 1992. LNCS,
vol. 740, pp. 31–53. Springer, Heidelberg (1993)
29. Okamoto, T.: An Efficient Divisible Electronic Cash Scheme. In: Coppersmith, D.
(ed.) CRYPTO 1995. LNCS, vol. 963, pp. 438–451. Springer, Heidelberg (1995)
30. Pedersen, T.P.: Non-Interactive and Information-Theoretic Secure Verifiable Secret
Sharing. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 129–140.
Springer, Heidelberg (1992)
Group Law Computations
on Jacobians of Hyperelliptic Curves
1 Introduction
The field of curve-based cryptography has flourished for the last quarter century
after Koblitz [31] and Miller [44] independently proposed the use of elliptic curves
in public-key cryptosystems in the mid 1980’s. Compared with traditional group
structures like F∗p , elliptic curve cryptography (ECC) offers the powerful advan-
tage of achieving the same level of conjectured security with a much smaller
elliptic curve group. In 1989, Koblitz [32] generalized this idea by proposing
Jacobians of hyperelliptic curves of arbitrary genus as a way to construct Abelian
groups suitable for cryptography. Roughly speaking, hyperelliptic curves of genus
g can achieve groups of the same size and security as elliptic curves, whilst being
This author acknowledges funding from the Australian-American Fulbright Commis-
sion, the Gregory Schwartz Enrichment Grant, the Queensland Government Smart
State Ph.D. Fellowship, and an Australian Postgraduate Award.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 92–117, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Group Law Computations on Jacobians of Hyperelliptic Curves 93
defined over finite fields with g times fewer bits1 . At the same time however,
increasing the genus of a hyperelliptic curve significantly increases the computa-
tional cost of performing a group operation in the corresponding Jacobian group.
Thus, the question that remains of great interest to the public-key cryptography
community is, under which circumstances elliptic curves are preferable, and vice
versa. At the present time, elliptic curves carry on standing as the front-runner
in most practical scenarios, but whilst both ECC and hyperelliptic curve cryp-
tography (HECC) continue to enjoy a wide range of improvements, this question
remains open in general. For a nice overview of the progress in this race and of
the state-of-the-art in both cases, the reader is referred to the talks by Bernstein
[4], and by Lange [39].
Cantor [6] was the first to give a concrete algorithm for performing computa-
tions in Jacobian groups of hyperelliptic curves over fields of odd characteristic.
Shortly after, Koblitz [32] modified this algorithm to apply to fields of any charac-
teristic. Cantor’s algorithm makes use of the polynomial representation of group
elements proposed by Mumford [46], and consists of two stages: (i) the compo-
sition stage, based on Gauss’s classical composition of binary quadratic forms,
which generally outputs an unreduced divisor, and (ii) the reduction stage, which
transforms the unreduced divisor into the unique reduced divisor that is equiv-
alent to the sum, whose existence is guaranteed by the Riemann-Roch theorem
[33]. Cantor’s algorithm has since been substantially optimized in work initiated
by Harley [24], who was the first to obtain practical explicit formulas in genus
2, and extended by Lange [34,38], who, among several others [43,50,45,49], gen-
eralized and significantly improved Harley’s original approach. Essentially, all of
these improvements involve unrolling the polynomial arithmetic implied by Can-
tor’s algorithm into operations in the underlying field, and finding specialized
shortcuts dedicated to each of the separate cases of input (see [35, §4]).
In this paper we propose an explicit alternative to unrolling Cantor’s polyno-
mial arithmetic in the composition phase. Our method is inspired by considering
the geometric description of the group law and applies to hyperelliptic curves
of any genus. The equivalence of the geometric group law and Cantor’s algo-
rithm was proven by Lauter [40] in the case of genus 2, but since then there has
been almost no reported improvements in explicit formulas that benefit from
this depiction. The notable exception being the work of Leitenberger [42], who
used Gröbner basis reduction to show that in the addition of two distinct di-
visors on the Jacobian of a genus 2 curve, one can obtain explicit formulas to
compute the required geometric function directly from the Mumford coordinates
without (unrolling) polynomial arithmetic. Leitenberger’s idea of obtaining the
necessary geometric functions in a simple and elementary way is central to the
theme of this paper, although we note that the affine addition formulas that
result from our description (which do not rely on any Gröbner basis reduction)
are significantly faster than the direct translation of those given in [42].
1
The security argument becomes more complicated once venturing beyond genus 2,
where the attacks by Gaudry [17] and others [8,21,48] overtake the Pollard Rho
method [47].
94 C. Costello and K. Lauter
We use the geometric description of the group law to prove that the inter-
polating functions for the composition step can be found by writing down a
linear system in the ground field to be solved in terms of the Mumford coordi-
nates of the divisors. Therefore, the composition algorithm for arbitrary genera
proposed in this work is immediately explicit in terms of arithmetic in Fq , in
contrast to Cantor’s composition which operates in the polynomial ring Fq [x],
the optimization of which calls for ad-hoc attention in each genus to unravel the
Fq [x] operations into explicit formulas in Fq .
To illustrate the value of our approach, we show that, for group operations
on Jacobians of general genus 2 curves over large prime fields, the (affine and
projective) formulas that result from this description are more efficient than
their predecessors. Also, when applying this approach back to the case of genus
1, we are able to recover several of the tricks previously explored for merging
simultaneous group operations to optimize elliptic curve computations.
The rest of this paper is organized as follows. We briefly touch on some more
related work, before moving to Section 2 where we give a short background on
hyperelliptic curves and the Mumford representation of Jacobian elements. Sec-
tion 3 discusses the geometry of Jacobian arithmetic on hyperelliptic curves, and
shows that we can use simple linear algebra to compute the required geometric
functions from the Mumford coordinates. Section 4 is dedicated to illustrating
how this technique results in fast explicit formulas in genus 2, whilst Section 5
generalizes the algorithm for all g ≥ 2. As we hope this work will influence fur-
ther progress in higher genus arithmetic, in Section 6 we highlight some further
implications of adopting this geometrically inspired approach, before concluding
in Section 7. MAGMA scripts that verify our proposed algorithms and formulas
can be found in the full version of this paper.
Related Work. There are several high-level papers (e.g. [27,25]) which discuss
general methods for computing in Jacobians of arbitrary algebraic curves. In
addition, there has also been work which specifically addresses arithmetic on
non-hyperelliptic Jacobians from a geometric perspective (e.g. [13,14]).
Khuri-Makdisi treated divisor composition on arbitrary algebraic curves with
linear algebra techniques in [29] and [30]. In contrast to Khuri-Makdisi’s deep
and more general approach, our paper specifically aims to present an explicit
algorithm in an implementation-ready format that is specific to hyperelliptic
curves, much like his joint work with Abu Salem which applied his earlier tech-
niques to present explicit formulas for arithmetic on C3,4 curves [1]. Some other
authors have also applied techniques from the realm of linear algebra to Jaco-
bian operations: two notable examples being the work of Guyot et al. [23] and
Avanzi et al. [2] who both used matrix methods to compute the resultant of two
polynomials in the composition stage.
Since we have focused on general hyperelliptic curves, our comparison in genus
2 does not include the record-holding work by Gaudry [19], which exploits the
Kummer surface associated with curves of a special form to achieve the current
outright fastest genus 2 arithmetic for those curve models. Gaudry and Harley’s
Group Law Computations on Jacobians of Hyperelliptic Curves 95
second exposition [20] further describes the results in [24]. Finally, we do not
draw comparisons with any work on real models of hyperelliptic curves, which
usually result in slightly slower formulas than imaginary hyperelliptic curves,
but we note that both Galbraith et al. [16] and Erickson et al. [11] achieve
very competitive formulas for group law computations on real models of genus
2 hyperelliptic curves.
2 Background
We give some brief background on hyperelliptic curves and the Mumford repre-
sentation of points in the Jacobian. For a more in depth discussion, the reader
is referred to [3, §4] and [15, §11]. Over the field K, we use Cg to denote the
general (“imaginary quadratic”) hyperelliptic curve of genus g given by
Cg : y 2 + h(x)y = f (x),
h(x), f (x) ∈ K[x], deg(f ) = 2g + 1, deg(h) ≤ g, f monic, (1)
with the added stipulation that no point (x, y) ∈ K simultaneously sends both
partial derivatives 2y + h(x) and f (x) − h (x)y to zero [3, §14.1]. As long as
char(K) = 2g + 1, we can isomorphically transform Cg into Cˆg , given as Cˆg :
y 2 + h(x)y = x2g+1 + fˆ2g−1 x2g−1 + ... + fˆ1 x + fˆ0 , so that the coefficient of x2g
is zero [3, §14.13]. In the case of odd characteristic fields, it is standard to also
annihilate the presence of h(x) completely under a suitable transformation, in
order to obtain a simpler model (we will make use of this in §4). We abuse
notation and use Cg from hereon to refer to the simplified version of the curve
equation in each context. Although the proofs in §3 apply to any K, it better
places the intention of the discussion to henceforth regard K as a finite field Fq .
We work in the Jacobian group Jac(Cg ) of Cg , where the elements are equiv-
alence classes of degree zero divisors on Cg . Divisors are formal sums of points
on the curve, and the degree of a divisor is the sum of the multiplicities of points
in the support of the divisor. Two divisors are equivalent if their difference is
a principal divisor, i.e. equal to the divisor of zeros and poles of a function. It
follows from the Riemann-Roch Theorem that for hyperelliptic curves, each class
D has a unique reduced representative of the form
3
3
4
5
2
25 · (x − xi ) · (x − xi ) (x − x̄i ) = i xi − f (x)
i=1 i=1 i=1 i=0
for x̄1 ,x̄2 ,x̄3 and x̄4 , which would necessitate multiple
root extractions. On the
4 5
i 2
other hand, the exact division i=1 (x − x̄i ) =
i=0 i x − f (x) / 25 ·
3 3
i=1 (x − xi ) · i=1 (x − xi ) can be computed very efficiently (and entirely over
Fq ) by equating coefficients of x.
P˜1
•
P˜3
P˜1
P1
• • P3 • •P2
P1 P˜3 •
• •
•˜ • • •P
P3 • •P P2 P2 P3 • 1
2 P˜4 •
P˜2
P˜4 •
Fig. 1. The composition stage of a gen- Fig. 2. The reduction stage: a (vertically)
eral addition on the Jacobian of a genus 3 magnified view of the cubic function which
curve C3 over the reals R: the 6 points in interpolates the points in the support of
the combined supports of D and D are in- D̃ and intersects C3 in three more places
terpolated by a quintic polynomial which to form D̄ = (P1 + P2 + P3 ) ∼ D̃, the
intersects C in 4 more places to form the reduced equivalent of D̃.
unreduced divisor D̃ = P˜1 + P˜2 + P˜3 + P˜4 .
Definition 1 (Mumford ideals). We call the g ideals Ψi arising from the g
hypersurfaces Ψi = 0 in Proposition 1 the Mumford ideals.
Mum
which we call the Mumford function fields and denote by KDBL = K(Ĵac(Cg ))
and KADD = K(Ĵac(Cg ) × Ĵac(Cg )) respectively. We abbreviate and use
Mum
Ψi , Ψi to differentiate between Ψi = Ψi (u0 , ..., ug−1 , v0 , ..., vg−1 ) and Ψi =
Ψi (u0 , ..., ug−1 , v0 , ..., vg−1
Mum
) when working in KADD .
where
with
100 C. Costello and K. Lauter
Solving Ω0≤i≤2 , Ω0≤i≤2 = 0 simultaneously for 0 , ..., 5 gives (x) = 84x5 +
3 2
213x + 78x + 252x + 165.
This section showed that divisor composition on hyperelliptic curves can be
achieved via linear operations in the Mumford function fields.
where
Ψ0 = v0 2 − f0 + f2 u0 − v1 2 u0 + 2 u0 2 u1 − u1 f3 u0 − u1 3 u0 ,
Ψ1 = 2 v0 v1 − f1 − v1 2 u1 + f2 u1 − f3 (u1 2 − u0 ) + 3 u0 u1 2 − u1 4 − u0 2 . (2)
/Ψ0 , Ψ1 , Ψ0 , Ψ1 ). In §4.2 particularly, we will see how the ideal Ψ0 , Ψ1 is useful
in simplifying the formulas that arise.
0 ≡ Ω1 x + Ω0 ≡ 3 x3 + 2 x2 + 1 x + 0 − (v1 x + v0 )
≡ (3 (u1 2 − u0 ) − 2 u1 + 1 − v1 )x + (3 u1 u0 − 2 u0 + 0 − v0 )
mod x2 + u1 x + u0 , (3)
102 C. Costello and K. Lauter
P1 •
P1 • P2
• P1
•
•
P1 P2 • •
• • • P2
P1
P2
•
P2
•
Fig. 3. The group law (general addition) Fig. 4. A general point doubling on the
on the Jacobian of the genus 2 curve C2 Jacobian of a genus 2 curve C2 over the
over the reals R, for (P1 +P2 )⊕(P1 +P2 ) = reals R, for [2](P1 + P2 ) = P1 + P2 .
P1 + P2 .
which provides two equations (Ω1 = 0 and Ω0 = 0) relating the four coefficients
Mum
of the interpolating polynomial linearly inside KADD . Identically, interpolating
the support of D produces two more linear equations which allow us to solve
for the four i as
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 −u0 u1 u0 0 v0
⎜ 0 1 −u1 u 2
− u ⎟ ⎜ ⎟ ⎜ v ⎟
⎜ 1 0 ⎟ ⎜ 1⎟
· = ⎜ 1⎟
⎝ 1 0 −u0 u1 u0 ⎠ ⎝ 2 ⎠ ⎝ v0 ⎠ .
0 1 −u1 u12 − u0 3 v1
Observe that the respective subtraction of rows 1 and 2 from rows 3 and 4 gives
rise to a smaller system that can be solved for 2 and 3 , as
u0 − u0 u1 u0 − u1 u0 2 v0 − v0
· = . (4)
u1 − u1 (u12 − u0 ) − (u21 − u0 ) 3 v1 − v1
Remark 1. We will see in Section 5.1 that for all g ≥ 2, the linear system that
arises in the computation of (x) can always be trivially reduced to be of di-
mension g, but for now it is useful to observe that once we solve the dimension
g = 2 matrix system for i with i ≥ g, calculating the remaining i where i < g
is computationally straightforward.
The next step is to determine the remaining intersection points of y = (x)
on C2 . Since y = (x) is cubic, its substitution into C2 will give a degree six
Group Law Computations on Jacobians of Hyperelliptic Curves 103
equation in x. Four of the roots will correspond to the four non-trivial points
in supp(D) ∪ supp(D ), whilst the remaining two will correspond to the two
x coordinates of the non-trivial elements in supp(D̄ ), which are the same as
the x coordinates in supp(D ) (see the intersection points in Figure 3). Let the
Mumford representation of D̄ be D̄ = (x2 + u1 x + u0 , −v1 x − v0 ); we then
have
3
( i=0 i xi )2 − f (x)
(x2 + u1 x + u0 ) · (x2 + u1 x + u0 ) · (x2 + u1 x + u0 ) = .
23
It remains to compute v1 and v0 . Namely, we wish to compute the linear function
that interpolates the points in supp(D ). Observe that reducing (x) modulo
x2 + u1 x + u0 gives the linear polynomial −v1 x + −v0 which interpolates the
points in supp(D¯ ), i.e. those points which are the involutions of the points in
supp(D ). Thus, the computation of v1 and v0 amounts to negating the result
of (x) mod x2 + u1 x + u0 . From equation (3) then, it follows that
v1 = −(3 (u1 − u0 ) − 2 u1 + 1 ), v0 = −(3 u1 u0 − 2 u0 + 0 ).
2
(6)
Remark 2. The formulas for computing v0 and v1 in (6) include operations in-
volving u2
1 and u1 u0 . Since those quantities are also needed in the first step
of the addition formulas (see the first line of Table 1) for any subsequent addi-
tions involving the divisor D , it makes sense to carry those quantities along as
extra coordinates to exploit these overlapping computations. It turns out that
an analogous overlap arises in group operations for all g ≥ 2, but for now we
remark that both additions and doublings on genus 2 curves will benefit from
extending the generic affine coordinate system to include two extra coordinates
u21 and u1 u0 .
104 C. Costello and K. Lauter
Table 1. Explicit formulas for a divisor addition D = D ⊕ D involving two distinct
degree 2 divisors on Jac(C2 ), and for divisor doubling D = [2]D of a degree 2 divisor
on Jac(C2 )
AFFINE
ADDITION
Input: D = (u1 , u0 , v1 , v0 , U1 = u2
1 , U0 = u u
1 0 ), D = (u , u , v , v , U = u2 , U = u u )
1 0 1 0 1 1 0 1 0 Operations in Fq
σ1 ← u1 + u1 , ,
Δ 0 ← v0 − v0 Δ 1 ← v1 − v1, M1 ← U1 − u0 − U1 + u ,
0 M2 ← U0 −U ,
0
M3 ← u1 − u1 , M4 ← u0 − u0 , t1 ← (M2 − Δ0 ) · (Δ1 − M1 ), t2 ← (−Δ0 − M2 ) · (Δ1 + M1 ), 2M
t3 ← (−Δ0 + M4 ) · (Δ1 − M3 ), t4 ← (−Δ0 − M4 ) · (Δ1 + M3 ), 2M
2 ← t1 − t2 3 ← t3 − t4 , d ← t3 + t4 − t1 − t2 − 2(M2 − M4 ) · (M1 + M3 ), 1M
A ← 1/(d · 3 ), B ← d · A, C ← d · B, D ← 2 · B, E ← 2 3 · A, CC ← C 2 , I + 5M + 2S
u
1 ← 2D − CC − σ1 , u 2
0 ← D + C · (v1 + v1 ) − ((u1 − CC) · σ1 + (U1 + U1 ))/2,
2M + 1S
U1 ← u2 , ← u · u , v ← D · (u − u ) + U − u − U + u ,
U0 2M + 1S
1 1 0 1 1 1 1 0 1 0
← D · (u − u ) + U − U , ← E · v + v
v0 0 0 0 0 v1 1 1 v0 ← E · v0 + v0 . 3M
ZZ ← Z1 · Z2 , U 1Z ← U1 · Z2 , U 1Z ← U1 ·Z ,
1 U 1ZS ← U 1Z 2 , U 1ZS ← U 1Z 2 , 3M + 2S
U 0Z ← U0 · Z2 , U 0Z ← U0 ·Z ,
1 V 1Z ← V1 · Z2 , V 1Z ← V1 · Z1 , 4M
M1 ← U 1ZS − U 1ZS + ZZ · (U 0dZ − U 0Z), M2 ← U 1Z · U 0Z − U 1Z · U 0Z; 3M
M3 ← U 1Z − U 1Z , M4 ← U 0Z − U 0Z, z1 ← V 0 · Z2 − V 0 · Z1 , z2 ← V 1Z − V 1Z , 2M
t1 ← (M2 − z1 ) · (z2 − M1 ), t2 ← (−z1 − M2 ) · (z2 + M1 ), 2M
t3 ← (−z1 + M4 ) · (z2 − M3 ), t4 ← (−z1 − M4 ) · (z2 + M3 ), 2M
2 ← t1 − t2 , 3 ← t3 − t4 , d ← t3 + t4 − t1 − t2 − 2 · (M2 − M4 ) · (M1 + M3 ), 1M
A ← d2 , B ← 3 · ZZ, C ← 2 · B, D ← d · B, E ← 3 · B, F ← U 1Z · E, G ← ZZ · E, 6M + 1S
H ← U 0Z · G, J ← D · G, K ← Z2 · J, U1 ← 2 · C − A − E · (U 1Z + U 1Z ), 4M
U0 ← 2 · ZZ + D · (V 1Z + V 1Z ) − ((U − A) · (U 1Z + U 1Z ) + E · (U 1ZS + U 1ZS ))/2, 4M + 1S
2 1
V1 ← U1 · (U − C) + F · (C − F ) + E · (H − U ),
1 0 3M
V0 ← H · (C − F ) + U0 · (U − C),
1 V
1 ← V1 · ZZ + K · V1 , V0 ← V0 + K · V0 , 5M
U1 ← U · D · ZZ, U ← U · D, Z ← ZZ · J. 4M
1 0 0
Output: D = ρ(D ⊕ D ) = (U1 , U , V , V , Z ).
0 1 0 Total 43M + 4S
AFFINE
DOUBLING
Input: D = (u1 , u0 , v1 , v0 , U1 = u2 1 , U0 = u1 u0 ), with constants f2 , f3 Operations
2,
vv ← v1 vu ← (v1 + u1 )2 − vv − U1 , M1 ← 2v0 − 2vu, M2 ← 2v1 · (u0 + 2U1 ), 1M + 2S
M3 ← −2v1 , M4 ← vu + 2v0 , z1 ← f2 + 2U1 · u1 + 2U0 − vv, z2 ← f3 − 2u0 + 3U1 , 1M
t1 ← (M2 − z1 ) · (z2 − M1 ), t2 ← (−z1 − M2 ) · (z2 + M1 ), 2M
t3 ← (M4 − z1 ) · (z2 − M3 ), t4 ← (−z1 − M4 ) · (z2 + M3 ), 2M
2 ← t1 − t2 , 3 ← t3 − t4 , d ← t3 + t4 − t1 − t2 − 2(M2 − M4 ) · (M1 + M3 ), 1M
A ← 1/(d · 3 ), B ← d · A, C ← d · B, D ← 2 · B, E ← 23 · A, I + 5M + 1S
u 2
1 ← 2D − C − 2u1 , u 2
0 ← (D − u1 ) + 2C · (v1 + C · u1 ), U1 ← u2 ,
1 U0 ← u · u ,
1 0 3M + 3S
← D · (u − u ) + U − U − u + u ,
v1 ← D · (u − u ) + U − U ,
v0 2M
1 1 1 1 0 0 0 0 0 0
← E · v + v ,
v1 v ← E · v + v . 2M
1 1 0 0 0
U U ← U1 · U0 , U1S ← U1 2, ZS ← Z 2 , V 0Z ← V 0 · Z, U 0Z ← U 0 · Z, V1S ← V 12 , 3M + 3S
U V ← (V1 + U1 )2 − V1S − U1S , M1 ← 2 · V 0Z − 2 · U V, M2 ← 2 · V 1 · (U 0Z + 2 · U1S ), 1M + 1S
M3 ← −2 · V1 , M4 ← U V + 2 · V 0Z, z1 ← Z · (f2 · ZS − V1S ) + 2 · U1 · (U1S + U 0Z), 2M
z2 ← f3 · ZS − 2 · U 0Z + 3 · U1S , t1 ← (M2 − z1) · (z2 − M1 ), t2 ← (−z1 − M2 ) · (z2 + M1 ), 2M
t3 ← (−z1 + M4 ) · (z2 − M3 ), t4 ← (−z1 − M4 ) · (z2 + M3 ), 2M
2 ← t1 − t2 , 3 ← t3 − t4 , d ← t3 + t4 − t1 − t2 − 2 · (M2 − M4 ) · (M1 + M3 ), 1M
A ← 22, B ← 2 3, C ← ((2 + 3 )2 − A − B)/2, D ← B · Z, E ← B · U1 , 2M + 3S
F ← d ,2 G ← F · Z, H ← ((d + 3 )2 − F − B)/2, J ← H · Z, K ← V1 · J, L ← U 0Z · B, 4M + 2S
U1 ← 2 · C − 2 · E − G,
U0 ← A + U1 · (E − 2 · C + 2 · G) + 2 · K, 1M
V1 ← (C − E − U1 ) · (E − U1 ) + B · (L − U0 ),
V0 ← L · (C − E) + (U1 − C) · U0 . 4M
V1 ← V1 · Z + K · D, V0 ← V0 + V 0Z · H · D, M ← J · Z, ← U · M,
U1 1 U0 ← U · J,
0 7M
Z ← M · D. 1M
Output: D = ρ([2]D) = (U1 , U , V , V , Z ).
0 1 0 Total 30M + 9S
Group Law Computations on Jacobians of Hyperelliptic Curves 105
where
Ω0 = 23 (2u30 − 3u21 u20 ) + 43 2 u1 u20 − 23 1 u20 + 20 − 22 u20 − 2u1 u20 − f0 ,
Ω1 = 623 (u1 u20 − u31 u0 ) + 23 2 (4u21 u0 − u20 ) + 21 0 − 43 1 u0 u1
− 222 u0 u1 − 4u21 u0 + u20 − f1 ,
Ω2 = 323 (u20 − u41 ) + 21 − 22 (u21 + 2u0 ) − 2u0 u1 − 2u31 + 43 2 (u31 + u0 u1 )
− 23 1 (2u0 + u21 ) + 22 0 − f2 ,
Ω3 = 223 (3u1 u0 − 2u31 ) + 22 1 + 23 2 (3u21 − 2u0 ) − 222 u1 − 43 1 u1 + 23 0
− 3u21 + 2u0 − f3 .
which combine with the linear interpolating equations (in (3)) to give rise to the
linear system
106 C. Costello and K. Lauter
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
−1 0 u0 −u1 u0 0 −v0
⎜ 0 −1 u1 −u21 + u0 ⎟ ⎜ 1 ⎟ ⎜ −v1 ⎟
⎜ ⎟ · ⎜ ⎟=⎜ ⎟
⎝ 0 4v1 2v0 − 2v1 u1 −6u0 v1 ⎠ ⎝ 2 ⎠ ⎝ f2 + 2u1 u0 + 2u31 + 3v12 ⎠ .
0 0 2v1 −4v1 u1 + 2v0 3 f3 − 2u0 + 3u21
As was the case with the divisor addition in the previous section, we can first
solve a smaller system for 2 and 3 , by adding the appropriate multiple of the
second row to the third row above, to give
2v1 u1 + 2v0 −2u0 v1 − 4v1 u21 2 f2 + 2u1 u0 + 2u31 − v12
· = .
2v1 −4v1 u1 + 2v0 3 f3 − 2u0 + 3u21
After solving the above system for 2 and 3 , the process of obtaining D =
[2]D = (x2 + u1 x + u0 , v1 x + v0 ) is identical to the case of addition in the
previous section, giving rise to the analogous explicit formulas in Table 1.
Table 2. Comparisons between our explicit formulas for genus 2 curves over prime
fields and previous formulas using CRT based composition
the same in all related publications following Cantor’s original paper (at least
in the case of low genera), but we give a simple geometric interpretation of the
number of reduction rounds required in Section 5.3 below.
1: U, U , M ← {0}g×g ∈ Fg×g
q , z ← {0}g ∈ Fgq .
2: for i from 1 to g do
3: Ug+1−i,1 ← −ug−i ; Ug+1−i,1 ← −ug−i
4: end for
5: for j from 2 to g do
6: U1,j ← Ug,j−1 · U1,1 ; U1,j ← Ug,j−1 · U1,1 .
7: for i from 2 to g do
8: Ui,j ← Ug,j−1 · Ui,1 + Ui−1,j−1 ; Ui,j ← Ug,j−1 · Ui,1 + Ui−1,j−1 .
9: end for
10: end for
11: M ← U − U .
12: for i from 1 to g do
13: zi ← vi−1 − vi−1
14: end for
15: Solve M · x = z
16: Compute x̃ = U · x
17: for i from 1 to g do
18: x̃i ← vg−i − x̃i
19: end for
20: return (x) (from x̃ = {0 , ..., g−1 } and x = {g , ..., 2g−1 })
Example 4. Assume a general genus 3 curve and let the Mumford representations
of the divisors D and D be as usual. The matrix U is given as
−u u u (−u2 + u )u
0 2 0 2 1 0 0 0 0 0 0 0
2
U= −u1 u u
2 1 (−u2 + u1 )u1 + 0 −u 0 u u
2 0 + 0 0 0 ,
−u2 u22 (−u22 + u1 )u2 0 −u1 u2 u1 0 0 −u0
Group Law Computations on Jacobians of Hyperelliptic Curves 109
The immediate observation in general is that cc̃T is the only outer product that
requires computation in order to determine U entirely.
For general divisor doublings the description of the linear system is much
longer; this is because the right hand side vector z is slightly more complicated
than in the case of addition: as is the case with general Weierstrass elliptic
curves, additions tend to be independent of the curve constants whilst doublings
do not. We reiterate that, for low genus implementations at least, Algorithm 2 is
intended to obtain the algebraic expressions for each element in M; as was the
case with genus 2, a faster computational route to determining the composition
function will probably arise from genus specific attention that derives tailor-
made explicit formulas. Besides, the general consequence of Remark 2 is that
many (if not all) of the values constituting U will have already been computed
in the previous point operation, and can therefore be temporarily stored and
reused.
Proposition 4. In the addition of any two reduced divisor classes on the Jaco-
bian of a genus g hyperelliptic curve, the number of rounds of further reduction
required to form the reduced divisor is at most g−1
2 , with equality occurring in
the general case.
Proof. For completeness note that addition on elliptic curves in Weierstrass form
needs no reduction, so take g ≥ 2. The composition polynomial y = (x) with
the 2g prescribed zeros (including multiplicities) has degree 2g − 1. Substituting
y = (x) into Cg : y 2 + h(x)y = f (x) gives an equation of degree max{2g +
1, 3g − 1, 2(2g − 1)} = 2(2g − 1) in x, for which there are at most 2(2g − 1) − 2g =
2g − 2 new roots. Let nt be the maximum number of new roots after t rounds
of reduction, so that n0 = 2g − 2. While nt > g, reduction is not complete, so
continue by interpolating the nt new points with a polynomial of degree nt − 1,
producing at most 2(nt −1)−nt = nt −2 new roots. It follows that nt = 2g−2t−2,
and since t, g ∈ Z, the result follows.
[3]P
•
•
P
• P
• P P • Pˆ1
• •
[2]P + P • Pˆ2
•
•
Fig. 5. Computing
[2]P + P by prescribing Fig. 6. Tripling the point Fig. 7. Quadrupling the
a parabola which inter- P ∈ E by prescribing a point P ∈ E by pre-
sects E at P, P with parabola which intersects scribing a cubic which
multiplicities two and one E at P with multiplicity intersects E at P with
respectively. three. multiplicity four.
as DP̂ = (û(x), v̂(x)) (as if it were a reduced divisor in genus 2). Our experi-
ments agree with prior evidence that it is unlikely that point quadruplings will
outperform consecutive doublings in the preferred projective cases, although we
believe that one application which could benefit from this description is pairing
computations, where interpolating functions are necessary in the computations.
ˆ
To reduce DP̂ , we need the line y = (x) joining P̂1 with P̂2 , which can be com-
ˆ
puted via (x) ≡ (x) mod û(x). The update to the pairing function requires
ˆ
both (x) and (x), ˆ
as fupd = (x)/(x). We claim that it may be attractive
to compute a quadrupling in this fashion and only update the pairing function
once, rather than two doublings which update the pairing functions twice, par-
ticularly in implementations where inversions don’t compare so badly against
multiplications [41]. It is also worth pointing out that in a quadruple-and-add
computation, the unreduced divisor DP̂ need not be reduced before adding an
additional point P . Rather, it could be advantageous to immediately interpolate
P̂1 , P̂2 and P with a parabola instead.
7 Conclusion
This paper presents a new and explicit method of divisor composition for hyper-
elliptic curves. The method is based on using simple linear algebra to derive the
required geometric functions directly from the Mumford coordinates of Jacobian
elements. In contrast to Cantor’s composition which operates in the polynomial
ring Fq [x], the algorithm we propose is immediately explicit in terms of Fq op-
erations. We showed that this achieves the current fastest general group law
formulas in genus 2, and pointed out several other potential improvements that
could arise from this exposition.
References
1. Abu Salem, F.K., Khuri-Makdisi, K.: Fast Jacobian group operations for C3,4
curves over a large finite field. CoRR, abs/math/0610121 (2006)
2. Avanzi, R., Thériault, N., Wang, Z.: Rethinking low genus hyperelliptic Jacobian
arithmetic over binary fields: interplay of field arithmetic and explicit formulæ. J.
Math. Crypt. 2(3), 227–255 (2008)
3. Avanzi, R.M., Cohen, H., Doche, C., Frey, G., Lange, T., Nguyen, K., Vercauteren,
F.: The Handbook of Elliptic and Hyperelliptic Curve Cryptography. CRC (2005)
4. Bernstein, D.J.: Elliptic vs. hyperelliptic, part I. Talk at ECC (September 2006)
5. Bernstein, D.J., Lange, T.: Explicit-formulas database,
http://www.hyperelliptic.org/EFD
6. Cantor, D.G.: Computing in the Jacobian of a hyperelliptic curve. Math.
Comp. 48(177), 95–101 (1987)
7. Ciet, M., Joye, M., Lauter, K., Montgomery, P.L.: Trading inversions for multipli-
cations in elliptic curve cryptography. Designs, Codes and Cryptography 39(2),
189–206 (2006)
8. Diem, C.: An Index Calculus Algorithm for Plane Curves of Small Degree. In:
Hess, F., Pauli, S., Pohst, M. (eds.) ANTS 2006. LNCS, vol. 4076, pp. 543–557.
Springer, Heidelberg (2006)
Group Law Computations on Jacobians of Hyperelliptic Curves 115
9. Doche, C., Icart, T., Kohel, D.R.: Efficient scalar multiplication by isogeny de-
compositions. In: PKC 2006 [54], pp. 191–206 (2006)
10. Eisenträger, K., Lauter, K., Montgomery, P.L.: Fast Elliptic Curve Arithmetic
and Improved Weil Pairing Evaluation. In: Joye, M. (ed.) CT-RSA 2003. LNCS,
vol. 2612, pp. 343–354. Springer, Heidelberg (2003)
11. Erickson, S., Jacobson Jr., M.J., Shang, N., Shen, S., Stein, A.: Explicit Formulas
for Real Hyperelliptic Curves of Genus 2 in Affine Representation. In: Carlet, C.,
Sunar, B. (eds.) WAIFI 2007. LNCS, vol. 4547, pp. 202–218. Springer, Heidelberg
(2007)
12. Fan, X., Gong, G., Jao, D.: Efficient Pairing Computation on Genus 2 Curves in
Projective Coordinates. In: Avanzi, R.M., Keliher, L., Sica, F. (eds.) SAC 2008.
LNCS, vol. 5381, pp. 18–34. Springer, Heidelberg (2009)
13. Flon, S., Oyono, R., Ritzenthaler, a.C.: Fast addition on non-hyperelliptic genus
3 curves. Algebraic geometry and its applications 5(3), 227–256 (2008)
14. Flon, S., Oyono, R.: Fast Arithmetic on Jacobians of Picard Curves. In: Bao,
F., Deng, R., Zhou, J. (eds.) PKC 2004. LNCS, vol. 2947, pp. 55–68. Springer,
Heidelberg (2004)
15. Galbraith, S.D.: Mathematics of Public Key Cryptography, 0.9 edition (February
11, 2011),
http://www.math.auckland.ac.nz/~ sgal018/crypto-book/crypto-book.html
16. Galbraith, S.D., Harrison, M., Mireles Morales, D.J.: Efficient Hyperelliptic Arith-
metic using Balanced Representation for Divisors. In: van der Poorten, A.J., Stein,
A. (eds.) ANTS-VIII 2008. LNCS, vol. 5011, pp. 342–356. Springer, Heidelberg
(2008)
17. Gaudry, P.: An Algorithm for Solving the Discrete Log Problem on Hyperelliptic
Curves. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 19–34.
Springer, Heidelberg (2000)
18. Gaudry, P.: Hyperelliptic curves and the HCDLP. London Mathematical Society
Lecture Notes, vol. 317, ch.VII, pp. 133–150. Cambridge University Press (2005)
19. Gaudry, P.: Fast genus 2 arithmetic based on Theta functions. J. Math.
Crypt. 1(3), 243–265 (2007)
20. Gaudry, P., Harley, R.: Counting Points on Hyperelliptic Curves Over Finite
Fields. In: Bosma, W. (ed.) ANTS 2000. LNCS, vol. 1838, pp. 313–332. Springer,
Heidelberg (2000)
21. Gaudry, P., Thomé, E., Thériault, N., Diem, C.: A double large prime variation
for small genus hyperelliptic index calculus. Math. Comp. 76(257), 475–492 (2007)
22. Gonda, M., Matsuo, K., Aoki, K., Chao, J., Tsujii, S.: Improvements of addition
algorithm on genus 3 hyperelliptic curves and their implementation. IEICE Trans-
actions on Fundamentals of Electronics Communications and Computer Sciences,
89–96 (2005)
23. Gurot, C., Kaveh, K., Patankar, V.M.: Explicit algorithm for the arithmetic on
the hyperelliptic Jacobians of genus 3. Journal of the Ramanujan Mathematical
Society 19, 75–115 (2004)
24. Harley, R.: Fast arithmetic on genus 2 curves, for C source code and further
explanations, http://cristal.inria.fr/~ harley/hyper
25. Hess, F.: Computing Riemann-Roch spaces in algebraic function fields and related
topics. J. Symb. Comput. 33(4), 425–445 (2002)
26. Hisil, H.: Elliptic curves, group law, and efficient computation. PhD thesis,
Queensland University of Technology (2010)
27. Huang, M.A., Ierardi, D.: Efficient algorithms for the Riemann-Roch problem and
for addition in the Jacobian of a curve. J. Symb. Comput. 18(6), 519–539 (1994)
116 C. Costello and K. Lauter
28. Katagi, M., Kitamura, I., Akishita, T., Takagi, T.: Novel Efficient Implementa-
tions of Hyperelliptic Curve Cryptosystems using Degenerate Divisors. In: Lim,
C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325, pp. 345–359. Springer,
Heidelberg (2005)
29. Khuri-Makdisi, K.: Linear algebra algorithms for divisors on an algebraic curve.
Math. Comp. 73(245), 333–357 (2004)
30. Khuri-Makdisi, K.: Asymptotically fast group operations on jacobians of general
curves. Math. Comp. 76(260), 2213–2239 (2007)
31. Koblitz, N.: Elliptic curve cryptosystems. Math. Comp. 48(177), 203–209 (1987)
32. Koblitz, N.: Hyperelliptic cryptosystems. J. Cryptology 1(3), 139–150 (1989)
33. Lang, S.: Introduction to algebraic geometry. Addison-Wesley (1972)
34. Lange, T.: Efficient arithmetic on hyperelliptic curves. PhD thesis, Universität-
Gesamthochschule Essen (2001)
35. Lange, T.: Efficient arithmetic on genus 2 hyperelliptic curves over finite fields
via explicit formulae. Cryptology ePrint Archive, Report 2002/121 (2002),
http://eprint.iacr.org/
36. Lange, T.: Inversion-free arithmetic on genus 2 hyperelliptic curves. Cryptology
ePrint Archive, Report 2002/147 (2002), http://eprint.iacr.org/
37. Lange, T.: Weighted coordinates on genus 2 hyperelliptic curves. Cryptology
ePrint Archive, Report 2002/153 (2002), http://eprint.iacr.org/
38. Lange, T.: Formulae for arithmetic on genus 2 hyperelliptic curves. Appl. Algebra
Eng. Commun. Comput. 15(5), 295–328 (2005)
39. Lange, T.: Elliptic vs. hyperelliptic, part II. Talk at ECC (September 2006)
40. Lauter, K.: The equivalence of the geometric and algebraic group laws for Jaco-
bians of genus 2 curves. Topics in Algebraic and Noncommutative Geometry 324,
165–171 (2003)
41. Lauter, K., Montgomery, P.L., Naehrig, M.: An Analysis of Affine Coordinates
for Pairing Computation. In: Joye, M., Miyaji, A., Otsuka, A. (eds.) Pairing 2010.
LNCS, vol. 6487, pp. 1–20. Springer, Heidelberg (2010)
42. Leitenberger, F.: About the group law for the Jacobi variety of a hyperelliptic
curve. Contributions to Algebra and Geometry 46(1), 125–130 (2005)
43. Matsuo, K., Chao, J., Tsujii, S.: Fast genus two hyperelliptic curve cryptosystems.
Technical Report 214, IEIC (2001)
44. Miller, V.S.: Use of Elliptic Curves in Cryptography. In: Williams, H.C. (ed.)
CRYPTO 1985. LNCS, vol. 218, pp. 417–426. Springer, Heidelberg (1986)
45. Miyamoto, Y., Doi, H., Matsuo, K., Chao, J., Tsujii, S.: A fast addition algorithm
of genus two hyperelliptic curve. In: Symposium on Cryptography and Information
Security - SCICS (2002) (in Japanese)
46. Mumford, D.: Tata lectures on theta II. In: Progress in Mathematics, vol. 43.
Birkhiauser Boston Inc., Boston (1984)
47. Pollard, J.M.: Monte Carlo methods for index computation (mod p). Math.
Comp. 32(143), 918–924 (1978)
48. Smith, B.: Isogenies and the discrete logarithm problem in Jacobians of genus 3
hyperelliptic curves. Journal of Cryptology 22(4), 505–529 (2009)
49. Sugizaki, H., Matsuo, K., Chao, J., Tsujii, S.: An extension of Harley addition
algorithm for hyperelliptic curves over finite fields of characteristic two. Technical
Report ISEC2002-9(2002-5), IEICE (2002)
50. Takahashi, M.: Improving Harley algorithms for Jacobians of genus 2 hyperelliptic
curves. In: Symposium on Cryptography and Information Security - SCICS (2002)
(in Japanese)
Group Law Computations on Jacobians of Hyperelliptic Curves 117
51. Wollinger, T.: Software and hardware implementation of hyperelliptic curve cryp-
tosystems. PhD thesis, Ruhr-University of Bochum (2004)
52. Wollinger, T., Kovtun, V.: Fast explicit formulae for genus 2 hyperelliptic curves
using projective coordinates. In: Fourth International Conference on Information
Technology, pp. 893–897 (2007)
53. Wollinger, T., Pelzl, J., Paar, C.: Cantor versus Harley: optimization and analysis
of explicit formulae for hyperelliptic curve cryptosystems. IEEE Transactions on
Computers, 861–872 (2005)
54. Yung, M., Dodis, Y., Kiayias, A., Malkin, T. (eds.): PKC 2006. LNCS, vol. 3958.
Springer, Heidelberg (2006)
Cryptographic Analysis of All 4 × 4-Bit S-Boxes
Markku-Juhani O. Saarinen
Revere Security
4500 Westgrove Drive, Suite 335, Addison, TX 75001, USA
[email protected]
1 Introduction
Horst Feistel introduced the Lucifer cipher, which can be considered to be the first
modern block cipher, some 40 years ago. Feistel followed closely the principles outlined
by Claude Shannon in 1949 [36] when designing Lucifer. We quote from Feistel’s 1971
patent text [20]:
Shannon, in his paper, presents further developments in the art of cryptog-
raphy by introducing the product cipher. That is, the successive application
of two or more distinctly different kinds of message symbol transformations.
One example of a product cipher consists of symbol substitution (nonlinear
transformation) followed by a symbol transposition (linear transformation).
Cryptographic algorithms are still designed in 2011 according to these same principles.
A key element of Lucifer’s symbol substitution layer was a pair of 4 × 4-bit substitution
boxes (S-Boxes).
Much research effort has been dedicated to the analysis of 4-bit S-Boxes in subse-
quent encryption algorithms during last the four decades. In this paper we present an
analysis of all bijective 4-bit S-Boxes in the light of modern cryptanalytic techniques,
together with comparison tables of 4-bit S-Boxes found in a dozen different published
encryption algorithm proposals.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 118–133, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Cryptographic Analysis of All 4 × 4-Bit S-Boxes 119
2 S-Box Properties
In the context of cryptographic operations, arithmetic is assumed to be performed on
variables, vectors, or matrices whose individualelements belong to the finite field F2 .
Vectors are indexed from 0. We write wt(x) = xi to denote the Hamming weight of
the bit vector (word) x.
We will first give definitions related to Differential Cryptanalysis [4,5], Linear Crypt-
analysis (LC) [32], and various forms of Algebraic / Cube Cryptanalysis (AC) [16,17].
Definition 1. Let S be an S-Box with |S| input values. Let n be the number of elements
x that satisfy S(x ⊕ Δi ) = S(x) ⊕ Δo . Then n/|S| is the differential probability p of
the characteristic SD (Δi → Δo ).
For 4 × 4 bijective S-Boxes the optimal differential bound (maximum of all differentials
in an individual S-Box) is p = 1/4.
Definition 2. Let S be an S-Box with |S| input values. Let n be the number of elements
x that satisfy wt(βi · x ⊕ βo · S(x)) mod 2 = 1 for two bit-mask vectors βi and βo .
n
Then abs( |S| − 12 ) is the bias of the linear approximation SL (βi → βo ).
n
It is well known that all 22 functions f from n bits to a single bit can be uniquely
expressed by a polynomial function with coefficients drawn from the Algebraic Normal
Form fˆ, which has the same domain as f :
yn−1
f (x) = fˆ(y)xy00 xy11 · · · xn−1 .
y∈Fn
2
This transformation from f to fˆ can also be seen to be equivalent to the Walsh transform
[35].
Definition 3. The algebraic degree deg(f ) of a function f : Fn2 → F2 is the maximal
weight wt(x) that satisfies fˆ(x) = 0.
In other words, the degree of f is the number of variables in the biggest monomial in the
polynomial representation of f . Naturally the maximum degree for a 4-bit function is
4. This monomial exists in the polynomial representation exactly when f (0) = f (15).
We define S-Box branch number similarly to the way it is defined in [39].
120 M.-J.O. Saarinen
ci co
4 × 4 - Bit
x Mi Mo S (x)
S-box S(x)
Fig. 1. Linear Equivalence (LE) and Permutation-XOR equivalence (PE). Mi and Mo boxes de-
note multiplication by an invertible matrix for LE and by a permutation matrix for PE.
where a, b ∈ Fn2 .
It is clear that for a bijective S-Box the branch number is at least 2.
Table 1. The canonical representatives of the 16 “optimal” linear equivalence classes. The Gi and
G−1
i identifier references are to Table 6 of [31]. We also give the DC and LC bounds, together
with the number nd of characteristics at the differential bound and the number nl of approxima-
tions at the linear bound. The branch BN number given is the maximal branch number among all
members of the given LE class.
Algebraic Properties. While the maximal algebraic degree of all output bits may be
preserved in LE [31], some of the output bits may still be almost linear. It is notewor-
thy that despite belonging to LE(G1 ), one of the PRESENT output bits only has one
nonlinear monomial (of degree 2) and therefore this output bit depends only linearly on
2 of the input bits. This can be crucial when determining the number of secure rounds;
final rounds can be peeled off using such properties.
Historical Developments. The original DES S-Box design principles are described
in [10]. In hindsight it can be seen that the criteria given in that 1976 document al-
ready offer significantly better resistance against primitive DC and LC than what can
be achieved with entirely random S-Boxes [11]. For a perspective on the development
of DES and the evaluation of its S-Boxes between the years 1975 and 1990 we refer to
[13]. We may compare our current view on the topic of “good” S-Boxes to that given
by Adams and Tavares in 1990 [2]. Four evaluation criteria for S-Boxes were given in
that work: bijectivity, nonlinearity, strict avalanche, and independence of output bits. In
current terminology nonlinearity would map to the algebraic degree, strict avalanche to
the branch number, and independence of output bits roughly to both DC and LC. Note
that modern DC, LC, and AC were (re)discovered after 1990.
Cryptographic Analysis of All 4 × 4-Bit S-Boxes 123
Bit # 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Hex
Word W0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0x00FF
Word W1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0x0F0F
Word W2 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0x3333
Word W3 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0x5555
Fig. 2. Our internal 4×16-bit representation of the identity permutation (0, 1, . . . , 15). The words
are always stored in increasing order and the highest bit is normalized to zero.
for some bit permutation P of numbers (0, 1, 2, 3), a vector c ∈ F42 and words Wi =
15 i 15
j=0 2 Wi,j satisfying 0 < W0 < W1 < W2 < W3 < 2 .
Proof. Output bits can be permuted in 4! = 24 different ways (as each Wi must be dif-
ferent from others) and each one of the 24 = 16 masks c creates a different permutation
due to the limit Wi < 215 . P and c uniquely define the 4!24 = 384 outer transforma-
tions while Wi uniquely defines the rest.
This representation offers a natural and quick way to normalize a S-Box in respect to
the outer permutation Po and mask co by sorting the four words and inverting all bits of
a word if the highest bit is set. Figure 2 illustrates this ordering.
16From
the fact that S is bijective it follows that wt(Wi ) = 8 for all Wi . There are
8 = 12, 870 16-bit words of weight 8, of which we may remove half due to the
co normalization limit Wi < 215 , yielding 6, 535 candidates. Furthermore, each word
has a minimal equivalent up to permutation among all input permutations Pi and input
constants ci . We call this minimal word mw(x). At program start, a table is initialized
that contains mw(x) for each 16-bit word by trying all 24 permutations of input bits
and 16 values of ci on the 4 × 1-bit Boolean function that the word x represents. If the
resulting word is greater or equal to 215 (indicating that the highest bit is set) all bits of
the word are inverted, normalizing the constant. Each one of the wt(x) = 8 candidates
map to a set of just 58 different mw(x) values.
124 M.-J.O. Saarinen
Table 2. Distribution of PE classes. The first column gives the number of elements in each class.
The second column |Cn | gives the number of such classes, followed by their product, which sums
to 16! = 20, 922, 789, 888, 000 as expected.
n
|Cn | n |Cn | Representative
4!24
1 2 768 0123456789ABCDEF
4 4 6144 01234567FEDCBA98
6 1 2304 01237654BA98CDEF
8 4 12288 0123456879ABCDEF
12 30 138240 0123456798BADCFE
16 18 110592 0123457689BADCFE
24 192 1769472 0123456789ABFEDC
32 104 1277952 0123456789ABCDFE
48 1736 31997952 0123456789ABCEDF
64 264 6488064 012345678ACD9EBF
96 13422 494788608 0123456789ABDEFC
128 324 15925248 0123456789ADCEBF
192 373192 27514699776 0123456789ABCEFD
384 141701407 20894722670592 0123456789ACBEFD
1–384 142090700 20922789888000
Table 3. Distribution of the 16! permutations in relation to Differential Cryptanalysis (rows) and
Linear Cryptanalysis (columns)
now left with exactly 28 = 256 options for the last word W3 . In lines 10–12 we store
in vector vi the lesser bit from the two-bit mask ui . In lines 13–20 we loop through the
remaining W3 possibilities. In line 14 we use the bit i of the loop index b to select which
one of the two bits in ui is used as part of W3 . Note that this part may be implemented
a bit faster with a Gray-code sequence.
The unique permutation is then tested by the subroutine on line 19 to see if it is the
least member of its class (here an early exit strategy will usually exit the exhaustive loop
early). If (W0 , W1 , W2 , W3 ) is indeed the canonical member in the special ordering that
we’re using, it is stored on on disk together with the size of the class. The entire process
of creating the 1.4 GB file takes about half an hour with a 2011 consumer laptop.
Table 4. Golden S-Boxes with ideal properties are all members of these four PE classes. Both
the S-Boxes and their inverses satisfy the bounds p ≤ 1/4, ≤ 1/4, have branch number 3, all
output bits have algebraic degree 3 and are dependent on all input bits in nonlinear fashion. n
gives the total size of the class and n the number of members which additionally have a perfect
cycle structure.
PE Representative LE n n’
035869C7DAE41FB2 G9 147456 19584
03586CB79EADF214 G9 147456 19584
03586AF4ED9217CB G10 147456 22656
03586CB7A49EF12D G10 147456 22656
S-Boxes. Each class size is divisible by 4!24 = 384 due to the fact that the output
bits can be permuted 4! = 24 ways and the output constant co can have 24 = 16 dif-
ferent values. However, it is less obvious how the inner transform defined by Pi and
ci affect the size of the class together with S. For example, for the identity permuta-
tion (0123456789ABCDEF) the bit shuffles Pi and Po and constant additions ci and
co may be presented with a single bit permutation and addition of constant and hence
hence n = 384. It is interesting to note that that there is one other class with this size,
the one with the largest canonical representative, (07BCDA61E952348F).
Table 3 gives the distribution of differential and linear properties among the 16!
S-Boxes examined. It can be seen that a majority, 54.7155% of all S-Boxes have a
differential bound p ≤ 3/4 and linear bound ≤ 3/4. There are no bijective S-Boxes
with differential bound p = 7/8. Appendix A gives results on some well-known 4-bit
S-Boxes.
5 Golden S-Boxes
Based on our exhaustive search, we may describe golden S-Boxes that have ideal prop-
erties. From Table 1 we see that the most tempting candidates belong to the LE sets
of G9 , G10 , G14 , and G15 as they have the smallest nd and nl numbers among those
S-Boxes that have branch number 3. Note that LE(G14 ) = LE(G−1 15 ) and vice versa.
The only problem with G14 and G15 in comparison to G9 and G10 is that if we want
the branch number to be larger than 2, there are no S-Boxes in these classes that have
the desired property that all output bits are nonlinearly dependent on all input bits and
have degree 3. Either the permutation or its inverse will not satisfy this condition. This
has been verified with exhaustive search. All golden S-Boxes belong to the four PE
classes given in Table 4.
The Serpent [1] S-Box S3, Hummingbird-1 [18] S-Boxes S1, S2, and S3 and
Hummingbird-2 [19] S-Boxes1 S0 and S1 are the only known examples of “golden”
S-Boxes in literature. Note that cipher designers may want to avoid re-using the same
LE class in multiple S-Boxes and hence not all can be “golden”. Please see Appendix
A for a more detailed comparison.
1
Hummingbird-2 was tweaked in May 2011 to use these S-Boxes, and they are also contained
in [19]. Some early prototypes used S-Boxes from Serpent.
Cryptographic Analysis of All 4 × 4-Bit S-Boxes 127
6 Conclusions
We have analyzed all 16! bijective 4 × 4-bit S-Boxes and classified them into linear
equivalence (LE) and permutation equivalence (PE) classes. Members of a LE class
have equivalent differential and linear bounds but not necessarily branch number, alge-
braic properties and circuit complexity. Members of PE classes share these properties.
Each equivalence class can be uniquely identified with the use of a canonical represen-
tative, which we define to be the member which is first in lexicographic ordering of the
class members.
There are 142,090,700 different PE classes, the vast majority (99.7260%) of which
have (4!24 )2 = 147456 elements. We classify the S-Boxes according to their differen-
tial and linear properties. It turns out that that a majority (54.7155%) of S-Boxes have
differential bound p ≤ 3/4 and linear bound ≤ 3/4.
Furthermore, we have discovered that not all of the “optimal” S-Boxes described
in [31] are equal if we take the branch number and multiple differential and linear
cryptanalysis into account.
In an appendix we give comparison tables of the S-Boxes from Lucifer [37], Present
[9], JH [41], ICEBERG [38], LUFFA [15] NOEKEON [12], HAMSI [30], Serpent [1],
Hummingbird-1 [18], Hummingbird-2 [19], GOST [14,23,24] and DES [33].
Acknowledgements. The author wishes to thank Whitfield Diffie and numerous other
commentators for their input. This work is still ongoing.
References
1. Anderson, R., Biham, E., Knudsen, L.: Serpent: A Proposal for the Advanced Encryption
Standard (1999), http://www.cl.cam.ac.uk/~rja14/Papers/serpent.pdf
2. Adams, C., Tavares, S.: The Structured Design of Cryptographically Good S-Boxes. Journal
of Cryptology 3(1), 27–41 (1990)
3. Biham, E.: A Fast New DES Implementation in Software. In: Biham, E. (ed.) FSE 1997.
LNCS, vol. 1267, pp. 260–272. Springer, Heidelberg (1997)
4. Biham, E., Shamir, A.: Differential Cryptanalysis of DES-Like Cryptosystems. In:
Menezes, A., Vanstone, S.A. (eds.) CRYPTO 1990. LNCS, vol. 537, pp. 2–21. Springer,
Heidelberg (1991)
5. Biham, E., Shamir, A.: Differential Cryptanalysis of the Data Encryption Standard.
Springer, Heidelberg (1993)
6. Biryukov, A., De Cannière, C., Braeken, A., Preneel, B.: A Toolbox for Cryptanalysis:
Linear and Affine Equivalence Algorithms. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS,
vol. 2656, pp. 33–50. Springer, Heidelberg (2003)
7. Biryukov, A., De Cannière, C., Quisquater, M.: On Multiple Linear Approximations. In:
Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 1–22. Springer, Heidelberg (2004)
8. Blondeau, C., Gérard, B.: Multiple Differential Cryptanalysis: Theory and Practice. In:
Joux, A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 35–54. Springer, Heidelberg (2011)
9. Bogdanov, A.A., Knudsen, L.R., Leander, G., Paar, C., Poschmann, A., Robshaw, M.J.B.,
Seurin, Y., Vikkelsoe, C.: PRESENT: An Ultra-Lightweight Block Cipher. In: Paillier, P.,
Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466. Springer, Heidelberg
(2007)
128 M.-J.O. Saarinen
10. Branstad, D.K., Gait, J., Katzke, S.: Report of the Workshop on Cryptography in Support of
Computer Security. Tech. Rep. NBSIR 77-1291, National Bureau of Standards (September
1976)
11. Coppersmith, D.: The Data Encryption Standard (DES) and its strength against attacks.
IBM Journal of Research and Development Archive 38(3) (May 1994)
12. Daemen, J., Peeters, M., Van Assche, G., Rijmen, V.: Nessie Proposal: NOEKEON.
NESSIE Proposal (October 27, 2000)
13. Denning, D.: The Data Encryption Standard – Fifteen Years of Public Scrutiny. In: Dis-
tinguished Lecture in Computer Security, Sixth Annual Computer Security Applications
Conference, Tucson, December 3-7 (1990)
14. Dolmatov, V. (ed.): GOST 28147-89: Encryption, Decryption, and Message Authentication
Code (MAC) Algorithms. Internet Engineering Task Force RFC 5830 (March 2010)
15. De Cannière, C., Sato, H., Watanabe, D.: Hash Function Luffa - Specification Ver. 2.0.1.
NIST SHA-3 Submission, Round 2 document (October 2, 2009)
16. Courtois, N.T., Pieprzyk, J.: Cryptanalysis of Block Ciphers with Overdefined Systems of
Equations. In: Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501, pp. 267–287. Springer,
Heidelberg (2002)
17. Dinur, I., Shamir, A.: Cube Attacks on Tweakable Black Box Polynomials. In: Joux, A.
(ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 278–299. Springer, Heidelberg (2009)
18. Engels, D., Fan, X., Gong, G., Hu, H., Smith, E.M.: Hummingbird: Ultra-Lightweight
Cryptography for Resource-Constrained Devices. In: Sion, R., Curtmola, R., Dietrich, S.,
Kiayias, A., Miret, J.M., Sako, K., Sebé, F. (eds.) RLCPS, WECSR, and WLC 2010. LNCS,
vol. 6054, pp. 3–18. Springer, Heidelberg (2010)
19. Engels, D., Saarinen, M.-J.O., Schweitzer, P., Smith, E.M.: The Hummingbird-2
Lightweight Authenticated Encryption Algorithm. In: RFIDSec 2011, The 7th Workshop
on RFID Security and Privacy, Amherst, Massachusetts, USA, June 26-28 (2011)
20. Feistel, H.: Block Cipher Cryptographic System. U.S.Patent 3,798,359 (Filed June 30,
1971)
21. Hermelin, M., Nyberg, K.: Dependent Linear Approximations: The Algorithm of Biryukov
and Others Revisited. In: Pieprzyk, J. (ed.) CT-RSA 2010. LNCS, vol. 5985, pp. 318–333.
Springer, Heidelberg (2010)
22. Golomb, S.: On the classification of Boolean functions. IEEE Transactions on Information
Theory 5(5), 176–186 (1959)
23. Government Committee of the USSR for Standards. Cryptographic Protection for Data Pro-
cessing System. GOST 28147-89, Gosudarstvennyi Standard of USSR (1989) (in Russian)
24. Government Committee of the Russia for Standards. Information technology. Crypto-
graphic Data Security. Hashing function. GOST R 34.11-94, Gosudarstvennyi Standard of
Russian Federation (1994) (in Russian)
25. Hiltgen, A.P.: Constructions of Feebly-One-Way Families of Permutations. In: Zheng, Y.,
Seberry, J. (eds.) AUSCRYPT 1992. LNCS, vol. 718, pp. 422–434. Springer, Heidelberg
(1993)
26. Hiltgen, A.P.: Towards a Better Understanding of One-Wayness: Facing Linear Permuta-
tions. In: Nyberg, K. (ed.) EUROCRYPT 1998. LNCS, vol. 1403, pp. 319–333. Springer,
Heidelberg (1998)
27. Hirsch, E.A., Nikolenko, S.I.: A Feebly Secure Trapdoor Function. In: Frid, A., Moro-
zov, A., Rybalchenko, A., Wagner, K.W. (eds.) CSR 2009. LNCS, vol. 5675, pp. 129–142.
Springer, Heidelberg (2009)
28. Intel: Intel Advanced Vector Extensions Programming Reference. Publication 319433-010,
Intel (April 2011)
Cryptographic Analysis of All 4 × 4-Bit S-Boxes 129
29. Kaliski Jr., B.S., Robshaw, M.J.B.: Linear Cryptanalysis Using Multiple Approximations.
In: Desmedt, Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 26–39. Springer, Heidelberg
(1994)
30. Küçük, Ö.: The Hash Function Hamsi. NIST SHA-3 Submission, Round 2 document
(September 14, 2009)
31. Leander, G., Poschmann, A.: On the Classification of 4 Bit S-Boxes. In: Carlet, C., Sunar,
B. (eds.) WAIFI 2007. LNCS, vol. 4547, pp. 159–176. Springer, Heidelberg (2007)
32. Matsui, M.: Linear Cryptanalysis Method for DES Cipher. In: Helleseth, T. (ed.) EURO-
CRYPT 1993. LNCS, vol. 765, pp. 386–397. Springer, Heidelberg (1994)
33. National Bureau of Standards: Data Encryption Standard. FIPS PUB 46. National Bureau
of Standards, U.S. Department of Commerce, Washington D.C. (January 15, 1977)
34. Poschmann, A.: Lightweight Cryptography - Cryptographic Engineering for a Pervasive
World. Doktor-Ingenieur Thesis, Ruhr-University Bochum, Germany. Also available as
Cryptology ePrint Report 2009/516 (2009)
35. Saarinen, M.-J.O.: Chosen-IV Statistical Attacks Against eSTREAM CIPHERS. In: Proc.
SECRYPT 2006, International Conference on Security and Cryptography, Setubal, Portu-
gal, August 7-10 (2006)
36. Shannon, C.E.: Communication Theory of Secrecy Systems. Bell System Technical Jour-
nal 28, 656–717 (1949)
37. Sorkin, A.: Lucifer: A cryptographic algorithm. Cryptologia 8(1), 22–42 (1984)
38. Standaert, F.-X., Piret, G., Rouvroy, G., Quisquater, J.-J., Legat, J.-D.: ICEBERG: An In-
volutional Cipher Efficient for Block Encryption in Reconfigurable Hardware. In: Roy, B.,
Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp. 279–299. Springer, Heidelberg (2004)
39. Ullrich, M., De Cannière, C., Indesteege, S., Kü¸, Ö., Mouha, N., Preneel, B.: Finding Op-
timal Bitsliced Implementations of 4 × 4-bit S-Boxes. In: SKEW 2011 Symmetric Key
Encryption Workshop, Copenhagen, Denmark, February 16-17 (2011)
40. Wegener, I.: The complexity of Boolean functions. WileyTeubner series in computer
science. Wiley, Teubner (1987)
41. Wu, H.: The Hash Function JH. NIST SHA-3 Submission, Round 3 document (January 16,
2011)
130
A Cryptographic Analysis of Some Well-Known 4 × 4 - Bit S-Boxes
Algorithm & Source: A normative identifier for the S-Box in question, together with a literary reference.
S-Box: The S-Box permutation S(x) in Hex.
Canonical PE: The lexicographically smallest member of the Permutation-XOR equivalence class PE(S).
Lin Eqv.: The linear equivalence class LE(s).
One Δ: number of instances where flipping a single input bit will cause single output bit to change (out of 64).
M.-J.O. Saarinen
BN #: Branch number.
DC: Differential bound p and the number nd of characteristics at that bound.
LC: Linear bias and the number nl of linear approximations at that bound.
Bit n: The linear set LS of input bits that only have linear effect on this output bit, together with its degree.
D8A13F42B67C05E9 0358BDC6E92F741A /8 6 /8 4
131
132
3 3
DES S3-0 [33] 7DE3069A1285BC4F 035869BECFA412D7 0 3 /8 6 /8 4 {} 3 {} 3 {} 3 {} 3
3 3
DES S3-1 [33] D8B56F03472C1AE9 035869BECFA412D7 0 3 /8 6 /8 4 {} 3 {} 3 {} 3 {} 3
3 3
DES S3-2 [33] A690CB7DF13E5284 035869BECFA412D7 0 3 /8 6 /8 4 {} 3 {} 3 {} 3 {} 3
3 3
DES S3-3 [33] 3F06A1D8945BC72E 035869BECFA412D7 0 3 /8 6 /8 4 {} 3 {} 3 {} 3 {} 3
1 3
DES S4-0 [33] 2C417AB6853FD0E9 03586DF47EA1CB92 0 3 /2 1 /8 3 {} 3 {} 3 {0,2} 2 {} 3
3 3
DES S4-1 [33] EB2C47D150FA3986 035869BECF241AD7 0 3 /8 5 /8 3 {} 3 {} 3 {} 3 {} 3
3 3
DES S4-2 [33] 421BAD78F9C5630E 03586DF2A49E1BC7 0 3 /8 2 /8 2 {} 3 {} 3 {} 3 {} 3
3 3
DES S4-3 [33] B8C71E2D6F09A453 03586AB79CE2F14D 0 3 /8 5 /8 3 {} 3 {1} 3 {} 3 {} 3
1 3
DES S5-0 [33] C1AF92680D34E75B 03586DF29EA4CB17 0 3 /4 24 /8 1 {} 3 {} 3 {} 3 {} 3
1 3
DES S5-1 [33] AF427C9561DE0B38 0358749FDAB6E12C 0 3 /2 1 /8 3 {} 3 {} 3 {} 3 {} 3
3 3
DES S5-2 [33] 9EF528C3704A1DB6 035869BEA4CFD721 0 3 /8 6 /8 4 {} 3 {1} 3 {} 3 {} 3
3 3
DES S5-3 [33] 432C95FABE17608D 035874BEF6ADC912 0 3 /8 3 /8 2 {3} 2 {} 3 {0} 3 {} 3
1 3
DES S6-0 [33] 4B2EF08D3C975A61 03586CB79EF2A14D 0 3 /2 1 /8 4 {} 3 {2} 3 {0} 2 {} 3
1 3
DES S6-1 [33] D0B7491AE35C2F86 03586DF47ECBA192 0 3 /2 1 /8 2 {} 3 {} 3 {} 3 {0,2} 2
3 3
DES S6-2 [33] 14BDC37EAF680592 035869BECFA412D7 0 3 /8 6 /8 4 {} 3 {} 3 {} 3 {} 3
1 3
DES S6-3 [33] 6BD814A7950FE23C 035869B7F4AD1EC2 0 3 /2 2 /8 5 {2} 3 {2} 3 {0} 3 {} 3
3 3
DES S7-0 [33] D2846FB1A93E50C7 03589CE2F6AD4B71 0 3 /8 4 /8 1 {2} 3 {3} 2 {} 3 {} 3
5 3
DES S7-1 [33] 1FD8A374C56B0E92 03587ACF96EB4D21 0 3 /8 1 /8 5 {} 3 {} 3 {} 3 {2} 3
3 3
DES S7-2 [33] 7B419CE206ADF358 035869BEF4ADC217 0 3 /8 5 /8 3 {} 3 {1} 3 {0} 3 {} 3
1 3
DES S7-3 [33] 21E74A8DFC90356B 035678EB9F2CA4D1 0 3 /2 1 /8 4 {} 3 {} 3 {1,3} 2 {} 3
Algorithm S-Box Canonical PE Lin. One BN DC LC Bit 0 Bit 1 Bit 2 Bit 3
& Source 0123456789ABCDEF 0123456789ABCDEF Eqv. Δ # p nd nl LS deg LS deg LS deg LS deg
1 1
Serpent S0 [1] 38F1A65BED42709C 0358749EF62BADC1 G2 0 3 /4 24 /4 36 {} 3 {} 3 {} 3 {1,2} 2
1 1
Serpent S1 [1] FC27905A1BE86D34 035A7CB6D429E18F G0 0 3 /4 24 /4 36 {} 3 {} 3 {2,3} 2 {} 3
1 1
Serpent S2 [1] 86793CAFD1E40B52 035869A7BCE21FD4 G1 0 3 /4 24 /4 36 {1,3} 2 {} 3 {} 3 {} 3
1 1
Serpent S3 [1] 0FB8C963D124A75E 03586CB79EADF214 G9 0 3 /4 18 /4 32 {} 3 {} 3 {} 3 {} 3
1 1
Serpent S4 [1] 1F83C0B6254A9E7D 035879BEADF4C261 G14 0 3 /4 18 /4 32 {2} 2 {} 3 {} 3 {} 3
1 1
Serpent S5 [1] F52B4A9C03E8D671 035879BEADF4C261 G14 0 3 /4 18 /4 32 {2} 2 {} 3 {} 3 {} 3
1 1
Serpent S6 [1] 72C5846BE91FD3A0 0358BC6FE9274AD1 G1 0 3 /4 24 /4 36 {} 3 {1,2} 2 {} 3 {} 3
1 1
Serpent S7 [1] 1DF0E82B74CA9356 035869C7DAE41FB2 G9 0 3 /4 18 /4 32 {} 3 {} 3 {} 3 {} 3
1 1
Serpent−1 S0 [1] D3B0A65C1E47F982 035A7CB6D429E18F G0 0 3 /4 24 /4 36 {} 3 {} 3 {2,3} 2 {} 3
1 1
Serpent−1 S1 [1] 582EF6C3B4791DA0 0358749EF62BADC1 G2 0 3 /4 24 /4 36 {} 3 {} 3 {} 3 {0,2} 2
1 1
Serpent−1 S2 [1] C9F4BE12036D58A7 03586CB7AD9EF124 G1 0 3 /4 24 /4 36 {0} 2 {} 3 {} 3 {} 3
1 1
Serpent−1 S3 [1] 09A7BE6D35C248F1 035869C7DAE41FB2 G9 0 3 /4 18 /4 32 {} 3 {} 3 {} 3 {} 3
1 1
Serpent−1 S4 [1] 5083A97E2CB64FD1 03586CF1A49EDB27 G15 0 3 /4 18 /4 32 {} 3 {} 3 {} 3 {} 3
1 1
Serpent−1 S5 [1] 8F2941DEB6537CA0 03586CF1A49EDB27 G15 0 3 /4 18 /4 32 {} 3 {} 3 {} 3 {} 3
1 1
Serpent−1 S6 [1] FA1D536049E72C8B 03567ABCD4E9812F G1 0 3 /4 24 /4 36 {} 3 {1,3} 2 {} 3 {} 3
1 1
Serpent−1 S7 [1] 306D9EF85CB7A142 03586CB79EADF214 G9 0 3 /4 18 /4 32 {} 3 {} 3 {} 3 {} 3
3 1
GOST K1 [14] 4A92D80E6B1C7F53 01243DFA856B97EC 14 2 /8 2 /4 36 {} 3 {} 3 {} 3 {} 3
3 3
GOST K2 [14] EB4C6DFA23810759 01254DC68BE3F79A 14 2 /8 3 /8 2 {} 3 {} 3 {} 3 {} 3
3 3
GOST K3 [14] 581DA342EFC7609B 01254EB97AF38D6C 14 2 /8 5 /8 3 {} 3 {} 3 {} 3 {} 3
3 3
GOST K4 [14] 7DA1089FE46CB253 0132586FC79DBEA4 8 2 /8 5 /8 3 {} 3 {} 3 {} 3 {3} 3
1 3
GOST K5 [14] 6C715FD84A9E03B2 0124B78EDF6CA359 12 2 /4 21 /8 1 {} 3 {} 3 {} 3 {} 3
3 3
GOST K6 [14] 4BA0721D36859CFE 01273CFAB85ED649 4 2 /8 2 /8 2 {} 3 {1} 3 {} 3 {} 3
1 3
GOST K7 [14] DB413F590AE7682C 01256D8BCA47F3E9 12 2 /2 1 /8 2 {} 3 {} 3 {} 3 {} 3
1 3
GOST K8 [14] 1FD057A4923E6B8C 012546F8EB7A39CD 12 2 /2 1 /8 4 {} 3 {} 3 {} 3 {} 3
Cryptographic Analysis of All 4 × 4-Bit S-Boxes
133
The Cryptographic Power of Random Selection
1 Introduction
The very limited computational resources available in technical devices like RFID
(radio frequency identification) tags implied an intensive search for lightweight
authentication protocols in recent years. Standard block encryption functions
like Triple-DES or AES seem to be not suited for such protocols largely because
the amount of hardware to implement and the energy consumption to perform
these operations is too high (see, e.g., [7] or [17] for more information on this
topic).
This situation initiated two lines of research. The first resulted in proposals
for new lightweight block encryption functions like PRESENT [4], KATAN and
KTANTAN [10] by use of which standard block cipher-based authentication
protocols can be made lightweight, too. A second line, and this line we follow in
the paper, is to look for new cryptographic paradigms which allow for designing
new symmetric lightweight authentication protocols. The two main suggestions
discussed so far in the relevant literature are the principle of random selection
and the principle of adding biased noise.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 134–150, 2012.
c Springer-Verlag Berlin Heidelberg 2012
The Power of Random Selection 135
The principle of adding biased noise to the output of a linear basis function
underlies the HB-protocol, originally proposed by Hopper and Blum [16] and
later improved to HB+ by Juels and Weis [17], as well as its variants HB# and
Trusted-HB (see [13] and [6], respectively). The protocols of the HB-family are
provably secure against passive attacks with respect to the Learning Parity with
Noise Conjecture but the problem to design HB-like protocols which are secure
against active adversaries seems to be still unsolved (see, e.g., [14], [21], [12]).
The principle of random selection underlies, e.g., the CKK-protocols of Ci-
choń, Klonowski, and Kutylowski [7] as well as the Ff -protocols in [3] and the
Linear Protocols in [18]. It can be described as follows.
Suppose that the verifier Alice and the prover Bob run a challenge-response
authentication protocol which uses a lightweight symmetric encryption operation
m
E : {0, 1}n ×K −→ {0, 1} of block length n, where K denotes an appropriate key
space. Suppose further that E is weak in the sense that a passive adversary can
efficiently compute the secret key K ∈ K from samples of the form (u, EK (u)).
This is obviously the case if E is linear.
Random selection denotes a method for compensating the weakness of E by
using the following mode of operation. Instead of holding a single K ∈ K, Alice
and Bob share a collection K1 , . . . , KL of keys from K as their common secret
information, where L > 1 is a small constant. Upon receiving a challenge u ∈
{0, 1}n from Alice, Bob chooses a random index l ∈ {1, . . . , L} and outputs the
response y = E(u, Kl ). The verification of y with respect to u can be efficiently
−1
done by computing EK l
(y) for all l = 1, . . . , L.
The main problem this paper is devoted to is to determine the level of security
which can be reached by applying this principle of random selection.
Note that the protocols introduced in [7], [3], and [18] are based on random
selection of GF (2)-linear functions. The choice of linear basis functions is moti-
vated by the fact that they can be implemented efficiently in hardware and have
desirable pseudo-random properties with respect to a wide range of important
statistical tests.
It is quite obvious that, with respect to passive adversaries, the security
of protocols which use random selection of linear functions can be bounded
from above by the complexity of the following learning problem referred to as
RandomSelect (L, n, a): Learn GF (2)-linear functions f1 , . . . , fL : {0, 1}n −→
a
{0, 1} from values (u, fl (u)), where the secret indices l ∈ {1, . . . , L} and the
inputs u ∈ {0, 1}n are randomly chosen by an oracle. In order to illustrate
this notion, we sketch in appendix B how an efficient learning algorithm for
RandomSelect (L, n, a) can be used for attacking the linear (n, k, L)+ -protocol
described by Krause and Stegemann [18].
In this paper, we present an algebraic attack approach for solving the above
learning problem RandomSelect (L, n, a). The running time of our algorithm is
dominated by the effort necessary to solve a full-rank system of linear equa-
tions of O(nL ) unknowns over the field GF (2a ). Note that trivial approaches for
solving RandomSelect (L, n, a) lead to a running time exponential in n.
136 M. Krause and M. Hamann
2 The Approach
We fix positive integers n, a, L and secret GF (2)-linear functions f1 , . . . , fL :
n a
{0, 1} −→ {0, 1} . The learner seeks to deduce specifications of f1 , . . . , fL from
an oracle which outputs in each round an example (u, w) ∈ {0, 1}n × {0, 1}a in
the following way. The oracle chooses independently and uniformly a random
n
input u ∈U {0, 1} , then chooses secretly a random index l ∈U [L]1 , computes
w = fl (u) and outputs (u, w).
It is easy to see that RandomSelect
can be efficiently solved in the case
L = 1 by collecting examples u1 , w1 , . . . , (um , wm ) until u1 , . . . , um contains
a basis of GF (2)n . The expected number of iterations until the above goal is
reached can be approximated by n + 1.61 (see, e.g., the appendix in [11]).
We will now treat the case L > 1, which immediately yields a sharp rise in
difficulty. First we need to introduce the notion of a pure basis.
Definition 1. Let us call a set V = u1 , w1 , . . . , (un , wn ) of n examples a
pure basis, if u1 ,. . ., un is a basis of GF (2)n and there exists an index l ∈ [L]
such that wi = fl ui is satisfied for all i = 1, . . . , n.
Recalling our preliminary findings, we can easily infer that for m ∈ Ln + Ω (1),
a set of m random examples contains such a pure
basis with high probability.
Moreover, note that for a given set Ṽ = ũ1 , w̃1 , . . . , (ũn , w̃n ) the pure basis
property can be tested efficiently. The respective strategy makes use of the fact
that in case of a random example (u, w), where u = i∈I ũi and I ⊆ [n]2 , the
probability p that w = i∈I w̃i holds is approximately L−1 if Ṽ is pure and at
most (2 · L)−1 otherwise. The latter estimate is based onthe trivial observation
that if Ṽ is not a pure basis, it contains at least one tuple ũj , w̃j , j ∈ [n], which
would have to be exchanged to make the set pure. As j ∈ I holds true for half
of all possible (but valid) examples, the probability that w = i∈I w̃i is fulfilled
−1
although Ṽ is not pure can be bounded from above by (2 · L) .
However, it seems to be nontrivial to extract a pure basis from a set of m ∈
Ln + Ω (1) examples. Exhaustive search among all subsets of size n yields a
running time exponential in n. This can be shown easily by applying Stirling’s
formula3 to the corresponding binomial coefficient m n .
1
2
integerN , we denote by [N ] the set {1, . . . , N }.
For a positive
Let B = v 1 , . . . , v n denote a basis spanning the vector space V . It is a simple
algebraic fact that every vector v ∈ V has a unique representation I ⊆ [n] over B,
i.e., v = i∈I v i .
3
Stirling’s
√ formula
n is an approximation for large factorials and commonly written
n! ≈ 2πn ne .
138 M. Krause and M. Hamann
x1i ⊕w · ...· xL
i ⊕w = 0. (1)
i∈I i∈I
L
wL−j tJ,j = wL , (2)
J⊆I,1≤|J|≤L j=|J|
L = min {L, |I|}, where the basis polynomials tJ,j are defined as
tJ,j = mg
g,|dom(g)|=j,im(g)=J
for all J ⊆ [n], 1 ≤ |J| ≤ L, and all j, |J| ≤ j ≤ L. The corresponding monomials
mg are in turn defined as
mg = xlg(l)
l∈dom(g)
for all partial mappings g from [L] to [n], where dom (g) denotes the domain of
g and im (g) denotes its image.
Let T (n, L) = {tJ,j | J ⊆ [n] , 1 ≤ |J| ≤ L, |J| ≤ j ≤ L} denote the set of all
basis polynomials tJ,j which may appear as part of equation (2). Moreover, we
define
b
a
Φ (a, b) =
i=0
i
L
n
|T (n, L)| = (L − j + 1)
j=1
j
L
n−1
= (L + 1) (Φ (n, L) − 1) − n
j=1
j−1
= (L + 1) (Φ (n, L) − 1) − nΦ (n − 1, L − 1) . (3)
Consequently, each set of examples V = u1 , w1 , . . . , (um , wm ) yields a sys-
tem of m degree-L equations in the xli -variables, which can be written as m K-
linear equations in the tJ,j -variables. In particular, the strong solution A ∈ K n×L
satisfies the relation
where
– K n×L denotes the set of all (n × L)-matrices with coefficients from K,
– M (V) is an (m × |T (n, L)|)-matrix built by the m linear equations of type
(2) corresponding to the examples in V,
– W (V) ∈ K m is defined by W (V)i = wiL 4 for all i = 1, . . . , m,
– t (A) ∈ K T (n,L) is defined by t (A) = (tJ,j (A))J⊆[n],1≤|J|≤L,|J|≤j≤L .
Note that in section 3, we will treat the special structure of M (V) in further
detail. Independently, it is a basic fact from linear algebra that if M (V) has full
column rank, then the linear system (4) has the unique solution t (A), which we
will call the weak solution.
Our learning algorithm proceeds as follows:
(1) Grow a set of examples V until M (V) has full column rank |T (n, L)|.
(2) Compute the unique solution t (A) of system (4), i.e., the weak solution of our
learning problem, by using an appropriate algorithm which solves systems
of linear equations over K.
(3) Compute the strong solution A from t (A).
We discuss the correctness and running time of steps (1) and (2) in section 3
and an approach for step (3) in section 4.
holds for an example (u, w), then in the corresponding row of M (V) the two
coefficients indexed by t{1},(L−1) and t{1},(L−|K min |) are always equal.
If 1 ∈
/ I holds for an example (u, w), then the coefficient with coordinates
(u, w) and t{1},(L−1) in M (V) as well as the coefficient with coordinates (u, w)
and t{1},(L−|K min |) in M (V)
equals 0.
Consequently, if K min < L holds, then the column of M (V) indexed by
t{1},(L−1) equals the column indexed by t{1},(L−|K|) for any set V of examples,
i.e., M (V) can never achieve full column rank.
Corollary 1. If K is chosen such that |K| < L, then the columns of M (V) are
linearly dependent for any set V of examples, i.e., a unique weak solution does
not exist.
While we are now aware of a lower bound for the size of K, it yet remains to prove
that step (1) of our learning algorithm is, in fact, correct. This will be achieved
by introducing the ((2n |K|) × |T (n, L)|)-matrix M ∗ = M ({0, 1}n × K), which
clearly corresponds to the set of all possible examples, and showing that M ∗ has
full column rank |T (n, L)| if L ≤ |K| holds.
The Power of Random Selection 141
Lemma 2.1. For all J ⊆ [n], 1 ≤ |J| ≤ L, and j, |J| ≤ j ≤ L, and examples
(u, w) ∈ {0, 1} × K, it holds that μJ,j (w, u) equals the coefficient in M ∗ which
n
For i = 1, . . . , |K|, we denote by ki the i-th element of the finite field K. More-
over, we suppose the convention that 00 = 1 in K. Let (u, w) be an example
defined as above and keep in mind that we are treating the case L ≤ |K|. It
should be observed that the coefficients in the corresponding equation of type
(2) are given by wL−j , where 1 ≤ j ≤ L. Thus, the set of possible exponents
{L − j | 1 ≤ j ≤ L} is bounded from above by (L − 1) < L ≤ |K|. It follows
straightforwardly from Lemma 2.1 that the (distinct) columns of M ∗ are columns
of the matrix W ⊗ B ⊗n , where
!
j 10
W = ki and B = .
i=1,...,|K|,j=0,...,|K|−1 11
is the unique weak solution determined previously. But before we lay out how
(and under which conditions) a strong solution A can be found, we need to
introduce the following two definitions along with an important theorem linking
them:
Definition 2. Let for all vectors x ∈ K L the signature sgt (x) of x be defined
as sgt (x) = (|x|k )k∈K , where |x|k denotes the number of components of x which
equal k.
Furthermore, consider the following new family of polynomials:
Definition 3. a) For all L ≥ 1 and j ≥ 0 let the simple symmetric polynomial
sj over the variables x1 , . . . , xL be defined by s0 = 1 and
sj = mS ,
S⊆[L],|S|=j
where mS = i∈S xi for all S ⊆ [L]. Moreover, we denote
for all x ∈ K L .
b) Let n, L, 1 ≤ L ≤ n, hold as well as j, 0 ≤ j ≤ L, and J ⊆ [n]. The
symmetric polynomial sJ,j : K n×L −→ K is defined by
sJ,j (A) = sj Ai,·
i∈J
The Power of Random Selection 143
Building on this result, we can then prove the following proposition, which is
of vital importance for computing the strong solution A on the basis of the
corresponding weak solution t (A):
Theorem 4. Let A ∈ K n×L and t (A) be defined as before. For each subset
I ⊆ [n] of rows of A, the signature of the sum of these rows, i.e., sgt i∈I Ai,· ,
can be computed by solely using information derived from t (A), in particular,
without knowing the underlying matrix A itself.
Proof: We first observe that the s-polynomials can be written as linear combi-
nations of the t-polynomials. Trivially, the relation t{i},j = s{i},j holds for all
i ∈ [n] and j, 1 ≤ j ≤ L. Moreover, for all I ⊆ [n], |I| > 1, it holds that
⎛ ⎞
sI,j = ⎝ mg ⎠ = tQ,j . (5)
Q⊆I,1≤|Q|≤j g:[L]−→[n],|dom(g)|=j,im(g)=Q Q⊆I,1≤|Q|≤j
Note that for all J ⊆ [n] and j, |J| ≤ j ≤ L, relation (5) implies
tJ,j = sJ,j ⊕ tQ,j . (6)
Q⊂J
As we have seen now that, under certain conditions, it is possible to fully recon-
struct the strong solution A by solely resorting to information obtained from the
weak solution t (A), we can proceed to actually describe a conceivable approach
for step (3) of the learning algorithm:
We choose a constant error parameter d and an exponent a, i.e., K = GF (2a ),
in such a way that Theorem 5 can be applied. Note that L ≤ n and |K| ∈ nO(1) .
In a pre-computation, we generate two databases DB1 and DB2 of size nO(L) .
While DB1 acts as a lookup table with regard to the one-to-one relation between
s (x) and sgt (x) for all x ∈ K L , we use DB2 to store all triples of signatures
S, S , S̃ for which there is exactly one solution pair x, y ∈ K L fulfilling sgt (x) = S
and sgt (y) = S as well as sgt (x ⊕ y) = S̃.
Given t (A), i.e., the previously determined weak solution, we then compute
sgt (Ai,· ) for all i, 1 ≤ i ≤ n, and sgt (Ai,· ⊕ Aj,· ) for all i, j, 1 ≤ i < j ≤ n, in
time nO(1) by using DB1 and relation (5), which can be found in the proof of
Theorem 4. According to Theorem 5, it is now possible to reconstruct A by the
help of database DB2 with probability at least 1 − d1 .
5 Experimental Results
To showcase the detailed workings of our learning algorithm as well as to evaluate
its efficiency at a practical level, we created a complete implementation using
the computer algebra system Magma. In case of success, it takes approximately
90 seconds on standard PC hardware (Intel i7, 2.66 GHz, with 6 GB RAM) to
compute the unique strong solution on the basis of a set of 10,000 randomly
generated examples for n = 10, a = 3 (i.e., K = GF (2a )), and L = 5. Relating
to this, we performed various simulations in order to assess the corresponding
probabilities, which were already discussed in sections 3 and 4 from a theoretical
point of view.
The experimental results summarized in table 1 clearly suggest that if |K|
is only slightly larger than the number L of secret linear functions, then in all
likelihood, M (V) will eventually reach full (column) rank |T (n, L)|, thus allowing
for the computation of a unique weak solution. Moreover, in accordance with
The Power of Random Selection 145
Table 1. An estimate of the rank of M (V) on the basis of all possible valid observations
for up to 10,000 randomly generated instances of RandomSelect (L, n, a). For each
choice of parameters, |T (n, L)| denotes number of columns of M (V) as defined in
section 2 and listed in table 2.
Table 2. An estimate of the number of randomly generated examples (u, fl (u)) which
have to be processed (in case of success) until the matrix M (V) reaches full column
rank |T (n, L)|. Given a probability p, we denote by Qp the p-quantile of the respective
sample.
Parameters Performed Iterations i.e., randomly chosen A ∈U K n×L
A not sgt (2)-identifiable A was sgt (2)-identifiable Total
n K L Number Ratio Number Ratio Number
2
4 GF 2 2 0 0.00 % 10,000 100.00 % 10,000
4 GF 22 3 69 0.69 % 9,931 99.31 % 10,000
4 GF 22 4 343 3.43 % 9,657 96.57 % 10,000
6 GF 23 4 0 0.00 % 10,000 100.00 % 10,000
8 GF 23 4 0 0.00 % 10,000 100.00 % 10,000
8 GF 23 6 0 0.00 % 1,000 100.00 % 1,000
8 GF 23 7 0 0.00 % 1,000 100.00 % 1,000
8 GF 23 8 0 0.00 % 100 100.00 % 100
9 GF 23 8 0 0.00 % 100 100.00 % 100
6 Discussion
The running time of our learning algorithm for RandomSelect (L, n, a) is dom-
inated by the complexity of solving a system of linear equations with |T (n, L)|
unknowns. Our hardness conjecture is that this complexity also constitutes a
The Power of Random Selection 147
lower bound to the complexity of RandomSelect (L, n, a) itself, which would im-
ply acceptable cryptographic security for parameter choices like n = 128 and
L = 8 or n = 256 and L = 6. The experimental results summarized in the previ-
ous section clearly support this view. Consequently, employing the principle of
random selection to design new symmetric lightweight authentication protocols
might result in feasible alternatives to current HB-based cryptographic schemes.
A problem of independent interest is to determine the complexity of recon-
structing an sgt (r)-identifiable matrix A from the signatures of all sums of at
most r rows of A. Note that this problem is wedded to determining the complex-
ity of RandomSelect (L, n, a) with respect to an active learner, who is able to
receive examples (u, w) for inputs u of his choice, where w = fl (u) and l ∈U [L] is
randomly chosen by the oracle. It is easy to see that such learners can efficiently
compute sgt (f1 (u) , . . . , fL (u)) by repeatedly asking for u. As the approach for
reconstructing A which was outlined in section 4 needs a data structure of size
exponential in L, it would be interesting to know if there are corresponding
algorithms of time and space costs polynomial in L.
From a theoretical point of view, another open problem is to determine the
probability that a random (n × L)-matrix over K is sgt (r)-identifiable for some
r, 2 ≤ r ≤ L. Based on the results of our computer experiments, it appears more
than likely that the lower bound derived in Theorem 5 is far from being in line
with reality and that identifiable matrices occur with much higher probability
for fields K of significantly smaller size.
References
1. Armknecht, F., Krause, M.: Algebraic Attacks on Combiners with Memory. In:
Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 162–175. Springer, Heidelberg
(2003)
2. Arora, S., Ge, R.: New algorithms for learning in presence of errors (submitted,
2010), http://www.cs.princeton.edu/~rongge/LPSN.pdf
3. Blass, E.-O., Kurmus, A., Molva, R., Noubir, G., Shikfa, A.: The Ff -family of pro-
tocols for RFID-privacy and authentication. In: 5th Workshop on RFID Security,
RFIDSec 2009 (2009)
4. Bogdanov, A., Knudsen, L.R., Leander, G., Paar, C., Poschmann, A., Robshaw,
M.J.B., Seurin, Y., Vikkelsoe, C.: PRESENT: An Ultra-Lightweight Block Cipher.
In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466.
Springer, Heidelberg (2007)
5. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra system. I. The user
language. J. Symbolic Comput. 24(3-4), 235–265 (1997)
6. Bringer, J., Chabanne, H.: Trusted-HB: A low cost version of HB+ secure against
a man-in-the-middle attack. IEEE Trans. Inform. Theor. 54, 4339–4342 (2008)
7. Cichoń, J., Klonowski, M., Kutylowski, M.: Privacy Protection for RFID with Hid-
den Subset Identifiers. In: Indulska, J., Patterson, D.J., Rodden, T., Ott, M. (eds.)
PERVASIVE 2008. LNCS, vol. 5013, pp. 298–314. Springer, Heidelberg (2008)
8. Courtois, N.: Fast Algebraic Attacks on Stream Ciphers with Linear Feedback. In:
Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 176–194. Springer, Heidelberg
(2003)
148 M. Krause and M. Hamann
9. Courtois, N., Meier, W.: Algebraic Attacks on Stream Ciphers with Linear Feed-
back. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 345–359.
Springer, Heidelberg (2003)
10. De Cannière, C., Dunkelman, O., Knežević, M.: KATAN and KTANTAN — A
Family of Small and Efficient Hardware-Oriented Block Ciphers. In: Clavier, C.,
Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 272–288. Springer, Heidelberg
(2009)
11. Golebiewski,
Z., Majcher, K., Zagórski, F.: Attacks on CKK Family of RFID Au-
thentication Protocols. In: Coudert, D., Simplot-Ryl, D., Stojmenovic, I. (eds.)
ADHOC-NOW 2008. LNCS, vol. 5198, pp. 241–250. Springer, Heidelberg (2008)
12. Frumkin, D., Shamir, A.: Untrusted-HB: Security vulnerabilities of Trusted-HB.
Cryptology ePrint Archive, Report 2009/044 (2009), http://eprint.iacr.org
13. Gilbert, H., Robshaw, M.J.B., Seurin, Y.: HB# : Increasing the security and effi-
ciency of HB+ . In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp.
361–378. Springer, Heidelberg (2008)
14. Gilbert, H., Robshaw, M.J.B., Sibert, H.: Active attack against HB+ : A provable
secure lightweight authentication protocol. Electronic Letters 41, 1169–1170 (2005)
15. Goldreich, O., Levin, L.A.: A hard-core predicate for all one-way functions. In: Pro-
ceedings of the Twenty-First Annual ACM Symposium on Theory of Computing
(STOC), pp. 25–32. ACM Press (1989)
16. Hopper, N.J., Blum, M.: Secure Human Identification Protocols. In: Boyd, C. (ed.)
ASIACRYPT 2001. LNCS, vol. 2248, pp. 52–66. Springer, Heidelberg (2001)
17. Juels, A., Weis, S.A.: Authenticating Pervasive Devices with Human Protocols. In:
Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 293–308. Springer, Heidelberg
(2005)
18. Krause, M., Stegemann, D.: More on the Security of Linear RFID Authentication
Protocols. In: Jacobson Jr., M.J., Rijmen, V., Safavi-Naini, R. (eds.) SAC 2009.
LNCS, vol. 5867, pp. 182–196. Springer, Heidelberg (2009)
19. Krause, M., Hamann, M.: The cryptographic power of random selection. Cryptol-
ogy ePrint Archive, Report 2011/511 (2011), http://eprint.iacr.org/
20. Meier, W., Pasalic, E., Carlet, C.: Algebraic Attacks and Decomposition of Boolean
Functions. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS,
vol. 3027, pp. 474–491. Springer, Heidelberg (2004)
21. Ouafi, K., Overbeck, R., Vaudenay, S.: On the Security of HB# against a Man-in-
the-Middle Attack. In: Pieprzyk, J. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp.
108–124. Springer, Heidelberg (2008)
22. Regev, O.: On lattices, learning with errors, random linear codes, and cryptogra-
phy. In: Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of
Computing (STOC), pp. 84–93. ACM Press (2005)
23. Courtois, N.T., Klimov, A.B., Patarin, J., Shamir, A.: Efficient Algorithms for
Solving Overdefined Systems of Multivariate Polynomial Equations. In: Preneel,
B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 392–407. Springer, Heidelberg
(2000)
The Power of Random Selection 149
Please refer to [19] for the full version of this paper including the proofs of
Theorems 3 and 5.
The following outline of an attack on the (n, k, L)+ -protocol by Krause and
Stegemann [18] is meant to exemplify the immediate connection between the
previously introduced learning problem RandomSelect (L, n, a) and the security
of this whole new class of lightweight authentication protocols. Similar to the
basic communication mode described in the introduction, the (n, k, L)+ -protocol
is based on L n-dimensional, injective linear functions F1 , . . . , FL : GF (2)n −→
GF (2)n+k (i.e., the secret key) and works as follows.
Each instance is initiated by the verifier Alice, who chooses a random vector
a ∈U GF (2)n/2 and sends it to Bob, who then randomly (i.e., independently and
uniformly) chooses l ∈U [L] along with an additional value b ∈U GF (2)n/2 , in or-
n+k
der to compute his response w = Fl (a, b). Finally, Alice accepts w ∈ GF (2)
−1
if there is some l ∈ [L] with w ∈ Vl and the prefix of length n/2 of Fl (w)
n+k
equals a, where Vl denotes the n-dimensional linear subspace of GF (2) cor-
responding to the image of Fl .
This leads straightforwardly to a problem called Learning Unions of L Lin-
ear Subspaces (LULS), where an oracle holds the specifications of L secret n-
dimensional linear subspaces V1 , . . . , VL of GF (2)n+k , from which it randomly
chooses examples v ∈U Vl for l ∈U [L] and sends them to the learner. Knowing
only n and k, he seeks to deduce"the specifications of V1 , . . . , VL from a suffi-
L
ciently large set {w1 , . . . , ws } ⊆ l=1 Vl of such observations. It is easy to see
that this corresponds to a passive key recovery attack against (n, k, L)-type pro-
tocols. Note that there is a number of exhaustive search strategies to solve this
problem, e.g., the generic exponential time algorithm called search-for-a-basis
heuristic, which was presented in the appendix of [18].
It should be noted that an attacker who is able to solve the LULS problem
needs to perform additional steps to fully break the (n, k, L)+ -protocol as im-
personating the prover requires to send responses w ∈ GF (2)n+k which not only
"L
fulfill w ∈ l=1 Vl but also depend on some random nonce a ∈ GF (2)n/2 pro-
vided by the verifier. However, having successfully obtained the specifications of
the secret subspaces V1 , . . . , VL allows in turn for generating a specification of
the image of Fl (a, ·) for each l ∈ [L] by repeatedly sending an arbitrary but fixed
(i.e., selected by the attacker) a ∈ GF (2)n/2 to the prover. Remember that, al-
though the prover chooses a random l ∈U [L] each time he computes a response
w based on some fixed a, an attacker who has determined V1 , . . . , VL will know
which subspace the vector w actually belongs to. Krause and Stegemann pointed
out that this strategy allows for efficiently constructing specifications of linear
150 M. Krause and M. Hamann
for all l ∈ [L] and a, b ∈ GF (2)n/2 [18]. Hence, the efficiently obtained specifi-
cations of the functions ((G1 , . . . , GL ) , (g1 , . . . , gL )) are equivalent to the actual
secret key (F1 , . . . , FL ). However, keep in mind that the running time of this
attack is dominated by the effort needed to solve the LULS problem first and
that RandomSelect (L, n, a) in fact refers to a special case of the LULS problem,
which assumes that the secret subspaces have the form
Sourav Sen Gupta1 , Subhamoy Maitra1 , Goutam Paul2 , and Santanu Sarkar1
1
Applied Statistics Unit, Indian Statistical Institute, Kolkata 700 108, India
2
Dept. of Computer Science and Engg., Jadavpur University, Kolkata 700 032, India
{sg.sourav,sarkar.santanu.bir}@gmail.com, [email protected],
[email protected],
1 Introduction
RC4 is one of the most popular stream ciphers for software applications. Designed
by Ron Rivest in 1987, the algorithm of RC4 has two parts; Key Scheduling
(KSA) and Pseudo-Random Generation (PRGA), presented in Table 1.
Given a secret key k of length l bytes, an array K of size N bytes is created
to hold the key such that K[y] = k[y mod l] for all y ∈ [0, N − 1]. Generally, N is
chosen to be 256. The first part of the cipher, KSA, uses this K to scramble an
initial identity permutation {0, 1, . . . , N − 1} to obtain a ‘secret’ state S. Then
the PRGA generates keystream bytes from this initial state S, which are used for
encrypting the plaintext. Two indices i (deterministic) and j (pseudo-random)
are used in KSA as well as PRGA to point to the locations of S. All additions
in the RC4 algorithm are performed modulo N .
After r (≥ 1) rounds of RC4 PRGA, we denote the variables by Sr , ir , jr , zr
and the output index Sr [ir ] + Sr [jr ] by tr . After r rounds of KSA, we denote the
same by adding a superscript K to each variable. By S0K and S0 , we denote the
initial permutations before KSA and PRGA respectively. Note that S0K is the
K
identity permutation and S0 = SN .
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 151–168, 2012.
c Springer-Verlag Berlin Heidelberg 2012
152 S. Sen Gupta et al.
Existing Results. In SAC 2010, Sepehrdad, Vaudenay and Vuagnoux [12] have
reported experimental results of an exhaustive search for biases in all possible
linear combinations of the state variables and the keystream bytes of RC4. In
the process, they have discovered many new biases that are significantly high
compared to random association. Some of these biases were further shown to be
useful for key recovery in WEP [3] mode. In a recent work [13] at Eurocrypt 2011,
the same authors have utilized the pool of all existing biases of RC4, including
a few reported in [12], to mount a distinguishing attack on WPA [4].
In the above approach, RC4 is treated as a black box, where the secret key
bytes are the inputs, the permutation and the index j are internal state vari-
ables and the keystream bytes are the outputs. The goal of [12] was to find out
empirical correlations between the inputs, internal state and the outputs and no
attempt was made to theoretically prove these biases. Finding empirical biases
without any justification or proof may be useful from application point of view.
However, cryptanalysis is a disciplined branch of science and a natural quest in
RC4 cryptanalysis should be: Where do all these biases come from?
Motivation. We felt three primary reasons behind a theoretical investigation
into the source and nature of these biases.
– We attempt to build a framework to analyze the biases and their origin.
– In the process of proving the existing biases, one may need to consider some
additional events and thus may end up discovering new biases, leading to
further insight into the cipher. We have observed some interesting events
with strong biases, which have not yet been reported in the literature.
– When there is a conditional bias in the event ‘A given B’, there may be
three reasons behind it: either some subset of A directly causes B or some
subset of B directly causes A or another set C of different events cause
both A and B. Just from empirical observation, it is impossible to infer
what is the actual reason behind the bias. Only a theoretical study can shed
light upon the interplay between the events. Our observations and analysis
suggest that some conditional biases reported in [12] are possibly of the third
kind discussed above and this provides us with some interesting new results
depending on the length of the RC4 secret key.
Proof of Empirical RC4 Biases and New Key Correlations 153
this target list, general biases refer to the ones occurring in all initial rounds
of PRGA (1 ≤ r ≤ N − 1), whereas the specific ones have been reported only
for rounds 1 and 2 of PRGA. We do not consider the biases reported for rounds
0 mod 16 in this section, as they are of order 1/N 2 or less.
For the proofs and numeric probability calculations in this paper, we re-
quire [6, Theorem 6.3.1], restated as Proposition 1 below.
Proposition 1. At the end of RC4 KSA, for 0 ≤ u ≤ N − 1, 0 ≤ v ≤ N − 1,
⎧ & '
⎨ 1 N −1 v + 1 − N −1 v N −1 N −u−1 if v ≤ u;
N &
'
N N N
Pr(S0 [u] = v) =
⎩ 1 N −1 N −u−1 + N −1 v if v > u.
N N N
Proof. Note that j1 = S0 [1] and S1 [i1 ] = S0 [j1 ]. So, in the case j1 = S0 [1] = 1,
we will have j1 + S0 [j1 ] = S0 [1] + S0 [1] = 2 with probability 1. Otherwise,
the
probability turns out to be Pr(j1 +S0 [j1 ] = 2 & j1 = S0 [1] = 1) = X=1 Pr(X +
S0 [X] = 2 & S0 [1] = X). Thus, the probability
Pr(j1 +S1 [i1 ] = 2) can be written
as Pr(j1 +S1 [i1 ] = 2) = Pr(S0 [1] = 1)+ X=1 Pr(S0 [X] = 2−X)·Pr(S0 [1] = X),
as desired. Hence the claimed result.
Numerical Values. If we consider the practical RC4 scheme, the probabilities
involving S0 in the expression for Pr(j1 + S1 [i1 ] = 2) should be evaluated using
Proposition 1, giving a total probability of approximately 1.937/N for N = 256.
This closely matches the observed value 1.94/N . If we assume that RC4 PRGA
starts with a truly pseudorandom initial state S0 , the probability turns out to
be approximately 2/N − 1/N 2 ≈ 1.996/N for N = 256, i.e., almost twice that
of a random occurrence.
Theorem 2. After the second round (r = 2) of RC4 PRGA, the following prob-
ability relations hold between the index j2 and the state variables S2 [i2 ], S2 [j2 ].
Proof. In Equation (1), we have j2 +S2 [j2 ] = (j1 +S1 [2])+S1 [i2 ] = S0 [1]+2·S1 [2].
In this expression, note that if S0 [1] = 2, then one must have the positions 1
and 2 swapped in the first round of PRGA, and thus S1 [2] = S0 [1] = 2 as well.
This provides one path for j2 + S2 [j2 ] = S0 [1] + 2 · S1 [2] = 2 + 2 × 2 = 6, with
probability Pr(S0 [1] = 2) · 1 ≈ N1 . If on the other hand, S0 [1] = X = 2, we have
Pr(j2 + S2 [j2 ] = 6 & S0 [1] = 2) = X=2 Pr(X + 2 · S1 [2] = 6 & S0 [1] = X).
Note that the value of X is bound to be even and for each such value of X, the
variable S1 [2] can take 2 different values to satisfy the equation 2 · S1 [2] = 6 − X.
Thus, we have X=2 Pr(2 · S1 [2] = 6 − X & S0 [1] = X) ≈ X even, X=2 N2 ·
Pr(S0 [1] = X). Combining the two disjoint cases S0 [1] = 2 and S0 [1] = 2, we get
Equation (1).
In case of Equation (2), we have a slightly different condition S0 [1]+2·S1[2] =
S2 [i2 ] = S1 [j2 ] = S1 [S0 [1] + S1 [2]]. In this expression, if we have S1 [2] = 0, then
the left hand side reduces to S0 [1] and the right hand side becomes S1 [S0 [1] +
S1 [2]] = S1 [S0 [1]] = S1 [j1 ] = S0 [i1 ] = S0 [1] as well. This provides a probability
N path for the condition to be true. In all other cases with S1 [2] = 0, we can
1
approximate the probability for the condition as N1 , and hence approximate the
total probability Pr(j2 + S2 [j2 ] = S2 [i2 ]) as Pr(j2 + S2[j2 ] = S2 [i2 ] & S1 [2] =
0) + Pr(j2 + S2 [j2 ] = S2 [i2 ] & S1 [2] = 0) ≈ N1 + 1 − N1 · N1 = N2 − N12 .
Finally, for Equation (3), the main observation is that this is almost iden-
tical to the condition of Equation (2) apart from the inclusion of z2 . But our
first path of S1 [2] = 0 in the previous case also provides us with z2 = 0 with
probability 1 (this path was first observed by Mantin and Shamir [7]). Thus,
we have Pr(j2 + S2 [j2 ] = S2 [i2 ] + z2 & S1 [2] = 1) ≈ N1 · 1. In all other cases
with S1 [2] = 0, we assume the conditions to match uniformly at random, and
therefore have Pr(j2 + S2 [j2 ] = S2 [i2 ] + z2 ) ≈ N1 · 1 + 1 − N1 · N1 = N2 − N12 .
Hence the desired results of Equations (1), (2) and (3).
Proof. For both the events mentioned above, we shall take the path ir = jr .
Notice that ir = jr occurs with probability N1 and in that case both the events
mentioned above hold with probability 1. In the case where ir = jr , we rewrite
the events as Sr [jr ] = (ir − jr ) + Sr [ir ] and Sr [jr ] = (jr − ir ) + Sr [ir ]. Here
we already know that Sr [jr ] = Sr [ir ], as jr = ir and Sr is a permutation. Thus
in case ir = jr , the values of Sr [ir ] and Sr [jr ] can be chosen in N (N − 1)
ways (drawing from a permutation without replacement) to satisfy the relations
stated above. This gives the total probability for each event approximately as
Pr(jr = ir ) · 1 + jr =ir N (N1−1) = N1 + (N − 1) · N (N1−1) = N2 . Hence the claimed
result for Equations (4) and (5).
The probabilities for New noz 001 and New noz 002 proved in Theorem 3 do
not vary with change in r (i.e., they continue to persist at the same order of
2/N at any arbitrary round of PRGA), and our theoretical results match the
probabilities reported in [12, Fig. 2].
Lemma 1. After the first round of RC4 PRGA, the probability Pr(S1 [t] = r) is
⎧ N −1
⎨ X=0 Pr(S0 [1] = X) · Pr(S0 [X] = r), t = 1;
Pr(S1 [t] = r) = Pr(S0 [1] = r) + (1 − Pr(S0 [1] = r)) · Pr(S0 [r] = r), t = r;
⎩
(1 − Pr(S0 [1] = t)) · Pr(S0 [t] = r), t = 1, r.
Proof. After the first round of RC4 PRGA, we obtain the state S1 from the
initial state S0 through a single swap operation between the positions i1 = 1
and j1 = S0 [i1 ] = S0 [1]. Thus, all other positions of S0 remain the same apart
from these two. This gives us the value of S1 [t] as follows: S1 [t] = S0 [S0 [1]] if
t = 1, S1 [t] = S0 [1] if t = S0 [1], and S1 [t] = S0 [t] in all other cases. Now, we can
compute the probabilities Pr(S1 [t] = r) based on the probabilities for S0 , which
are in turn derived from Proposition 1. We have three cases:
– Case t = 1. In this case, using the recurrence relation S1 [1] = S0 [S0 [1]], we
N −1
can write Pr(S1 [1] = r) = X=0 Pr(S0 [1] = X) · Pr(S0 [X] = r).
– Case t = r. In this situation, if S0 [1] = r, we will surely have S1 [r] = r
as these are the positions swapped in the first round, and if S0 [1] = r, the
position t = r remains untouched and S1 [r] = r is only possible if S0 [r] = r.
Thus, Pr(S1 [r] = r) = Pr(S0 [1] = r) + (1 − Pr(S0 [1] = r)) · Pr(S0 [r] = r).
– Case t = 1, r. In all other cases where t = 1, r, it can either take the value
S0 [1] with probability Pr(S0 [1] = t), or not. If t = S0 [1], the value S0 [t] will
get swapped with S0 [1] = t itself, i.e., we will get S1 [t] = t = r for sure.
Otherwise, the value S1 [t] remains the same as S0 [t]. Hence, Pr(S1 [t] = r) =
(1 − Pr(S0 [1] = t)) · Pr(S0 [t] = r).
Combining all the above cases together, we obtain the desired result.
Proof of Theorem 4. Let us start from the PRGA state S1 , that is, the state
that has been updated once in the PRGA (we refer to the state after KSA by
S0 ). We know that the event Pr(S1 [r] = r) is positively biased for all r, and
hence the natural path for investigation is the effect of the event (S1 [r] = r) on
(Sr−1 [r] = r), i.e, on (Sr [jr ] = ir ). Notice that there can be two cases, as follows.
Case I. In the first case, suppose that (S1 [r] = r) after the first round, and the
r-th index is not disturbed for the next r − 2 state updates. Notice that index
i varies from 2 to r − 1 during these period, and hence never touches the r-th
index. Thus, the index r will retain its state value r if index j does not touch
r−2
it. The probability of this event is 1 − N1 over all the intermediate rounds.
r−2
Hence the first part of the probability is Pr(S1 [r] = r) 1 − N1 .
Case II. In the second case, suppose that S1 [r] = r and S1 [t] = r for some
t = r. In such a case, only a swap between the positions r and t during rounds
2 to r − 1 of PRGA can make the event (Sr−1 [r] = r) possible. Notice that if
t does not fall in the path of i, that is, if the index i does not touch the t-th
location, then the value at S1 [t] can only go to some position behind i, and this
can never reach Sr−1 [r], as i can only go up to (r − 1) during this period. Thus
we must have 2 ≤ t ≤ r − 1 for S1 [t] to reach Sr−1 [r]. Note that the way S1 [t]
can move to the r-th position may be either a one hop or a multi-hop route.
158 S. Sen Gupta et al.
– In the easiest case of single hop, we require j not to touch t until i touches t,
and j = r when i = t, and j not to touch r for the next r−t−1 state updates.
t−2 1 r−t−1
Total probability comes to be Pr(S1 [t] = r) 1 − N1 · N · 1 − N1 =
1 r−3
Pr(S1 [t] = r) · N 1 − N
1
.
– Suppose that it requires (k + 1) hops to reach from S1 [t] to Sr−1 [r]. Then
the main issue to note is that the transfer will never happen if the position t
swaps with any index which does not lie in the future path of i. Again, this
path of i starts from r−t−1N for the first hop and decreases approximately to
r−t−1
lN at the l-th hop. We would also require j not to touch the position r
for the remaining (r − 3 − k) number of rounds. & Combining ' ( all, we get the
k )
1 r−3−k
second part of the probability as Pr(S1 [t] = r) r−t−1
l=1 lN 1 − N =
(
Pr(S1 [t]=r) r−t−1 k) ( ) r−3−k
k!·N N 1 − N1 .
Finally, note that the number of hops (k+1) is bounded from below by 1 and from
above by (r − t + 1), depending on the initial gap between t and r positions.
Considering the sum over t and k with this consideration, we get the desired
expression for Pr(Sr−1 [r] = r).
Fig. 1 illustrates the experimental observations (averages taken over 100 mil-
lion runs with 16-byte key) and the theoretical values for the distribution of
Pr(Sr [jr ] = ir ) over the initial rounds 3 ≤ r ≤ 255 of RC4 PRGA. It is evident
that our theoretical formula matches the experimental observations in this case.
Fig. 1. Distribution of Pr(Sr [jr ] = ir ) for initial rounds 3 ≤ r ≤ 255 of RC4 PRGA
Now let us take a look at the other two round-dependent biases of RC4,
observed in [12]. We can state the related result in Theorem 5 (corresponding
to observations New noz 004 and New 000).
Proof of Empirical RC4 Biases and New Key Correlations 159
The proof of this result is omitted for brevity, as it follows the same logic as in
the proof of Theorem 4. A brief proof sketch is presented as follows. For this
proof sketch, we consider the variables jr and tr to be pseudorandom variables
that can take any value between 0 to 255 with probability 1/N . The reader may
note that this is a crude approximation, especially for small values of r, and
causes minor mismatch with the experimental observations in the final result.
Proof-sketch for Pr(Sr [ir ] = jr ). For this probability computation, we first
rewrite the event as (Sr−1 [jr ] = jr ) to make it look similar to Sr−1 [r] = r,
as in Theorem 4. The only difference is that we were concentrating on a fixed
index r in Theorem 4 instead of a variable index jr . This produces two cases.
Case I. First, suppose that jr assumes a value X ≥ r. In this case, the proba-
bility calculation can be split in two paths, one in which S1 [X] = X is assumed,
and the other in which S1 [X] = X. If we assume S1 [X] = X, the probability
( )r−2
of (Sr−1 [X] = X) becomes Pr(S1 [X] = X) 1 − N1 , similar to the logic in
Theorem 4. If we suppose that S1 [t] = X was the initial state, then one may
notice the following two sub-cases:
– The probability for this path is identical to that in Theorem 4 if 2 ≤ t ≤ r−1.
– The probability is 0 in case t ≥ r, as in this case the value X will always be
behind the position of ir = r, whereas X > r as per assumption. That is,
the value X can never reach index X from t.
N −1 & ( )r−2
Assuming Pr(jr = X) = 1/N , this gives X=r N1 Pr(S1 [X] = X) 1 − N1
r−1 r−t Pr(S1 [t]=r) ( r−t−1 )k ( ) '
1 r−3−k
+ t=2 k=0 k!·N N 1 − N .
Case II. In the second case, we assume that jr takes a value X between 0 to r−1.
Approximately this complete range is touched by index i for sure, and may also
be touched by index j. Thus, with probability approximately 1, the index jr = X
is touched by either of the indices. Simplifying all complicated computations
involving the initial position of value X and the exact location of index X in
this case, we shall assume that the approximate value of Pr(Sr−1 [X] = X) is
1/N . Thus, the total contribution of Case II, assuming Pr(jr = X) = 1/N , is
r−1 r−1
given by X=0 Pr(jr = X) · Pr(Sr−1 [X] = X) ≈ X=0 N1 · N1 = Nr2 .
Adding the contributions of the two disjoint cases I and II, we obtain the
total probability for (Sr [ir ] = jr ) as desired. One may investigate Case II in
more details to incorporate all intertwined sub-cases, and obtain a better closed
form expression for the probability.
160 S. Sen Gupta et al.
Proof-sketch for Pr(Sr [tr ] = tr ). In this case, notice that tr is just another
random variable like jr , and may assume all values from 0 to 255 with approxi-
mately the same probability 1/N . Thus we can approximate Pr(Sr [tr ] = tr ) by
Pr(Sr−1 [jr ] = jr ) with a high confidence margin to obtain the desired expression.
This approximation is particularly close for higher values of r because the
effect of a single state change Sr−1 → Sr is low in such a case. For smaller values
of r, one may approximate Pr(Sr−1 [tr ] = tr ) by Pr(Sr−1 [jr ] = jr ) and critically
analyze the effect of the r-th round of PRGA thereafter. However, in spite of
the approximations we made, one may note that the theoretical values closely
match the experimental observations (averages taken over 100 million runs of
RC4 with 16-byte key), as shown in Fig. 2.
Fig. 2 illustrates the experimental observations (averages taken over 100 mil-
lion runs with 16-byte key) and the theoretical values for the distributions of
Pr(Sr [ir ] = jr ) and Pr(Sr [tr ] = tr ) over the initial rounds 3 ≤ r ≤ 255 of RC4
PRGA. It is evident that our theoretical formulas approximately match the ex-
perimental observations in both the cases; the cause of the little deviation is
explained in the proof sketch above.
Fig. 2. Distributions of Pr(Sr [ir ] = jr ) and Pr(Sr [tr ] = tr ) for initial rounds 3 ≤ r ≤
255 of RC4 PRGA
Apart from the biases proved so far, all other unconditional biases reported
in [12] are of order 1/N 2 or less, and we omit their analysis in this paper. The
next most significant bias reported in [12] was a new conditional bias arising
from a set of correlations in RC4 PRGA. A careful study of this new bias gives
rise to several related observations and results related to the KSA as well, as
presented in the next section.
In our notation, the above event is denoted as Pr(S16 [j16 ] = 0 | z16 = −16).
While exploring this conditional bias and related parameters of RC4 PRGA, we
could immediately observe two things:
1. The number 16 in the result comes from the keylength that is consistently
chosen to be 16 in [12] for most of the experimentation. In its general form,
the conditional bias should be stated as (crude approximation):
10
Pr (Sl [jl ] = 0 | zl = −l) ≈ . (6)
N
It is surprising why this natural observation could not be identified earlier.
2. Along the same line of investigation, we could find a family of related condi-
tional biases, stated in their general form as follows (crude approximations):
We shall henceforth denote the above event by (fl−1 = −l), following the no-
tation of Paul and Maitra [9], and this event is going to constitute the base
for most of the conditional probabilities we consider hereafter. We consider
Pr(fl−1 = −l) ≈ N1 , assuming that fl−1 can take any value modulo N uni-
formly at random.
Extensive experimentation with different keylengths (100 million runs for each
keylength 1 ≤ l ≤ 256) revealed strong bias in all of the following events:
Each of the correlations (6), (7), (8), (9), and (10) is an artifact of these com-
mon keylength-based correlations in RC4 PRGA. In this section, we discuss and
justify all these conditional biases.
162 S. Sen Gupta et al.
To prove our observations in this paper, we shall require the following existing
results from the literature of key-correlation in RC4. These are the correlations
observed by Roos [11] in 1995, which were later proved by Paul and Maitra [9].
Proposition 2. [9, Lemma 1] If index j is pseudorandom at each KSA round,
K 1+ y(y+1)
we have Pr jy+1 = fy ≈ 1 − N1 2
+ N1 .
Theorem 6. Suppose that l is the length of the secret key used in the RC4
l−1
algorithm. Given fl−1 = i=0 K[i] + l(l − 1)/2 = −l, we have
⎡ ⎤
! !N +l−2 !1+ l(l+1)
1 l 1 ⎣ 1− 1 2
1
Pr(Sl [jl ] = 0) ≈ + 1− 1− + ⎦
N N N N N
⎡ ⎤
!l−1 ! !N + l(l−1)
1 1 ⎣ 1− l − 1 1 2
1
Pr(Sl−2 [l − 1] = −l) ≈ + 1− 1− + ⎦
N N N N N
Proof. For proving the first conditional bias, we need to trace the value 0 over
KSA and the first l rounds of PRGA. We start from S0K [0] = 0, as the initial
state S0K of KSA is the identity permutation in RC4. The following gives the
trace pattern for 0 through the complete KSA and l initial rounds of PRGA. We
shall discuss some of the transitions in details.
1 p1 p2 p3 1
0 : S0K [0] −→ S1K [K[0]] −→ SlK [K[0]] −→ Sl+1
K
[l] −→ Sl−1 [l] −→ Sl [jl ]
1 l−1
Here p1 = 1 − l
N 1− N denotes the probability that index K[0] is not
1+ l(l+1)
touched by i K
and j K
in the first l rounds of KSA, p2 = 1 − N1 2
+ N1
Proof of Empirical RC4 Biases and New Key Correlations 163
K
denotes the probability Pr(jl+1 = fl = K[0]) (using Proposition 2) such that 0 is
N −2
swapped from Sl [K[0]] to Sl+1 [l], and p3 = 1 − N1
K K
denotes the probability
K
that the location Sl+1 [l] containing 0 is not touched by iK , j K in the remaining
N − l − 1 rounds of KSA or by i, j in the first l − 1 rounds of PRGA. So, this
path gives a total probability of p1 p2 p3 . If this path does not hold, we assume
that the event (Sl [jl ] = 0) still holds at random, with probability 1/N . Thus,
the total probability is obtained as
1 1 1
Pr(Sl [jl ] = 0) = p1 p2 p3 + (1 − p1 p2 p3 ) · = + 1− p1 p2 p3 .
N N N
We do a similar propagation tracking for the value fl−1 = −l to prove the second
result, and the main path for this tracking looks as follows.
p4 p5
−l : S0K [−l] −→ S0 [l − 1] −→ Sl−2 [l − 1]
N + l(l−1)
Here we get p4 = Pr(S0 [l − 1] = fl−1 ) = 1 − l−1 1 − N1 2
+ N1 using
l−2
N
Proposition 3 directly, and p5 = 1 − N1 denotes the probability that the
index (l − 1), containing −l, is not touched by i, j in the first l − 2 rounds of
PRGA. Similar to the previous proof, the total probability can be calculated as
1 1 1
Pr(Sl−2 [l − 1] = −l) = p4 p5 + (1 − p4 p5 ) · = + 1− p4 p5 .
N N N
We get the claimed results by substituting p1 , p2 , p3 and p4 , p5 appropriately.
Numerical Values. If we substitute l = 16, the most common keylength for
RC4, and N = 256, we get the probabilities of Theorem 6 of magnitude
Pr(Sl [jl ] = 0 | fl−1 = −l) ≈ Pr(Sl−2 [l − 1] = −l | fl−1 = −l) ≈ 50/N.
These are, to the best of our knowledge, the best known key-dependent conditional
biases in RC4 PRGA till date. The estimates closely match the experiments we
performed over 100 million runs with 16-byte keys. In the next theorem, we look
at a few natural consequences of these biases.
Theorem
l−1 7. Suppose that l is the length of the RC4 secret key. Given that
fl−1 = i=0 K[i] + l(l − 1)/2 = −l, the probabilities Pr(Sl [l] = −l | fl−1 = −l)
and Pr(tl = −l | fl−1 = −l) are approximately
⎡ ⎡ ⎤⎤
! !N +l−2 !1+ l(l+1)
1 1 1 l 1 1 2
1
+ 1− ·⎣ + 1− 1− ⎣ 1− + ⎦⎦
N N N N N N N
* !l−1 * !N −l ++
1 1 1 1
· + 1− 1− +
N N N N
Proof. Before proving the path for the target events, let us take a look at rounds
l −1 and l of RC4 PRGA when Sl−2 [l −1] = −l and Sl−1 [l] = 0. In this situation,
we have the following propagation for the value −l.
1 1
−l : Sl−2 [l − 1] −→ Sl−1 [jl−1 ] = Sl−1 [jl ] −→ Sl [l]
164 S. Sen Gupta et al.
In the above path, the equality holds because jl = jl−1 + Sl−1 [l] = jl−1 + 0 as per
the conditions. Again, we have Sl [jl ] = Sl−1 [l] = 0, implying tl = Sl [l] + Sl [jl ] =
−l + 0 = −l as well. This explains the same expression for the probabilities of
the two events in the statement.
Note that we require both the events (Sl [jl ] = 0 | fl−1 = −l) and (Sl−2 [l −
1] = −l | fl−1 = −l) to occur simultaneously, and need to calculate the joint
probability. Also note that there is a significant overlap between the tracking
paths of these two events, as they both assume that the first l positions of
the state S0K are not touched by j K in the first l rounds of KSA (refer to
the proof of Theorem 6 of this paper and proofs of [9, Theorem 1, Corollary
1] for details). In other words, if we assume the occurrence of event (Sl [jl ] =
0 | fl−1 = −l) (with probability p6 , as derived in Theorem 6, say), then the
precondition for (Sl−2 [l − 1] = −l | fl−1 = −l) will be satisfied, and thus the
modified conditional probability
( )l−1 &( )N −l ' [l − 1] = −l | Sl [jl ] = 0 & fl−1 =
is Pr(Sl−2
−l) = N1 + 1 − N1 1 − N1 + N1 = p7 , say. Now, we can compute the
joint probability of the two events as
1 1 1
Pr(Sl [l] = −l | fl−1 = −l) = p6 p7 + (1 − p6 p7 ) · = + 1− · p6 p7 .
N N N
Substituting the values of p6 and p7 , we obtain the desired result. Event (tl = −l)
follows immediately from (Sl [l] = −l), with the same conditional probability.
Conditional Bias in Output. We could also find that the bias in (zl = −l)
is caused due to the event fl−1 [l], but in a different path than the one we have
discussed so far. We prove the formal statement next as Theorem 8.
Theorem 8. Suppose that l is the length of the secret key of RC4. Given that
fl−1 = l−1
i=0 K[i] + l(l − 1)/2 = −l, the probability Pr(zl = −l) is approximately
! * ! !N +l−2 * !1+l ++
1 1 1 l 1 1 1
+ 1− · + 1− 1− 1− +
N N N N N N N
* !l+1 +
1 1
· + 1− Pr(S0 [S0 [l − 1]] = fl−1 )
N N
Proof. The proof is similar to that of Theorem 7 as both require Sl [jl ] = Sl−1 [l] =
0 to occur first. Note that if Sl [jl ] = Sl−1 [l] = 0, we will always have
Thus the basic intuition is to use the path S0 [S0 [l − 1]] = fl−1 = −l to get
p8 p9
−l : S0 [S0 [l − 1]] −→ Sl−2 [Sl−2 [l − 1]] −→ Sl [Sl−2 [l − 1]]
l−2 2
In the above expression, p8 = 1 − N1 and p9 = 1 − N1 denote the proba-
bilities of j not touching the state index that stores the value −l. This introduces
l
a probability 1 − N1 . Thus Pr(Sl [Sl−2 [l − 1]] = −l | fl−1 = −l) is cumulatively
( )l+1
given by N1 + 1 − N1 Pr(S0 [S0 [l − 1]] = fl−1 ) = p10 , say. Note that one of the
preconditions to prove [9, Theorem 4] is that the first (l − 1) places of state S0K
remain untouched by j K for the first l − 1 rounds of KSA. This partially matches
with the precondition to prove Pr(Sl [jl ] = 0 | fl−1 = −l) (see Theorem 6), where
we require the same for first l places over the first l rounds of KSA. Thus we de-
rive the formula for Pr(Sl [jl ] = 0 | S0 [S0 [l − 1]] = −l & &fl−1 = −l) by modifying
'
( )( )N +l−2 ( )1+l
the result of Theorem 6 as N1 + 1 − Nl 1 − N1 1 − N1 + N1 = p11 ,
say. The final probability for (zl = −l | fl−1 = −l) can now be computed as
1 1 1
Pr(zl = −l | fl−1 = −l) = p10 p11 + (1 − p10 p11 ) · = + 1− · p10 p11 .
N N N
Substituting appropriate values for p10 and p11 , we get the desired result.
Here p12 depends on the probability Pr(S0 [S0 [l − 1]] = fl−1 ) from Proposition 4.
Using these two paths, one may obtain the value of Pr(zl = −l & Sl [jl ] = 0) as
As before, Pr(fl−1 = −l) can be taken as 1/N . If one assumes that the aforemen-
tioned two paths are independent, the probabilities from Theorems 6 and 8 can
be substituted in the above expression. If one further assumes that the events
occur uniformly at random if fl−1 = −l, the values of Pr(Sl [jl ] = 0 | zl = −l)
and Pr(zl = −l | Sl [jl ] = 0) turn out to be approximately 5/N each (for l = 16).
166 S. Sen Gupta et al.
However, our experiments show that the two paths mentioned earlier are not
entirely independent, and we obtain Pr(zl = −l & Sl [jl ] = 0 | fl−1 = −l) ≈ 5/N .
Moreover, the events are not uniformly random if fl−1 = −l; rather they are
considerably biased for a range of values of fl−1 around −l (e.g., for values like
−l + 1, −l + 2 etc.). These hidden paths contribute towards the probability
Pr(fl−1 = −l) Pr(zl = −l & Sl [jl ] = 0 | fl−1 = −l) ≈ 5/N 2 . Through a careful
treatment of the dependences and all the hidden paths, one would be able to
justify the above observations, and obtain
Pr(Sl [jl ] = 0 | zl = −l) ≈ Pr(zl = −l | Sl [jl ] = 0) ≈ 10/N.
Similar techniques for analyzing dependences and hidden paths would work for
all correlations reported in Equations 6, 7, 8, 9 and, 10.
We now shift our focus to Pr(zl = −l | fl−1 = −l) and its implications.
Numerical Values. First of all, notice that the value of Pr(zl = −l | fl−1 = −l)
depends on the value of Pr(S0 [S0 [l − 1]] = fl−1 ). Proposition 4 gives an explicit
formula for Pr(zl = −l | fl−1 = −l) for l up to 32. As l increases beyond 32, one
may check by experimentation that this probability converges approximately to
1/N . Thus, for 1 ≤ l ≤ 32, one can use the formula from Proposition 4, and
for l > 32, one may replace Pr(S0 [S0 [l − 1]] = fl−1 ) by 1/N to approximately
compute the distribution of (zl = −l | fl−1 = −l) completely. In fact, after
the state recovery attack by Maximov and Khovratovich [8], that is of time
complexity around 2241 , choosing a secret key of length l > 30 is not meaningful.
The value of Pr(zl = −l | fl−1 = −l) for some typical values of l are
12/N for l = 5 11/N for l = 8 7/N for l = 16 2/N for l = 30.
In the list above, each conditional probability is quite high in magnitude com-
pared to the natural probability of random occurrence. We try to exploit this
bias in the next section to predict the length of RC4 secret key.
References
1. Fluhrer, S.R., Mantin, I., Shamir, A.: Weaknesses in the Key Scheduling Algorithm
of RC4. In: Vaudenay, S., Youssef, A.M. (eds.) SAC 2001. LNCS, vol. 2259, pp.
1–24. Springer, Heidelberg (2001)
2. Klein, A.: Attacks on the RC4 stream cipher. Designs, Codes and Cryptogra-
phy 48(3), 269–286 (2008)
3. LAN/MAN Standard Committee. ANSI/IEEE standard 802.11b: Wireless LAN
Medium Access Control (MAC) and Physical Layer (phy) Specifications (1999)
4. LAN/MAN Standard Committee. ANSI/IEEE standard 802.11i: Amendment 6:
Wireless LAN Medium Access Control (MAC) and Physical Layer (phy) Specifi-
cations. Draft 3 (2003)
5. Maitra, S., Paul, G., Sen Gupta, S.: Attack on Broadcast RC4 Revisited. In: Joux,
A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 199–217. Springer, Heidelberg (2011)
6. Mantin, I.: Analysis of the stream cipher RC4. Master’s Thesis, The Weizmann
Institute of Science, Israel (2001),
http://www.wisdom.weizmann.ac.il/~ itsik/RC4/Papers/Mantin1.zip
7. Mantin, I., Shamir, A.: A Practical Attack on Broadcast RC4. In: Matsui, M. (ed.)
FSE 2001. LNCS, vol. 2355, pp. 152–164. Springer, Heidelberg (2002)
8. Maximov, A., Khovratovich, D.: New State Recovery Attack on RC4. In: Wagner,
D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 297–316. Springer, Heidelberg (2008)
9. Paul, G., Maitra, S.: On biases of permutation and keystream bytes of RC4 towards
the secret key. Cryptography Communications 1, 225–268 (2009)
10. Paul, G., Rathi, S., Maitra, S.: On Non-negligible bias of the first output byte of
RC4 towards the first three bytes of the secret key. Designs, Codes and Cryptog-
raphy 49(1-3), 123–134 (2008)
11. Roos, A.: A class of weak keys in the RC4 stream cipher. Two posts in sci.crypt,
message-id [email protected], [email protected] (1995),
http://marcel.wanda.ch/Archive/WeakKeys
12. Sepehrdad, P., Vaudenay, S., Vuagnoux, M.: Discovery and Exploitation of New
Biases in RC4. In: Biryukov, A., Gong, G., Stinson, D.R. (eds.) SAC 2010. LNCS,
vol. 6544, pp. 74–91. Springer, Heidelberg (2011)
13. Sepehrdad, P., Vaudenay, S., Vuagnoux, M.: Statistical Attack on RC4. In: Pa-
terson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 343–363. Springer,
Heidelberg (2011)
14. Vaudenay, S., Vuagnoux, M.: Passive–Only Key Recovery Attacks on RC4. In:
Adams, C., Miri, A., Wiener, M. (eds.) SAC 2007. LNCS, vol. 4876, pp. 344–359.
Springer, Heidelberg (2007)
15. Wagner, D.: My RC4 weak keys. Post in sci.crypt, message-id
[email protected]. (September 26, 1995),
http://www.cs.berkeley.edu/~ daw/my-posts/my-rc4-weak-keys
Combined Differential and Linear Cryptanalysis
of Reduced-Round PRINTcipher
1 Introduction
The security and privacy in constrained environments such as RFID tags and sen-
sor networks is a challenging subject in cryptography. Lightweight cryptographic
algorithms and protocols are required for this reason. Some block and stream
ciphers and hash functions are proposed to meet this requirement [2, 3, 8, 10–
12, 17, 18, 22, 27, 28]. The encryption algorithm PRINTcipher was introduced
in CHES 2010 as a lightweight block cipher by Knudsen et al. [19]. The authors
aim to build an algorithm especially suitable for integrated circuit printing.
At FSE 2011, Abdelraheem et al. [1] applied a differential attack on reduced
rounds of PRINTcipher. Their attack can break half of the rounds of the
cipher. The authors have observed that the differential distribution has a key
dependent structure. They have exploited this fact to get information about the
key bits. Their attack uses the whole codebook and has a complexity about 248
computational steps for the 48-bit version of the algorithm. The authors use
the roots of permutations to deduce the key bits which affect the key-dependent
permutations. There are also algebraic cryptanalysis and side channel analyses of
PRINTcipher [9, 31]. But, the designers noticed that side channel and related
key attacks were not their major concern in the design of PRINTcipher.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 169–184, 2012.
c Springer-Verlag Berlin Heidelberg 2012
170 F. Karakoç, H. Demirci, and A.E. Harmancı
Recently, Leander et al. [21] have announced an attack on the full round
PRINTcipher-48 for a class of 252 keys. Also Ågren et al. [15] have applied a
linear attack on 28-round PRINTcipher-48 which works for half of the keys.
Differential [7] and linear cryptanalysis [25] are the most used cryptanalysis
techniques for block ciphers. Also there are some attacks which uses combina-
tions of classical techniques such as impossible-differential [4], boomerang [30],
and differential-linear attack [20]. In the differential-linear method, the attacker
divides the cipher into two parts where a differential and a linear approximation
are constructed for the first and second parts respectively. This combined attack
method was enhanced by some other works [5, 23, 32] and applied on some ci-
phers such as IDEA and Serpent [6, 14, 16]. Also, there are some key-dependent
attacks [13, 16, 26, 29] where [16] uses a differential-linear technique.
In this work, we combine differential and linear cryptanalysis in a different
technique on PRINTcipher. We construct linear approximations to increase
the probability of differential characteristics. Using this method we have found
that for some of the keys, the probability of an r-round differential characteristic
is significantly higher than the maximum probability of r-round characteristic
claimed by the designers. We point out the special key values which induce to
this weakness and explain the mechanism behind this observation. We show that
4.54% and 0.036% of the keys are weak for 29 and 31 rounds, respectively.
This paper proceeds as follows. In Section 2, we briefly introduce the notation
we use and the PRINTcipher. In Section 3, we explain the weak key mechanism
of the cipher. Section 4 is on the cryptanalytic attacks using the observations of
the previous section. Finally, we conclude the paper in Section 5.
2.2 PRINTcipher
The PRINTcipher encryption algorithm has two versions, PRINTcipher-48
has block size of 48 bits, consists of 48 rounds and uses 80-bit key whereas
PRINTcipher-96 has block size of 96 bits, consists of 96 rounds and admits
160-bit key.
PRINTcipher has an SP-network structure where the S-box is chosen to
have the best differential and linear distributions among 3-bit functions. Each
Cryptanalysis of Reduced-Round PRINTcipher 171
round function consists of a key xoring, a bitwise permutation over 48 (resp 96)
bits, a round-constant xoring to the least significant bits, a bitwise permutation
on 3 bits and an S-box layer.
Note that the round key k = sk1 ||sk2 is identical at each round. The first b
bits of the key, sk1 , is xored to the state at the beginning of each round. After
that the following bit permutation is applied:
3i mod b − 1 for 0 ≤ i ≤ b − 2,
P (i) =
b−1 for i = b − 1,
where b ∈ {48, 96} is the block size. Then, a 6-bit or a 7-bit round constant
is added to the least significant bits of the state according to the block size.
We would like to point out that most significant bits are not affected from this
addition. This is followed by a key dependent permutation. In this layer, sk2 is
divided into 2-bits and each 2-bit is used to determine the permutation on each
3-bit of the state. This permutation is defined as follows where a1 ||a0 are the
bits of sk2 and c2 ||c1 ||c0 are the state bits:
a1 ||a0
00 c2 ||c1 ||c0
01 c1 ||c2 ||c0
10 c2 ||c0 ||c1
11 c0 ||c1 ||c2
Finally, the same S-box is applied in parallel to every 3-bit of the state. The
unique S-box used in PRINTcipher is the following:
x 0 1 2 3 4 5 6 7
S[x] 0 1 3 6 7 4 5 2
The S-box conserves one bit input difference in the same position in the output
with probability 2−2 , see Table 1.
Output Difference
000 001 010 011 100 101 110 111
000 8 0 0 0 0 0 0 0
001 0 2 0 2 0 2 0 2
010 0 0 2 2 0 0 2 2
Input 011 0 2 2 0 0 2 2 0
Difference 100 0 0 0 0 2 2 2 2
101 0 2 0 2 2 0 2 0
110 0 0 2 2 2 2 0 0
111 0 2 2 0 2 0 0 2
Similarly, it can be seen in Table 2 that i-th bit of the input of the S-box
equals to i-th bit of the output with a bias 2−2 or −2−2 .
Using the properties of the S-box mentioned in the previous section and putting
some conditions on the key bits we are able to combine a differential and a linear
characteristic resulting a differential characteristic with higher probability than
Cryptanalysis of Reduced-Round PRINTcipher 173
Output Mask
000 001 010 011 100 101 110 111
000 4 0 0 0 0 0 0 0
001 0 −2 0 2 0 2 0 2
010 0 0 2 2 0 0 2 −2
Input 011 0 2 −2 0 0 2 2 0
Mask 100 0 0 0 0 2 −2 2 2
101 0 2 0 2 2 0 −2 0
110 0 0 2 −2 2 2 0 0
111 0 2 2 0 −2 0 0 2
2−2 for one round. We have found 3 different combined characteristics each of
which puts 6-bit conditions on the key bits of PRINTcipher-48. One of the
characteristics is shown in Figure 2 and the other characteristics are given in
Figure 6 and Figure 7 in Appendix A. To express the reason of the probabil-
ity increase we focus on the characteristic shown in Figure 2. The probability
increase for the other characteristics depends on a similar reason.
In Figure 2, the dotted line and the solid line shows the differential path and
the linear path respectively. We give the following lemma to show the correla-
tion of the input-output bits of the active S-boxes in the differential path in
consecutive rounds using the linear path in Figure 2.
174 F. Karakoç, H. Demirci, and A.E. Harmancı
Lemma 1. Let the key bits of PRINTcipher-48 satisfy the following equations
sk2 [29] = 1, sk2 [28] = 1, sk2 [21] = 0, sk2 [20] = 1.
Then, the bias of the equation xi [46] ⊕ sk1 [46] ⊕ sk1 [42] ⊕ sk1 [31] = z i+2 [46] is
−2−3 where xi is the input of the i-th round and z i+2 is the input of the key
dependent permutation in the (i + 2)-th round.
Proof. Let the three input bits of the S-box be i2 i1 i0 and three output bits be
o2 o1 o0 . From the linear approximation table of the S-box, the biases of equations
i0 ⊕ o0 = 0 and i1 ⊕ o1 = 0 are −2−2 and 2−2 respectively. Using this information
we can write the following equations with the corresponding biases:
ti [42] = xi+1 [42] , = −2−2
ti+1 [31] = xi+2 [31] , = 2−2 .
Also using the following equations
xi [46] ⊕ sk1 [46] = ti [42] (since sk2 [29] = 1 and sk2 [28] = 1)
x [42] ⊕ sk1 [42] = ti+1 [31] (since sk2 [21] = 0 and sk2 [20] = 1)
i+1
The correlation of the input and output bits of the active S-boxes in consecutive
rounds helps us to give one of our main statements for the probability of the
differential characteristic shown in Figure 2.
Theorem 1. Let the key bits of PRINTcipher-48 satisfy the following equa-
tions
sk2 [30] = 0, sk2 [29] = 1, sk2 [28] = 1, sk2 [21] = 0, sk2 [20] = 1,
sk1 [46] ⊕ sk1 [42] ⊕ sk1 [31] = 1.
Then, the probability of the differential characteristic (100...00) → (100...00) →
... → (100...00) for r rounds is 2−(6+1.68×(r−3)) .
Proof. Since sk2 [30] = 0, the key dependent permutation layer keeps the dif-
ference in the leftmost bit. In the first three rounds, the probability of the dif-
ferential characteristic is 2−6 because there is no linear relation between the
input-output bits of the active S-boxes. In the fourth round, while z 4 [45] is dis-
tributed uniformly, z 4 [46] equals to x2 [46] ⊕ sk1 [46] ⊕ sk1 [42] ⊕ sk1 [31] with bias
−2−3 putting i = 2 in Lemma 1. We know that x2 [46] = 1 because only the pair
(011, 111) conserves the difference in the leftmost bit for the S-box and the corre-
sponding output pair is (110,010). Since sk1 [46] ⊕ sk1[42] ⊕ sk1 [31] = 1, we have
z 4 [46] = 1 with bias 2−3 , that is with probability 10/16. Thus, for the fourth
round the input pair of the S-box is (011, 111) with probability 2−1 × 10/16 =
2−1.68 . That means the difference in the leftmost bit of the inputs of the S-box
stays in the same position in the output of the S-box with probability 2−1.68 .
For the later rounds z i [46] equals to xi−2 [46] ⊕ sk1 [46] ⊕ sk1 [42] ⊕ sk1 [31] = 1
Cryptanalysis of Reduced-Round PRINTcipher 175
with probability 10/16 and we may assume that z i [45] has a uniform distribu-
tion. That is, the probability for the rounds greater than four is 2−1.68 . Thus
the probability of the r-round differential characteristic is 2−(6+1.68×(r−3)) .
In Table 3, the key constraints for the combined characteristics are shown. We
use the notation KS i to show the key subsets which satisfy the i-th combined
characteristic.
Analyzing the above conditions we have observed that the size of the key
space which satisfies at least one of the constraints in Table 3 is 4.54% of the
key space of the PRINTcipher-48.
Lemma 2. Let the key bits of PRINTcipher-48 satisfy the following equations
sk2 [10] = 0, sk2 [27] = sk2 [26] = sk2 [15] = sk2 [14] = sk2 [11] = 1.
Then, the bias of the equation xi [45] ⊕ sk1 [45] ⊕ sk1 [39] ⊕ sk1 [21] ⊕ sk1 [15] =
z i+3 [45] is −2−4 .
176 F. Karakoç, H. Demirci, and A.E. Harmancı
Proof. We can write the following equations with corresponding biases using
linear approximation table of the S-box:
ti [39] = xi+1 [39] , = −2−2
t [21] = xi+2 [21] , = −2−2
i+1
4 Key Recovery
4.1 An Attack on 31-Round PRINTcipher-48 for KS 4
Assume that PRINTcipher-48 uses a key from KS 4 . Therefore the key bits sat-
isfy the conditions: sk2 [30] = 0, sk2 [29] = 1, sk2 [28] = 1, sk2 [27] = 1, sk2 [26] = 1,
178 F. Karakoç, H. Demirci, and A.E. Harmancı
sk2 [21] = 0, sk2 [20] = 1, sk2 [15] = 1, sk2 [14] = 1, sk2 [11] = 1, sk2 [10] = 0,
sk1 [46] ⊕ sk1 [42] ⊕ sk1[31] = 1, and sk1 [45] ⊕ sk1 [39] ⊕ sk1[21] ⊕ sk1[15] = 0. Us-
ing the differential characteristic powered by linear characteristics for 28 rounds
in Section 3.3 we have been able to attack on 31-round version of the cipher and
recover the key bits sk2 [25 − 22], sk2 [19 − 16], sk1 [47 − 39]. For 28 rounds the
probability of the differential characteristic is 2−43.92 . The propagation of the
active bit in the output of the 28-th round through 3 rounds is shown in Figure
4. In the figure, the difference in the bits in the dotted line is 0 or 1. We apply
the attack using Algorithm 1.
We have calculated the signal to noise ratio as 24.08 for the attack on the
reduced 31-round PRINTcipher-48 using the formula
2k × p 217 × 2−43.92
S/N = = = 24.08
α×β 24 × 2−35
where k denotes the number of guessed key bits, p is the probability of the
characteristic, α is the average count of the subkeys per counted plaintext pairs
and β is the ratio of the counted pairs to all pairs. Since S/N is bigger than 2,
according to [7], about 4 right pairs are enough to determine the key bits. Thus
we need about 4 × 243.92 = 245.92 pairs.
The complexity of the attack is as follows. We use 246.92 chosen plaintext-
ciphertext data. The number of pairs used in the attack is 245.92 × 2−35 = 210.92
because of the elimination in step 3 and 5 in the algorithm. Note that in step 4 we
make 225,92 inverse S-box operations. For the counted pairs we make 219.92 key
dependent permutations guessing the 8 bits (sk2 [25 − 22] and sk2 [19 − 16]) of the
key. We decrease the search space to 215.92 using the elimination in step 9. Then
we make 224.92 two round decryptions guessing the 9 bits (sk1 [47 − 39]) of the
key. In total, the number of operations in the attack approximately equivalent
to 224.92 × 2 = 225.92 one-round encryptions of PRINTcipher-48.
Cryptanalysis of Reduced-Round PRINTcipher 179
To verify the attack algorithm and the effect of the linear approximations
we have implemented the attack for 20-round PRINTcipher-48 where the 17-
round characteristic probability is 2−27.31 by Theorem 2. We have run Algorithm
1 using 231 plaintext-ciphertext data 8 different times. In each of these experi-
ments the correct key is found among the highest counted candidates. If there
were no effect of linear approximations on the probability of the differential char-
acteristic, then the probability would be 2−34 for 17 rounds and 231 data would
not be sufficient to recover the key bits.
The propagation of the active bit in the output of the 26-th round through
3 rounds is shown in Figure 5. We use a similar algorithm as Algorithm 1 to
recover the key bits. The differences between the attack algorithms for KS 1 and
KS 4 are the followings:
– The condition in step 5 will be Δtr [46 − 45] = (0, 0), Δtr [44 − 43] = (0, 0),
hw(Δtr [41−39]) ≤ 1, hw(Δtr [38−36]) ≤ 1, hw(Δtr [35−33]) ≤ 1, Δtr [32] =
(0), Δtr [30] = (0), hw(Δtr [29 − 27]) ≤ 1, hw(Δtr [26 − 24]) ≤ 1, hw(Δtr
[23 − 21]) ≤ 1,
– The guessed key bits will be sk2 [27 − 22] and sk2 [19 − 14] in step 6,
– The condition in step 9 will be Δz r [40 − 39] = (0, 0), Δz r [37 − 36] = (0, 0),
Δz r [34 − 33] = (0, 0), Δz r [28 − 27] = (0, 0), Δz r [25 − 24] = (0, 0), Δz r
[22 − 21] = (0, 0).
2k × p 221 × 2−44.64
S/N = = = 23.59 .
α×β 26 × 2−33
5 Conclusion
In this paper, we have used differential and linear cryptanalysis techniques to-
gether to analyze the security of PRINTcipher. This combined usage is different
from differential-linear cryptanalysis [20]. In differential-linear cryptanalysis, a
cipher is divided into two parts where differentials and linear approximations
are constructed for the first and second parts respectively. In this work, we have
used linear approximations to increase the probability of the differentials. Using
this method, we have found out that for some of the keys, the probability of
an r-round differential characteristic is significantly higher than the designers’
expected values. With the help of linear approximations we have constructed
r-round differential characteristics with probability 2−(6+1.68×(r−3)) for 4.54%
of the keys and with probability 2−(7.68+1.51×(r−4)) for 0.036% of the keys of
PRINTcipher-48. These observations enable us to develop cryptanalytic at-
tacks on 29 and 31 rounds of PRINTcipher-48 for these key subsets.
References
1. Abdelraheem, M.A., Leander, G., Zenner, E.: Differential Cryptanalysis of Round-
Reduced PRINTcipher: Computing Roots of Permutations. In: Joux, A. (ed.) FSE
2011. LNCS, vol. 6733, pp. 1–17. Springer, Heidelberg (2011)
2. Aumasson, J.-P., Henzen, L., Meier, W., Naya-Plasencia, M.: QUARK: A
lightweight hash. In: Mangard and Standaert [24], pp. 1–15 (2010)
3. Badel, S., Dagtekin, N., Nakahara, J., Ouafi, K., Reffé, N., Sepehrdad, P., Susil, P.,
Vaudenay, S.: ARMADILLO: A Multi-Purpose Cryptographic Primitive Dedicated
to Hardware. In: Mangard and Standaert [24], pp. 398–412 (2010)
4. Biham, E., Biryukov, A., Shamir, A.: Cryptanalysis of Skipjack Reduced to 31
Rounds using Impossible Differentials. In: Stern, J. (ed.) EUROCRYPT 1999.
LNCS, vol. 1592, pp. 12–23. Springer, Heidelberg (1999)
5. Biham, E., Dunkelman, O., Keller, N.: Enhancing Differential-Linear Cryptanaly-
sis. In: Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501, pp. 254–266. Springer,
Heidelberg (2002)
6. Biham, E., Dunkelman, O., Keller, N.: Differential-Linear Cryptanalysis of Serpent.
In: Johansson, T. (ed.) FSE 2003. LNCS, vol. 2887, pp. 9–21. Springer, Heidelberg
(2003)
7. Biham, E., Shamir, A.: Differential Cryptanalysis of DES-Like Cryptosystems. In:
Menezes, A., Vanstone, S.A. (eds.) CRYPTO 1990. LNCS, vol. 537, pp. 2–21.
Springer, Heidelberg (1991)
8. Bogdanov, A., Knudsen, L.R., Leander, G., Paar, C., Poschmann, A., Robshaw,
M.J.B., Seurin, Y., Vikkelsoe, C.: Present: An Ultra-Lightweight Block Cipher.
In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466.
Springer, Heidelberg (2007)
9. Bulygin, S.: Algebraic Cryptanalysis of the Round-Reduced and Side Channel
Analysis of the Full PRINTcipher-48. Cryptology ePrint Archive, Report 2011/287
(2011), http://eprint.iacr.org/
182 F. Karakoç, H. Demirci, and A.E. Harmancı
10. De Cannière, C.: trivium: A Stream Cipher Construction Inspired by Block Cipher
Design Principles. In: Katsikas, S.K., López, J., Backes, M., Gritzalis, S., Preneel,
B. (eds.) ISC 2006. LNCS, vol. 4176, pp. 171–186. Springer, Heidelberg (2006)
11. De Cannière, C., Dunkelman, O., Knežević, M.: KATAN and KTANTAN — A
Family of Small and Efficient Hardware-Oriented Block Ciphers. In: Clavier, C.,
Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 272–288. Springer, Heidelberg
(2009)
12. Cheng, H., Heys, H.M., Wang, C.: PUFFIN: A Novel Compact Block Cipher Tar-
geted to Embedded Digital Systems. In: Fanucci, L. (ed.) DSD, pp. 383–390. IEEE
(2008)
13. Daemen, J., Govaerts, R., Vandewalle, J.: Weak Keys for IDEA. In: Stinson, D.R.
(ed.) CRYPTO 1993. LNCS, vol. 773, pp. 224–231. Springer, Heidelberg (1994)
14. Dunkelman, O., Indesteege, S., Keller, N.: A Differential-Linear Attack on 12-
Round Serpent. In: Chowdhury, D.R., Rijmen, V., Das, A. (eds.) INDOCRYPT
2008. LNCS, vol. 5365, pp. 308–321. Springer, Heidelberg (2008)
15. Ågren, M., Johansson, T.: Linear Cryptanalysis of PRINTcipher — Trails and
Samples Everywhere. Cryptology ePrint Archive, Report 2011/423 (2011),
http://eprint.iacr.org/
16. Hawkes, P.: Differential-Linear Weak Key Classes of IDEA. In: Nyberg, K. (ed.)
EUROCRYPT 1998. LNCS, vol. 1403, pp. 112–126. Springer, Heidelberg (1998)
17. Hong, D., Sung, J., Hong, S., Lim, J., Lee, S., Koo, B., Lee, C., Chang, D., Lee,
J., Jeong, K., Kim, H., Kim, J., Chee, S.: HIGHT: A New Block Cipher Suitable
for Low-Resource Device. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS,
vol. 4249, pp. 46–59. Springer, Heidelberg (2006)
18. Izadi, M., Sadeghiyan, B., Sadeghian, S.S., Khanooki, H.A.: MIBS: A New
Lightweight Block Cipher. In: Garay, J.A., Miyaji, A., Otsuka, A. (eds.) CANS 2009.
LNCS, vol. 5888, pp. 334–348. Springer, Heidelberg (2009)
19. Knudsen, L.R., Leander, G., Poschmann, A., Robshaw, M.J.B.: PRINTcipher: A
Block Cipher for IC-Printing. In: Mangard and Standaert [24], pp. 16–32
20. Langford, S.K., Hellman, M.E.: Differential-Linear Cryptanalysis. In: Desmedt,
Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 17–25. Springer, Heidelberg (1994)
21. Leander, G., Abdelraheem, M.A., AlKhzaimi, H., Zenner, E.: A cryptanalysis of
PRINTcipher: The Invariant Subspace Attack. In: Rogaway, P. (ed.) CRYPTO
2011. LNCS, vol. 6841, pp. 206–221. Springer, Heidelberg (2011)
22. Lim, C.H., Korkishko, T.: mCrypton – A Lightweight Block Cipher for Security
of Low-Cost RFID Tags and Sensors. In: Song, J.-S., Kwon, T., Yung, M. (eds.)
WISA 2005. LNCS, vol. 3786, pp. 243–258. Springer, Heidelberg (2006)
23. Liu, Z., Gu, D., Zhang, J., Li, W.: Differential-Multiple Linear Cryptanalysis. In:
Bao, F., Yung, M., Lin, D., Jing, J. (eds.) Inscrypt 2009. LNCS, vol. 6151, pp.
35–49. Springer, Heidelberg (2010)
24. Mangard, S., Standaert, F.-X. (eds.): CHES 2010. LNCS, vol. 6225. Springer,
Heidelberg (2010)
25. Matsui, M.: Linear Cryptanalysis Method for DES Cipher. In: Helleseth, T. (ed.)
EUROCRYPT 1993. LNCS, vol. 765, pp. 386–397. Springer, Heidelberg (1994)
26. Ohkuma, K.: Weak Keys of Reduced-Round Present for Linear Cryptanalysis. In:
Jacobson Jr., M.J., Rijmen, V., Safavi-Naini, R. (eds.) SAC 2009. LNCS, vol. 5867,
pp. 249–265. Springer, Heidelberg (2009)
27. Ojha, S.K., Kumar, N., Jain, K., Sangeeta, L.: TWIS – A Lightweight Block Cipher.
In: Prakash, A., Sen Gupta, I. (eds.) ICISS 2009. LNCS, vol. 5905, pp. 280–291.
Springer, Heidelberg (2009)
Cryptanalysis of Reduced-Round PRINTcipher 183
28. Standaert, F.-X., Piret, G., Gershenfeld, N., Quisquater, J.-J.: SEA: A Scalable
Encryption Algorithm for Small Embedded Applications. In: Domingo-Ferrer, J.,
Posegga, J., Schreckling, D. (eds.) CARDIS 2006. LNCS, vol. 3928, pp. 222–236.
Springer, Heidelberg (2006)
29. Sun, X., Lai, X.: The Key-Dependent Attack on Block Ciphers. In: Matsui, M.
(ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 19–36. Springer, Heidelberg (2009)
30. Wagner, D.: The Boomerang Attack. In: Knudsen, L.R. (ed.) FSE 1999. LNCS,
vol. 1636, pp. 156–170. Springer, Heidelberg (1999)
31. Guo, S.-Z., Zhao, X.-J., Wang, T.: Fault-Propagation Pattern Based Dfa on Spn
Structure Block Ciphers using Bitwise Permutation, with Application to Present
and PRINTcipher. Cryptology ePrint Archive, Report 2011/086 (2011),
http://eprint.iacr.org/
32. Zhang, W., Zhang, L., Wu, W., Feng, D.: Related-Key Differential-Linear At-
tacks on Reduced AES-192. In: Srinathan, K., Rangan, C.P., Yung, M. (eds.)
INDOCRYPT 2007. LNCS, vol. 4859, pp. 73–85. Springer, Heidelberg (2007)
1 Introduction
Modular Multiplication based Block Cipher (MMB) [7] was designed as an al-
ternative to the IDEA block cipher [9] by Daemen, Govaerts and Vandewalle in
1993. It has 6 rounds, and both of the block size and key size are 128 bits. In
[13], Wang et al. proposed a differential attack on the full 6-round MMB with
2118 chosen plaintexts, 295.61 encryptions and 266 bytes of memory. They also
presented linear and square attacks on the reduced-round MMB.
Our main contribution to this paper is to introduce a fast sandwich attack
on MMB. Sandwich attack was recently formalized by Dunkelman et al. [11], is
Supported by 973 Project (No.2007CB807902), the National Natural Science
Foundation of China (Grant No.60931160442), Tsinghua University Initiative Scien-
tific Research Program (2009THZ01002) and China Postdoctoral Science Founda-
tion(20110490442).
Corresponding author.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 185–199, 2012.
c Springer-Verlag Berlin Heidelberg 2012
186 K. Jia et al.
aimed to improve the former theoretic related-key rectangle attack on the full
KASUMI block cipher [3] into a fast attack. Sandwich attack is an extension
of the boomerang attack, which was introduced by Wagner [14]. Similar crypt-
analysis techniques with sandwich attack were also used in [4,5,14]. Usually,
boomerang attack is an adaptively chosen plaintexts and ciphertexts attack. It
was further developed by Kelsey et al. [6] into a chosen plaintexts attack called
the amplified boomerang attack, which was independently introduced by Biham
et al. with the name of the rectangle attack [2]. In [10], sandwich attack is also
converted into a chosen plaintexts attack, called rectangle-like sandwich attack.
In this paper, we construct an interesting sandwich distinguisher of 5-round
MMB with probability 1. Using the distinguisher, we present an adaptively cho-
sen texts attack on MMB, which the complexity is 240 texts and 240 MMB
encryptions. We also give a rectangle-like sandwich attack on MMB with 266.5
chosen plaintexts, 266.5 encryptions and 270.5 bytes of memory.
Furthermore, we introduce a 6-round differential with probability 2−94 . Uti-
lizing a 5-round differential by truncating the given 6-round differential, we show
an improved differential attack on MMB with 296 chosen plaintexts, 296 MMB
encryptions and 266 bytes of memory. It is interesting that, even if MMB block
cipher is increased to 7 rounds, it is still vulnerable to the differential attack
with the same complexity.
The rest of this paper is organized as follows. A brief description of MMB is
given in Sect. 2. We recall the sandwich attack in Sect. 3. The fast sandwich
attack on MMB is introduced in Sect. 4. Section 5 describes the rectangle-like
attack on MMB. And Section 6 shows the improved differential attack. Finally,
we conclude the paper in Sect. 7.
Key Schedule. Let the 128-bit key of MMB be K = (k0 , k1 , k2 , k3 ), the subkey
can be computed as:
kij = k(i+j) mod 4 ⊕ (B ≪ j),
where B = 0x0dae, k j is the (j + 1)-th round subkey, k j = (k0j , k1j , k2j , k3j ),
kij (i = 0, . . . , 3) are 32-bit words, and j = 0, . . . , 6.
where Xj is the 128-bit input to the (j + 1)-th round, and X0 is the plaintext.
The ciphertext is denoted as C = σ[k 6 ](X6 ).
The details of the four functions σ, γ, η, θ are given as follows.
1. σ[k j ] is a bitwise XOR operation with the round subkey.
γ(a0 , a1 , a2 , a3 ) = (a0 ⊗ G0 , a1 ⊗ G1 , a2 ⊗ G2 , a3 ⊗ G3 ).
Gi , G−1
i = (Gi )−1 mod 232 − 1, i = 0, 1, 2, 3 are listed.
θ(a0 , a1 , a2 , a3 ) = (a3 ⊕ a0 ⊕ a1 , a0 ⊕ a1 ⊕ a2 , a1 ⊕ a2 ⊕ a3 , a2 ⊕ a3 ⊕ a0 ).
3 Sandwich Attack
Sandwich attack dates from boomerang attack, and was utilized to break effi-
ciently the block cipher KASUMI in the related-key setting [10]. We give a brief
description of the boomerang attack and the sandwich attack.
188 K. Jia et al.
E0 (P ) ⊕ E0 (P ) = β , E0 (P0 ) ⊕ E0 (P0) = β,
E0 (P ) ⊕ E0 (P0 ) = γ .
P
P
α α
P
P
α α
P
P E0
P
P E0
p E0 X
X
γ β β
E0 X
X
EM
β X γ X
β
γ EM Y
Y
X
X E1 γ
Y
Y E1
q ζ
E1 C
C E1
ζ
C
C
ζ ζ
C
C C
C
The middle layer is a transition differential connecting the top and bottom
differentials. The probability of the transition differential is computed as follows.
0 ⊕X
r = P r((X 0 = β)|(Y ⊕ Y0 = γ) ∧ (Y ⊕ Y0 = γ) ∧ (X ⊕ X = β)).
r = P r((Y ⊕ Y0 = γ)|(X
0 ⊕X
0 = β) ∧ (X ⊕ X = β) ∧ (Y ⊕ Y0 = γ)).
where ‘0’ denotes a 32-bit zero difference word, and 0̄ = 232 − 1 = 0xf f f f f f f f .
E0
So α = γ = (0, 0̄, 0̄, 0), β = ζ = (0, 0̄ ⊕ δ, 0̄ ⊕ δ, 0), and P r(α −→ β) = 1,
E
P r(γ −→1
ζ) = 1.
The remaining is to prove that the probability of the transition differential
keeps 1, i.e.,
0 ⊕X
P r((X 0 = β)|(Y ⊕ Y0 = γ) ∧ (Y ⊕ Y0 = γ) ∧ (X ⊕ X = β)) = 1.
0i , X
Xi , X 0 and Xi denote the i-th words of X, X , X, 0 X 0 , i = 0, 1, 2, 3. The
i
subkey of the third round is denoted as k̄ = (k̄0 , k̄1 , k̄2 , k̄3 ).
Since θ and η are linear, by
we get
Y = θ ◦ η ◦ γ ◦ σ[k](X),
Y = θ ◦ η ◦ γ ◦ σ[k](X ),
Y0 = θ ◦ η ◦ γ ◦ σ[k](X),
0
Y0 = θ ◦ η ◦ γ ◦ σ[k](X
0 ). (4)
01 ⊕ k̄1 ) ⊗ G1 ) = 0,
((X1 ⊕ k̄1 ) ⊗ G1 ) ⊕ ((X (5)
02 ⊕ k̄2 ) ⊗ G2 ) = 0,
((X2 ⊕ k̄2 ) ⊗ G2 ) ⊕ ((X (6)
((X1 01 ⊕ k̄1 ) ⊗ G1 ) = 0,
⊕ k̄1 ) ⊗ G1 ) ⊕ ((X (7)
((X2 0 ⊕ k̄2 ) ⊗ G2 ) = 0.
⊕ k̄2 ) ⊗ G2 ) ⊕ ((X (8)
2
00 ⊕ k̄0 ) ⊗ G0 ) = 0̄ ⊕ δ,
((X0 ⊕ k̄0 ) ⊗ G0 ) ⊕ ((X (9)
03 ⊕ k̄3 ) ⊗ G3 ) = 0̄ ⊕ δ,
((X3 ⊕ k̄3 ) ⊗ G3 ) ⊕ ((X (10)
((X0 ⊕ k̄0 ) ⊗ G0 ) ⊕ 0
((X ⊕ k̄0 ) ⊗ G0 ) = 0̄ ⊕ δ, (11)
0
((X3 ⊕ k̄3 ) ⊗ G3 ) ⊕ 0
((X ⊕ k̄3 ) ⊗ G3 ) = 0̄ ⊕ δ. (12)
3
0 1 , X2 = X
X1 = X 02 , X1 = X
01 , X2 = X
02 .
Practical Attack on the Full MMB Block Cipher 191
00 ⊕ k̄0 ) ⊗ G0 ) = ((X
((X 00 ⊕ k̄0 ) ⊗ G0 ),
03 ⊕ k̄3 ) ⊗ G3 ) = ((X
((X 0 ⊕ k̄3 ) ⊗ G3 ).
3
Then,
00 ⊕ X
X 00 = 0,
03 ⊕ X
X 0 = 0. (16)
3
0 ⊕X
X 0 = (0, 0̄ ⊕ δ, 0̄ ⊕ δ, 0) = β.
Therefore,
0 ⊕X
r = P r((X 0 = β)|(Y ⊕ Y0 = γ) ∧ (Y ⊕ Y0 = γ) ∧ (X ⊕ X = β)) = 1.
G G
By computer searching, both 0xf df f 77ef −→ 0
0̄ ⊕ δ and 0xdf f bf eef −→
3
0̄ ⊕ δ
−18
occur with probability about 2 , so the probability of the differential is about
2−36 .
We collect 238 plaintext pairs (P, P ) and their corresponding ciphertext pairs
(C, C ), where P and P satisfy
For each pair, we construct the quartet, and detect whether the quartet satisfies
the differentials. The details are as follows.
– For the collected plaintext-ciphertext pair ((P, C), (P , C )), calculate
0 = C ⊕ (0, 0̄ ⊕ δ, 0̄ ⊕ δ, 0),
C
0 = C ⊕ (0, 0̄ ⊕ δ, 0̄ ⊕ δ, 0).
C
– Query the decryption to obtain P0 = E −1 (C), 0 P0 = E −1 (C
0 ), and get the
quartet (P, P , P0 , P0 ).
– For the constructed quartet (P, P , P0 , P0 ), check whether P0 ⊕ P0 = (∗, 0, 0, ∗)
holds, where ‘*’ stands for any non-zero 32-bit value. If P0 ⊕ P0 equals to
(∗, 0, 0, ∗), output the quartet.
Partial Key Recovery. For the right quartet (P, P , P0, P0 ), we search the right
subkey k00 among 232 candidates by the following equations:
0 = (V1 , V1 ⊕ V2 , V1 ⊕ V2 , V2 ),
C ⊕ C
where V1 , V2 are non-zero 32-bit words. If the equation holds, output the
quartet.
Partial Key Recovery. We firstly recover 64 bits of the equivalent key k 6 of
k 6 , i.e.,
k16 = k06 ⊕ k16 ⊕ k26 ,
k26 = k16 ⊕ k26 ⊕ k36 .
We find the right subkey k16 by searching 232 candidates with the verification of
the equations
(G−1 6 −1
1 ⊗ (C0 ⊕ C1 ⊕ C2 ⊕ k1 )) ⊕ (G1 ⊗ (C0 ⊕ C1 ⊕ C2 ⊕ k1 )) = 0̄ ⊕ δ,
6
00 ⊕ C
(G−1 ⊗ (C 01 ⊕ C
02 ⊕ k16 )) ⊕ (G−1 ⊗ (C
00 ⊕ C
01 ⊕ C
02 ⊕ k16 )) = 0̄ ⊕ δ.
1 1
In the similar way, we search the right subkey k26 among 232 candidates by the
following equations.
(G−1 6 −1
2 ⊗ (C1 ⊕ C2 ⊕ C3 ⊕ k2 )) ⊕ (G2 ⊗ (C1 ⊕ C2 ⊕ C3 ⊕ k2 )) = 0̄ ⊕ δ,
6
01 ⊕ C
(G−1 ⊗ (C 02 ⊕ C
03 ⊕ k26 )) ⊕ (G−1 ⊗ (C
01 ⊕ C
02 ⊕ C
03 ⊕ k26 )) = 0̄ ⊕ δ.
2 2
194 K. Jia et al.
From the key schedule algorithm, we know that, k00 =k0 ⊕ B, k30 = k3 ⊕ B,
k16 = k0 ⊕ k2 ⊕ k3 ⊕ (B ≪ 6), and k26 = k0 ⊕ k1 ⊕ k3 ⊕ (B ≪ 6). As a result,
we compute the whole 128 bits of the key. 24 = 16 key can be computed, for
there are 2 values for a subkey. Filter the right key by a known plaintext and
corresponding ciphertexts.
Complexity. The data complexity is 239 adaptive chosen plaintexts and cipher-
texts. The collection of the pairs is dominant the time complexity, which is 240
MMB encryptions. Once a right quarter is obtained, the right subkey can be
computed. So the success rate is (0.98)2 = 0.96.
#Right Quartets 0 1 2 3 4 5 6 7 8 9 10 11 12
Experiment 23 106 202 252 273 185 137 86 30 17 5 4 0
Theory 24.1 96.7 193.4 257.8 257.8 206.3 137.5 78.5 39.2 17.4 6.9 2.5 0.8
Fig. 3. The Number of Right Quartets in Our Experiment and the Theory
Our experiment was carried out on a IBM X3950 M2 server, with 64 Intel
Xeon E7330 2.4GHz cores inside. The operation system is Red Hat 4.1.2-46,
Linux 2.6.18. The compiler is gcc 4.1.2, and we use the standard optimization
flags, one thread in each core. It takes about 1 hour to identify a right quartet,
and recovery the main key of MMB.
Practical Attack on the Full MMB Block Cipher 195
α P α P
P P E0
E0 X
X
β β
EM
X γ X
EM Y Y
γ
Y Y E1
ζ
E1 C
C
ζ
C
C
G G
0̄ ⊕ δ −→
1
τ −→
0
0̄, (19)
G2 G3
0̄ ⊕ δ −→ τ −→ 0̄. (20)
By search all the τ , the 5-round differential holds with probability p1 .p2 = 2−94 .
Because there are 16862718720 pairs make the differential characteristics (19)
.
and (20) hold together, the probability is 16862718720/(2128) = 2−94 .
We mount the 5-round differential path to rounds 1-5 of the 6 rounds. In the
rest of the section, we give the attack algorithm.
The Key Recovery Attack. We choose 296 pairs of plaintext with difference
(0̄, 0, 0, 0̄), then there are 4 right pairs. The output difference of the 5-th round
for a right pair is (0, 0̄ ⊕ δ, 0̄ ⊕ δ, 0), so the difference of the ciphertext should
be (V1 , V1 ⊕ V2 , V1 ⊕ V2 , V2 ), where V1 , V2 are non-zero 32-bit words. We use
this to sieve the ciphertext pairs, and there will be 296 · 2−64 = 232 pairs left.
Furthermore, the input difference of the 6-th round is (0, 0̄ ⊕ δ, 0̄ ⊕ δ, 0), the
number of possible output difference values given the input difference 0̄ ⊕ δ for
G1 or G2 is about 228.56 . So there are 232 · 2(28.56−32)×2 = 225.12 pairs satisfying
the output difference.
For each of 225.12 pairs, we recover the key as Subsect. 4.2. Calculate the 32-bit
words k16 , k26 respectively, and increase the counter corresponding to (k16 , k26 )
by 1. For G1 and G2 , the number of pairs with input difference 0̄ ⊕ δ and any
given output difference is at most 214.28 , so the maximum count per counted
pair of the wrong subkey words will be 214.28 · 214.28 = 228.56 . The signal-to-noise
ratio is :
p · 2k 2−94 × 264
S/N = = −64−6.88 = 210.32 .
α·β 2 × 228.56
According to [12], the success probability is
1 ∞
Ps = √
μS/N −Φ−1 (1−2−a )
Φ(x)dx = 0.9686,
− √
S/N +1
7 Conclusion
In this paper, we construct a 5-round sandwich distinguisher for MMB with high
probability 1. With the distinguisher, we recover the 128-bit key of MMB with
239 adaptive chosen plaintexts and ciphertexts, 240 MMB encryptions. On this
bases, we present a rectangle-like sandwich attack to MMB, with 266.5 chosen
plaintexts, 266.5 MMB encryptions and 270.5 bytes memory. Besides, we improve
the differential attack on MMB in [13]. The data complexity is 296 chosen plain-
texts, the time complexity is 296 MMB encryptions and the memory complexity
is 266 bytes. We summarize the results on MMB in Table 2.
References
1. Biham, E., Shamir, A.: Differential Cryptanalysis of The Data Encryption Stan-
dard. Springer, London (1993)
2. Biham, E., Dunkelman, O., Keller, N.: The Rectangle Attack - Rectangling
the Serpent. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045,
pp. 340–357. Springer, Heidelberg (2001)
3. Biham, E., Dunkelman, O., Keller, N.: A Related-Key Rectangle Attack on the
Full KASUMI. In: Roy, B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp. 443–461.
Springer, Heidelberg (2005)
4. Biryukov, A., De Cannière, C., Dellkrantz, G.: Cryptanalysis of safer++. In:
Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 195–211. Springer, Heidelberg
(2003)
5. Biryukov, A., Khovratovich, D.: Related-Key Cryptanalysis of the Full AES-192
and AES-256. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 1–18.
Springer, Heidelberg (2009)
6. Kelsey, J., Kohno, T., Schneier, B.: Amplified Boomerang Attacks Against
Reduced-Round MARS and Serpent. In: Schneier, B. (ed.) FSE 2000. LNCS,
vol. 1978, pp. 75–93. Springer, Heidelberg (2001)
7. Daemen, J., Govaerts, R., Vandewalle, J.: Block Ciphers Based on Modular Mul-
tiplication. In: Wolfowicz, W. (ed.) Proceedings of 3rd Symposium on State and
Progress of Research in Cryptography, Fondazione Ugo Bordoni, pp. 80–89 (1993)
8. Daemen, J.: Cipher and Hash Function Design Strategies based on Linear and
Differential Cryptanalysis. PhD Thesis, Dept. Elektrotechniek, Katholieke Univer-
siteit Leuven, Belgium (1995)
9. Lai, X., Massey, J.: A Proposal for a New Block Encryption Standard. In: Damgård,
I.B. (ed.) EUROCRYPT 1990. LNCS, vol. 473, pp. 389–404. Springer, Heidelberg
(1991)
10. Dunkelman, O., Keller, N., Shamir, A.: A Practical-Time Attack on the A5/3
Cryptosystem Used in Third Generation GSM Telephony,
http://eprint.iacr.org/2010/013
11. Dunkelman, O., Keller, N., Shamir, A.: A Practical-Time Related-Key Attack on
the KASUMI Cryptosystem Used in GSM and 3G Telephony. In: Rabin, T. (ed.)
CRYPTO 2010. LNCS, vol. 6223, pp. 393–410. Springer, Heidelberg (2010)
12. Selçuk, A.A., Biçak, A.: On Probability of Success in Linear and Differential Crypt-
analysis. In: Cimato, S., Galdi, C., Persiano, G. (eds.) SCN 2002. LNCS, vol. 2576,
pp. 174–185. Springer, Heidelberg (2003)
13. Wang, M., Nakahara Jr., J., Sun, Y.: Cryptanalysis of the Full MMB Block Ci-
pher. In: Jacobson Jr., M.J., Rijmen, V., Safavi-Naini, R. (eds.) SAC 2009. LNCS,
vol. 5867, pp. 231–248. Springer, Heidelberg (2009)
14. Wagner, D.: The Boomerang Attack. In: Knudsen, L.R. (ed.) FSE 1999. LNCS,
vol. 1636, pp. 156–170. Springer, Heidelberg (1999)
Conditional Differential Cryptanalysis
of Trivium and KATAN
FHNW, Switzerland
1 Introduction
The stream cipher Trivium and the KATAN family of block ciphers are
lightweight cryptographic primitives dedicated to hardware implementation.
They share a very similar structure based on non-linear feedback shift registers
(NLFSR). In [12], conditional differential cryptanalysis, first introduced in [3],
has been applied to such constructions. The idea is to control the propagation
of differences by imposing conditions on the public variables of the cipher. De-
pending whether these conditions involve secret variables or not, key-recovery or
distinguishing attacks can be mounted. The technique extends to higher order
differential cryptanalysis. A similar concept is the dynamic cube attack pre-
sented in [9]. Deriving the conditions by hand is a time consuming and error
prone task. In this paper we use automatic tools to find and simplify these con-
ditions. The method is applied to KATAN and Trivium. In both cases we obtain
new cryptanalytic results.
In the single-key scenario, the KATAN family was already analyzed with re-
spect to conditional differential cryptanalysis in [12]. Table 1 summarizes the
Supported by the Hasler Foundation www.haslerfoundation.ch under project
number 08065.
Supported by the National Competence Center in Research on Mobile Information
and Communication Systems (NCCR-MICS), a center of the Swiss National Science
Foundation under grant number 5005-67322.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 200–212, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Conditional Differential Cryptanalysis of Trivium and KATAN 201
Table 1. Cryptanalytic results for KATAN. All attacks have practical complexity and
recover parts of the key. The results in the single-key scenario also apply to KTANTAN.
results and compares them to the results in the related-key scenario presented
in this paper. The question about the related-key security of KATAN was raised
by very efficient such attacks on KTANTAN [1]. The KTANTAN family of
block ciphers differs from KATAN only by its key scheduling. The latter has
shown some vulnerability which was also exploited for a meet-in-the-middle
attack [4].
The most relevant cryptanalytic results on Trivium are obtained by cube
attacks [8] and cube testers [2,15]. Our analysis can be seen as a refinement
of cube testers. Exploiting these refinements for Trivium is the subject of the
second part of this paper. Table 2 summarizes the results and compares them
to existing analysis.
The γi are either 0 or 1 and describe the differential characteristic. Values x that
satisfy all conditions are called valid. The goal is to find a large sample of valid
inputs X , such that a bias can be detected in the output of Δa f on X . The
conditions may also involve variables of the key. This allows for key recovery or
classification of weak keys.
The technique extends to higher order derivatives (corresponding to higher or-
der differential cryptanalysis). The d-th derivative of f with respect to a1 , . . . , ad
is defined as
Δ(d)
a1 ,...,ad f (k, x) = f (k, x ⊕ c),
c∈L(a1 ,...,ad )
1
The algebraic variety of J is the set {(k, x) | f (k, x) = 0 for all f ∈ J}.
204 S. Knellwolf, W. Meier, and M. Naya-Plasencia
return J
where k0 , . . . , k79 are the bits of k. At each round of the encryption process
two consecutive bits of the expanded key are used. The round updates further
depend on a bit ci . The sequence of ci is produced by an 8-bit linear feedback shift
register which is used as a counter. It is initialized by (c0 , . . . , c7 ) = (1, . . . , 1, 0)
and expanded according to ci+8 = ci +ci+1 +ci+3 +ci+8 . Round i, for 0 ≤ i < 254,
corresponds to the following transformation of the state:
After 254 rounds, the state is output as the ciphertext. All three members of
the KATAN family use the same key expansion and the same sequence of ci .
The algebraic structure of the non-linear update functions is the same. They
differ in the length of the non-linear registers and the tap positions for the non-
linear update functions. All members perform 254 rounds, but for KATAN48 the
non-linear registers are updated twice per round and for KATAN64 even thrice
(using the same ci and ki for all updates at the same round).
As in the analysis of KATAN in [12] we use first order differentials. The basic
strategy is as follows:
1. Find a key difference b whose expansion does not introduce differences for
many rounds after some round r. The idea is to cancel all differences intro-
duced by b up to round r and to maximize the number of rounds, where no
differences are introduced again.
2. Compute backwards from round r in order to find a plaintext difference a
that cancels the differences introduced by b. This fixes a differential
characteristic.
3. Use Algorithm 1 to compute the ideal J, describing the conditions for the
characteristic to be followed.
4. Derive a sample of valid plaintexts and empirically find the maximal number
of rounds for which a bias can be detected in the ciphertext differences.
206 S. Knellwolf, W. Meier, and M. Naya-Plasencia
The automated techniques for condition analysis allow to test many configura-
tions for a and b. The maximal number of consecutive rounds b does not intro-
duce differences is 39 (the key expansion is a 80-bit linear feedback shift register
with maximum period and two bits are used per round). It is easy to compute
differences which have this maximal run of zeros at any desired round r, and
the choice of b essentially reduces to a choice of r. We try to find the largest r
that can be controlled by conditions. If key bits are involved in the conditions,
several samples will be derived and tested for the correct guess.
Table 4. Differential characteristic for a = [6, 9, 19] and b = [6, 14, 25, 44]
All pairs (k, x) in the algebraic variety of J will follow the characteristic given
in Table 4. The conditions involve 10 bits of the key which can not be chosen.
However, we can guess them and adjust x accordingly. It is not difficult to derive
a sample of 220 valid inputs for each guess. One adjusts a linear variable of each
condition in order to nullify the expression. The remaining variables can be freely
chosen. The correct guess is detected by a significant bias in the difference of
state bit 18 after 120 rounds. Testing one sample costs 221 queries and at most
210 samples have to tested. Hence, the attack needs not more than 231 queries
to the cipher. The number of different queries can be even smaller, since the
samples for the different guesses may overlap. The attack recovers 10 bits of the
key, and we note that the recovered bits are essentially those of the first few
rounds. This enables us to mount the same procedure starting at a later round,
and finally to recover the full key at essentially the same cost.
208 S. Knellwolf, W. Meier, and M. Naya-Plasencia
No output is produced during the first 1152 rounds. After this initialization
phase the value of z is output as the key stream at each round.
It turns out that differences of hamming weight one give the best results. That
is, the a1 , . . . , ad are unit vectors in Fn2 . We note that this special case of a higher
order derivative is called a superpoly in [2]. Some heuristic techniques for choos-
ing the differences have been proposed. We use none of them, but briefly explain
our choice. The propagation of the single differences should be as independent
as possible. This excludes for example, choosing two differences at a distance
one. Such neighboring differences influence each other in the very early rounds
due to the quadratic monomials in the update functions. Further, the regular
structure of Trivium suggests a regular choice of the differences. Motivated by an
observation in [14] we chose the differences at a distance of three. Empirical tests
confirmed that this choice indeed outperforms all other choices. Specifically, we
choose ai = e3(i−1) for 1 ≤ i ≤ 24, where (e0 , . . . , en−1 ) is the standard basis of
(24)
Fn2 . In the following we use the shorthand Δzj = Δa1 ,...,a24 zj , where zj is the
keystream produced in round j. (In the terminology of [2], Δzj corresponds to
the superpoly of {x0 , x3 , . . . , x69 }.)
For the condition analysis we use Algorithm 3 with r = 200, that is, each dif-
ference is controlled for the first 200 rounds. After processing the first difference
(the difference in x0 ) we obtain
At this stage, J has the following interpretation: all pairs (k, x) in the algebraic
variety of J follow the same differential characteristic up to round r = 200 with
respect to a1 . We already note that two conditions can not be satisfied by the
attacker, since they only involve bits of the key. After processing the remaining
differences we have
All conditions collapse to conditions on single bits. From x, only the bits x72 , x75
and x78 are not fixed by conditions and not touched by the differences. This
makes them candidate neutral bits for Δzj , when all other variables xi are set
to zero. Empirical results confirm that they are probabilistically neutral up to
round 798. Table 6 shows the neutrality which we obtained in an experiment
with 100 random keys. Note that a neutrality of zero means that Δzj is linear in
the corresponding variable (which can be exploited as a distinguishing property
in the same way as neutrality).
j 72 75 78
772 1.00 1.00 1.00
782 0.05 0.10 0.05
789 0.30 0.20 0.25
798 0.40 0.40 0.30
Table 7. Neutrality of the bits x72 , x75 and x78 for weak keys
j 72 75 78
953 1.00 1.00 1.00
961 0.00 0.50 1.00
5 Conclusion
We evaluated the security of Trivium and KATAN with respect to conditional
differential cryptanalysis. We used an automatic approach to find and analyze
the conditions in terms of polynomial ideals. For reduced Trivium we identified
a class of 226 keys that can be distinguished for 961 of 1152 rounds. For reduced
KATAN we presented a key recovery attack up to 120 of 254 rounds in a related
key scenario. KATAN seems to have a comfortable security margin with respect
to the approach described in this paper.
References
1. Ågren, M.: Some Instant- and Practical-Time Related-Key Attacks on KTAN-
TAN32/48/64. In: Miri, A., Vaudenay, S. (eds.) SAC 2011. LNCS, vol. 7118,
pp. 217–233. Springer, Heidelberg (2011)
2. Aumasson, J.-P., Dinur, I., Meier, W., Shamir, A.: Cube Testers and Key Recovery
Attacks on Reduced-Round MD6 and Trivium. In: Dunkelman, O. (ed.) FSE 2009.
LNCS, vol. 5665, pp. 1–22. Springer, Heidelberg (2009)
3. Ben-Aroya, I., Biham, E.: Differential Cryptanalysis of Lucifer. In: Stinson, D.R.
(ed.) CRYPTO 1993. LNCS, vol. 773, pp. 187–199. Springer, Heidelberg (1994)
4. Bogdanov, A., Rechberger, C.: A 3-Subset Meet-in-the-Middle Attack: Cryptanal-
ysis of the Lightweight Block Cipher KTANTAN. In: Biryukov, A., Gong, G.,
Stinson, D.R. (eds.) SAC 2010. LNCS, vol. 6544, pp. 229–240. Springer, Heidel-
berg (2011)
5. Brickenstein, M., Dreyer, A.: PolyBoRi: A framework for Groebner-basis com-
putations with Boolean polynomials. Journal of Symbolic Computation 44(9),
1326–1345 (2009)
6. De Cannière, C.: trivium: A Stream Cipher Construction Inspired by Block Cipher
Design Principles. In: Katsikas, S.K., López, J., Backes, M., Gritzalis, S., Preneel,
B. (eds.) ISC 2006. LNCS, vol. 4176, pp. 171–186. Springer, Heidelberg (2006)
7. De Cannière, C., Dunkelman, O., Knežević, M.: KATAN and KTANTAN — A
Family of Small and Efficient Hardware-Oriented Block Ciphers. In: Clavier, C.,
Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 272–288. Springer, Heidelberg
(2009)
8. Dinur, I., Shamir, A.: Cube Attacks on Tweakable Black Box Polynomials. In: Joux,
A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 278–299. Springer, Heidelberg
(2009)
212 S. Knellwolf, W. Meier, and M. Naya-Plasencia
9. Dinur, I., Shamir, A.: Breaking Grain-128 with Dynamic Cube Attacks. In: Joux,
A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 167–187. Springer, Heidelberg (2011)
10. ECRYPT: The eSTREAM project, http://www.ecrypt.eu.org/stream/
11. Fischer, S., Khazaei, S., Meier, W.: Chosen IV Statistical Analysis for Key Recovery
Attacks on Stream Ciphers. In: Vaudenay, S. (ed.) AFRICACRYPT 2008. LNCS,
vol. 5023, pp. 236–245. Springer, Heidelberg (2008)
12. Knellwolf, S., Meier, W., Naya-Plasencia, M.: Conditional Differential Cryptanaly-
sis of NLFSR-Based Cryptosystems. In: Abe, M. (ed.) ASIACRYPT 2010. LNCS,
vol. 6477, pp. 130–145. Springer, Heidelberg (2010)
13. Lai, X.: Higher order derivatives and differential cryptanalysis. In: Blahut, R.E.,
Costello, D.J., Maurer, U., Mittelholzer, T. (eds.) Communicationis and Cryp-
tography: Two Sides of one Tapestry, pp. 227–233. Kluwer Academic Publishers
(1994)
14. Maximov, A., Biryukov, A.: Two Trivial Attacks on Trivium. In: Adams, C., Miri,
A., Wiener, M. (eds.) SAC 2007. LNCS, vol. 4876, pp. 36–55. Springer, Heidelberg
(2007)
15. Stankovski, P.: Greedy Distinguishers and Nonrandomness Detectors. In: Gong, G.,
Gupta, K.C. (eds.) INDOCRYPT 2010. LNCS, vol. 6498, pp. 210–226. Springer,
Heidelberg (2010)
Some Instant- and Practical-Time Related-Key
Attacks on KTANTAN32/48/64
Martin Ågren
1 Introduction
KTANTAN is a hardware-oriented block cipher designed by De Cannière,
Dunkelman and Knežević. It is part of the KATAN family [4] of six block ci-
phers. There are three variants KTANTANn where n ∈ {32, 48, 64}. All ciphers
consist of 254 very simple, hardware-efficient rounds.
The only difference between KATAN and KTANTAN is the key schedule. The
goal with KTANTAN is to allow an implementation to use a burnt-in key, which
rules out loading the key into a register and applying some state updates to it in
order to produce subkeys. Instead, subkeys are chosen as original key bits, selected
according to a fixed schedule. This schedule is the same for all three variants.
Aiming for a lightweight cipher, the designers of KTANTAN did not pro-
vide the key schedule as a large table of how to select the key bits. Rather,
a small state machine generates numbers between 0 and 79. In this way, key
bits can hopefully be picked in an irregular fashion. As shown by Bogdanov and
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 213–229, 2012.
c Springer-Verlag Berlin Heidelberg 2012
214 M. Ågren
Rechberger [3], the sequence in which the key bits are used has some unwanted
properties.
We will revisit the result of Bogdanov and Rechberger. We adjust the pre-
sentation slightly, before using their observation to launch a related-key attack.
Bogdanov and Rechberger noted this as a possible direction of research, but did
not look into it further.
Related-key attacks have been known for almost twenty years [5,1]. Like most
other related-key attacks, the ones presented in this paper are quite academic
in their nature. They are still a good measurement of the security of the cipher,
which should appear as an ideal permutation, and several notable properties
make the attacks in this paper very interesting:
1. They are minimal: they only require flipping one bit in the key and in several
cases, it is enough for the attacker to use only one triplet: one plaintext and
two ciphertexts.
2. They are extreme: we find a large number of key bits in time equivalent to
just a few encryptions. For KTANTAN32, the entire key can be found in
half a minute on a current CPU.
3. They never fail: All the properties exploited in this paper have probability
one, meaning the correct (partial) key always shows the property we look
for.
4. They directly contradict the designers’ claims. We will discuss why this is,
and what can be learnt from this.
The remainder of this paper is organized as follows: In Section 2 we describe the
cipher KTANTAN, and Section 3 introduces (truncated) differentials. Section 4
discusses the result by Bogdanov and Rechberger [3]. Section 5 develops our
attacks on KTANTAN32, while we summarize our results on KTANTAN48 and
KTANTAN64 in Section 6. In Section 7 we compare our results to the designers’
original claims on related-key security before concluding the paper in Section 8.
2 KTANTAN
The n-bit plaintext P = pn−1 . . . p0 is loaded into the state of the cipher, which
consists of two shift registers, L1 and L2 , see Fig. 1. For KTANTAN32, these
are of lengths |L1 | = 13 and |L2 | = 19. The other variants use longer registers.
The 254 rounds are denoted as round 0, 1, . . . , 253. Each round uses two key
bits, kar and kbr , which are picked straight from the 80-bit master key. The key
schedule is provided in Appendix A.
The contents of the registers are shifted, and the new bit in each register
(L1 /L2 ) is created from five or six bits from the other register (L2 /L1 ), through
some simple functions of degree two. For all versions of KTANTAN, the update
is specified by
fa (L1 ) = L1 [x1 ] ⊕ L1 [x2 ] ⊕ (L1 [x3 ] · L1 [x4 ]) ⊕ (L1 [x5 ] · IRr ) ⊕ kar
fb (L2 ) = L2 [y1 ] ⊕ L2 [y2 ] ⊕ (L2 [y3 ] · L2 [y4 ]) ⊕ (L2 [y5 ] · L2 [y6 ]) ⊕ kbr .
Related-Key Attacks on KTANTAN 215
fb
19 L1 31
IRr kar
kbr
fa
18 L2 0
Fig. 1. An overview of KTANTAN32. In each clocking, one shift is made and two
key bits, kar and kbr , are added to the state. IRr is a round constant which decides
whether or not L1 [3] is used in the state update or not. Indices denote how bits in the
plaintext/ciphertext are identified. L1 is shifted to the right and L2 to the left.
n |L1 | |L2 | x1 x2 x3 x4 x5 y1 y2 y3 y4 y5 y6
32 13 19 12 7 8 5 3 18 7 12 10 8 3
48 19 29 18 12 15 7 6 28 19 21 13 15 6
64 25 39 24 15 20 11 9 38 25 33 21 14 9
We denote the key K = k79 . . . k0 as in [3]. Test vectors for KTANTAN can
be produced by the reference code. As an example, the all-ones key and the
all-zeros plaintext produce the ciphertext 0x22ea3988. Unfortunately, this does
not highlight the bit order in the plaintext and, more importantly, the key. For
completeness and using the reference code given by the designers, we thus pro-
vide the key 0xfffffffffffffffffffe, plaintext 0x00000001, and ciphertext
0x8b4f0824 to indicate the bit orders involved.
3 (Truncated) Differentials
Differential cryptanalysis was publicly introduced by Biham and Shamir [2] in
1990. The idea is to study how a difference in the plaintext propagates through
the state of the encryption. If a partial key is correctly guessed, this property
should show up with some probability — ideally one but often very close to one
half — while a bad guess should lead to a more random behaviour.
Knudsen [6] extended the technique to truncated differentials, where similar
properties are studied only in some part of the state.
In [3], a differential is denoted by (ΔP, ΔK) → ΔS, where a difference in
the plaintext and key gives a difference in the state some number of rounds
into the encryption. We adopt and extend this notation. To denote truncated
differentials, i.e., differentials where we only know the differences in certain bit
positions, we will use a mask and a value denoted [mask : value]. As an example,
[00010a00:00010800] denotes a known difference in bits 16, 11, and 9. In bits
16 and 11, there is a difference, while there is a bit-equality in bit 9. For the
other bits, we do not know or care about the difference. In pseudo-C code, such
a mask-value pair could be used to identify a match by
if ( ((s1^s2)&mask) == value ) { ... }.
In this paper, ΔK always involves only a single bit, so we will name this bit
specifically, e.g., as in (0, k32 ) → [08075080 : 00000080].
With each (truncated) differential, there is also a probability that it holds. In
this paper, we only use differentials with probability one, which means there are
only false positives, which can be ruled out by repeated filtering, and no false
negatives. As a result, all attacks given in this paper have probability one of
succeeding. When we give data complexities, these will be the expected number
of samples needed to obtain a unique solution. Similarly, time complexities will
account for the work needed to rule out false alarms. We assume that an alarm
is raised with probability 2−b for a differential that involves b bits.
Due to the unicity distance, we will always need some extra material in order
to find a unique key. This is a fundamental property of KTANTAN as we can
only access plaintexts and ciphertexts of 32 to 64 bits, but want to find a key
consisting of 80 bits.
Related-Key Attacks on KTANTAN 217
Table 2. The nine most extreme key bits in both directions during encryption. Six
bits do not appear before round 111, while six others are not used after round 131.
Key bit Used first in round Key bit Used last in round
k13 109 k38 164
k27 110 k46 158
k59 110 k15 157
k39 111 k20 131
k66 123 k74 130
k75 127 k41 122
k44 136 k3 106
k61 140 k47 80
k32 218 k63 79
Differential Probability
(0, k32 ) → [00020000 : 00020000] .687 = .5 + .187
(0, k32 ) → [40000000 : 00000000] .640 = .5 + .140
(0, k32 ) → [40020000 : 00020000] .453 = .25 + .203
5. For the 212 combinations, check twelve specific bits for equality:
if ( ((mi^m’j)&0x2a03cd44) == 0 ) { ... }.
Alarms will be raised with probability 2−12 , so we expect one alarm.
6. Use some additional plaintext–ciphertext pairs to rule out false alarms.
An implementation improvement is to only calculate those 12 bits that we actu-
ally need. We have then reached something similar to the original formulation of
the attack, with the notable difference that we only perform the computations
involved in matching (φ111,127 , φ−1 6
127,132 ) once, during the 2 -parts. (We can split
at any round between and including 123 and 127, and still get twelve known
(but different) bit positions to look at, but opted for 127 as it makes both halves
equally expensive to calculate.)
We attack these subsets one after another, i.e., when attempting to find the
correct bit assignments for Aj , we assume that we already know the correct bit
assignments for Ai , i = 0, . . . , j − 1. We then follow this simple outline:
1. Guess the bit assignments for Aj .
2. If the (truncated) differential matches, we have a candidate subkey.
3. If the (truncated) differential does not match, we discard the candidate sub-
key.
In the first step, we can make 2sj guesses for the subkey. Note that the last step
can be performed without risk, since all our differentials have probability one.
Due to this, we can immediately discard large numbers of guesses.
The second step of the attack can however give false positives. As already
noted, we assume that an alarm is raised with probability 2−b for a differential
that involves b bits. To discard the false alarms, we can recheck the differential
on more material.
After finding the key bits specified by ∪i Ai , we can conclude by a brute force
for the remaining 80−s key bits. The total complexity would be 2s0 +. . .+2sl−1 +
280−s . However, the different operations in these terms have different costs. All
time complexities in this paper will be normalized to KTANTAN calls, and also
incorporate the expected increase of calculations due to false positives. We will
denote this time measurement t and it will, depending on context, refer to the
time required to recover either the full key or only some part of it.
Table 4. Key bits recovered in Sections 5.3 and 5.5. In the second set, the 11 reap-
pearing key bits have been underlined.
The 28 key bits guessed and found in Section 5.3, exploiting k32 .
{k0 , k1 , k2 , k4 , k5 , k7 , k8 , k11 , k12 , k14 , k16 , k17 , k22 , k27 , k29 ,
k32 , k34 , k55 , k56 , k60 , k62 , k64 , k66 , k68 , k69 , k71 , k73 , k75 }
The 40 key bits guessed and found in Section 5.5, exploiting k63 .
{k7 , k10 , k11 , k14 , k15 , k17 , k19 , k21 , k22 , k25 , k26 , k28 , k30 , k31 ,
k34 , k35 , k37 , k38 , k40 , k41 , k43 , k45 , k47 , k49 , k52 , k53 , k54 ,
k58 , k60 , k62 , k63 , k67 , k68 , k69 , k70 , k71 , k74 , k76 , k77 , k79 }
probability. However, the very last bit we guess, k37 , will only enter linearly, and
if the other 41 key bits are correct, we will have S0 = S1 no matter how we guess
k37 .
Generalizing, we realize that the bits which enter the partial decryption “late”
will not affect the comparison of S0 and S1 at all as they enter only linearly. We
have found that there are only 28 key bits that affect the equality between S0
and S1 . These bits are listed in Table 4.
We thus need to guess 28 bits and for each guess perform two partial de-
cryptions of 32 out of 254 rounds. The total number of round function calls is
expected to be 228 · 2 · 32 = 234 , which corresponds to 234 /254 ≈ 226.01 full
KTANTAN evaluations. Thus the total time complexity of finding 28 bits is
t ≈ 226 . All time complexities in the remainder of the paper will be calculated
in this way.
By using brute-force for the remaining key bits, the entire key can be found
in time t ≈ 226 + 262 ≈ 262 .
k32 appeared very late in the encryption, and we exploited this above. Similarly,
k63 is only used in the first 80 rounds, meaning that during decryption it shows
similar properties. With one ciphertext and two plaintexts, corresponding to a
secret key with a flipped k63 , we can launch an attack similar to that above, with
a truncated differential involving a single bit. With A0 and using φ0,43 , we guess
and obtain 40 bits, listed in Table 4, using 40 data and t ≈ 239.44 . We can then
exploit k63 for more subsets A1 , . . . , A15 and partial encryptions φ0,45 , . . . , φ0,71 ,
finding in total 65 bits of the key still with t ≈ 239.44 . Concluding with a brute
force for the remaining bits, we can find the entire key in t ≈ 239.44 +215 ≈ 239.44 .
All subsets, truncated differentials, etc. can be found in Table 6.
Table 5. The differentials used on KTANTAN32 in this paper. PCC means that
the differential is of type (ΔP, ΔK) → ΔS, where S is the state some rounds into
the encryption. Similarly, CPP means a differential (ΔC, ΔK) → ΔS, extending some
rounds into the decryption. (The ’Rounds’ column then denote the round into which
we decrypt, not the number of decryption rounds.) The ’#Key bits’ column counts
how many key bits need to be guessed. We also give the reduced number of guessed
key bits in Aj when we have already acquired a part of the key, ∪i<j Ai , by using the
differentials found earlier in the table.
Table 6. The attack parameters for finding 65 key bits with t ≈ 239.44 , exploiting k63
We have used a greedy approach for finding the differentials used in this paper. As
an example, on φ0,248 , there is the truncated differential (0, k32 ) → [00021000 :
00001000], but due to the slow diffusion we cannot find any key bits using it
with probability one. This forces us to use the differential (0, k32 ) → [80050800 :
00000800] on φ0,247 , where three key bits affect the differential so all three bits
Related-Key Attacks on KTANTAN 225
8 Conclusion
We have presented several weaknesses related to the key schedule of KTANTAN.
We first noted how the exceptionally weak key bit k32 allowed for a nonrandom-
ness result on KTANTAN32.
As the main result, we then derived several related-key attacks allowing for
(partial-)key recovery: With a single triplet, 3 bits can be found in time 2−0.90
and 28 bits can be obtained in time 23.0 . Using one triplet in the backward
226 M. Ågren
and 29 in the forward direction, the full 80-bit key is recovered in time 228.47 .
Requiring only three triplets, the full key is instead recovered in time 239.97 . Our
implementation of one of the attacks verifies the general attack idea and the
specific results.
Finally, note that none of these attacks are directly applicable to KATAN.
The slow diffusion, which allowed for e.g., the 23.0 -attack on 28 bits, is present
also in KATAN, but one needs a weak key bit in order to exploit this.
For the design of future primitives with a bitwise key schedule such as the
one in KTANTAN, we encourage designers to carefully study how individual key
bits are used, either by specifically ensuring that they are used both early and
late in the key schedule, or by investigating all differentials of modest weight.
References
1. Biham, E.: New Types of Cryptanalytic Attacks using Related Keys. Journal of
Cryptology 7(4), 229–246 (1994)
2. Biham, E., Shamir, A.: Differential Cryptanalysis of the Data Encryption Standard.
Springer, Heidelberg (1993)
3. Bogdanov, A., Rechberger, C.: A 3-Subset Meet-in-the-Middle Attack: Cryptanaly-
sis of the Lightweight Block Cipher KTANTAN. In: Biryukov, A., Gong, G., Stinson,
D.R. (eds.) SAC 2010. LNCS, vol. 6544, pp. 229–240. Springer, Heidelberg (2011)
4. De Cannière, C., Dunkelman, O., Knežević, M.: KATAN and KTANTAN — A
Family of Small and Efficient Hardware-Oriented Block Ciphers. In: Clavier, C.,
Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 272–288. Springer, Heidelberg
(2009)
5. Knudsen, L.R.: Cryptanalysis of LOKI 91. In: Zheng, Y., Seberry, J. (eds.)
AUSCRYPT 1992. LNCS, vol. 718, pp. 196–208. Springer, Heidelberg (1993)
6. Knudsen, L.R.: Truncated and Higher Order Differentials. In: Preneel, B. (ed.) FSE
1994. LNCS, vol. 1008, pp. 196–211. Springer, Heidelberg (1995)
Related-Key Attacks on KTANTAN 227
r kar kbr r kar kbr r kar kbr r kar kbr r kar kbr r kar kbr r kar kbr r kar kbr
0 63 31 1 31 63 2 31 63 3 15 47 4 14 14 5 60 76 6 40 40 7 49 17
8 35 67 9 54 22 10 45 77 11 58 26 12 37 69 13 74 10 14 69 69 15 74 10
16 53 21 17 43 43 18 71 7 19 63 79 20 30 62 21 45 45 22 11 11 23 54 70
24 28 60 25 41 41 26 3 19 27 38 70 28 60 28 29 25 73 30 34 34 31 5 21
32 26 74 33 20 52 34 9 41 35 2 18 36 20 68 37 24 56 38 1 33 39 2 2
40 52 68 41 24 56 42 17 49 43 3 35 44 6 6 45 76 76 46 72 8 47 49 17
48 19 51 49 23 55 50 15 63 51 14 46 52 12 28 53 24 72 54 16 48 55 1 49
56 2 34 57 4 20 58 40 72 59 48 16 60 17 65 61 18 50 62 5 53 63 10 58
64 4 36 65 8 8 66 64 64 67 64 0 68 65 1 69 51 19 70 23 55 71 47 47
72 15 15 73 78 78 74 76 12 75 73 9 76 67 3 77 55 23 78 47 47 79 63 31
80 47 79 81 62 30 82 29 77 83 26 58 84 5 37 85 10 26 86 36 68 87 56 24
88 33 65 89 50 18 90 21 69 91 42 42 92 5 5 93 58 74 94 20 52 95 25 57
96 3 51 97 6 38 98 12 12 99 56 72 100 16 48 101 33 33 102 3 3 103 70 70
104 60 28 105 41 41 106 67 3 107 71 71 108 78 14 109 77 13 110 59 27 111 39 39
112 79 15 113 79 79 114 62 30 115 45 45 116 59 27 117 23 71 118 46 46 119 13 29
120 42 74 121 52 20 122 41 73 123 66 2 124 53 69 125 42 42 126 53 21 127 27 75
128 38 38 129 13 13 130 74 74 131 52 20 132 25 57 133 35 35 134 7 7 135 62 78
136 44 44 137 73 9 138 51 67 139 22 54 140 29 61 141 11 43 142 6 22 143 44 76
144 72 8 145 65 65 146 50 18 147 37 37 148 75 11 149 55 71 150 46 46 151 77 13
152 75 75 153 70 6 154 61 29 155 27 59 156 39 39 157 15 31 158 46 78 159 76 12
160 57 73 161 34 34 162 69 5 163 59 75 164 38 38 165 61 29 166 43 75 167 70 6
168 77 77 169 58 26 170 21 53 171 43 43 172 7 23 173 30 78 174 44 44 175 9 25
176 18 66 177 36 36 178 9 9 179 50 66 180 36 36 181 57 25 182 19 67 183 22 54
184 13 45 185 10 10 186 68 68 187 56 24 188 17 49 189 19 51 190 7 39 191 14 30
192 28 76 193 40 40 194 1 1 195 66 66 196 68 4 197 57 25 198 35 35 199 55 23
200 31 79 201 30 62 202 13 61 203 10 42 204 4 4 205 72 72 206 48 16 207 33 33
208 51 19 209 39 71 210 78 14 211 61 77 212 26 58 213 21 53 214 11 59 215 6 54
216 12 44 217 8 24 218 32 64 219 64 0 220 49 65 221 18 50 222 37 37 223 11 27
224 22 70 225 28 60 226 9 57 227 2 50 228 4 52 229 8 40 230 0 0 231 48 64
232 32 32 233 65 1 234 67 67 235 54 22 236 29 61 237 27 59 238 7 55 239 14 62
240 12 60 241 8 56 242 0 32 243 0 16 244 16 64 245 32 32 246 1 17 247 34 66
248 68 4 249 73 73 250 66 2 251 69 5 252 75 11 253 71 7
Table 10. Similar to Table 5, this table gives the truncated differentials used on
KTANTAN48
Table 11. Similar to Table 5, this table gives the truncated differentials used on
KTANTAN64
ANSSI, France
{thomas.fuhr,henri.gilbert,jean-rene.reinhard,
marion.videau}@ssi.gouv.fr
1 Introduction
A set of two cryptographic algorithms is currently considered for inclusion in the
emerging mobile communications standard LTE of the 3rd Generation Partner-
ship Project 3GPP. It consists of an encryption algorithm named 128-EEA3 and
an integrity algorithm named 128-EIA31 — that are both derived from a core
stream cipher named ZUC. The algorithms ZUC, 128-EEA3, and 128-EIA3 were
designed by the Data Assurance and Communication Security Research Center
(DACAS) of the Chinese Academy of Sciences.
An initial version of the specifications of 128-EEA3/EIA3 and ZUC, that is re-
ferred to in the sequel as v1.4, was produced in June 2010 and published on the
also with Université Henri Poincaré-Nancy 1 / LORIA, France.
1
EEA stands for “EPS Encryption Algorithm” and EIA stands for “EPS Integrity
Algorithm”. EPS (Evolved Packet System) is an evolution of the third generation
system UMTS that consists of new radio access system named LTE (Long Term
Evolution) and a new core network named SAE (System Architecture Evolution).
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 230–242, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Analysis of the Candidate 3GPP Integrity Algorithm 128-EIA3 231
GSMA web site for an initial public evaluation [5,6]. Following the discovery of
some cryptographic weaknesses in the ZUC v1.4 initialisation [20,18] and of the
forgery attack on 128-EIA3 v1.4 reported in this paper, tweaks to the specifica-
tions of ZUC and EIA3 were introduced by the designers and a modified version
of the specifications referred to in the sequel as v1.5 was published in January
2011 for a second public evaluation period [8,9]. After its adoption by 3GPP, 128-
EEA3/EIA3 will represent the third LTE encryption and integrity algorithm set,
in addition to the already adopted sets 128-EEA1/EIA1 [4] based on the stream
cipher SNOW 3G and 128-EEA2/EIA2 [1, Annex B] based on AES.
The integrity algorithm 128-EIA3 is an IV-dependent MAC that takes as
input (1) a 128-bit key, (2) various public parameters that together determine a
128-bit initial vector, (3) an input message of length between 1 and 20000 bits,
and produces a 32-bit MAC value. It uses an universal hash function-based
construction and has therefore many features in common with the algorithms of
the well known Wegman-Carter family of message authentication codes [3,19].
As already mentioned, we denote by 128-EIA3 v1.4 (resp. 128-EIA3 v1.5)
the initial version specified in [5] (resp.the modified version specified in [8]). In
this paper we analyse the security of both versions. We first show that 128-
EIA3 v1.4 is vulnerable to a simple existential forgery attack. Given any known
message M , any known or unknown initial vector, and the associated MAC un-
der an unknown key, it is possible to predict the MAC value associated with
a new message M = M derived from M under the same initial vector and
the same unknown key, with a success probability 1/2. This attack is generic,
it does not rely on any specific feature of ZUC and works with any under-
lying stream cipher. It exploits a subtle deviation of 128-EIA3 v1.4 from the
requirements of the Wegman-Carter paradigm. The latter requirements can be
informally summarized by saying that mask values must behave as one-time
masks, which is not the case for 128-EIA3 v1.4. As will be shown in the sequel,
distinct 128-EIA3 v1.4 mask values are not necessarily independent. Indeed, in
128-EIA3 v1.4, the mechanism used to generate the masking values applied to
the output of the universal hash function does not match the model used in the
proof. Consequently, the arguments from [12] and [16] that are invoked in the
design and evaluation report [7] to infer bounds on the success probability of
forgery attacks on 128-EIA3 v1.4 are not applicable.
In [8], a tweak leading to 128-EIA3 v1.5 has been proposed to circumvent this
attack. Through an improved generation procedure, masking values are either
equal or independent. However, it can be observed that for distinct messages,
no separation between the ZUC keystream bits involved in the universal hash
function computation and those involved in the generation of the masking values
is ensured.
While this represents a deviation from the requirements on masking values
used in the Wegman-Carter paradigm, the security consequences are much less
dramatic than for the initial MAC (v1.4) since an ad hoc proof given in [10] allows
to show that the modified MAC offers a provable resistance against existential
forgery attacks under the assumption that the same (key, IV) pair can never
232 T. Fuhr et al.
be re-used, neither by the MAC issuer nor by the MAC verifier. We show that
this property however affects the resilience of 128-EIA3 v1.5 against forgery
attacks if IV repetitions occur. We further observe that independently of this
property, the universal hash function structure also results in some limitations
of this resilience. This leads us to investigate the resistance of 128-EIA3 v1.5
and one natural variant of this MAC against forgery attacks involving three
pairwise distinct messages and the same IV value. We make no claims regarding
the practical applicability of the identified nonce repetition attacks to the LTE
system.
In Section 3, we give a short description of the 128-EIA3 algorithms. We
then describe the attack on v1.4 in Section 4 and discuss the reasons why the
security proofs for related constructions by Krawczyk [12] and Shoup [16] do not
guarantee the security of 128-EIA3 v1.4. In Section 5, we state a property which,
although it may not be considered as an attack in standard security models,
underscores the lack of robustness of 128-EIA3 v1.5 against nonce repetition. We
also explain why a simple modification of 128-EIA3 fails to completely suppress
such properties because of the universal hashing underlying structure.
2 Notation
Throughout the paper, we use the following notation.
– S is a stream cipher.
– For two finite bitstrings A = (a0 , . . . , a−1 ) and B = (b0 , . . . , bm−1 ), A B
denotes the concatenation of A and B, i.e. the bitstring
(a0 , . . . , a−1 , b0 , . . . , bm−1 ).
– For a bitstring A = (a0 , . . .) of length ≥ j + 1, A|ij , 0 ≤ i ≤ j, denotes the
(j − i + 1)-bit string obtained from the consecutive bits of A between indices
i and j, i.e. A|ij = (ai , . . . , aj ).
– 0 denotes the bitstring of length whose bits are all zero.
– W (i) denotes the i-th bit of a 32-bit word W .
– Let consider a 32-bit word W = (W (0) , . . . , W (31) ) and an integer a between
1 and 31. Then W a denotes the (32−a)-bit word resulting from a left shift
of W by a positions and a truncation of the a rightmost bits. More precisely,
W a = (W (a) , . . . , W (31) ). The (32 − b)-bit word, W b, resulting from
the right shift of W by b positions and a truncation of the b leftmost bits is
defined in the same way. We have W b = (W (0) , . . . , W (31−b) ).2
where Wmask = WL−32 with the value L being different between v1.4 and v1.5,
i.e. Wmask = W+32 for v1.4 and Wmask = W ×32+32 for v1.5. The parameter
32
lengths used in 128-EIA3 are: k = n = 128 and 1 ≤ ≤ 20000.
In fact, the MAC of a message M is computed as
where H(.) is a family of universal hash functions based on Toeplitz matrices
with pseudorandom coefficients taken from a stream cipher output. We have:
234 T. Fuhr et al.
⎡ ⎤
z0 z1 . . . z31
⎢ z1 z2 . . . z32 ⎥
⎢ ⎥
⎢ . . . z33 ⎥
H(z0 ,...,z+31 ) (m0 , . . . , m−1 ) = [m0 , m1 , . . . , m−1 , 1] · ⎢ z2 z3 ⎥.
⎢ .. .. .. .. ⎥
⎣ . . . . ⎦
z z+1 . . . z+31
We can notice that the words Wi derived from the keystream and corresponding
to message bits mi are not independent from each other. More precisely, we have:
Wi+1 = ((Wi 1), zi+32 ).
Moreover the “one-time masks” Wmask associated with identical values of IV
but different message lengths are related. We have:
where (M ) denotes the length of the message M . Let us suppose that Wmask is
the one-time mask generated for the input (IK, IV, M ) and Wmask is the one-
time mask generated for the input (IK, IV, M ). If (M ) − (M ) = Δ with
0 < Δ < 32, we have:
Wmask = (Wmask Δ, β0 , . . . , βΔ−1 ),
for some bit values βi . We can use these relations in a substitution attack.
Let us suppose that the adversary knows a valid MAC value T for a given
message M = (m0 , . . . , m−1 ) of length bits under a given IV value IV and
a key IK. This MAC can be transformed with probability 1/2 into a valid
MAC, T , for the ( + 1)-bit message M = (0, m0 , . . . , m−1 ) under the same IV
value IV and the same key IK.
Let us analyse what happens during the computation of the MAC for M
(under the same IV value IV and the same key IK). The generated keystream
z0 , . . . , z+64 is the same as the keystream that was used to compute T , with
one extra bit: z+64 . As a consequence, the words Wi , 0 ≤ i ≤ are identical.
The one-time mask used is Wmask = (z+33 , . . . , z+64 ) = ((Wmask 1), z+64 ).
Then, the MAC value T is given by the following formula:
Analysis of the Candidate 3GPP Integrity Algorithm 128-EIA3 235
T = mi Wi
⊕ W+1 ⊕ Wmask
i=0
−1
= mi Wi+1 ⊕ W+1 ⊕ Wmask
i=0
−1
= mi ((Wi 1), zi+32 ) ⊕ (W 1, z+32 ) ⊕ ((Wmask 1), z+64 )
i=0
−1
= mi Wi ⊕ W ⊕ Wmask 1, β
i=0
−1
= (T 1, β) , with β = mi zi+32 ⊕ z+32 ⊕ z+64 .
i=0
The Design and Evaluation Report [7] that accompanied version 1.4 erroneously
invokes the security proofs of [16] to infer that in the case of 128-EIA3 v1.4, no
forgery of a new message can succeed with probability higher than 2−32 . The
argument comes from the fact that the algorithm makes use of an ε-almost XOR
universal (ε-AXU) family of hash functions with ε = 2−32 .
In [7], a proof is given that for any value of IV , the family of hash functions
used in 128-EIA3, i.e. the intermediate value obtained in the MAC computation
associated with key K just before before the exclusive or with Wmask is ε-AXU
with ε = 2−32 .
As far as we know, the first construction of a secure MAC using ε-AXU hash
functions has been issued by Krawczyk [12], who proved that given HK (M ) ⊕ r
236 T. Fuhr et al.
All the models used for the proofs assume that the hash function and the pseu-
dorandom function are randomly chosen and in particular that they are inde-
pendent from each other. In the case of 128-EIA3 v1.4, the construction does
not fit the model as the two are related. Moreover, what makes our attack work
is that the one-time masks used for messages M and M of distinct lengths are
different but related. In fact, we have:
(M)+32
M AC(M ) = (cnt, S(IK, cnt)|(M)+63 ⊕ HS(IK,cnt)|0(M )+31 (M )).
We see that the mask computation also involves the message length and leads to
distinct, but related mask values, for identical IVs and different message lengths.
Therefore no existing proof applies and we manage to derive an attack against
v1.4.
In the following we consider tags generated using the same key/IV pair. We
remark that in the case of 128-EIA3 v1.5, even though masking values for two
distinct messages are either equal or independent, the independence of the uni-
versal hash function keys (i.e. the keystream bits used in the computation of the
hash value) and the masking values is not guaranteed. Parts of the keystream
(zi ) used as masking values for a message can be used during the universal hash
function computation for a longer message, and conversely. This represents a de-
viation of the mask value generation of 128-EIA3 v1.5 from the Wegman-Carter
paradigm. We show that consequently, while the proof of [10] guarantees that the
MACs associated with two distinct messages and the same IV value are indepen-
dent and uniformly distributed, the knowledge of the tags of two related messages
under the same (key, IV) pair may allow to compute the tag of a third message
under the same key and IV. Consider any message M1 of arbitrary length 1 ,
any message M2 of length 2 ≥ 1 + 32( 32 1
+ 1), and the message M3 = M2 ⊕ δ
of length 3 = 2 , where δ is the bitstring of length 2 whose prefix of length 1
3
Whether this third message and the associated tag can be successfully submitted to
the verifying entity depends on wheter the IV repetition detection of this entity is
effective or not.
238 T. Fuhr et al.
is M1 and whose other bits are zero except for the two bits at positions 1 and
1
32( 32 + 1). Then we have M AC(M1 ) ⊕ M AC(M2 ) ⊕ M AC(M3 ) = 0. Indeed,
1 −1
M AC(M1 ) = (m1
i Wi ) ⊕ W ⊕ W ,
i=0 1 32( 1 +1)
32
2 −1
M AC(M2 ) = (m2
i Wi ) ⊕ W ⊕ W ,
i=0 2 32( 2 +1)
32
1 −1 2 −1
M AC(M3 ) = (m1
i Wi ) ⊕ (m2
i Wi ) ⊕ W1 ⊕ W ⊕ W ⊕ W .
i=0 i=0 32( 1 +1) 2 32( 2 +1)
32 32
Consequently, for any such triplet of pairwise distinct messages the authentica-
tion codes of two messages gives a forgery for the third one.
The above 3-message forgery can be avoided by making the masking values
and the universal hashing keys independent, for example by following the slightly
modified MAC described in Algorithm 2.
This algorithm is quite similar to 128-EIA3 and requires the same number of
keystream bits and the same amount of computation as 128-EIA3 v1.4 — the
single difference being that the mask value consists of the first keystream bits
and the universal hash function output value is derived from the subsequent
keystream bits. This scheme ensures the equality or independence of keystream
bits used as masking values or universal hashing key when tagging two different
messages. It is also closer to the Wegman-Carter paradigm in that the masking
value computation does not depend on the message being tagged — which is
not the case in 128-EIA3 v1.4 and v1.5, where the length of the tagged message
impacts the masking value. Unfortunately some non-generic properties remain,
that are related to the Toeplitz matrix structure underlying the universal hash
function construction rather than to the masking values generation method and
hold for both 128-EIA3 v1.5 and Algorithm 2.
Analysis of the Candidate 3GPP Integrity Algorithm 128-EIA3 239
Let us now consider two messages M and M = 0 M and assume that we got
their tags T and T under the same key/IV pair. Assume furthermore that these
tag computations involve the same masking value Wmask . This is always the
case in Algorithm 2 and is true in 128-EIA3 v1.5 under some mild assumption
on the length of M (namely that (mod 32) = 0). Thus we get
Hz (M ) ⊕ Wmask = T ,
Hz (M ) ⊕ Hz (M ) = T ⊕ T .
(Hz (M ) ⊕ Hz (M )) 1 = (Hz (0 M ) ⊕ Hz (0 M )) 1
= (Hz (0 M ) 1) ⊕ (Hz (0 M ) 1)
= (Hz (M ) 1) ⊕ (Hz (M ) 1)
= (Hz (M ) ⊕ Hz (M ))) 1
= (T ⊕ T ) 1.
IV0 = COUNT
IV1 = BEARER 027
IV2 = IV0 ⊕ DIRECTION 031
IV3 = IV1 ⊕ 016 DIRECTION 015
We notice that while in UMTS two distinct values managed by the sending
and receiving parties ensure the non-repetition of IVs, one single 32-bit counter
is used for this purpose in LTE. Enforcing the use of fresh IVs by both the
MAC issuer and the MAC verifier might therefore be more complex and we may
express some concerns about the assurance that in LTE implementations the
strong security requirement of (key, IV) pair never being reused at either side
will always be verified.
6 Conclusion
The existential forgery attack presented in Section 4 was forwarded to the de-
signers of 128-EIA3 v1.4, who produced the modified version 128-EIA3 v1.5 to
address the issue. While our analysis of 128-EIA3 v1.5 did not reveal any security
issue of similar significance and the new MAC offers a provable resistance (under
some assumptions) against a large class of forgery attacks, we have highlighted
some structural properties of the mask values computation and the universal
family of hash functions underlying 128-EIA3 v1.5, and shown that these may
lead to limitations of its resilience against nonce reuse. None of the security prop-
erties we have investigated here relates to the specific features of the underlying
IV-dependent stream cipher ZUC.
References
1. 3GPP Technical Specification Group Services and System Aspects: 3GPP System
Architecture Evolution (SAE); Security architecture (Release 9). Tech. Rep. 3G
TS 33.401 V 9.3.1, 3rd Generation Partnership Project (2010-04)
2. Bellare, M., Goldreich, O., Mityagin, A.: The Power of Verification Queries in Mes-
sage Authentication and Authenticated Encryption. Tech. Rep. 2004/309, Cryp-
tology ePrint Archive (2004)
3. Carter, J., Wegman, M.: Universal Classes of Hash Functions. Journal of Computer
and System Science 18, 143–154 (1979)
4. ETSI/SAGE: Specification of the 3GPP Confidentiality and Integrity Algorithms
UEA2 & UIA2. Document 1: UEA2 and UIA2 Specification. Version 2.1. Tech.
rep., ETSI (March 16, 2009),
http://www.gsmworld.com/documents/uea2_uia2_d1_v2_1.pdf
5. ETSI/SAGE: Specification of the 3GPP Confidentiality and Integrity Algorithms
128-EEA3 & 128-EIA3. Document 1: 128-EEA3 and 128-EIA3 Specification. Ver-
sion 1.4. Tech. rep., ETSI (July 30, 2010)
6. ETSI/SAGE: Specification of the 3GPP Confidentiality and Integrity Algorithms
128-EEA3 & 128-EIA3. Document 2: ZUC Specification. Version 1.4. Tech. rep.,
ETSI (July 30, 2010)
7. ETSI/SAGE: Specification of the 3GPP Confidentiality and Integrity Algorithms
128-EEA3 & 128-EIA3. Document 4: Design and Evaluation Report. Version 1.1.
Tech. rep., ETSI (August 11, 2010)
8. ETSI/SAGE: Specification of the 3GPP Confidentiality and Integrity Algorithms
128-EEA3 & 128-EIA3. Document 1: 128-EEA3 and 128-EIA3 Specification. Ver-
sion 1.5. Tech. rep., ETSI (January 4, 2011),
http://www.gsmworld.com/documents/EEA3_EIA3_specification_v1_5.pdf
9. ETSI/SAGE: Specification of the 3GPP Confidentiality and Integrity Algorithms
128-EEA3 & 128-EIA3. Document 2: ZUC Specification. Version 1.5. Tech. rep.,
ETSI (January 4, 2011),
http://www.gsmworld.com/documents/EEA3_EIA3_ZUC_v1_5.pdf
10. ETSI/SAGE: Specification of the 3GPP Confidentiality and Integrity Algorithms
128-EEA3 & 128-EIA3. Document 4: Design and Evaluation Report. Version 1.3,
Tech. rep., ETSI (January 18, 2011),
http://www.gsmworld.com/documents/EEA3_EIA3_Design_Evaluation_v1_3.pdf
11. Handschuh, H., Preneel, B.: Key-Recovery Attacks on Universal Hash Function
Based MAC Algorithms. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157,
pp. 144–161. Springer, Heidelberg (2008)
12. Krawczyk, H.: LFSR-Based Hashing and Authentication. In: Desmedt, Y.G. (ed.)
CRYPTO 1994. LNCS, vol. 839, pp. 129–139. Springer, Heidelberg (1994)
13. Martin Albrecht, K.P., Watson, G.: Plaintext Recovery Attacks Against SSH. In:
Proceedings of IEEE Symposium on Security and Privacy 2009, pp. 16–26. IEEE
Computer Society (2009)
14. Rogaway, P.: Bucket Hashing and Its Application to Fast Message Authentication.
In: Coppersmith, D. (ed.) CRYPTO 1995. LNCS, vol. 963, pp. 29–42. Springer,
Heidelberg (1995)
15. Rogaway, P.: Bucket Hashing and its Application to Fast Message Authentication.
Journal of Cryptology 12(2), 91–115 (1999)
16. Shoup, V.: On Fast and Provably Secure Message Authentication Based on Uni-
versal Hashing. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 313–328.
Springer, Heidelberg (1996)
242 T. Fuhr et al.
17. Stinson, D.: Universal Hashing and Authentication Codes. Design, Codes and
Cryptography 4, 369–380 (1994)
18. Sun, B., Tang, X., Li, C.: Preliminary Cryptanalysis Results of ZUC. Presented at
the First International Workshop on ZUC Algorithm, vol. 12 (2010)
19. Wegman, M., Carter, J.: New Hash Functions and Their Use in Authentication
and Set Equality. Journal of Computer and System Science 22, 265–279 (1981)
20. Wu, H.: Cryptanalysis of the Stream Cipher ZUC in the 3GPP Confidentiality &
Integrity Algorithms 128-EEA3 & 128-EIA3. Presented at the ASIACRYPT 2010
rump session (2010), http://www.spms.ntu.edu.sg/Asiacrypt2010/
Rump%20Session-%207%20Dec%202010/wu_rump_zuc.pdf
New Insights
on Impossible Differential Cryptanalysis
1 Introduction
Impossible differential attack [3] is a method of using differential concepts in
cryptanalytic attacks. While regular differential cryptanalysis [5] exploits differ-
entials with as high probability as possible, impossible differential cryptanalysis
exploits differentials that cannot happen, i.e., have probability of zero. The actual
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 243–259, 2012.
c Springer-Verlag Berlin Heidelberg 2012
244 C. Bouillaguet et al.
F ⊕
use of the impossible differential resembles the one of a high probability differen-
tials: given a pair that may “satisfy” the differential, the adversary obtains the
subkey(s) suggested by the pair. Unlike differential cryptanalysis, where such a
subkey is more likely to be the right subkey, in impossible differential cryptanal-
ysis, once a subkey is suggested by a candidate pair, it is necessarily a wrong
one (and thus discarded).
To start an impossible differential attack, the adversary has to identify such
impossible differentials. Most of these differentials are constructed in a miss-
in-the-middle approach [4]. The approach is based on combining two probabil-
ity 1 truncated differentials that cannot coexist. For example, there is a generic
5-round impossible differential for Feistel constructions with a bijective round
function (first identified in [12]) of the form (0, α) → (0, α) (depicted in Figure 2).
A method for finding such impossible differentials is presented in [11] under
the name U-method. In this method, one can construct probability 1 truncated
differentials, which in turn leads to finding contradictions. An automated ver-
sion of the method is presented in [10]. The tool (called the matrix method).
The automated analysis shows several results for generalizations of the Feistel
cipher (the Generalized Feistel Network of [14], MARS-like constructions [6], and
CAST-like constructions [1]).
As an example, consider a CAST-like construction (depicted in Figure 1). The
matrix method suggests an impossible differential of n2 − 1 rounds for n ≥ 3
threads assuming that the round function is bijective. The impossible differential
has the form of (0, 0, . . . , 0, α) → (0, 0, . . . , 0, ω) for any non-zero α and ω, and
is based on the fact that the (n − 1)-round truncated differential starting at
(0, 0, . . . , 0, α) predicts a zero difference in the one before last word, while the
n(n − 1)-round truncated differential ending at (0, 0, . . . , 0, ω) predicts that the
same word has a non-zero difference.
The U-method was later improved in [13] to incorporate a much larger set
of contradictions. Such new set of contradictions may include the use of specific
differences in the input and the output (rather than truncated differences) or
conditions on XORing a few words together.
In this paper we take a deeper look into the construction of impossible dif-
ferentials. We start the analysis by considering a slightly different approach for
the analysis, a one which does not classify the state of the word as part of a small
New Insights on Impossible Differential Cryptanalysis 245
set of values.1 Instead, we try to look at the specific differences that may form a
contradiction, taking the structure of the round function into account. The main
property we use is the existence of impossible differentials in the round function.
This allows us to extend the impossible differentials by an additional round,
leading to improved attacks on some structures of block ciphers. Moreover, follow-
ing the new point of view, one can even reduce the requirements from the round
function. For example, as part of our analysis, we can offer n2 -round impossible
differentials for CAST-like ciphers, even if their round function is not bijective. We
note that our results contradict a claim made in [16], which claims that “generic”
impossible differentials for this structure exist only up to n2 − 1 rounds. We com-
pare the previously known results with our new results in Table 1.
We continue and define the differential expansion rate of a round function for
a (set of) input difference(s). The rate tries to measure the speed in which the
set of possible differences evolves through invocations of the round function. To
some extent, it is the equivalent of the expanding rate of a graph.
We then study how to use our new impossible differential in an actual attack,
and how useful is the new impossible differential. We describe attacks using
our new extended impossible differentials, with the same time complexity as
previous attacks (under some natural conditions on the round function), and
covering more rounds.
The structure of this paper is as follows: In Section 2 we cover the basics
of differential cryptanalysis and impossible differential cryptanalysis. Section 3
discusses the previous results and the matrix method. In Section 4 we suggest a
new approach for constructing impossible differentials, and in Section 5 we show
that impossible differential attacks that use the previous impossible differentials
can be extended to more rounds when instantiated with our newly found im-
possible differentials (almost with no additional complexity). Finally, Section 6
concludes this paper.
1
The matrix method classifies the state of a word as one of five states: zero difference,
fixed difference, unknown non-zero difference, the XOR of a fixed difference with an
unknown non-zero difference, or unknown.
246 C. Bouillaguet et al.
2 Preliminaries
α = 0 0
0 0
F
β = 0 α
F
α β = 0
0 = 0
F
α γ = 0
γ = 0 α
F
0 0
F
α = 0 0
The miss-in-the-middle follows the fact that the input and output differences force the
output difference of the third round to be 0. At the same time, due to the bijectiveness
of the round function the input difference of the third round is necessarily non-zero.
The two cannot coexist, as the round function is bijective.
Fig. 2. A Generic 5-Round Impossible Differential for Feistel Ciphers with a Bijective
Round Function
The most successful method for constructing impossible differentials is the miss
in the middle method. In this method, a probability one truncated differential
ΔIN → ΔA and a probability one truncated differential in the backward direction
ΔB ← ΔOUT are identified, such that ΔA and ΔB cannot coexist simultaneously.
For example, Figure 2 describes a 5-round Feistel construction with a bijective
round function, for which (α, 0) → (α, 0) is an impossible differential.
2.1 Notations
In this paper we use the following notations:
– n — denotes the number of threads in a given structure.
– w — denotes the size (in bits) of a given thread.
– α, β, . . . — denotes a non-zero difference.
– 0 — denotes a zero difference (in a thread).
– ? — denotes an unknown difference.
– →i , ←i — denotes the propagation of a (truncated) difference for i rounds
in the encryption/decryption direction.
– α β — denotes the event that an input difference α to a round function
F may result in an output difference β, i.e., P rx [F (x) ⊕ F (x ⊕ α) = β] > 0.
248 C. Bouillaguet et al.
F ⊕ F ⊕ F ⊕ F ⊕
⊕
F ⊕
⊕
F ⊕ F ⊕
Observation 1 We note that the above two truncated differentials can coexist if
and only if β = ω. Hence, if an input difference α to the round function may not
cause an ω difference at the output, i.e., if α ω is an impossible differential
for F , then these two differentials cannot coexist, and we obtain a 16-round
impossible differential of the form (0, 0, 0, α) →16 (ω, 0, 0, 0) for the cipher.
Table 2. The Two Truncated Differentials Used in Our New 16-Round Impossible
Differential on 4-Thread CAST-like Ciphers
the 8 S-boxes separately. We note that in DES’ round function, given a pair of
random input/output differences from an S-box, there is an 80% chance of the
transition being possible. Hence, for a random α and ω, the probability of x a
is only 0.88 ≈ 0.17.3
In the more general case, where the form of the round function is Fk (x) =
G(x ⊕ k), one can exhaustively try all possible pairs with input difference α,
and see if any of them leads to ω output difference. For a w-bit G(·) this takes
2w invocations of G(·), even if we only have a black box access to G(·) (but
not to Fk (·)). Of course, when the description of G(·) is known, this verification
is expected to be significantly faster. As we show in Section 5, even under the
worst case assumption, i.e., when G(·) is unknown, this has no real effect on the
actual attack that uses this impossible differential.
Moreover, we note that for a function G(·) of this form, the probability that
α ω is at most 0.5 for a random4 α and ω (following the fact that the row
corresponding to α in the difference distribution table has at most half of its
entries as non-zero). If we assume that G(·) is a random function, then according
to [15] we can determine that about 60.6% of the possible (α, ω) pairs yield an
impossible differential.
An interesting point concerning the truncated differentials suggested above is
the fact that their existence is independent of the actual differential properties
of the round functions. Namely, in the case Fk (·) is not bijective, the above
truncated differentials still hold, and thus, also the impossible differential. More
3
Even though the actual inputs to the different S-boxes are independent of each other,
assuming the key is chosen randomly, the differences are not. Hence, the actual value
of the full round function may be slightly different.
4
Most impossible differential attacks face a large amount of (α, ω) pairs which are
generated in a random manner.
252 C. Bouillaguet et al.
precisely, even if different round functions are used, the only one of interest is
the one of round 4.
Now, one can easily generalize the above impossible differential, and can easily
see that for an n-thread CAST-like block cipher, the following is an impossible
differential: (0, 0, 0, . . . , α) →n2 (ω, 0, . . . , 0) if α ω following the n-round
truncated differential (0, 0, 0, . . . , 0, α) →n (β, 0, 0, . . . , 0, α) and the n(n − 1)-
round truncated differential (ω, ?, . . . , ?, φ) ←n(n−1) (φ, 0, . . . , 0).
Table 3. The Two Truncated Differentials Used in Our New 8-Round Impossible
Differential on 4-Thread MARS-like Ciphers
We can of course generalize the above truncated differentials for the case
of an n-thread MARS-like cipher. The backwards differential is the same, i.e.,
5
We note that the differentials presented in Table 3 assume that the differences that
are XORed into each of the three threads is different (as in the real MARS there
are three different functions). When the same output is XORed into all the three
threads (in the real MARS, additions and subtractions are also used) then one can
construct a longer impossible differential for 9 rounds. In the forward direction we
use the following 5-round differential:
and it is easy to see that the two cannot coexist, as the XOR of the two intermediate
words cannot be the same.
New Insights on Impossible Differential Cryptanalysis 253
|ΔF (ΔS)|
max ,
|ΔS|>0 |ΔS|
i.e., the maximal increase in the size of a difference set through the round
function.
We first note that the above definitions are set for unkeyed functions. However,
for round functions of the form Fk (x) = G(x ⊕ k), one can disregard the key
addition, and use the same results. Moreover, once the key is fixed, this is the
case for any round function. For the following discussion, we shall assume that
indeed F (·) is of that form.
Now, if the differential expansion rate of a function is small, then ΔF (ΔF ({α}))
for a fixed input difference α may not be large enough to cover all possible dif-
ferences. Assume that this is indeed the case for a round function F (·) (we later
describe an example of such a round function), then one can easily extend the
16-round impossible differential for CAST-like structure with 4 threads by one
round by using the following truncated differentials: (0, 0, 0, α) →5 (γ, 0, α, β)
and (ω, ?, ?, φ) ←12 (ω, 0, 0, 0). If ω ∈ ΔF (ΔF ({α})), one can easily see that
(0, 0, 0, α) →17 (ω, 0, 0, 0).
More generally, if the differential expansion rate is c < 2w/2 then
|ΔF (ΔF ({x}))| < 2w , which means that there are many ω values for which
ω ∈ ΔF (ΔF ({α})). The smaller the value of c is, there is a larger set of dif-
ferences which are not in |ΔF (ΔF ({α}))|, and thus, allow for longer impossible
differentials.
These results can be generalized. If c < 2w/3 , then the above arguments
can be repeated and the forward differential can be extended to a 6-round
254 C. Bouillaguet et al.
To show the contrary, we consider an attack which uses the 16-round impos-
sible differential on 4-thread CAST-like structure and compare it to a similar
attack the uses the original 15-round impossible differential. As before, for sim-
plicity, we shall consider round functions of the form Fk (x) = G(x ⊕ k), which
are very common in block ciphers.
We first note that both the 16-round impossible differential and the 15-round
impossible differential share the same structure, i.e., (0, 0, 0, α) → (ω, 0, 0, 0).
Hence, the use of structures and early abort in the attacks is (almost) the same.
In Figure 6 we compare the 16-round attacks using the 15-round impossible
differential (there are two variants, in one the additional round is before the
impossible differential, and in the second it is after the impossible differential)
with the 17-round attacks using the 16-round impossible differential. As can be
seen, the attack algorithms are very similar, and the analysis of them is also
very similar. For example, the data complexity of the 16-round attack with an
additional round after the impossible differential is 2w · 22w chosen plaintexts,
while for the equivalent 17-round attack, the data complexity is 4w · 22w chosen
plaintexts.6 We compare the complexities of these attacks in Table 4.
to show that performing this check only when ω and α are both known, again
increases the data and time by a factor of two at most.
We conclude that the new attacks are indeed one round longer (when the
impossible differential is one round longer), and can be made longer, depending
on the exact impossible differential. At the same time, the data and time com-
plexities of the attacks increase by at most factor of two (the accurate increase
is the 1/p where p is the ratio of non-zero entries in the difference distribution
of G(·)).
Finally, we note that when more complex impossible differentials are used,
the same results apply, as long as the differential expansion rate of G(·) is small
258 C. Bouillaguet et al.
enough, or in the cases where the structure of G(·) allows quick verification of
the existence of contradiction.
In this paper we show how to extend several impossible differentials for gener-
alized Feistel schemes by one or more round, using a more subtle analysis of the
round function. We follow and show that attacks which are based on these new
impossible differentials require almost the same data and time complexity as the
previous attacks, which proves that these impossible differentials are not only of
theoretical interest, but can also be used in the analysis of block ciphers.
The new measure we introduced, the differential expansion rate of a round
function, is expected to motivate block ciphers designers to re-think some of the
basic approaches in block cipher design. For example, it is commonly believed
that even if only a small amount of nonlinearity is used in the round function,
then the cipher can still be secure. While this belief is not necessarily contradicted
by our findings, we do show that it is possible to exploit this small nonlinearity
in more complex attacks, such as impossible differential attacks, a combination
that was not suggested before.
Additionally, our results may suggest that constructions which take the oppo-
site approach than MARS, i.e., strong outer rounds with weaker inner rounds,
may be susceptible to impossible differential attacks. This follows the fact that
the development of difference sets that interest us, happen not in the outer
rounds, but instead in the inner rounds.
References
1. Adams, C., Heys, H., Tavares, S., Wiener, M.: The CAST-256 Encryption Algo-
rithm (1998); AES Submission
2. Aoki, K., Ichikawa, T., Kanda, M., Matsui, M., Moriai, S., Nakajima, J., Tokita,
T.: Camellia: A 128-Bit Block Cipher Suitable for Multiple Platforms - Design
and Analysis. In: Stinson, D.R., Tavares, S. (eds.) SAC 2000. LNCS, vol. 2012,
pp. 39–56. Springer, Heidelberg (2001)
3. Biham, E., Biryukov, A., Shamir, A.: Cryptanalysis of Skipjack Reduced to 31
Rounds Using Impossible Differentials. In: Stern, J. (ed.) EUROCRYPT 1999.
LNCS, vol. 1592, pp. 12–23. Springer, Heidelberg (1999)
4. Biham, E., Biryukov, A., Shamir, A.: Miss in the Middle Attacks on IDEA and
Khufu. In: Knudsen, L.R. (ed.) FSE 1999. LNCS, vol. 1636, pp. 124–138. Springer,
Heidelberg (1999)
5. Biham, E., Shamir, A.: Differential Cryptanalysis of the Data Encryption Standard.
Springer, Heidelberg (1993)
New Insights on Impossible Differential Cryptanalysis 259
6. Burwick, C., Coppersmith, D., D’Avignon, E., Gennaro, R., Halevi, S., Jutla, C.,
Matyas Jr., S.M., O’Connor, L., Peyravian, M., Safford, D., Zunic, N.: MARS - a
candidate cipher for AES (1998); AES submission
7. Choy, J., Yap, H.: Impossible Boomerang Attack for Block Cipher Structures. In:
Takagi, T., Mambo, M. (eds.) IWSEC 2009. LNCS, vol. 5824, pp. 22–37. Springer,
Heidelberg (2009)
8. Daemen, J., Rijmen, V.: AES Proposal: Rijndael (1998); NIST AES proposal
9. Keliher, L., Sui, J.: Exact Maximum Expected Differential and Linear Probability
for 2-Round Advanced Encryption Standard (AES) (2005); IACR ePrint report
2005/321
10. Kim, J., Hong, S., Lim, J.: Impossible differential cryptanalysis using matrix
method. Discrete Mathematics 310(5), 988–1002 (2010)
11. Kim, J., Hong, S., Sung, J., Lee, S., Lim, J., Sung, S.: Impossible Differential
Cryptanalysis for Block Cipher Structures. In: Johansson, T., Maitra, S. (eds.)
INDOCRYPT 2003. LNCS, vol. 2904, pp. 82–96. Springer, Heidelberg (2003)
12. Knudsen, L.R.: Deal — A 128-bit Block Cipher (1998); AES submission
13. Luo, Y., Wu, Z., Lai, X., Gong, G.: A Unified Method for Finding Impossible
Differentials of Block Cipher Structures (2009); IACR ePrint report 2009/627
14. Nyberg, K.: Generalized Feistel Networks. In: Kim, K., Matsumoto, T. (eds.) ASI-
ACRYPT 1996. LNCS, vol. 1163, pp. 91–104. Springer, Heidelberg (1996)
15. O’Connor, L.: On the Distribution of Characteristics in Bijective Mappings. In:
Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 360–370. Springer,
Heidelberg (1994)
16. Pudovkina, M.: On Impossible Truncated Differentials of Generalized Feistel and
Skipjack Ciphers. Presented at the Rump Session of the FSE 2009 Workshop
(2009),
http://fse2009rump.cr.yp.to/e31bba5d1227eac5ef0daa6bcbf66f27.pdf
17. Rivest, R.L., Robshaw, M.J., Sidney, R., Yin, Y.L.: The RC6 Block Cipher (1998);
AES submission
18. US Government: SKIPJACK and KEA Algorithm Specification (1998)
19. US National Institute of Standards and Technology: Advanced Encryption
Standard (2001); Federal Information Processing Standards Publications No. 197
A Unified Framework
for Small Secret Exponent Attack on RSA
1 Introduction
RSA cryptosystem is the widely used cryptosystem [12]. Let N be an RSA moduli
and d be an RSA secret key. The small secret exponent d is often used to speed
up the decryption or signature generation in some cryptographic applications.
However, it is well known that RSA scheme is easily broken if secret exponent d
is small.
In 1990, Wiener [14] showed that RSA scheme is broken by using continued
fraction expansion when d < 13 N 1/4 . In 1999, Boneh and Durfee reduced the
small secret exponent attack into finding small roots of a bivariate modular
equation: x(A + y) ≡ 1(mod e) and then proposed two algorithms for solving
the problem [2]. They referred to the problem as the small inverse problem. Their
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 260–277, 2012.
c Springer-Verlag Berlin Heidelberg 2012
A Unified Framework for Small Secret Exponent Attack on RSA 261
2 Preliminaries
First, we briefly recall the LLL algorithm and Howgrave-Graham’s lemma. Then,
we review the small secret exponent attack on RSA cryptosystem [2] and intro-
duce the “small inverse problem.” Then, we explain previous algorithms for
solving the small inverse problem.
262 N. Kunihiro, N. Shinohara, and T. Izu
where |x̄| < eδ and |ȳ| < e1/2 . They referred to this as the small inverse problem.
nomials are referred as y-shifts. Let FBD (m; τ ) be a set of shift-polynomials. The
set FBD (m; τ ) is given by FBD (m; τ ) := GBD (m) ∪ HBD (m; τ ), where
They achieved a bound: d ≤ N 0.284 using FBD (m; τ ). We refer to this method as
Boneh–Durfee’s weaker method. Then, Boneh and Durfee improved the bound
to d ≤ N 0.292 by removing y-shift polynomials whose coefficient of leading term
exceeds em . The resulting lattice is not full rank and computing its volume is not
easy. To overcome this difficulty, they introduced a concept of “Geometrically
Progressive Matrix” and succeeded to obtain an upper bound of the volume. The
analysis for its bound, especially its volume evaluation, is rather complicated.
Blömer and May [1] presented another algorithm. Although the bound: d ≤
N 0.290 is worse than Boneh–Durfee’s bound, their method has several interesting
features. The first is that it requires a smaller lattice dimension for solving the
problem. The second is that the involved lattice is full rank and the analysis for
the bound is simpler than Boneh–Durfee’s. However, the evaluation of bound is
still complicated.
Herrmann and May [7] proposed a novel method which achieves the bound:
d ≤ N 0.292 by employing unravelled linearization technique. We briefly recall
Herrmann–May’s method. Note that we use different notation from the original
description of [7]. First, f (x, y) is transformed into f (x, y) = x(A+y)+1 = (xy +
1)+Ax. The first step of their method is to perform a linearization of f (x, y) into
f¯(x, z) := z + Ax by setting xy + 1 = z. In a second step of analysis, xy is back-
substituted by xy = z − 1 for each occurrence of xy. They define the polynomials
as ḡ[i,j] (x, z) := xi f¯(x, z)j em−j and h̄[i,u] (x, y, z) := y i f¯(x, z)u em−u . Let τ be
an optimization parameter with 0 < τ ≤ 1. Let FHM (m; τ ) be a set of shift-
polynomials. The FHM (m; τ ) is given by FHM (m; τ ) := GHM (m) ∪ HHM (m; τ ),
where
264 N. Kunihiro, N. Shinohara, and T. Izu
They achieved the bound: d ≤ N 0.292 using FHM (m; τ ). Note that its lattice is
also full rank.
Let z̄ = x̄ȳ + 1. It is easy to see that ḡ[i,k] (x̄, z̄) = 0(mod em ) for any non-
negative integers i and k. The upper bound of |z̄| is given by XY + 1 and then
we define Z = XY + 1.
For y-shifts, we set
It is easy to see that h̄[i,k] (x̄, ȳ, z̄) = 0(mod em ) for any non-negative integers i
and k.
Remark 1. From the definition, it holds that ḡ[0,u] (x, z) = h̄[0,u] (x, y, z).
Next, we fix a set of indexes for shift-polynomials. Let t be a parameter which
is optimized later with 0 ≤ t ≤ m. Let FBM (m; t) be a set of shift-polynomials.
The set FBM (m; t) is given by FBM (m; t) := GBM (m; t) ∪ HBM (m; t), where
Lemma 3. If h̄[j,u] ∈ FBM (m; t), then h̄[j−1,u] and h̄[j−1,u−1] ∈ FBM (m; t).
Furthermore, it holds that h̄[j−1,u] ≺ h̄[j,u] and h̄[j−1,u−1] ≺ h̄[j,u] .
Proof of Lemma 3. It is clear that h̄[j−1,u] ∈ FBM (m; t). Note that we can
ḡ[0,u] instead of h̄[0,u] since h̄[0,u] and ḡ[0,u] are identical from Remark 1. Since
h̄[j,u] ∈ FBM (m; t), it holds that 1 ≤ j ≤ u+t−m. Then, 0 ≤ j−1 ≤ (u−1)+t−m.
Hence, it holds that h̄[j−1,u−1] ∈ FBM (m; t).
Theorem 1. Let m and t be integers with t ≤ m. A lattice basis matrix BBM (m; t)
is triangular for any m and t.
Before giving a proof, we give three lemmas, whose proofs are given in Ap-
pendix A.1.
Lemma 4. If 0 ≤ u ≤ m, S(ḡ[u,0] − em xu ) = ∅.
From Lemmas 2–3 and 4–6, for any f ∈ F , there exists mf such that S(f −mf ) ⊆
S(F f ). We can easily verify that mf ∈ S(F f ). Then, the lattice basis matrix is
triangular.
A lattice basis is constructed by using coefficient vectors of x-shifts ḡ[i,k] (xX, zZ)
in GBM (m; t) and y-shifts h̄[j,u] (xX, yY, zZ) in HBM (m; t). We denote the number
of shift-polynomials used in x-shifts and y-shifts by wx and wy , respectively. We
also denote contributions in x-shifts and y-shifts to lattice volume by vol(LX )
and vol(LY ), respectively. The total number of shift-polynomials w is given by
w = wx + wy and a lattice volume vol(L) is given by vol(L) = vol(LX )vol(LY ).
First, we derive wx and vol(LX ). The lattice dimension wx is given by wx =
m l
l=m−t k=0 1. The volume vol(LX ) is given by
m
l
m
l
vol(LX ) = X l−k Z k em−k = emwx X l−k (Z/e)k .
l=m−t k=0 l=m−t k=0
A Unified Framework for Small Secret Exponent Attack on RSA 267
Let vol(LX ) = emwx X sXX (Z/e)sXZ . Each sXX and sXZ is explicitly given as
follows:
m l
m3 − (m − t)3 1 − (1 − η)3 3
sXX = l−k = + o(m3 ) = m + o(m3 )
6 6
l=m−t k=0
m l
m3 − (m − t)3 1 − (1 − η)3 3
sXZ = k= + o(m3 ) = m + o(m3 ),
6 6
l=m−t k=0
3 3 3 3
where η := t/m. Then, we have vol(LX) = emwx X (1−(1−η) )m /6 (Z/e)(1−(1−η) )m /6 .
Second, we derive wy and vol(LY ). The lattice dimension wy is given by
t l
wy = l=0 j=1 1. The volume vol(LY ) is given by
t
l
t
l
vol(LY ) = Y j Z l+m−t em−l−m+t = emwy Y j (Z/e)l+m−t .
l=0 j=1 l=0 j=1
3 3 2
(3−η)m3 /6
Then, we have vol(LY ) = emwy Y η m /6 (Z/e)η .
Summing up the above discussion, we have
3
)m3 /6 3
m3 /6 3
vol(L) = vol(LX )vol(LY ) = emw X (1−(1−η) Yη (Z/e)ηm /2
. (1)
By combining Proposition 1 and Lemma 1, the condition that the √problem can be
solved in polynomial time is given by 2w/4 vol(L)1/(w−1) ≤ em / w. By ignoring
small terms, we have the condition: vol(L) ≤ emw . From Eq. (1), we have the
condition:
2 2
X 3−3η+η Y η Z 3 ≤ e3 . (2)
By substituting Z = XY + 1 ≤ 2XY and Y = e1/2 into Eq. (2) and neglecting
small terms which don’t depend on e, we have the following inequality about X:
3−η2
X < e 2(6−3η+η2 ) .
The
√ maximum value of the exponent
√ part in the right hand side is given by
( 6 − 1)/5 ≈ 0.290 when η = 3 − 6 ≈ 0.55. This is exactly the same as the
bound of Blömer–May [1].
268 N. Kunihiro, N. Shinohara, and T. Izu
Lemma 8. Suppose that 0 < τ ≤ 1. If h̄[j,u] ∈ F (m; τ, η), then h̄[j−1,u] and
h̄[j−1,u−1] ∈ F (m; τ, η). Furthermore, it holds that h̄[j−1,u] ≺ h̄[j,u] and h̄[j−1,u−1]
≺ h̄[j,u] .
We show that our framework includes previous works as special cases. First, we
show that our Framework includes Herrmann–May’s work [7] as a special case.
We gave the set of shift-polynomials FHM (m; τ ) for Herrmann–May’s method in
Section 2.3. From the definition, it holds that
Note that t/m ≤ 1 from the definition. Then, Blömer–May’s method is obtained
by setting τ = 1 in our unified framework.
From Lemmas 4–6 and 7–8, for any f ∈ F , there exists mf such that S(f −mf ) ⊆
S(F f ). We can easily verify that mf ∈ S(F f ). Then, the lattice basis matrix is
triangular.
We show a small example for m = 3, τ = 1/2 and η = 1/3. We have
or we explicitly have
G(3; 1/3) = {ḡ[2,0], ḡ[1,1] , ḡ[0,2] , ḡ[3,0] , ḡ[2,1] , ḡ[1,2] , ḡ[0,3] } and H(3; 1/2, 1/3) = {h̄[1,3]}.
m
l
m
l k
Z
vol(LX ) = X l−k Z k em−k = emwx X l−k .
e
l=m(1−η) k=0 l=m(1−η) k=0
Let vol(LX ) = emwx X sXX (Z/e)sXZ . Each sXX and sXZ is explicitly given as
follows:
m l
1 − (1 − η)3 3
sXX = l−k = m + o(m3 ) and
6
l=m(1−η) k=0
m l
1 − (1 − η)3 3
sXZ = k= m + o(m3 ).
6
l=m(1−η) k=0
Then, we have
(1−(1−η)3 )m3 /6
mwx (1−(1−η)3 )m3 /6 Z
vol(LX ) = e X .
e
A Unified Framework for Small Secret Exponent Attack on RSA 271
ηm
τl
ηm
τl l+m(1−η)
j l+m(1−η) m−l−m(1−η) mwy j Z
vol(LY ) = Y Z e =e Y .
e
l=0 j=1 l=0 j=1
Then, we have
τ η2 (3−η)m3 /6
mwy η 3 τ 2 m3 /6 Z
vol(LY ) = e Y .
e
Remember that the condition that the problem can be solved in polynomial time
is given by vol(L) ≤ emw by ignoring small terms. From Eq. (3), we have the
condition:
(3−3η+η2 )+τ (3η−η2 )
3−3η+η 2 τ 2 η 2 Z
X Y ≤ 1. (4)
e
As described in previous subsection, we obtain the same set as those of Herrmann–
May or Blömer–May if we set η = 1 or τ =1. Deriving bounds for each case are
described in full version [9].
Let P and P̄ be sets such that P = {(τ, η) | 0 < τ < 1, 0 < η < 1} and
P̄ = {(τ, η) | 0 < τ ≤ 1, 0 < η ≤ 1}. In order to obtain the maximal value of the
right side of Eq. (5) in P̄, we firstly consider the extremal values of the following
function Ψ (τ, η) in P:
(3 − 3η + η 2 ) + (3η − η 2 )τ − η 2 τ 2
Ψ (τ, η) := .
2(3 − 3η + η 2 ) + (3η − η 2 )τ
Let Num(τ, η) and Den(τ, η) be the numerator and denominator of Ψ (τ, η) re-
spectively. Here, we show that Den(τ, η) = 0 in P. If Den(τ, η) = 0, then we
have
2(3 − 3η + η 2 ) (η − 3/2)2 + 34
0<τ = = 2 .
η 2 − 3η (η − 3)η
However, this contradicts the condition 0 < η < 1. Therefore, the rational func-
tion Ψ (τ, η) ∈ Q(τ, η) is obviously differentiable in P. By solving the algebraic
equation ∂Ψ ∂Ψ
∂τ = ∂η = 0, we show that there are no extremal values of Ψ (τ, η) in
P . Let Φτ (τ, η), Φη (τ, η) be polynomials such that
∂Ψ ∂Ψ
Φτ (τ, η) := · Den(τ, η)2 , Φη (τ, η) := · Den(τ, η)2 .
∂τ ∂η
Note that both Φτ and Φη are in Z[τ, η], and we solve the algebraic equation
Φτ = Φη = 0 by introducing Gröbner basis. Let G be the Gröbner basis for
the ideal generated by Φτ , Φη with respect to the lexicographic order ≺LEX such
that η ≺LEX τ . Then G contains three polynomials in Z[τ, η], and one of them
is m(η) such that
This fact implies that, for every extremal value Ψ (τ0 , η0 ) where (τ0 , η0 ) ∈ R2 , η0
is a root of m(η) over R. Since m(η) does not have its root in the real interval
(0, 1), there are no extremal values of Ψ (τ, η) in P.
Hence, we only have to check the maximal values of Ψ (0, η), Ψ (1, η) for 0 ≤
η ≤ 1 and Ψ (τ, 0), Ψ (τ, 1) for 0 ≤ τ ≤ 1, and furthermore the two cases τ = 1
and η = 1 are discussed above. The maximal value of the right side of Eq. (5)
for τ = 0 or η = 0 is e1/4 since Ψ (0, η) = Ψ √ (τ, 0) = 1/2, and thus the maximal
value of the right side of Eq. (5) in P̄ is e 1− 1/2
.
We will present a hybrid method for arbitrary α in full version [9]. Theorem 4
shows that Blömer–May like method (τ = 1) is superior to Herrmann–May like
method (η = 1) if α < 1/4. Interestingly, if α is extremely small (α < 3/35),
Herrmann–May like and Blömer–May like methods are not best-known algo-
rithms. We show the details in full version [9]. We also show another extension
in Appendix B.
6 Concluding Remarks
We should point out the relationship between our results and the discussion
in May’s PhD thesis [11]. He presented the interpolation between the results
of Blömer–May and Boneh–Durfee by using a concept called strictly decreasing
pattern in Section 7 of [11]. He also argued that Boneh–Durfee’s stronger bound is
optimal over all decreasing patterns. However, no formal proof of its optimality
has been given in [11]. On the contrary to [11], we give a strict proof of the
optimality within our framework in Section 4. Furthermore, we extend our results
to arbitrary Y = eα , which has not been discussed in [11] and is also an advantage
over [11].
It has been known that Blömer–May method has an advantage because their
method requires a smaller lattice dimension than the Boneh–Durfee’s lattice.
Theorem 4 gives another view of their algorithm. Theorem 4 shows Blömer–May
method has another advantage because it achieves a better bound in addition
to less lattice dimension; Blömer–May method achieves a higher bound than
Herrmann–May method (and Boneh–Durfee’s method) if α ≤ 1/4.
For the usual small secret exponent attack on RSA, we just showed that
d ≤ N 0.292 is an optimal bound in our framework. Hence, the bound might be
improved if we develop the other method outside of our framework, which is an
open problem.
References
1. Blömer, J., May, A.: Low Secret Exponent RSA Revisited. In: Silverman, J.H. (ed.)
CaLC 2001. LNCS, vol. 2146, pp. 4–19. Springer, Heidelberg (2001)
2. Boneh, D., Durfee, G.: Cryptanalysis of RSA with private key d less than N 0.292 .
IEEE Transactions on Information Theory 46(4), 1339–1349 (2000); (Firstly ap-
peared in Eurocrypt 1999)
3. Coppersmith, D.: Finding a Small Root of a Univariate Modular Equation. In:
Maurer, U.M. (ed.) EUROCRYPT 1996. LNCS, vol. 1070, pp. 155–165. Springer,
Heidelberg (1996)
4. Coppersmith, D.: Finding a Small Root of a Bivariate Integer Equation; Factor-
ing with High Bits Known. In: Maurer, U.M. (ed.) EUROCRYPT 1996. LNCS,
vol. 1070, pp. 178–189. Springer, Heidelberg (1996)
5. Coppersmith, D.: Small Solutions to Polynomial Equations, and Low Exponent
RSA Vulnerabilities. J. Cryptology 10(4), 233–260 (1997)
A Unified Framework for Small Secret Exponent Attack on RSA 275
6. Herrmann, M., May, A.: Attacking Power Generators Using Unravelled Lineariza-
tion: When Do We Output Too Much? In: Matsui, M. (ed.) ASIACRYPT 2009.
LNCS, vol. 5912, pp. 487–504. Springer, Heidelberg (2009)
7. Herrmann, M., May, A.: Maximizing Small Root Bounds by Linearization and
Applications to Small Secret Exponent RSA. In: Nguyen, P.Q., Pointcheval, D.
(eds.) PKC 2010. LNCS, vol. 6056, pp. 53–69. Springer, Heidelberg (2010)
8. Howgrave-Graham, N.: Finding Small Roots of Univariate Modular Equations Re-
visited. In: IMA Int. Conf., pp.131–142 (1997)
9. Kunihiro, N., Shinohara, N., Izu, T.: A Unified Framework for Small Secret Expo-
nent Attack on RSA. IACR ePrint Archive
10. Lenstra, A.K., Lenstra, H.W., Lovász, L.: Factoring polynomials with rational co-
efficients. Mathematische Annalen 261, 515–534 (1982)
11. May, A.: New RSA Vulnerabilities Using Lattice Reduction Methods. PhD thesis,
University of Paderborn (2003)
12. Rivest, R., Shamir, A., Adleman, L.: A Method for Obtaining Digital Signa-
tures and Public-Key Cryptosystems. Communications of the ACM 21(2), 120–126
(1978)
13. Sarkar, S., Maitra, S., Sarkar, S.: RSA Cryptanalysis with Increased Bounds on
the Secret Exponent using Less Lattice Dimension. IACR ePrint Archive: Report
2008/315 (2008)
14. Wiener, M.: Cryptanalysis of Short RSA Secret Exponents. IEEE Transactions on
Information Theory 36, 553–558 (1990)
A Proofs
j
m−j u−j j m−j u−j j i j−i j−i
ḡ[u−j,j] (x, z) = e x (z + Ax) = e x zx A
i=0
i
j−1
m−j u−j j m−j j−i j u−i i
=e x z + e A x z.
i=0
i
Then, we have
j−1 j−1
ḡ[u−j,j] (x, z) − em−j xu−j z j ∼
= xu−i z i = xu−j+1 x(j−1)−i z i
i=0 i=0
∼
= xu−j+1 (z + Ax)j−1 ∼
= ḡ[u−j+1,j−1] .
Then, we have
u−1 u−1
h̄[j,u] (x, y, z) − em−u y j z u ∼
= xu−i y j z i = y j−1 xy x(u−1)−i z i
i=0 i=0
∼
= y j−1 (z − 1)(z + Ax)u−1 ∼ = y j−1 (z + Ax)u−1 z + y j−1 (z + Ax)u−1
∼
= h̄[j−1,u−1] z + h̄[j−1,u−1] .
Hence, we have
Let P and P̄ be the sets defined in the proof of Theorem 3. In order to obtain
the maximal value of the right side of (6) in P̄, we firstly consider the extremal
values of the following function Ψα (τ, η) in P:
(1 − α)((3 − 3η + η 2 ) + (3η − η 2 )τ ) − αη 2 τ 2
Ψα (τ, η) =
2(3 − 3η + η 2 ) + (3η − η 2 )τ
Notice that the denominator of Ψα (τ, η) is Den(τ, η) given in the proof of The-
orem 3, and so Ψα (τ, η) is also differentiable in P.
In the same manner as the proof of Theorem 3, we show that there are no
(α) (α)
extremal values of Ψα (τ, η) in P for any α ∈ (0, 1). Let Φτ (τ, η), Φη (τ, η) be
polynomials such that
∂Ψα ∂Ψα
Φ(α)
τ (τ, η) = · Den(τ, η)2 , Φ(α)
η (τ, η) = · Den(τ, η)2 .
∂τ ∂η
(α) (α)
We solve the algebraic equation Φτ = Φη = 0 by introducing Gröbner basis.
Let Gα be the Gröbner basis under 0 < α < 1 for the ideal generated by
A Unified Framework for Small Secret Exponent Attack on RSA 277
(α) (α)
Φτ , Φη with respect to the lexicographic order ≺LEX such that η ≺LEX τ .
One of polynomials in Gα is mα (η) such that
mα (η) = η(η − 1)(η − 3)(η 2 − 3η + 3){3α(η − 1)2 + (η − 3)2 }.
This fact implies that, for every extremal value Ψα (τ0 , η0 ) where (τ0 , η0 ) ∈ R2 ,
η0 is a root of mα (η) over R. Since mα (η) does not have its root in the real
interval (0, 1), there are no extremal values of Ψα (τ, η) in P.
Hence, we only have to check the maximal values of Ψα (0, η), Ψα (1, η) for
0 ≤ η ≤ 1 and Ψα (τ, 0), Ψα (τ, 1) for 0 ≤ τ ≤ 1. If η = 0 or τ = 0, then
Ψα (τ, 0) = Ψα (0, η) = (1 − α)/2, and so the maximal value for η = 0 or τ = 0 is
(1 − α)/2.
For η = 1, we have that
−ατ 2 + (1 − α)(1 + 2τ )
Ψα (τ, 1) = ,
2(τ + 1)
and so the maximal value for η = 1 is
3
4 −√
1
α (τ = 1,70 < α < 41 ) (7)
1 − α (τ = 1/α − 1, 4 ≤ α < 1).
For τ = 1, we have that
3 − α(η 2 + 3)
Ψα (1, η) = ,
6 − 3η + η 2
and so the maximal value for τ = 1 is
2 7 2
( α − α + 1 − 3α + 1). (8)
5
By comparing with the above values, we have the theorem.
Sony Corporation
5-1-12 Kitashinagawa Shinagawa-ku, Tokyo 141-0001, Japan
{Toru.Akishita,Harunaga.Hiwatari}@jp.sony.com
1 Introduction
CLEFIA [9,11] is a 128-bit blockcipher supporting key lengths of 128, 192 and
256 bits, which is compatible with AES [2]. CLEFIA achieves enough immunity
against known attacks and flexibility for efficient implementation in both hard-
ware and software. It is reported that CLEFIA is highly efficient particularly in
hardware implementations [12,10,13].
Compact hardware implementations are very significant for small embedded
devices such as RFID tags and wireless sensor nodes because of their limited
hardware resources. As for AES with 128-bit keys, low-area hardware implemen-
tations have been reported in [3] and [4]. The former uses a RAM based archi-
tecture supporting both encryption and decryption with the area requirements
of 3,400 GE, while the latter uses a shift-register based architecture supporting
encryption only with the area requirements of 3,100 GE. Both implementations
use an 8-bit serialized data path and implement only a fraction of the Mix-
Columns operation with additional three 8-bit registers, where it takes several
clock cycles to calculate one column. Very recently, another low-area hardware
implementation of AES was proposed in [5] requiring 2,400 GE for encryption
only. Unlike the previous two implementations, it implements MixColumns not
in a serialized way, where one column of MixColumns is processed in 1 clock
cycle. Thus it requires 4 times more XOR gates for MixColumns, but requires
no additional register and can reduce gate requirements for control logic.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 278–292, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Very Compact Hardware Implementations of CLEFIA 279
06 04 02 01 0A 02 08 01
The multiplications between these matrices and vectors are performed in GF(28 )
defined by a primitive polynomial z 8 + z 4 + z 3 + z 2 + 1.
280 T. Akishita and H. Hiwatari
k0 k1 k2 k 3
8/ 8/ 8/ 8/
8 8
x0 / S0 / y0
P0 P1 P2 P3 8 8
x1 / S1 / y1
32/ 32/ 32/ 32/
8
M0 8
x2 / S0 / y2
RK0 WK0 RK1 WK1 8 8
x3 / S1 / y3
F0 F1
F0
k0 k1 k2 k 3
RK2 RK3 8/ 8/ 8/ 8/
F0 F1 x0 8 8
y0
/ S1 /
8 8
x1 / S0 / y1
RK4 RK5 8
M1 8
x2 / S1 / y2
F0 F1 x3 8
/ S0
8
/ y3
F1
.. .. .. ..
. . . .
Fig. 2. F-functions F0 , F1
RK2r−2 RK2r−1
F0 F1 128 bits
WK2 WK3 7 57 57 7
32/ 32/ 32/ 32/
C0 C1 C2 C3 57 7 7 57
Ω : X $→ Y
Y = X[64-127] | X[0-63]
Ψ : X $→ Y
Y = X[71-127] | X[57-70] | X[0-56]
X[a-b] denotes a bit string cut from the a-th bit to the b-th bit of X. Please note
that Ω and Ψ are both involutive. The 128-bit key register for the intermediate
key L is updated by applying Ω and Ψ alternately. Round keys are always
generated from the most significant 64-bit of the key register. After the final
round of encryption, L is re-stored into the key register by applying the following
F inalSwap function Φ.
Φ : X $→ Y
Y = X[49-55] | X[42-48] | X[35-41] | X[28-34] | X[21-27] | X[14-20] |
X[7-13] | X[0-6] | X[64-71] | X[56-63] | X[121-127] | X[114-120] |
X[107-113] | X[100-106] | X[93-99] | X[86-92] | X[79-85] | X[72-78]
Please note that Φ is also involutive. In case of decryption, round keys are always
generated from the most significant 64-bit of the key register by applying the
inverse functions of Ω, Ψ and Φ in reverse order of encryption. Due to their
involutive property, only three functions Ω, Ψ and Φ are required for encryption
and decryption.
In the second technique, XOR operations with the parts of round keys re-
lated to a secret key K are moved by an equivalent transformation into the two
data lines where key whitening operations are processed. Therefore, these XOR
operations and key whitening operations can be shared.
In [13], five types of hardware architectures were designed and fairly compared
to the ISO 18033-3 standard blockciphers under the same conditions. In their
results, the highest efficiency of 400.96 Kbps/gates was achieved, which is at
least 2.2 times higher than that of the ISO 18033-3 standard blockciphers.
3 Proposed Architectures
In this section we propose three types of hardware architectures. Firstly, we
propose a compact matrix multiplier for CLEFIA-128. Next, in Type-I architec-
ture, we propose a novel serialized architecture of the data processing block of
CLEFIA-128. By adaptively applying clock gating logic to Type-I architecture,
282 T. Akishita and H. Hiwatari
MUX2 MUX1
1 {02}
R0 :
k0 k1 k2 k3 z0 z1 z2 z3 z3 3 2
:1
x0 S0 a0
R1
z2 3
:1
:1
2 {02}
x1 S1 a1
:1
M0 R2
z1 3 2 {02} a
S0 a2 :1 i
x2
R3
x3 S1 a3 z0 3
:1
Matrix Multiplier
w0 w1 w2 w3
(a) (b)
l 1 2 3 4
R0 z3 ⊕{06}a0 z2 ⊕{04}a0 ⊕{06}a1 z1 ⊕{02}a0 ⊕a1 ⊕{06}a2 z0 ⊕a0 ⊕{02}a1 ⊕{04}a2 ⊕{06}a3
R1 z2 ⊕{04}a0 z3 ⊕{06}a0 ⊕{04}a1 z0 ⊕a0 ⊕{02}a1 ⊕{04}a2 z1 ⊕{02}a0 ⊕a1 ⊕{06}a2 ⊕{04}a3
R2 z1 ⊕{02}a0 z0 ⊕a0 ⊕{02}a1 z3 ⊕{06}a0 ⊕{04}a1 ⊕{02}a2 z2 ⊕{04}a0 ⊕{06}a1 ⊕a2 ⊕{02}a3
R3 z0 ⊕a0 z1 ⊕{02}a0 ⊕a1 z2 ⊕{04}a0 ⊕{06}a1 ⊕a2 z3 ⊕{06}a0 ⊕{04}a1 ⊕{02}a2 ⊕a3
(c)
Fig. 4. Matrix multiplier: (a) F -function F0 , (b) Data path, (c) Contents of registers
Rj (0 ≤ j < 4) at the l-th cycle
2 3 2 2 2 2
:1 :1 :1 :1 :1 :1
R33 R32 R31 R30 R13 R12 R11 R10 R03 R02 R01 R00
data_in
data_out
1 {02}
R20 :
3 2
:1
R21 :1
Data Processing Block 3 2 {02}
:1
S0
1:
R22 2 1
:
3 {02} 2
:1
S1
R23 :1 32
3 2
:1 32
:1 32
128
:1
32
4 4 key_in
32
128
2
:1
L33 L32 L31 L30 L23 L22 L21 L20 L13 L12 L11 L10 L03 L02 L01 L00
performed by selecting the intermediate results through MUX2 and XORing the
products of ai (i = 1, 2, 3) with them at the (i + 1)-th cycle. After 4 clock cycles,
wi (0 ≤ i < 4) are stored in Ri . The multiplication by M1 can be performed by
switching MUX1.
In [4], three 8-bit registers are required for the construction of a parallel-to-
serial converter due to avoiding register competition with the next calculation of
a matrix. On the other hand, no competition occurs in our architecture because
zi is input at the 1st cycle of a matrix multiplication. wi can be moved into the
register where zi for the newly processing F-function is stored.
for a round of the encryption processing. As described in Sect. 3.1, at the 1st
and the 5th cycle in the 8 cycles, the data stored in R20 –R23 are moved into
R03 –R12 , and simultaneously the data stored in R10 –R13 are input to the matrix
multiplier. Therefore, no additional register but the 128-bit data register exists
in the data processing block. Please note that R30 –R33 hold the current state at
the 5–8th cycle by clock gating.
In the start of encryption, a 128-bit plaintext is located to Rij in 16 clock
cycles by inputting it byte by byte from data in. After 18 rounds of the encryp-
tion function which require 144 cycles, a 128-bit ciphertext is output byte by
byte from data out in 16 clock cycles. Therefore, it takes 176 cycles for encryp-
tion. The reason why data out is connected to R30 is that no word rotation is
necessary at the final round of encryption. In the start of key setup, a 128-bit
secret key K input from key in is located to Rij in 16 clock. After 12 rounds of
the encryption function which require 96 cycles, a 128-bit intermediate key L is
stored into the key registers Lij (0 ≤ i, j < 4) by shifting Rij and Lij in 16 clock
cycles. Therefore, it takes 128 cycles for key setup.
The two S-box circuits S0 and S1 are located in the data processing block,
and one of those outputs is selected by a 2-to-1 MUX (8-bit width) and input to
the matrix multiplier. The encryption processing of CLEFIA-128 is modified by
a equivalent transformation as shown in Fig. 7 (a). The 32-bit XOR operation
with 32-bit chunks Ki is reduced to the 8-bit XOR operation by locating it in
the matrix multiplier. The 32-bit chunk Ki selected by a 32-bit 4-to-1 MUX is
divided into four 8-bit data, and then one of the data is selected by a 8-bit 4-to-1
MUX and fed into the matrix multiplier one by one in 4 clock cycles.
In the key scheduling block, the intermediate key L stored in Lij is cyclically
shifted by one byte, and the 8-bit chunk in L00 is fed into the data processing
after being XORed with the 8-bit chunk of CONi . At the end of even-numbered
rounds, Lij is updated by (8-bit shift+Σ) operation; at the end of encryption, Lij
is updated by (8-bit shift + Σ −8 ) operation in order to recover the intermediate
key L. After restoring the intermediate key L, Lij holds it by clock gating until
next start of encryption.
2 3
:1 :1
R33 R32 R31 R30 R13 R12 R11 R10 R03 R02 R01 R00
data_in
data_out
1 {02}
R20 :
2
2
:1
R21 :1
Data Processing Block
2 {02}
2
:1
S0
1:
R22 2 1
:
2 {02} 2
:1
S1
R23 :1 32
3 3
:1 32
:1 32
128
:1
32
4 4 key_in
32
Σ
128 128
2
:1
L33 L32 L31 L30 L23 L22 L21 L20 L13 L12 L11 L10 L03 L02 L01 L00
P0 P1 P2 P3 C0 C1 C2 C3 C0 C1 C2 C3
RK0* K0 RK1* K1 RK34* K2 RK35* K3 RK34* K2 RK35* K3
F0 F1 F0 F1 F0 F1
RK2* K0 RK3* K1 RK32* K3 RK33* K2 RK33* K2 RK32* K3
F0 F1 F0 F1 F1 F0
RK4* K2 RK5* K3 RK30* K1 RK31* K0 RK30* K1 RK31* K0
F0 F1 F0 F1 F0 F1
Fig. 7. (a) Encryption processing, (b) Decryption processing, (c) Optimized decryption
processing. XOR operations with the part of round keys related to secret key K are
moved by an equivalent transformation, and thus RKj∗ (0 ≤ j < 36) denote the
remaining part of round keys.
those stored in R20 –R23 are swapped by cyclically shifting these registers in 4
clock cycles, while the other data register and the key registers hold the current
state by clock gating. Simultaneously, the XOR operation with a 32-bit chunk
Ki is done by XOR gates in the matrix multiplier, which leads the savings
of 8 XOR gates. These data swaps are required twice for a round of the en-
cryption processing. Therefore, it takes 16 cycles for a round of the encryption
processing; in total 328 and 224 clock cycles are required for encryption and key
setup, respectively. In compensation for the increase of many cycle counts, several
8-bit inputs of MUXes together with 8 XOR gates for secret key chunk can be
discarded.
4 Supporting Decryption
Any encryption-only implementation can support decryption by using the CTR
mode. Yet, if the implementation itself supports decryption, it can be used for
more application, e.g., an application requiring the CBC mode. Accordingly, we
consider the three types of hardware architectures supporting decryption.
Since the data processing part of CLEFIA employs a 4-branch Type-2 gen-
eralized Feistel network [14], the directions of word rotation are different be-
tween the encryption function and the decryption function. The encryption and
Very Compact Hardware Implementations of CLEFIA 287
X X
Y Y
X X
Y Y
4-input AND-NOR gate 4-input OR-NAND gate
with 2 inputs inverted with 2 inputs inverted
Fig. 8. 4-input AND-NOR and 4-input OR-NAND gate with 2 inputs inverted, which
correspond to XOR and XNOR gate
5 Implementation Results
One Gate Equivalent (GE) is equivalent to the area of a 2-way NAND with the
lowest drive strength. For synthesis, we use a clock frequency of 100 KHz, which
is widely used operating frequency for RFID applications.
Recently, scan flip-flops have been used in the low-area implementations of
blockciphers instead of combinations of D flip-flops and 2-to-1 MUXes [8,1,5] to
reduce area requirements. In our evaluation, a D flip-flop and a 2-to-1 MUX cost
288 T. Akishita and H. Hiwatari
4.5 and 2.0 GE, respectively, while a scan flip-flop costs 6.25 GE. Thus, we can
save 0.25 GE per bit of storage. Moreover, the library we used has the 4-input
AND-NOR and 4-input OR-NAND gates with 2 inputs inverted described in
Fig. 8. The outputs of these cells are corresponding to those of XOR or XNOR
gates when the inputs X, Y are set as shown in Fig. 8. Since these cells cost 2
GE instead of 2.25 GE required for XOR or XNOR cell, we can save 0.25 GE per
XOR or XNOR gate. Clock gating logics are inserted into the design manually
by instantiating Integrated Clock Gating (ICG) cells to gate the clocks of specific
registers.
Table 1 shows the detailed implementations figures of the three types of hard-
ware architectures presented in Sect. 3. CON generator and selector, ICG cells,
and buffers are included in controller.
The area savings for the key scheduling block of Type-II/III implementation
over Type-I implementation are 128 GE. In the library we used, a register with
a 3-to-1 MUX costs 7.25 GE per bit; a register with a 4-to-1 MUX costs 8.25 GE
per bit. The key register of Type-I implementation consists of 120 registers with
a 3-to-1 MUX (870 GE) and 8 registers with a 4-to-1 MUX (66 GE), while the
key register of Type-II/III implementation consists of 120 scan flip-flops (750
GE) and 8 registers with a 3-to-1 MUX (58 GE). Thus, the area savings of 128
GE are achieved.
The area savings for the data processing block of Type-III implementation
over Type-I/II implementation are 78 GE. As for the data register of Type-
III implementation 32 scan flips-flops (200 GE) is replaced with 32 D flip-flops
(144 GE), which leads savings of 56 GE. 24 3-to-1 MUXes with output inverted
(54 GE) can be replaced with 24 2-to-1 MUXes with output inverted (42 GE)
Very Compact Hardware Implementations of CLEFIA 289
6 Conclusion
In this paper, we have proposed very compact hardware architectures of CLE-
FIA with 128-bit keys based on 8-bit shift registers. We showed that the data
processing part of CLEFIA-128 can be implemented in a serialized way without
any additional registers. Three types of hardware architectures were proposed
according to required cycles for one block process by adaptively applying clock
gating technique. Those architectures were implemented and synthesized using a
0.13 μm standard cell library. In our smallest implementation, the area require-
ments are only 2,488 GE, which is 50% smaller than the smallest implementa-
tion of CLEFIA-128, and competitive to the smallest AES-128 implementation.
Moreover, the area requirements for its version supporting both encryption and
decryption are only 2,604 GE, which achieve 23% reduction of area requirement
compared to the smallest encryption/decryption implementation of AES-128.
Future work will include the application of side-channel countermeasures such
as threshold implementations [6,7] to the proposed architectures.
References
1. De Cannière, C., Dunkelman, O., Knežević, M.: KATAN and KTANTAN — A
Family of Small and Efficient Hardware-Oriented Block Ciphers. In: Clavier, C.,
Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 272–288. Springer, Heidelberg
(2009)
2. Daemen, J., Rijmen, V.: The Design of Rijndael: AES – The Advanced Encryption
Standard (Information Security and Cryptography). Springer, Heidelberg (2002)
3. Feldhofer, M., Wolkerstorfer, J., Rijmen, V.: AES Implementation on a Grain of
Sand. In: IEE Proceedings Information Security, vol. 152, pp. 13–20 (2005)
4. Hämäläinen, P., Alho, T., Hännikäinen, M., Hämäläinen, T.: Design and Imple-
mentation of Low-Area and Low-Power AES Encryption Hardware Core. In: DSD
2006, pp. 577–583. IEEE Computer Society (2006)
5. Moradi, A., Poschmann, A., Ling, S., Paar, C., Wang, H.: Pushing the Limits: A
Very Compact and a Threshold Implementation of AES. In: Paterson, K.G. (ed.)
EUROCRYPT 2011. LNCS, vol. 6632, pp. 69–88. Springer, Heidelberg (2011)
6. Nikova, S., Rechberger, C., Rijmen, V.: Threshold Implementations against Side-
Channel Attacks and Glitches. In: Ning, P., Qing, S., Li, N. (eds.) ICICS 2006.
LNCS, vol. 4307, pp. 529–545. Springer, Heidelberg (2006)
7. Nikova, S., Rijmen, V., Schläffer, M.: Secure Hardware Implementation of Non-
linear Functions in the Presence of Glitches. In: Lee, P.J., Cheon, J.H. (eds.) ICISC
2008. LNCS, vol. 5461, pp. 218–234. Springer, Heidelberg (2009)
8. Rolfes, C., Poschmann, A., Leander, G., Paar, C.: Ultra-Lightweight Implementa-
tions for Smart Devices – Security for 1000 Gate Equivalents. In: Grimaud, G.,
Standaert, F.-X. (eds.) CARDIS 2008. LNCS, vol. 5189, pp. 89–103. Springer, Hei-
delberg (2008)
9. Shirai, T., Shibutani, K., Akishita, T., Moriai, S., Iwata, T.: The 128-Bit Block-
cipher CLEFIA (Extended Abstract). In: Biryukov, A. (ed.) FSE 2007. LNCS,
vol. 4593, pp. 181–195. Springer, Heidelberg (2007)
10. Shirai, T., Shibutani, K., Akishita, T., Moriai, S., Iwata, T.: Hardware Implemen-
tations of the 128-bit Blockcipher CLEFIA, Technical Report of IEICE, 107(141),
ISEC2007–49, 29–36 (2007) (in Japanese)
Very Compact Hardware Implementations of CLEFIA 291
11. The 128-bit Blockcipher CLEFIA: Algorithm Specification, Revision 1.0 (2007),
Sony Corporation,
http://www.sony.net/Products/cryptography/clefia/download/
data/clefia-spec-1.0.pdf
12. The 128-bit Blockcipher CLEFIA: Security and Performance Evaluations, Revision
1.0 (2007), Sony Corporation,
http://www.sony.net/Products/cryptography/clefia/download/
download/data/clefia-eval-1.0.pdf
13. Sugawara, T., Homma, N., Aoki, T., Satoh, A.: High-Performance ASIC Imple-
mentations of the 128-bit Block Cipher CLEFIA. In: ISCAS 2008, pp. 2925–2928
(2008)
14. Zheng, Y., Matsumoto, T., Imai, H.: On the Construction of Block Ciphers Prov-
ably Secure and not Relying on Any Unproved Hypotheses. In: Brassard, G. (ed.)
CRYPTO 1989. LNCS, vol. 435, pp. 461–480. Springer, Heidelberg (1990)
Appendix
In this appendix, we show the detailed data flow of the registers Rij in Fig. 5
during a round of the encryption processing for Type-I architecture. Fig. 9 defines
the data structure of a round of the encryption processing. The contents of the
registers Rij (0 ≤ i < 4) are clarified in Table 3.
a0 b0
S0 Ks0 Ks1 Ks2 Ks3
| | |
S1
Kt0 Kt1 Kt2 Kt3
| | |
a1 b1
S1 S0
a2 M0 b2 M1
S0 S1
a3 b3
S1 S0
l 0 1 2 3 4
R00 x00 x01 x02 x03 x20
R01 x01 x02 x03 x20 x21
R02 x02 x03 x20 x21 x22
R03 x03 x20 x21 x22 x23
R10 x10 x21 x22 x23 x30
R11 x11 x22 x23 x30 x31
R12 x12 x23 x30 x31 x32
R13 x13 x30 x31 x32 x33
R20 x20 x13 ⊕{06}a0 x12 ⊕{04}a0 ⊕{06}a1 x11 ⊕{02}a0 ⊕a1 ⊕{06}a2 ⊕Ks1 y00
R21 x21 x12 ⊕{04}a0 x13 ⊕{06}a0 ⊕{04}a1 x10 ⊕a0 ⊕{02}a1 ⊕{04}a2 ⊕Ks0 y01
R22 x22 x11 ⊕{02}a0 x10 ⊕a0 ⊕{02}a1 ⊕Ks0 x13 ⊕{06}a0 ⊕{04}a1 ⊕{02}a2 y02
R23 x23 x10 ⊕a0 ⊕Ks0 x11 ⊕{02}a0 ⊕a1 ⊕Ks1 x12 ⊕{04}a0 ⊕{06}a1 ⊕a2 ⊕Ks2 y03
R30 x30 x31 x32 x33 x00 (= y30 )
R31 x31 x32 x33 x00 x01 (= y31 )
R32 x32 x33 x00 x01 x02 (= y32 )
R33 x33 x00 x01 x02 x03 (= y33 )
l 4 5 6 7 8
R00 x20 x21 x22 x23 y00
R01 x21 x22 x23 y00 y01
R02 x22 x23 y00 y01 y02
R03 x23 y00 y01 y02 y03
R10 x30 y01 y02 y03 x20 (= y10 )
R11 x31 y02 y03 x20 x21 (= y11 )
R12 x32 y03 x20 x21 x22 (= y12 )
R13 x33 x20 x21 x22 x23 (= y13 )
R20 y00 x33 ⊕{0A}b0 x32 ⊕{02}b0 ⊕{0A}b1 x31 ⊕{08}b0 ⊕b1 ⊕{0A}b2 ⊕Kt1 y20
R21 y01 x32 ⊕{02}b0 x33 ⊕{0A}b0 ⊕{02}b1 x30 ⊕b0 ⊕{08}b1 ⊕{02}b2 ⊕Kt0 y21
R22 y02 x31 ⊕{08}b0 x30 ⊕b0 ⊕{08}b1 ⊕Kt0 x33 ⊕{0A}b0 ⊕{02}b1 ⊕{08}b2 y22
R23 y03 x30 ⊕b0 ⊕Kt0 x31 ⊕{08}b0 ⊕b1 ⊕Kt1 x32 ⊕{02}b0 ⊕{0A}b1 ⊕b2 ⊕Kt2 y23
R30 y30 y30 y30 y30 y30
R31 y31 y31 y31 y31 y31
R32 y32 y32 y32 y32 y32
R33 y33 y33 y33 y33 y33
Another Look at Tightness
1 Introduction
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 293–319, 2012.
c Springer-Verlag Berlin Heidelberg 2012
294 S. Chatterjee, A. Menezes, and P. Sarkar
the protocol’s security. On the other hand, Damgård [25] asserts that a non-
tight polynomial-time reduction is useful because it rules out all polynomial-
time attacks. However, such an assurance is not very comforting since proofs
are meant to guarantee resistance to all attacks, and moreover there are many
examples of practical cryptographic schemes that have succumbed to attacks
that are deemed to be effective in practice even though in asymptotic terms
they require super-polynomial time.
Considerable effort has been expended on devising tighter security proofs for
existing protocols, and on designing new protocols with tighter security proofs.
For example, the first security proof [5] for the traditional hash-then-sign RSA
signature scheme (called RSA-FDH) was highly non-tight. Subsequently, Coron
[23] found an alternate proof that is significantly tighter (although still consid-
ered non-tight), and proved that no tighter reduction exists [24]. Meanwhile,
Katz and Wang [43] showed that a small modification of RSA-FDH yields a sig-
nature scheme that has a tight security proof, arguably increasing confidence in
RSA-FDH itself. Nonetheless, another variant of RSA-FDH, called RSA-PSS, is
commonly recommended in practice because it has a tight security proof [5]. As
another example, Gentry and Halevi [35] designed a hierarchical identity-based
encryption (HIBE) scheme that has a security proof whose tightness gap depends
only linearly on the number of levels, in contrast to all previous HIBE schemes
whose tightness gaps depend exponentially on the number of levels. Finally, we
mention Bernstein’s [7] tight proof in the random oracle model for the Rabin-
Williams signature scheme, and Schäge’s [65] tight proofs without the random
oracle assumption for the Cramer-Shoup and Camenisch-Lysyanskaya signature
schemes.
Despite ongoing efforts by some to tighten security proofs of existing protocols
and to develop new protocols with tighter proofs, it is fair to say that, for the
most part, the tightness gaps in security proofs are not viewed as a major concern
in practice. Researchers who design protocols with non-tight proofs typically give
arguments in favour of their protocol’s efficiency by using parameters that would
make sense if the proof had been tight. For example, the Schnorr signature
scheme [66] is widely regarded as being secure, although its known security
proofs are highly non-tight [58]. In fact, there are arguments which suggest that
a tighter proof is not even possible [57]. Nevertheless, the Schnorr signature
scheme is widely used in the cryptographic literature without any suggestion to
use larger key sizes to account for the tightness gap in the proof.
Other examples of well-known protocols with highly non-tight proofs include
the Boneh-Franklin (BF) [16,34], Sakai-Kasahara (SK) [22] and Boneh-Boyen
(BB1) [15] identity-based encryption schemes, the Lu et al. aggregate signature
scheme [51], and the HMQV key agreement protocol [48]. In [18], Boyen com-
pares the tightness of the reductions for BB1, BF and SK. The reduction for
BB1 is significantly tighter than the reduction for BF, which in turn is signifi-
cantly tighter than that for SK. However, all three reductions are in fact highly
non-tight, the tightness gap being (at least) linear, quadratic and cubic in the
number of random oracle queries made by the adversary for BB1, BF and SK,
296 S. Chatterjee, A. Menezes, and P. Sarkar
respectively. Although all these proofs have large tightness gaps, Boyen recom-
mends that SK should “generally be avoided as a rule of thumb”, BF is “safe
to use”, and BB1 “appears to be the smartest choice” in part due to the “fairly
efficient security reduction” of the latter. Despite the importance Boyen attaches
to tightness as a reason for avoiding SK, a recent IETF standard co-authored
by Boyen that describes BB1 and BF [19] does not recommend larger security
parameters to account for tightness gaps in their security proofs.
Our Work. In §2, we examine a natural, but non-tight, reductionist security
proof for MAC schemes in the multi-user setting. If parameters are selected
without accounting for the tightness gap in the reduction, then the MAC scheme
is shown to provide a level of security that is less than what one would desire
in the multi-user setting. In particular, the attacks we describe are effective on
HMAC as standardized in [33,26] and CMAC as standardized in [28,69]. In §3,
we show that this deficiency in the security assurances provided by the non-
tight proof appears in a network authentication protocol [20], and in §4 we
obtain analogous results for aggregate MACs and aggregate designated verifier
signatures. In §5, we exhibit attacks on some authenticated encryption schemes,
disk encryption schemes, and stream ciphers in the multi-user setting. We draw
our conclusions in §6.
and Hk (m) = τ . We will henceforth denote B’s task by MAC1 (breaking a MAC
scheme in the single-user setting). An adversary B is said to (T, )-break MAC1
if its running time is bounded by T and it produces a forgery with probability
at least ; the probability is assessed over the choice of k and B’s coin tosses.
MAC1 is said to be (T, )-secure if there does not exist an adversary B that
(T, )-breaks it.
Our definition of MAC security in the multi-user setting is the following.
An adversary A has complete knowledge of the MAC scheme. First, n keys
k1 , k2 , . . . , kn corresponding to users1 1, 2, . . . , n are chosen independently and
uniformly at random from K and kept secret from A; n is an upper bound on
the total number of users in the system. The adversary A has access to MAC
oracles indexed by k1 , . . . , kn in the following way: for any (i, m) of A’s choosing,
where i ∈ [1, n] and m ∈ D, A is given Hki (m). Furthermore, A is allowed to
corrupt any oracle (or user); i.e., for any i ∈ [1, n] of its choosing, A is given ki .
The adversary’s goal is to produce a forgery, i.e., find a triple (i, m, τ ) such that:
about 260 MAC operations. Since the work can be easily and effectively paral-
lelized, the attack should be considered feasible today (cf. Remark 7).
The FIPS 198-1 standard allows 80-bit keys and 160-bit tags, i.e., r=80 and
t=160. Attack 1 also applies to this choice of parameters. In fact, since t % r, a
collision in the first phase of the attack will most likely be due to a key collision.
In general, having tag length to be greater than the key length will not provide
any additional resistance to Attack 1.
Remark 6. (number of users) The 220 users in the attack described above need
not be distinct pairs of entities. What is needed is 220 keys. An entity might
be engaged in multiple sessions with other entities, and might even have several
active sessions with the same entity. Thus, the attacks could be mounted with
far fewer than 220 different entities.
CMAC. CMAC is a block cipher-based MAC scheme that has been standardized
in [28] and [69]. Let E denote a block cipher with key length r bits and block
length b bits. The r-bit key k is first used to generate two b-bit subkeys, k1 and
k2 . The message m is divided into blocks m1 , m2 , . . . , mh , where each mi is b-bits
in length with the possible exception of mh , which might be less than b-bits long.
Now, if mh is b bits in length, then it is updated as follows: mh ← mh ⊕ k1 .
Otherwise, mh is padded on its right with a single 1 bit followed by 0 bits until the
length of the padded mh is b bits; then mh is updated as follows: mh ← mh ⊕ k2 .
Finally, one sets c0 = 0 and computes ci = Ek (ci−1 ⊕ mi ) for 1 ≤ i ≤ h. The tag
of m is defined to be CMACk (m) = Trunct (ch ).
The standards [28] and [69] both use the AES block cipher (with r=b=128)
and do not mandate truncation, so we can take t=128. With these parameters,
CMAC in the multi-user setting is vulnerable to Attack 1. Indeed, after query-
ing n=232 users for the MAC of a fixed message m, the adversary is able to
compute the secret key of one of the users after performing about 296 MAC op-
erations. Although this workload is considered infeasible today, the attack does
demonstrate that CMAC-AES does not attain the 128-bit security level in the
multi-user setting.
Remark 7. (reducing the on-line running time) Hellman [39] introduced the idea
of time/memory trade-offs (TMTO) to search for a preimage of a target point in
the range of a one-way function. The idea is to perform a one-time precomputa-
tion and store some of the results, subsequent to which the on-line search phase
can be significantly sped up. Biryukov and Shamir [11] later applied TMTO
to stream ciphers. They considered the problem of inverting any one out of D
possible targets. Let N denote the size of the search space, M the amount of
memory required, and T the on-line time, and suppose that 1≤D≤T 2 . Then the
Biryukov-Shamir TMTO can be implemented with these parameters provided
that they satisfy the so-called multiple-data trade-off curve T M 2 D2 = N 2 ; the
precomputation time P is N/D. The multiple-data trade-off curve has natural
interpretations in other contexts. Biryukov et al. [10] considered the problem of
finding any one of D keys for a block cipher. An extensive analysis of TMTO
with multiple data in different cryptographic settings was carried out in [40].
302 S. Chatterjee, A. Menezes, and P. Sarkar
arbitrary element of D. Then, for i = j, we will need to consider the event that
Hki (m) = Hkj (m). For the probability analysis, it will be useful to analyze this
event in terms of the following three events, the last two of which are conditional
events: (i) ki = kj ; (ii) Hki = Hkj given that ki = kj ; and (iii) Hki (m) = Hkj (m)
given that ki = kj and Hki = Hkj . Clearly Pr[ki = kj ] = 2−r and Pr[Hki =
Hkj |ki = kj ] = 1/#F = (2−t )#D . In practical applications, the maximum length
L of messages can be expected to be at least around 220 and so the probabil-
ity that Hki = Hkj given ki = kj is negligible. Furthermore, for 1 ≤ s ≤ 2t ,
the quantity s/#F is also negligible. We will use these approximations in the
remainder of the analysis.
The analysis of Attack 1 is done in two stages. In the first stage, we determine
values for n and w for which there is a significant probability of detecting a
collision. The second stage of the analysis considers the probability of the keys
and ki being equal once a collision H (m) = Hki (m) is detected.
Let W = {1 , . . . , w } and consider the functions H1 , . . . , Hw . Let A be the
event that these functions are distinct. Then
1 2 w−1
Pr[A] = 1 − 1− ··· 1 − ≈ 1.
#F #F #F
The approximation is based on the fact that w2 is negligible in comparison
to #F = (2t )#D . Let C be the event that a collision occurs. Let Lst1 =
{Hk1 (m), . . . , Hkn (m)} and Lst2 = {H1 (m), . . . , Hw (m)}. The event C is the
event Lst1 ∩ Lst2 = ∅. Now,
Pr[C] = Pr[C|A] · Pr[A] + Pr[C|A] · Pr[A] ≈ Pr[C|A].
Let B1 be the event that the keys k1 , . . . , kn are pairwise distinct. Then
1 n−1 1
Pr[B1 ] = 1 − r · · · 1 − ≈ exp − (1 + 2 + · · · + n − 1)
2 2r 2r
n2 n2
≈ exp − r+1 ≈ 1 − r+1 .
2 2
As long as n2 & 2r+1 , the probability of event B1 occurring will be almost equal
to 1. For the remainder of the analysis, we will assume that this condition holds.
Let B2 be the event that the functions Hk1 , . . . , Hkn are pairwise distinct.
Conditioned on the event B1 , the probability of B2 occurring is almost equal 1.
This follows from an argument similar to the one which shows that Pr[A] ≈ 1.
We introduce three more approximations:
Pr[C] ≈ Pr[C|A] = Pr[C|A, B1 ] · Pr[B1 ] + Pr[C|A, B1 ] · Pr[B1 ]
≈ Pr[C|A, B1 ] (using Pr[B1 ] ≈ 1)
= Pr[C|A, B1 , B2 ] · Pr[B2 ] + Pr[C|A, B2 ] · Pr[B2 ]
≈ Pr[C|A, B1 , B2 ] (using Pr[B2 ] ≈ 1).
Let xi = Hki (m) for 1 ≤ i ≤ n and yj = Hj (m) for 1 ≤ j ≤ w. Conditioned
on the conjunction of B1 and B2 , the values x1 , . . . , xn are independent and
304 S. Chatterjee, A. Menezes, and P. Sarkar
Pr[Hki (m) = Hj (m)] = Pr[ki = j ] + Pr[Hki (m) = Hj (m)|ki = j ] · Pr[ki = j ]
1 1
= r + 1 − r · Pr[Hki = Hj |ki = j ]
2 2
+ Pr[Hki (m) = Hj (m)|ki = j , Hki = Hj ] · Pr[Hki = Hj |ki = j ]
1 1 1 1 1 1 1 1
= r + 1− r + t 1− ≈ r + t − t+r = δ,
2 2 #F 2 #F 2 2 2
and hence
Pr[ki = j , Hki (m) = Hj (m)]
Pr[ki = j |Hki (m) = Hj (m)] =
Pr[Hki (m) = Hj (m)]
Pr[ki = j ] 1 2t+r 2t
= ≈ r = r t = .
Pr[Hki (m) = Hj (m)] 2 δ 2 (2 + 2r − 1) 2t + 2r − 1
2.5 Fixes
We propose two generic countermeasures to Attack 1 on MAC schemes in the
multi-user setting.
Remark 10. (preventing replay attacks) Some MAC standards make provisions
for protecting against the replay of message-tag pairs. For example, NIST’s SP
800-38B [28] suggests that replay can be prevented by “incorporating certain
identifying information bits into the initial bits of every message. Examples
of such information include a sequential message number, a timestamp, or a
nonce.” We note that sequential message numbers and timestamps do not nec-
essarily circumvent Attack 1 because it is possible that each user selects the
same sequential message number or timestamp when authenticating the chosen
message m. Nonces can be an effective countermeasure provided that there is
sufficient uncertainty in their selection.
computing τ = Hk (s, m) where s ∈R {0, 1}r ; the resulting tag is (s, τ ). The
verifier confirms that τ = Hk (s, m). This modified MAC scheme is called rMAC
(randomized MAC).
Security of rMAC in the multi-user setting is defined analogously to security
of MAC*: The adversary A is given access to n rMAC oracles with secret keys
k1 , k2 , . . . , kn ∈R K and can corrupt any oracle (i.e., obtain its secret key).
Its goal is to produce a triple (i, m, (s, τ )) such that the ith oracle was not
corrupted, (m, (s, τ )) is a valid message-tag pair with respect to the ith oracle
(i.e, Hki (s, m) = τ ), and m was not queried to the ith oracle. We denote A’s
task by rMAC*. When n = 1, then rMAC* is called rMAC1 (security of rMAC
in the single-user setting).
It is easy to verify that rMAC* resists Attack 1. Let us denote by P1 ≤b P2 a
reduction from problem P1 to problem P2 that has a tightness gap of b; if b = 1
then the reduction is tight. In §2.2 we showed that MAC1 ≤n MAC*, i.e., the
problem of breaking a MAC scheme in the single-user setting can be reduced to
breaking the same MAC scheme in the multi-user setting, but the reduction has a
tightness gap of n. Trivially, we have MAC* ≤1 MAC1. The reductionist security
proof in §2.2 can be adapted to show that rMAC1 ≤n rMAC*, and we trivially
have rMAC* ≤1 rMAC1. Moreover, it is easy to see that MAC1 ≤1 rMAC1
and hence MAC1 ≤n rMAC*. However, it is unlikely that a generic reduction of
rMAC1 to MAC1 exists because a MAC scheme {Hk }k∈K having the property
that there exists a (known) pair (s, τ ) with s ∈ {0, 1}r , τ ∈ {0, 1}t and Hk (s) = τ
for all k ∈ K would be considered insecure whereas the corresponding rMAC
scheme could well be secure.
We do not know a tighter security reduction from MAC1 to rMAC*, nor
do we know whether a tighter reduction is even possible (in general). However,
we would expect that rMAC* and MAC1 are tightly related in practice. One
approach to increasing confidence in rMAC* would be to derive tight lower
bounds for MAC1 and rMAC* in the ideal MAC model, and hope that these
lower bounds coincide.
fMAC. One drawback of rMAC is that tags are longer than before. An alter-
native countermeasure is to prepend all messages with a string that is fixed
and unique to every pair of users (and every session between them). That is, a
user with secret key k would authenticate a message m by computing the tag
τ = Hk (f, m), where f is the fixed and unique string that the user shares with
the intended recipient (for that session). All such strings are assumed to have
the same length, and this length is at least r. The strings are assumed to be
understood from context, so do not need to be transmitted. (For an example of
such strings, see §3.3.) The verifier confirms that τ = Hk (f, m). This modified
MAC scheme is called fMAC (fixed-string MAC).
Security of fMAC in the multi-user setting is defined analogously to security
of MAC*: The adversary A is given access to n fMAC oracles with secret keys
k1 , . . . , kn ∈R K and fixed strings f1 , . . . , fn , and can corrupt any oracle. Its goal
is to produce a triple (i, m, τ ) such that the ith oracle was not corrupted, (m, τ ) is
a valid message-tag pair with respect to the ith oracle (i.e, Hki (fi , m) = τ ), and
306 S. Chatterjee, A. Menezes, and P. Sarkar
m was not queried to the ith oracle. We denote A’s task by fMAC*. When n = 1,
then fMAC* is called fMAC1 (security of fMAC in the single-user setting).
As was the case with rMAC*, it is easy to verify that fMAC* resists Attack 1.
Furthermore, one can show that fMAC* ≤1 fMAC1 and MAC1 ≤1 fMAC1 ≤n
fMAC*, while we do not expect there to be a generic reduction from fMAC1 to
MAC1. We do not know a tighter security reduction from MAC1 to fMAC*, nor
do we know whether a tighter reduction is even possible (in general). However,
we would expect that fMAC* and MAC1 are tightly related in practice. An
intuitive reason for why fMAC* can be expected to be more secure than MAC*
is that for fMAC* each of the n oracles available to the adversary can be viewed
as having been chosen from an independent family of MAC functions, whereas
in MAC* each of the n oracles available to the adversary is chosen from a single
family of MAC functions.
Remark 11. (use of MAC schemes) Higher-level protocols that use MAC schemes
for authentication generally include various data fields with the messages being
MAC’ed, thus providing adequate defenses against Attack 1. For example, IPsec
has an authentication-only mode [45] where a MAC scheme is used to authen-
ticate the data in an IP packet. Among these data fields are the source and
destination IP addresses, and a 32-bit “Security Parameter Index” (SPI) which
identifies the “Security Association” (SA) of the sending party.
3 NetAut
NetAut is a network authentication protocol proposed by Canetti and Krawczyk
[20] which combines a key establishment scheme with a conventional MAC
scheme in a natural way. In [20], a security model and definition for key estab-
lishment are proposed. Then, NetAut is proved to be a secure network authen-
tication protocol under the assumption that the underlying key establishment
and MAC schemes are secure. We describe several shortcomings in the analy-
sis of NetAut. The most serious of these shortcomings is the tightness gap in
the security proof, which we exploit to formulate concrete attacks on plausible
instantiations of NetAut.
In the SIG-DH key agreement scheme, sig and sigB̂ denote the signing al-
gorithms of parties  and B̂, respectively. It is assumed that each party has
an authenticated copy of the other party’s public verification key. The SIG-DH
scheme proceeds as follows. The initiator  selects x ∈R [0, N − 1] and sends
(Â, s, X=xG) to party B̂. In response, B̂ selects y ∈R [0, N − 1] and sends
(B̂, s, Y =yG, sigB̂ (B̂, s, Y, X, Â)) to  and computes κ = yX. Upon receipt of
B̂’s message,  verifies the signature, sends the message (Â, s, sig (Â, s, X, Y, B̂))
to B̂, and computes the session key κ = xY associated with session s. Finally,
upon receipt of Â’s message, B̂ verifies the signature and accepts κ as the session
key associated with session s.
Canetti and Krawzcyk proved that SIG-DH is secure in the CK model un-
der the assumption that the decisional Diffie-Hellman problem4 in E(Fp ) is in-
tractable (and the signature scheme is secure). The proof proceeds in two stages.
In the first stage, the basic Diffie-Hellman protocol is proven secure in the au-
thenticated links model under the assumption that DDH is intractable; this proof
has a tightness gap of n, the total number of sessions. In the second stage, SIG-
DH is proven secure (in the unauthenticated links model) under the assumption
that the basic Diffie-Hellman protocol is secure in the authenticated links model;
this proof has a tightness gap of 2n. However, these tightness gaps do not seem
to have any negative security consequences for SIG-DH.
Key Type Mismatch. The first problem encountered when using SIG-DH and
CMAC as the ingredients of NetAut is that the SIG-DH session keys are points
in E(Fp ) whereas the CMAC secret keys are bit strings. This key type mismatch
can be rectified by the commonly-used method of using a key derivation function
KDF to derive a bit-string session key from the SIG-DH session key, i.e., the
session key is now KDF(xyG). We refer to the modified key agreement scheme
as hashed SIG-DH (HSIG-DH).
The KDF is generally modeled as a random oracle in security proofs. HSIG-
DH can then be proven secure under the assumption that the gap Diffie-Hellman
(GDH) problem5 is hard using standard techniques.
Keysize Mismatch. Security proofs for Diffie-Hellman key agreement proto-
cols in the random oracle model sometimes make the assumption that the prob-
ability of a KDF collision during the adversary’s operation is negligible (e.g.,
see [48,50,55]). If this probability were not negligible, then the adversary could
conceivably force two non-related sessions (called ‘non-matching’ sessions in the
literature) to compute the same session key — in that event, the adversary could
learn the session key from one session by asking for it and thereby obtain the
session key for the other session. Thus, because of the birthday paradox, at the
80-bit security level the assumption that the adversary has negligible probability
4
The decisional Diffie-Hellman (DDH) problem in E(Fp ) is the problem of determin-
ing whether Z=xyG given G, X=xG, Y =yG and Z ∈ E(Fp ).
5
The gap Diffie-Hellman (GDH) problem in E(Fp ) is the problem of solving the
computational Diffie-Hellman (CDH) problem in E(Fp ) given an oracle for the DDH
problem in E(Fp ).
Another Look at Tightness 309
of obtaining a KDF collision requires that the KDF for HSIG-DH with our choice
of elliptic curve parameters should have 160-bit outputs. However we then have a
keysize mismatch since CMAC uses 80-bit keys. If the KDF is restricted to 80-bit
outputs, then the aforementioned proofs have a logical gap since the probability
of a KDF collision now becomes non-negligible.
One simple way to remove this gap is to include the identities of the commu-
nicating parties and the session identifier as input to the key derivation func-
tion (as is done in [70], for example), i.e., the HSIG-DH session key is now
KDF(Â, B̂, s, xyG). One can then argue that since the KDF is modelled as a
random oracle, the adversary must know the inputs to the KDF for the two
non-matching sessions (since the triples (Â, B̂, s) for the non-matching sessions
must be distinct) in order to detect the collision. In particular, the adversary
must know xyG — and such an adversary can be used to solve a CDH instance.
The Insecurity of NetAut. Attack 1 is applicable to our instantiation of Net-
Aut with HSIG-DH (with 80-bit session keys) and CMAC at the 80-bit security
level. Namely, the adversary monitors n = 220 NetAut sessions, each of which
is induced to transmit some fixed message m. Then, as explained in §2.3, the
adversary is able to deduce one of the 220 session keys and thereafter use it to
forge message-MAC pairs for that session.
We emphasize that the mechanisms of the attack are within the scope of the
security model for NetAut considered in [20]. However, the attack does not con-
tradict the security proof for NetAut given in [20, Theorem 12] for the following
reason. At one point in the proof it is shown that the probability that an adver-
sary succeeds in convincing a party  that a message m was sent by party B̂ in
a particular session s even though B̂ did not send that message in that session is
negligible provided that the underlying MAC scheme is secure. The reductionist
proof for this claim (Lemma 13 of [20]) is analogous to the security proof for
MAC* given in §2.2, and hence has a tightness gap equal to the total number n
of sessions — this tightness gap is precisely what the attack exploits.
3.3 A Fix
One method for preventing the attack on NetAut described above is to use the
fMAC variant of the MAC scheme. Here, a natural candidate for the unique
fixed string f is the session identifier s and the identifiers of the communicating
parties, i.e., after parties  and B̂ complete session s of HSIG-DH and estab-
lish a session key κ, the authentication tag for a message m is computed as
τ = MACκ (s, Â, B̂, m). This modification of NetAut resists Attack 1. However,
even with this modification we do not know a tight security reduction, so the
possibility of another attack that exploits the tightness gap cannot be ruled out.
Katz and Lindell [42] provided a formal security definition for the task of aggre-
gating MACs, proposed an aggregate MAC scheme, and gave a security proof
for their construction.
In the Katz-Lindell scheme, there are z parties, each of which randomly se-
lects an r-bit key ki for a deterministic MAC scheme; these keys are shared
with a central authority. When parties6 i1 , i2 , . . . , in wish to authenticate mes-
sages m1 , m2 , . . . , mn , respectively, for the authority, they each compute τi =
MACki (mi ). The aggregate tag is τ = τ1 ⊕ τ2 ⊕ · · · ⊕ τn . The authority verifies
the aggregate tag by computing the individual tags and checking that their xor
is equal to τ .
In the security model of [42], the adversary can corrupt any party, and in
addition can obtain the tag of any message from any party. The adversary’s goal
is to produce a set of party-message pairs (i1 , m1 ), (i2 , m2 ), . . . , (in , mn ) (for any
n ≤ z) and an aggregate tag τ such that the tag passes the verification check
and there is at least one party-message pair (ij , mj ) for which party ij has not
been corrupted and was never queried for the MAC of mj .
Katz and Lindell prove that their aggregate MAC scheme is secure provided
that the underlying MAC scheme is secure in the single-user setting. Their proof
is very similar to the one given for MAC* in §2.2, but is described asymptotically.
The total number of parties is z = p(r) for some unspecified polynomial p, and
the adversary A of the aggregate MAC scheme is assumed to be polynomially
bounded. The simulator B of A’s environment makes a guess for the index j,
and is successful in producing a forgery for the underlying MAC scheme provided
that its guess is correct. Since n ≤ z, the proof has a tightness gap of p(r).
It is easy to see that the Katz-Lindell aggregate MAC scheme succumbs to At-
tack 1. This security flaw in their scheme is a direct consequence of the tightness
gap in their proof.
As with rMAC, randomizing the MACs will prevent the attack. However,
since the randomizers would also have to be sent, this countermeasure defeats
the primary objective of the aggregate MAC scheme — a small aggregate tag.
A better solution would be to deploy fMAC as the underlying MAC scheme.
Hierarchical In-Network Data Aggregation. Chan, Perrig and Song [21]
presented the first provably secure hierarchical in-network data aggregation al-
gorithm. Such an algorithm can be used to securely perform queries on sensor
network data. A crucial component of the algorithm is the (independently discov-
ered) Katz-Lindell aggregate MAC scheme. In the data aggregation application,
each sensor node shares a secret key ki with the querier. At one stage of the
application, each node computes the tag τi = MACki (N, OK), where MAC is a
conventional MAC scheme, N is a nonce sent by the querier, and OK is a unique
message identifier. The aggregate tag is τ = τ1 ⊕ τ2 ⊕ · · · ⊕ τn . We emphasize
that the same nonce N and message identifier OK are used by each node. It
follows that the MAC scheme is vulnerable to Attack 1. In fact, the attack is
6
For simplicity, we assume the parties are distinct and hence n ≤ z.
Another Look at Tightness 311
easier to mount in this setting because the application itself requires each node
to compute its tag on a fixed message. The security proof for the aggregate MAC
scheme given in [21, Lemma 11] is very informal and assumes “that each of the
distinct MACs are unforgeable (and not correlated with each other)”, and then
concludes that “the adversary has no information about this [aggregate tag].”
History-Free Aggregate MACs. Eikemeier et al. [31] presented and analyzed
a MAC aggregation algorithm where the aggregation of individual tags must be
carried out in a sequential manner, and where the aggregation algorithm de-
pends only on the current message being MAC’ed and on the previous aggregate
tag. They provided an elaborate security definition and a security proof for their
scheme. We note that their security model allows the adversary to query individ-
ual parties for tags of messages of the adversary’s choosing. Consequently, their
history-free aggregate MAC scheme succumbs to Attack 1. Not surprisingly, the
security reduction in [31] is non-tight, with a tightness gap of at least z (the
total number of parties).
An Attack. For concreteness, suppose that SIV uses an 80-bit block cipher
(such as SKIPJACK) as the underlying block cipher for CTR mode encryption
as well as for CMAC. Our attack on SIV is a chosen-plaintext attack in the
multi-user setting. The adversary selects an arbitrary message m and obtains
the ciphertext (IVi , ci ) from 220 parties i with secret key pairs ki = (ki , ki ).
As in Attack 1, the adversary then finds kj for some user j in about 260 steps.
Next, the adversary finds two equal-length messages m1 and m2 with m1 =
m2 and CMACkj (m1 ) = CMACkj (m2 ); this can be accomplished in about 240
steps using the van Oorschot-Wiener collision finding algorithm [71]. Finally, the
adversary requests the encryption of m1 from party j, receiving the ciphertext
(IV1 , c1 ). The adversary then computes the encryption of m2 as (IV1 , c1 ⊕ m1 ⊕
m2 ) as its forgery. It can easily be checked that this ciphertext will decrypt to
m2 and pass the verification check.
Our attack shows that, despite the provable security guarantees of SIV in
[62], this particular implementation of SIV does not achieve the desired 80-bit
security level in the multi-user setting. Note, however, that the attack may not
be relevant in the context of the key-wrap problem. Since “the plaintext carries
a key”, it will not be possible for the adversary to obtain 220 (IVi , ci ) pairs on
the same message m.
7
In the interest of simplicity, our description of SIV omits some details from [62]. In
particular, we omit the header which in any case “may be absent”, and use CMAC
instead of CMAC*. These omissions do not have any bearing on our attack.
Another Look at Tightness 313
will typically be much longer than k, a collision encountered during the attack
will most likely be due to a key collision. In the context of disk encryption, there
is no notion of session keys — the different keys would correspond to different
users. The encryption of a fixed tweak-message pair can be obtained by inducing
the users to encrypt the chosen message for the chosen disk sector.
Fixes for AE and Disk Encryption Schemes. In the multi-user setting,
one way to ensure that an r-bit security level is achieved against our attacks
(without changing the underlying block cipher) is to use multiple keys that
together are longer than r bits. Examples of such schemes are Poly1305-AES
[6] and the disk encryption schemes in [64]. The use of multiple keys, however,
does not immediately guarantee resistance to Attack 1 — as we have seen, SIV
is vulnerable to the attack since the first ciphertext component depends only on
the first SIV key – and hence the modification of a mode of operation to resist
Attack 1 should be done with care.
6 Concluding Remarks
We showed that ignoring the tightness gaps in reductionist security proofs can
have damaging consequences in practice. Our examples involve MAC schemes in
the multi-user setting. In particular, the tightness gap in the natural reduction
from MAC1 to MAC* indicates a real security weakness, whereas the tightness
gap in the natural reductions from MAC1 to rMAC* and fMAC* do not seem to
matter in practice. Our examples illustrate the difficulty of interpreting a non-
tight security proof in practice. Although our examples all involve the multi-user
setting, we feel that they call into question the practical value of all non-tight
security proofs. We also demonstrated potential security weaknesses of provably-
secure authenticated encryption schemes in the multi-user setting.
Practitioners who use security proofs as a tool to assess the security of a cryp-
tographic system, but rely more heavily on extensive cryptanalysis and sound
engineering principles, should not be alarmed by our observations. On the other
hand, theoreticians who believe that a security proof is the essential, and per-
haps the only, way to gain confidence in the security of a protocol should be
much more skeptical of non-tight proofs (unless, of course, the proof is accom-
panied by a clearly-stated requirement that security parameters be increased to
accommodate the tightness gap) and perhaps even reject these proofs as mere
heuristic arguments for the protocol’s security.
References
1. Alexi, W., Chor, B., Goldreich, O., Schnorr, C.P.: RSA and Rabin functions: Cer-
tain parts are as hard as the whole. SIAM J. Computing 17, 194–209 (1988)
2. Bellare, M., Boldyreva, A., Micali, S.: Public-Key Encryption in a Multi-User Set-
ting: Security Proofs and Improvements. In: Preneel, B. (ed.) EUROCRYPT 2000.
LNCS, vol. 1807, pp. 259–274. Springer, Heidelberg (2000)
3. Bellare, M., Canetti, R., Krawczyk, H.: Keying Hash Functions for Message Au-
thentication. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 1–15.
Springer, Heidelberg (1996)
4. Bellare, M., Rogaway, P.: Entity Authentication and Key Distribution. In: Stinson,
D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 232–249. Springer, Heidelberg
(1994)
5. Bellare, M., Rogaway, P.: The Exact Security of Digital Signatures - How to Sign
with RSA and Rabin. In: Maurer, U.M. (ed.) EUROCRYPT 1996. LNCS, vol. 1070,
pp. 399–416. Springer, Heidelberg (1996)
6. Bernstein, D.: The Poly1305-AES Message-Authentication Code. In: Gilbert, H.,
Handschuh, H. (eds.) FSE 2005. LNCS, vol. 3557, pp. 32–49. Springer, Heidelberg
(2005)
316 S. Chatterjee, A. Menezes, and P. Sarkar
7. Bernstein, D.: Proving Tight Security for Rabin-Williams Signatures. In: Smart,
N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 70–87. Springer, Heidelberg
(2008)
8. Bhaskar, R., Herranz, J., Laguillaumie, F.: Aggregate designated verifier signatures
and application to secure routing. Int. J. Security and Networks 2, 192–201 (2007)
9. Biham, E.: How to decrypt or even substitute DES-encrypted messages in 228 steps.
Information Processing Letters 84, 117–124 (2002)
10. Biryukov, A., Mukhopadhyay, S., Sarkar, P.: Improved Time-Memory Trade-Offs
with Multiple Data. In: Preneel, B., Tavares, S. (eds.) SAC 2005. LNCS, vol. 3897,
pp. 110–127. Springer, Heidelberg (2006)
11. Biryukov, A., Shamir, A.: Cryptanalytic Time/Memory/Data Tradeoffs for Stream
Ciphers. In: Okamoto, T. (ed.) ASIACRYPT 2000. LNCS, vol. 1976, pp. 1–13.
Springer, Heidelberg (2000)
12. Black, J.A., Rogaway, P.: CBC MACs for Arbitrary-Length Messages: The Three-
Key Constructions. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp.
197–215. Springer, Heidelberg (2000)
13. Blake-Wilson, S., Johnson, D., Menezes, A.: Key Agreement Protocols and Their
Security Analysis. In: Darnell, M.J. (ed.) Cryptography and Coding 1997. LNCS,
vol. 1355, pp. 30–45. Springer, Heidelberg (1997),
http://www.cacr.math.uwaterloo.ca/techreports/1997/corr97-17.ps
14. Blum, L., Blum, M., Shub, M.: A simple unpredictable pseudo-random number
generator. SIAM J. Computing 15, 364–383 (1986)
15. Boneh, D., Boyen, X.: Efficient Selective-ID Secure Identity-Based Encryption
without Random Oracles. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT
2004. LNCS, vol. 3027, pp. 223–238. Springer, Heidelberg (2004)
16. Boneh, D., Franklin, M.: Identity-based encryption from the Weil pairing. SIAM
J. Computing 32, 586–615 (2003)
17. Boneh, D., Gentry, C., Lynn, B., Shacham, H.: Aggregate and Verifiably Encrypted
Signatures from Bilinear Maps. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS,
vol. 2656, pp. 416–432. Springer, Heidelberg (2003)
18. Boyen, X.: A tapestry of identity-based encryption: practical frameworks com-
pared. Int. J. Applied Cryptography 1, 3–21 (2008)
19. Boyen, X., Martin, L.: Identity-based cryptography standard (IBCS) #1: Supersin-
gular curve implementations of the BF and BB1 cryptosystems. IETF RFC 5091
(2007)
20. Canetti, R., Krawczyk, H.: Analysis of Key-Exchange Protocols and their Use
for Building Secure Channels. In: Pfitzmann, B. (ed.) EUROCRYPT 2001.
LNCS, vol. 2045, pp. 453–474. Springer, Heidelberg (2001), Full version at
http://eprint.iacr.org/2001/040
21. Chan, H., Perrig, A., Song, D.: Secure hierarchical in-network aggregation in sensor
networks. In: CCS 2006, pp. 278–287 (2006)
22. Chen, L., Cheng, Z.: Security Proof of Sakai-Kasahara’s Identity-Based Encryption
Scheme. In: Smart, N.P. (ed.) Cryptography and Coding 2005. LNCS, vol. 3796,
pp. 442–459. Springer, Heidelberg (2005)
23. Coron, J.-S.: On the Exact Security of Full Domain Hash. In: Bellare, M. (ed.)
CRYPTO 2000. LNCS, vol. 1880, pp. 229–235. Springer, Heidelberg (2000)
24. Coron, J.-S.: Optimal Security Proofs for PSS and Other Signature Schemes. In:
Knudsen, L.R. (ed.) EUROCRYPT 2002. LNCS, vol. 2332, pp. 272–287. Springer,
Heidelberg (2002)
Another Look at Tightness 317
25. Damgård, I.: A “Proof-Reading” of Some Issues in Cryptography. In: Arge, L.,
Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp.
2–11. Springer, Heidelberg (2007)
26. Dang, Q.: Recommendation for applications using approved hash algorithms. NIST
Special Publication 800-107 (2009)
27. Dworkin, M.: Recommendation for block cipher modes of operation: Methods and
techniques. NIST Special Publication 800-38A (2001)
28. Dworkin, M.: Recommendation for block cipher modes of operation: The CMAC
mode for authentication. NIST Special Publication 800-38B (2005)
29. Eastlake, D., Crocker, S., Schiller, J.: Randomness recommendations for security.
IETF RFC 1750 (1994)
30. The eSTREAM project, http://www.ecrypt.eu.org/stream/
31. Eikemeier, O., Fischlin, M., Götzmann, J.-F., Lehmann, A., Schröder, D., Schröder,
P., Wagner, D.: History-Free Aggregate Message Authentication Codes. In: Garay,
J.A., De Prisco, R. (eds.) SCN 2010. LNCS, vol. 6280, pp. 309–328. Springer,
Heidelberg (2010)
32. FIPS 180-3, Secure Hash Standard (SHS), Federal Information Processing Stan-
dards Publication 180-3, National Institute of Standards and Technology (2008)
33. FIPS 198-1, The Keyed-Hash Message Authentication Code (HMAC), Federal In-
formation Processing Standards Publication 198, National Institute of Standards
and Technology (2008)
34. Galindo, D.: Boneh-Franklin Identity Based Encryption Revisited. In: Caires, L.,
Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS,
vol. 3580, pp. 791–802. Springer, Heidelberg (2005)
35. Gentry, C., Halevi, S.: Hierarchical Identity Based Encryption with Polynomially
Many Levels. In: Reingold, O. (ed.) TCC 2009. LNCS, vol. 5444, pp. 437–456.
Springer, Heidelberg (2009)
36. Goldreich, O.: On the Foundations of Modern Cryptography. In: Kaliski Jr., B.S.
(ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 46–74. Springer, Heidelberg (1997)
37. Halevi, S., Rogaway, P.: A Tweakable Enciphering Mode. In: Boneh, D. (ed.)
CRYPTO 2003. LNCS, vol. 2729, pp. 482–499. Springer, Heidelberg (2003)
38. Halevi, S., Rogaway, P.: A Parallelizable Enciphering Mode. In: Okamoto, T. (ed.)
CT-RSA 2004. LNCS, vol. 2964, pp. 292–304. Springer, Heidelberg (2004)
39. Hellman, M.: A cryptanalytic time-memory trade-off. IEEE Trans. Info. Th. 26,
401–406 (1980)
40. Hong, J., Sarkar, P.: New Applications of Time Memory Data Tradeoffs. In: Roy,
B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp. 353–372. Springer, Heidelberg
(2005)
41. Jakobsson, M., Sako, K., Impagliazzo, R.: Designated Verifier Proofs and their
Applications. In: Maurer, U.M. (ed.) EUROCRYPT 1996. LNCS, vol. 1070, pp.
143–154. Springer, Heidelberg (1996)
42. Katz, J., Lindell, A.: Aggregate Message Authentication Codes. In: Malkin, T.
(ed.) CT-RSA 2008. LNCS, vol. 4964, pp. 155–169. Springer, Heidelberg (2008)
43. Katz, J., Wang, N.: Efficiency improvements for signature schemes with tight se-
curity reductions. In: CCS 2003, pp. 155–164 (2003)
44. Kelly, S., Frankel, S.: Using HMAC-SHA-256, HMAC-SHA-384, and HMAC-SHA-
512 with IPsec. IETF RFC 4868 (2007)
45. Kent, S., Atkinson, R.: IP authentication header. IETF RFC 4302 (2005)
46. Koblitz, N., Menezes, A.: Another look at “provable security”. J. Cryptology 20,
3–37 (2007)
318 S. Chatterjee, A. Menezes, and P. Sarkar
47. Koblitz, N., Menezes, A.: Another Look at “Provable Security”. II. In: Barua,
R., Lange, T. (eds.) INDOCRYPT 2006. LNCS, vol. 4329, pp. 148–175. Springer,
Heidelberg (2006)
48. Krawczyk, H.: HMQV: A High-Performance Secure Diffie-Hellman Protocol. In:
Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 546–566. Springer, Heidelberg
(2005), Full version at http://eprint.iacr.org/2005/176
49. Kurosawa, K., Iwata, T.: TMAC: Two-Key CBC MAC. In: Joye, M. (ed.) CT-RSA
2003. LNCS, vol. 2612, pp. 33–49. Springer, Heidelberg (2003)
50. LaMacchia, B., Lauter, K., Mityagin, A.: Stronger Security of Authenticated Key
Exchange. In: Susilo, W., Liu, J.K., Mu, Y. (eds.) ProvSec 2007. LNCS, vol. 4784,
pp. 1–16. Springer, Heidelberg (2007)
51. Lu, S., Ostrovsky, R., Sahai, A., Shacham, H., Waters, B.: Sequential Aggregate
Signatures and Multisignatures without Random Oracles. In: Vaudenay, S. (ed.)
EUROCRYPT 2006. LNCS, vol. 4004, pp. 465–485. Springer, Heidelberg (2006)
52. Luby, M.: Pseudorandomness and Cryptographic Applications. Princeton
University Press (1996)
53. McGrew, D.A., Viega, J.: The Security and Performance of the Galois/Counter
Mode (GCM) of Operation. In: Canteaut, A., Viswanathan, K. (eds.)
INDOCRYPT 2004. LNCS, vol. 3348, pp. 343–355. Springer, Heidelberg (2004)
54. Menezes, A., Smart, N.: Security of signature schemes in the multi-user setting.
Designs, Codes and Cryptography 33, 261–274 (2004)
55. Menezes, A., Ustaoglu, B.: Security arguments for the UM key agreement protocol
in the NIST SP 800-56A standard. In: ASIACCS 2008, pp. 261–270 (2008)
56. National Security Agency, SKIPJACK and KEA algorithm specification, Version
2.0 (May 29, 1998)
57. Paillier, P., Vergnaud, D.: Discrete-Log-Based Signatures May Not Be Equiva-
lent to Discrete Log. In: Roy, B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp.
1–20. Springer, Heidelberg (2005)
58. Pointcheval, D., Stern, J.: Security arguments for digital signatures and blind sig-
natures. J. Cryptology 13, 361–396 (2000)
59. Rogaway, P.: Efficient Instantiations of Tweakable Blockciphers and Refinements
to Modes OCB and PMAC. In: Lee, P.J. (ed.) ASIACRYPT 2004. LNCS, vol. 3329,
pp. 16–31. Springer, Heidelberg (2004)
60. Rogaway, P.: OCB: Background,
http://www.cs.ucdavis.edu/~ rogaway/ocb/ocb-faq.htm
61. Rogaway, P., Bellare, M., Black, J.: OCB: A block-cipher mode of operation for
efficient authenticated encryption. ACM Trans. Information and System Security 6,
365–403 (2003)
62. Rogaway, P., Shrimpton, T.: A Provable-Security Treatment of the Key-Wrap
Problem. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 373–
390. Springer, Heidelberg (2006), Full version at
http://eprint.iacr.org/2006/221
63. Sarkar, P.: Pseudo-random functions and parallelizable modes of operations of a
block cipher. IEEE Trans. Info. Th. 56, 4025–4037 (2010)
64. Sarkar, P.: Tweakable enciphering schemes using only the encryption function of a
block cipher. Inf. Process. Lett. 111, 945–955 (2011)
65. Schäge, S.: Tight Proofs for Signature Schemes without Random Oracles. In: Pa-
terson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 189–206. Springer,
Heidelberg (2011)
66. Schnorr, C.: Efficient signature generation for smart cards. J. Cryptology 4,
161–174 (1991)
Another Look at Tightness 319
67. Shin, J.: Enhancing privacy in cryptographic protocols, Ph.D. thesis, University of
Maryland (2009)
68. Sidorenko, A., Schoenmakers, B.: Concrete Security of the Blum-Blum-Shub Pseu-
dorandom Generator. In: Smart, N.P. (ed.) Cryptography and Coding 2005. LNCS,
vol. 3796, pp. 355–375. Springer, Heidelberg (2005)
69. Song, J.H., Poovendran, R., Lee, J., Iwata, T.: The AES-CMAC algorithm. IETF
RFC 4493 (2006)
70. Ustaoglu, B.: Obtaining a secure and efficient key agreement protocol from
(H)MQV and NAXOS. Designs, Codes and Cryptography 46, 329–342 (2008)
71. van Oorschot, P., Wiener, M.: Parallel collision search with cryptanalytic applica-
tions. J. Cryptology 12, 1–28 (1999)
72. Young, A., Yung, M.: Malicious Cryptography: Exposing Cryptovirology. Wiley
(2004)
Duplexing the Sponge:
Single-Pass Authenticated Encryption
and Other Applications
Guido Bertoni1 , Joan Daemen1 , Michaël Peeters2 , and Gilles Van Assche1
1
STMicroelectronics
2
NXP Semiconductors
1 Introduction
While most symmetric-key modes of operations are based on a block cipher
or a stream cipher, there exist modes using a fixed permutation as underlying
primitive. Designing a cryptographically strong permutation suitable for such
purposes is similar to designing a block cipher without a key schedule and this
design approach was followed for several recent hash functions, see, e.g., [15].
The sponge construction is an example of such a mode. With its arbitrarily
long input and output sizes, it allows building various primitives such as a stream
cipher or a hash function [5]. In the former, the input is short (typically the key
and a nonce) while the output is as long as the message to encrypt. In contrast,
the latter takes a message of any length at input and produces a digest of small
length.
Some applications can take advantage of both a long input and a long out-
put size. For instance, authenticated encryption combines the encryption of a
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 320–337, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Duplexing the Sponge 321
These modes can readily be used by the concrete sponge function Keccak
[10] and the members of a recent wave of lightweight hash functions that are
in fact sponge functions: Quark [1], Photon [18] and Spongent [12]. For these,
and for the small-width instances of Keccak, our security bound against generic
attacks beyond the birthday bound published in [9] allows constructing solutions
that are at the same time compact, efficient and potentially secure.
The remainder of this paper is organized as follows. First, we propose a model
for authenticated encryption in Section 2. Then in Section 3, we review the
sponge construction. The core concept of this paper, namely the duplex con-
struction, is defined in Section 4. Its use for authenticated encryption is given
in Section 5 and for other applications in Section 6. Finally, Section 7 discusses
the use of a flexible and compact padding. For compactness reasons, the proofs
are omitted in this version and can be found in [8].
W : Zk2 × (Z∗2 )2+ → Z∗2 × Zt2 : (K, A, B) → (C (last) , T (last) ) = W (K, A, B), and
U : Zk2 × (Z∗2 )2+ × Zt2 → Z∗2 ∪ {error} : (K, A, C, T (last) ) → B (last) or error.
Here, (Z∗2 )2+ means any sequence of binary strings, with an even number of such
strings and at least two. To wrap a sequence of header-body pairs, the sender
calls W (K, A(1) , B (1) ) with the first header-body pair to get (C (1) , T (1) ), then
W (K, A(1) , B (1) , A(2) , B (2) ) with the second one to get (C (2) , T (2) ), and so on.
To unwrap, the receiver first calls U (K, A(1) , C (1) , T (1) ) to retrieve the first body
B (1) , then U (K, A(1) , C (1) , A(2) , C (2) , T (2) ) to retrieve the second body, and so
on. As we consider only the case of non-expanding encryption, we assume that
|C (i) | = |B (i) | for all i.
with R(A, B) = RO(A, B)|B (n) |+t where B (n) is the last body in A, B, |x| is
$
the bitlength of string x, · indicates truncation to bits and K ← − Zk2 means
that K is chosen randomly and uniformly among the set Z2 . In this definition, we
k
use a random oracle RO as defined in [3], but allowing sequences of one or more
binary strings as input (instead of a single binary string). Here, a random oracle
is a map from (Z∗2 )+ to Z∞ 2 , chosen by selecting each bit of RO(x) uniformly
and independently, for every input. The original definition can still be used by
defining an injective mapping from (Z∗2 )+ to Z∗2 .
For privacy, we consider only adversaries who respect the nonce requirement.
For a single header-body pair, it means that, for any two queries (A, B) and
(A , B ), we have A = A ⇒ B = B . In general, the nonce requirement
specifies that for any two queries (A, B) and (A , B ) of equal length n, we
have
with pre(A, B) = (A(1) , B (1) , A(2) , . . . , B (n−1) , A(n) ) the sequence with the last
body omitted. As for a stream cipher, not respecting the nonce requirement
means that the adversary can learn the bitwise difference between two plaintext
bodies.
Authenticity is defined in Eq. (2) below. Informally, it quantifies the proba-
bility of the adversary successfully generating a forged ciphertext-tag pair.
$
Advauth (A) = Pr[K ←
− Zk2 : A[W (K, ·, ·)] outputs a forgery]. (2)
Here a forgery is a sequence (A, C, T ) such that U (K, A, C, T ) = error and that
the adversary made no query to W with input (A, B) returning (C (n) , T ), with
C (n) the last ciphertext body of A, C. Note that authenticity does not need the
nonce requirement.
The unwrapping algorithm U first checks that T (n) = RO T (K, A, B)t and if
so decrypts each body B (i) = ROC (K, A(1) , B (1) , A(2) , . . . , A(i) )|C (i) | ⊕ C (i)
from the first one to the last one and finally returns the last one B (n) =
RO C (K, pre(A, B))|C (n) | ⊕ C (n) .
The security of ROwrap is captured by Lemmas 1 and 2.
The sponge construction [5] builds a function sponge[f, pad, r] with variable-
length input and arbitrary output length using a fixed-length permutation (or
transformation) f , a padding rule “pad” and a parameter bitrate r.
For the padding rule we use the following notation: the padding of a message
M to a sequence of x-bit blocks is denoted by M ||pad[x](|M |), where |M | is the
length of M . This notation highlights that we only consider padding rules that
append a bitstring that is fully determined by the length of M and the block
length x. We may omit [x], |M | or both if their value is clear from the context.
For the sponge construction to be secure (see Section 3.2), the padding rule
pad must be sponge-compliant. As a sufficient condition, a padding rule that is
reversible, non-empty and such that the last block must be non-zero, is sponge-
compliant [5].
3.1 Definition
The permutation f operates on a fixed number of bits, the width b. The sponge
construction has a state of b bits. First, all the bits of the state are initialized
to zero. The input message is padded with the function pad[r] and cut into
r-bits blocks. Then it proceeds in two phases: the absorbing phase followed by
the squeezing phase. In the absorbing phase, the r-bit input message blocks are
XORed into the first r bits of the state, interleaved with applications of the
function f . When all message blocks are processed, the sponge construction
switches to the squeezing phase. In the squeezing phase, the first r bits of the
state are returned as output blocks, interleaved with applications of the function
f . The number of iterations is determined by the requested number of bits.
Finally the output is truncated to the requested length. Algorithm 1 provides a
formal definition.
The value c = b − r is called the capacity. The last c bits of the state are never
directly affected by the input blocks and are never output during the squeezing
phase. The capacity c actually determines the attainable security level of the
construction [6,9].
3.2 Security
Cryptographic functions are often designed in two steps. In the first step, one
chooses a construction that uses a cryptographic primitive with fixed input and
output size (e.g., a compression function or a permutation) and builds a function
326 G. Bertoni et al.
that can take inputs and or generate outputs of arbitrary size. If the security
of this construction can be proven, for instance as in this case using the in-
differentiability framework, it reduces the scope of cryptanalysis to that of the
underlying primitive and guarantees the absence of single-stage generic attacks
(e.g., preimage, second preimage and collision attacks) [21]. However, generic
security in the multi-stage setting using the indifferentiability framework is cur-
rently an open problem [25].
It is shown in [6] that the success probability of any single-stage generic at-
tack for differentiating the sponge construction calling a random permutation or
transformation from a random oracle is upper bounded by 2−(c+1) N 2 . Here N
is the number of calls to the underlying permutation or its inverse. This implies
that any single-stage generic attack on a sponge function has success probability
of at most 2−(c+1) N 2 plus the success probability of this attack on a random
oracle.
In [9], we address the security of the sponge construction when the message
is prefixed with a key, as it will be done in the mode of Section 5. In this specific
case, the security proof goes beyond the 2c/2 complexity if the number of input
or output blocks for which the key is used (data complexity) is upper bounded
by M < 2c/2−1 . In that case, distinguishing the keyed sponge from a random
oracle has time complexity of at least 2c−1 /M > 2c/2 . Hence, for keyed modes,
one can reduce the capacity c for the same targeted security level.
of the tag, while we aim for a single-pass solution. To achieve this, we define
a variant where the key stream blocks and tag are the responses of a sponge
function to input sequences that are each other’s prefix. This introduces a new
construction that is closely related to the sponge construction: the duplex con-
struction. Subsequently, we build an authenticated encryption mode on top
of that.
Upon receipt of a D.duplexing(σ, ) call, the duplex object pads the input string
σ and XORs it into the first r bits of the state. Then it applies f to the state
328 G. Bertoni et al.
Interface: D.initialize()
s = 0b
ρmax (pad,r)
Interface: Z = D.duplexing(σ, ) with ≤ r, σ ∈ n=0 Zn
2 , and Z ∈ Z2
P = σ||pad[r](|σ|)
s = s ⊕ (P ||0b−r )
s = f (s)
return s
and returns the first bits of the state at the output. We call a blank call a
call with σ the empty string, and a mute call a call without output, = 0. The
duplex construction is illustrated in Figure 1, and Algorithm 2 provides a formal
definition.
The following lemma links the security of the duplex construction to that of
the sponge construction with the same parameters, i.e., duplex[f, pad, r] and
sponge[f, pad, r]. Generating the output of a D.duplexing() call using a sponge
function is illustrated in Figure 2.
The output of a duplexing call is thus the output of a sponge function with
an input σ0 ||pad0 ||σ1 ||pad1 || . . . ||σi and from this input the exact sequence
σ0 , σ1 , . . . , σi can be recovered as shown in Lemma 4 below. As such, the duplex
construction is as secure as the sponge construction with the same parameters.
In particular, it inherits its resistance against (single-stage) generic attacks. The
reference point in this case is a random oracle whose input is the sequence of
inputs to the duplexing calls since the initialization.
Lemma 4. Let pad and r be fixed. Then, the mapping from a sequence of binary
strings (σ0 , σ1 , . . . , σn ) with |σi | ≤ ρmax (pad, r) ∀i to the binary string s =
σ0 ||pad0 ||σ1 ||pad1 || . . . ||padn−1 ||σn is injective.
In the following sections we will show that the duplex construction is a powerful
tool for building modes of use.
Duplexing the Sponge 329
5.1 Definition
A SpongeWrap object W internally uses a duplex object D with parameters
f, pad and r. Upon initialization of a SpongeWrap object, it initializes D and
forwards the (padded) key blocks K to D using mute D.duplexing() calls.
When receiving a W.wrap(A, B, ) request, it forwards the blocks of the
(padded) header A and the (padded) body B to D. It generates the cryptogram
C block by block Ci = Bi ⊕ Zi with Zi the response of D to the previous
D.duplexing() call. The -bit tag T is the response of D to the last body block
(possibly extended with the response to additional blank D.duplexing() calls in
case > ρ). Finally it returns the cryptogram C and the tag T .
When receiving a W.unwrap(A, C, T ) request, it forwards the blocks of the
(padded) header A to D. It decrypts the data body B block by block Bi = Ci ⊕Zi
with Zi the response of D to the previous D.duplexing() call. The response of D
330 G. Bertoni et al.
to the last body block (possibly extended) is compared with the tag T received
as input. If the tag is valid, it returns the data body B; otherwise, it returns
an error. Note that in implementations one may impose additional constraints,
such as SpongeWrap objects dedicated to either wrapping or unwrapping.
Additionally, the SpongeWrap object should impose a minimum length t for
the tag received before unwrapping and could break the entire session as soon
as an incorrect tag is received.
Before being forwarded to D, every key, header, data or cryptogram block
is extended with a so-called frame bit. The rate ρ of the SpongeWrap mode
determines the size of the blocks and hence the maximum number of bits pro-
cessed per call to f . Its upper bound is ρmax (pad, r) − 1 due to the inclusion
of one frame bit per block. A formal definition of SpongeWrap is given in
Algorithm 3.
5.2 Security
i.e., the last body B (n) is truncated to its first i blocks of ρ bits. We define
ROwrap[ρ] identically to ROwrap, except that in the wrapping algorithm, we
have
(n)
C (n) =RO C (K, pre0 (A, B))|B (n) | ⊕ B0
0
(n)
||RO C (K, pre1 (A, B))|B (n) | ⊕ B1
1
...
||RO C (K, prew (A, B))|B (n) | ⊕ Bw
(n)
w
−k N (N + 1)
Advpriv
SpongeWrap[f,pad,r,ρ] (A) < q2 + and
2c+1
−k N (N + 1)
Advauth
SpongeWrap[f,pad,r,ρ] (A) < q2 + 2−t + ,
2c+1
$
against any single adversary A if K ←− Zk2 , tags of ≥ t bits are used, f is a
randomly chosen permutation, q is the number of queries and N is the number
of times f is called.
Note that all the outputs of SpongeWrap are equivalent to calls to a sponge
function with the secret key blocks as a prefix. So the results of [9] can also be
applied to SpongeWrap as explained in Section 3.2.
then expands the seed into a sequence of bits. For cryptographic purposes, it is
required that the generated bits cannot be predicted, even if subsets of the se-
quence are revealed. In this context, a PRG is similar to a stream cipher. A PRG
is also similar to a cryptographic hash function when gathering entropy coming
from different sources. Finally, some applications require a pseudo-random bit
sequence generator to support forward security: The compromise of the cur-
rent state does not enable the attacker to determine the previously generated
pseudo-random bits [4,13].
Conveniently, a pseudo-random bit sequence generator can be reseedable, i.e.,
one can bring an additional source of entropy after pseudo-random bits have
been generated. Instead of throwing away the current state of the PRG, reseeding
combines the current state of the generator with the new seed material. In [7] a
reseedable PRG was defined based on the sponge construction that implements
the required functionality. The ideas behind that PRG are very similar to the
duplex construction. We however show that such a PRG can be defined on top
of the duplex construction.
A duplex object can readily be used as a reseedable PRG. Seed material can
be fed via the σ inputs in D.duplexing() call and the responses can be used as
pseudo-random bits. If pseudo-random bits are required and there is no seed
available, one can simply send blank D.duplexing() calls. The only limitation
of this is that the user must split his seed material in strings of at most ρmax
bits and that at most r bits can be requested in a single call. This limitation
is removed in a more elaborate generator called SpongePRG presented in [8].
This mode is similar to the one proposed in [7] in that it minimizes the number
of calls to f , although explicitly based on the duplex construction.
always just try to differentiate the weakest construction from a random oracle.
The next theorem states that we achieve this bound by using the multi-rate
padding.
References
1. Aumasson, J.-P., Henzen, L., Meier, W., Naya-Plasencia, M.: Quark: A lightweight
hash. In: Mangard and Standaert [20], pp. 1–15
2. Bellare, M., Namprempre, C.: Authenticated Encryption: Relations among No-
tions and Analysis of the Generic Composition Paradigm. In: Okamoto, T. (ed.)
ASIACRYPT 2000. LNCS, vol. 1976, pp. 531–545. Springer, Heidelberg (2000)
3. Bellare, M., Rogaway, P.: Random oracles are practical: A paradigm for designing
efficient protocols. In: ACM (ed.) ACM Conference on Computer and Communi-
cations Security 1993, pp. 62–73 (1993)
4. Bellare, M., Yee, B.: Forward-security in private-key cryptography. Cryptology
ePrint Archive, Report 2001/035 (2001), http://eprint.iacr.org/
5. Bertoni, G., Daemen, J., Peeters, M., Van Assche, G.: Sponge functions. In: Ecrypt
Hash Workshop (May 2007), public comment to NIST, from
http://www.csrc.nist.gov/pki/HashWorkshop/
Public Comments/2007 May.html
6. Bertoni, G., Daemen, J., Peeters, M., Van Assche, G.: On the Indifferentiability
of the Sponge Construction. In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS,
vol. 4965, pp. 181–197. Springer, Heidelberg (2008),
http://sponge.noekeon.org/
7. Bertoni, G., Daemen, J., Peeters, M., Van Assche, G.: Sponge-based pseudo-
random number generators. In: Mangard and Standaert [20], pp. 33–47
8. Bertoni, G., Daemen, J., Peeters, M., Van Assche, G.: Duplexing the sponge: single-
pass authenticated encryption and other applications. Cryptology ePrint Archive,
Report 2011/499 (2011), http://eprint.iacr.org/
9. Bertoni, G., Daemen, J., Peeters, M., Van Assche, G.: On the security of the keyed
sponge construction. In: Symmetric Key Encryption Workshop (SKEW) (February
2011)
10. Bertoni, G., Daemen, J., Peeters, M., Van Assche, G.: The keccak reference
(January 2011), http://keccak.noekeon.org/
11. Biryukov, A. (ed.): FSE 2007. LNCS, vol. 4593. Springer, Heidelberg (2007)
12. Bogdanov, A., Knežević, M., Leander, G., Toz, D., Varıcı, K., Verbauwhede, I.:
spongent: A Lightweight Hash Function. In: Preneel, B., Takagi, T. (eds.) CHES
2011. LNCS, vol. 6917, pp. 312–325. Springer, Heidelberg (2011)
13. Desai, A., Hevia, A., Yin, Y.L.: A Practice-Oriented Treatment of Pseudorandom
Number Generators. In: Knudsen, L.R. (ed.) EUROCRYPT 2002. LNCS, vol. 2332,
pp. 368–383. Springer, Heidelberg (2002)
14. Dworkin, M.: Request for review of key wrap algorithms. Cryptology ePrint
Archive, Report 2004/340 (2004), http://eprint.iacr.org/
Duplexing the Sponge 337
Yusuke Naito
1 Introduction
The blockcipher-based design (e.g. [19,26]) is the most popular method for con-
structing a cryptographic hash function. A hash function is designed by the
following two steps: (1) designing a blockcipher and (2) designing a mode of
operation. MD-family [28,29], SHA-family [23] and SHA-3 candidates follow the
design method. Another design method is to utilize a practical blockcipher such
as AES. Such hash functions are useful in size restricted devices such as RFID
tags and smart cards: when implementing both a hash function and a blockci-
pher, one has only to implement a blockcipher. However, the output length of
practical blockciphers is far too short for a collision resistant hash function, e.g.,
128 bits for AES. Thus designing a collision resistant double length hash func-
tion (CR-DLHF) is an interesting topic. The core of the design of the CR-DLHF
is to design a collision resistant double-length compression function (CR-DLCF)
which maps an input of fixed length (more than 2n-bits) to an output of 2n-bit
length when using an n-bit output length blockcipher. Then the hash function
combined a domain extension (e.g. strengthened Merkle-Damgård (SMD) [5,21]),
which preserves CR security, with the CR-DLCF yields a CR-DLHF. Many DL-
CFs, e,g,. [2,22,11,14,24,16,18], have been designed and the security is proven in
the ideal cipher (IC) model [8,11,17,9,24,15,30].
The indifferentiability framework was introduced by Maurer et al. [20], which
considers the reducibility of one system to another system. Roughly speaking,
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 338–355, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Blockcipher-Based Double-Length Hash Functions 339
In this paper, we propose DLHFs using a single (2n,n)-BC, which are PROs up
to O(2n ) query complexity in the IC model. Since many blockciphers support 2n-
bit key length, e.g., AES supports 256-bit key length, and the existing DLCFs
(e.g., Hirose’s compression function [11], Tandem-DM [14], Abreast-DM [14],
and generalized DLCF [24]) use a (2n,n)-BC, the assumption to use a (2n,n)-
BC is acceptable. To our knowledge, our hash functions are the first time DLHFs
based on a practical size blockcipher with birthday PRO security.3 When n = 128
which is supported by AES, our hash functions have 2128 security. Since our hash
functions use only a single blockcipher, it is useful on size restricted devices when
implementing both a hash function and a blockcipher. (the hybrid encryption
1
“Practical size” is the size supported by practical blockciphers such as AES.
2
Since PRO security is stronger security than CR security, CR security does not guar-
antee PRO security.
3
Our hash functions don’t satisfy the stronger notion called reset indifferentiability
from RO, which ensure security in the multi-stage games [27]. Note that there is no
hash function satisfying the notion. Thus to propose the hash function is an open
problem.
340 Y. Naito
schemes use both a blockcipher and a hash function used in a key derivation
function, for example.)
Our DLHF. Our DLHFs, which use each of Hirose’s compression function,
Tandem-DM and Abreast-DM, iterate the compression function and use a new
post-processing function f at the last iteration which calls a (2n, n)-BC twice.
Our DLHFs are slightly lesser for speed than existing CR-DLHFs but have higher
security (birthday PRO security).
Let BC2n,n = (E, D) be a (2n,n)-BC where E is an encryption function and
D is a decryption function. Let DLCFBC2n,n be a DLCF: Hirose’s compression
BC2n,n
function, Tandem-DM, or Abreast-DM. Let SMDDLCF : {0, 1}∗ → {0, 1}2n
be the SMD hash function using the compression function DLCFBC2n,n . Our
DLHF is defined as follows:
BC2n,n
F BC2n,n (M ) = f BC2n,n (SMDDLCF (M ))
where f BC2n,n (x) = E(x, c1 )||E(x, c2 ) and c1 and c2 are n-bit constant values.
Note that the first element of the encryption function is the key element and the
second element is the plain text element. The DLHF using Hirose’s compression
function is illustrated in Fig. 1 where each line is n bits and IV [0], IV [1], C, c1
and c2 are constant values. Note that in this figure we omit the suffix free padding
function sfpad. So the hash function takes as its input a message M , sfpad(M ) =
M1 ||M2 || · · · ||Ml with each block of n bits, and outputs the final value rv1 ||rv2 .
BC2n,n
We use the DLHF SMDDLCF to compress an arbitrary length input into
an fixed input length value. Since SMD hash functions cannot be used as ROs
[4], the post-processing function f BC2n,n is used to guarantee PRO security.
The use of the constant values c1 and c2 in the post-processing function is
inspired by the design technique of EMD proposed by Bellare and Ristenpart
[1]. This realizes the fact that we can treat our hash function as a NMAC-like
hash function. Note that the security of EMD is proven when the compression
function is FILRO, while the security of our hash functions is proven when the
compression function is the DLCF in the IC model. So additional analyses are
needed due to the invertible property of IC and the structures of DLCFs.We thus
prove the PRO security of F BC2n,n by using three techniques: the PrA (Preimage
Aware) design framework of Dodis et al. [6], PRO for a small function [4], and
Blockcipher-Based Double-Length Hash Functions 341
indifferentiability from a hash function. The first two techniques are existing
techniques and the last technique is a new application of the indifferentiability
framework [20].
First, we prove that the DLCFs are PrA up to O(2n ) query complexity. The
PrA design framework offers the hash functions which are PROs up to O(2n )
query complexity where FILRO is used as the post-processing function. Second,
we convert FILRO into the blockcipher-based post-processing function. We prove
that the post-processing function is PRO up to O(2n ) query complexity in the
IC model (PRO for a small function). Then, we prove that the PRO security of
the post-processing function and the first PRO result ensure that the converted
hash functions are PROs up to O(2n ) query complexity. We note that the hash
functions use two blockciphers.4 Finally, we consider the single-blockcipher-based
hash functions F BC2n,n . We prove that the single blockcipher-based hash func-
tions are indifferentiable from the two-blockciphers-based hash functions in the
IC model up to O(2n ) query complexity (indifferentiability from a hash func-
tion). Then we show that the indifferentiable security result and the second PRO
result ensure that our hash functions are PROs up to O(2n ) query complexity
in the IC model.
2 Preliminaries
Notation. For two values x, y, x||y is the concatenated value of x and y. For
some value y, x ← y means assigning y to x. ⊕ is bitwise exclusive or. |x| is
the bit length of x. For a set (list) T and an element W , T ← W means to
∪
insert W into T and T ← − W means T ← T ∪ {W }. For some 2n-bit value x,
x[0] is the first n bit value and x[1] is the last n-bit value. BCd,n = (E, D) be
a blockcipher where E : {0, 1}d × {0, 1}n → {0, 1}n is an encryption function,
D : {0, 1}d × {0, 1}n → {0, 1}n is a decryption function, the key size is d bits
and the cipher text size is n bits. Cd,n = (EI , DI ) be a ideal cipher (IC) where
EI : {0, 1}d × {0, 1}n → {0, 1}n is an encryption oracle, DI : {0, 1}d × {0, 1}n →
{0, 1}n is a decryption oracle, the key size is d bits and the cipher text size is
n bits. Fa,b : {0, 1}a → {0, 1}b is a random oracle (RO). An arbitrary input
length random oracle is denoted by Fb : {0, 1}∗ → {0, 1}b . For any algorithm A,
we write Time(A) to mean the sum of its description length and the worst-case
number of steps.
Merkle-Damgård [5,21]. Let h : {0, 1}2n × {0, 1}d → {0, 1}2n be a com-
pression function using a primitive P (more strictly hP ) and pad : {0, 1}∗ →
({0, 1}d)∗ be a padding function. The Merkle-Damgård hash function MDh is
described as follows where IV is a 2n-bit initial value.
4
Two independent ideal cipher can be obtained from a single ideal cipher by victim-
izing one bit of the key space. So using a blockcipher with the 2n + 1-bit key space
and the n-bit key space, the hash functions which uses a single blockcipher can be
realized. But the size of the blockcipher is not a practical size.
342 Y. Naito
MDh (M )
z ← IV ;
Break pad(M ) into d-bit blocks, pad(N ) = M1 || · · · ||Ml ;
for i = 1, . . . , l do z ← h(z, Mi );
Ret z;
Pseudorandom Oracle [20]. Let H P : {0, 1}∗ → {0, 1}n be a hash function
that utilizes an ideal primitive P . We say that H P is PRO if there exists an effi-
cient simulator S that simulates P such that for any distinguisher A outputting
a bit it is the case that
Fn
⇒ 1] − Pr[AFn ,S
P
Advpro
H P ,S (A) = |Pr[A
H ,P
⇒ 1]|
is small where the probabilities are taken over the coins used the experiments. S
can make queries to Fn . The S’s task is to simulate P such that relations among
responses of (H P , P ) hold in responses of (Fn , S) as well.
Exppra
F P ,P,E,A oracle P(m) oracle Ex(z)
x←
$
− AP,Ex ; c ← P (m); Q[z] ← 1;
∪
z ← F P (x); α← − (m, c); V[z] ← E(z, α);
Ret (x = V[z] ∧ Q[z] = 1); Ret c; Ret V[z];
Here an adversary A is provided two oracles P and Ex. The oracle P provides
access to the ideal primitive P and records a query history α. The extraction
oracle Ex provides an interface to an extractor E, which is a deterministic al-
gorithm that uses z and the query history α of P , and returns either ⊥ or an
element x such that F P (x ) = z. If x can be constructed from α, it returns x
and otherwise returns ⊥. In this experiment, the (initially everywhere ⊥) array
Q and the (initially empty) array V are used. When z is queried to Ex, Q[z] ← 1
and then the output of E(z, α) is assigned to V[z]. For the hash function F P ,
the adversary A, and the extractor E, we define the advantage relation
Advpra pra
F P ,P,E = Pr[ExpF P ,P,E,A ⇒ true]
Blockcipher-Based Double-Length Hash Functions 343
where the probabilities are over the coins used in running the experiments. When
there exists an efficient extractor E such that for any adversary A the above
advantage is small, we say that F P is preimage aware (PrA).
The pra-advantage can be evaluated from the cr-advantage (collision resis-
tance advantage) and the 1-wpra (1-weak PrA) advantage [7]. The 1-WPrA
experiment is described as follows.
Exp1wpra
F P ,P,E + ,A oracle P(m) oracle Ex+ (z)
$ + c ← P (m); Q[z] ← 1;
x←− AP,Ex ; ∪
z ← F P (x); α← − (m, c); L ← E + (z, α);
Ret (x ∈ L ∧ Q[z] = 1); Ret c; Ret L;
The difference between the 1-WPrA experiment and the PrA experiment is the
extraction oracle. In the 1-WPrA experiment, a multi-point extractor oracle
Ex+ is used. Ex+ provides an interface to a multi-point extractor E + , which is a
deterministic algorithm that uses z and α, and returns either ⊥ or a set of an
element in the domain of F P . The output (set) of E + is stored in list L. Thus,
if L = {⊥}, for any x ∈ L F P (x ) = z. In this experiment, an adversary A can
make only a single query to Ex+ . For a hash function F P , an adversary A, and
a multi-point extractor E + , we define the advantage relation
Adv1wpra 1wpra
F P ,P,E = Pr[ExpF P ,P,E + ,A ⇒ true]
where the probabilities are over the coins used in running the experiments. When
there exists an efficient multi-point extractor E + such that the above advantage
is small for any adversary A, we say that F P is 1-WPrA.
The definition of the cr-advantage as follows. Let A be an adversary that
outputs a pair of values x and x . To hash function F P using primitive P and
adversary A we associate the advantage relation
− AP : F P (x) = F P (x ) ∧ x = x ]
$
F P ,P (A) = Pr[(x, x ) ←
Advcr
Advpra
F P ,P,E
(Apra ) ≤ qe · Adv1wpra
F P ,P,E +
(A1wpra ) + Advcr cr
F P ,P (A ).
A1wpra runs in time at most O(qe Time(E + )) and makes the same number of P
queries as Apra . Acr asks qP queries and run in time O(qe · Time(E + )). E runs
in the same time as E + .
344 Y. Naito
NMAC Hash Function. Let g : {0, 1}n → {0, 1}n be a function and H P :
{0, 1}∗ → {0, 1}n be a hash function using primitive P such that g is not used in
H P . Dodis et al. [7] proved that the PRO security of the NMAC hash function
g ◦ H P can be reduced into the PrA security of H P .
Lemma 2 (Theorem 4.1 of [7]). Let P be an ideal primitive, g be a random
oracle and E be any extractor for H P . Then there exists a simulator S = (SP , Sg )
such that for any PRO adversary A making at most qF , qP , qg queries to its three
oracles (OF , OP , Og ) where (OF , OP , Og ) = (g ◦ H P , P, g) or (OF , OP , Og ) =
(Fn , SP , Sg ), there exists a PrA adversary B such that
Advpro
g◦H P ,S
(A) ≤ Advpra
H P ,P,E
(B).
S runs in time O(qP + qg ·Time(E)). Let l be the length of the longest query made
by A to OH . B runs in time O(Time(A) + qF tH + qP + qg ), makes qP + qH qF
queries, qg extraction queries, and outputs a preimage of length at most l where
for any input M to H P the output of H P (M ) can be calculated within at most
tH times and qH queries to P .
Dodis et al. proved that the SMD construction preserves the PrA security as
follows. Therefore, the PRO security of the NMAC hash function using the SMD
hash function can be reduced into the PrA security of the compression function.
where f BC2n,n (x) = E(x, c1 )||E(x, c2 ) such that c1 and c2 are n-bit different
constant values and are different from values which are defined by the compres-
sion function (see subsection 3.3). The hash functions use Hirose’s compression
function, Tandem-DM, and Abreast-DM as the underlying DLCF, respectively.
We prove the PRO security by the three steps. Each step uses the PrA design
framework, PRO for a small function and indifferentiability from a hash function,
respectively.
– Step 3. This is the final step. We use the indifferentiability from a hash
BC2 ,BC3
function: we prove that F BC2n,n is indifferentiable from F2 2n,n 2n,n up to
O(2n ) query complexity in the IC model. Then, we prove that the indiffer-
entiable result and the PRO security of F2 ensure that F BC2n,n is PRO up
to O(2n ) query complexity in the IC model.
3.1 Step 1
We call the procedure 1 “first block” and the procedure 2 “second block”.
346 Y. Naito
Proof. We prove that Hirose’s compression function is 1-WPrA, and then Lemma
1 gives the final bound. We note that Theorem 3 of [9] upperbounds the cr-
advantage of A by 2qP2 /(2n − 2qP )2 + 2qP /(2n − 2qP ), yielding the first two
terms.
Intuitively, the 1-WPrA game for the compression function is that A declares
a value z then an extractor outputs preimages, stored in L, of z which can be
constructed from input-output values of A’s queries to C2n,n
1
. Then A outputs a
new preimage of z which is not stored in L. Note that A can adaptively query
to C2n,n
1
. We define the multi-point extractor to utilize the preimage resistant
bound, proven in [9], of Hirose’s compression function as follows.
algorithm E + (z, α)
Let L be an empty list;
Parse (k1 , x1 , y1 ), . . . , (ki , xi , yi ) ← α; //E1(kj , xj ) = yj
For j = 1 to i do
If z[0] = xj ⊕ yj then
y ← E1I (kj , xj ⊕ C);
∪
If z[1] = C ⊕ xj ⊕ y then L ← − (xj ||k[0], k[1]);
If z[1] = xj ⊕ yj then
y ← E1I (kj , xj ⊕ C);
∪
If z[0] = C ⊕ xj ⊕ y then L ← − ((xj ⊕ C)||k[0], k[1]);
If L is not an empty list then return L otherwise return ⊥;
If an input-output triple of the first block is defined, automatically the input of
the second block is defined, and vice versa, from the definition of the compression
function. For a query (z, α) to E + , when there is an input-output triple (k, x, y)
such that x⊕ y = z[0], E + checks whether the output of the second block is equal
to z[1] or not and if this holds the multi-point extractor stores it in the return
list L, and vice versa. Therefore, A must find a new preimage of z to win the 1-
WPrA experiment. Thus one can straightforwardly adapt the preimage resistant
advantage of the compression function (described in Theorem 5 of [9])5 because
the proof of Theorem 5 of [9] can be applied to the case that an adversary selects
an image z of the compression function and then finds the preimage of z. The
advantage is at most 2qP /(2n − qP )2 .
5
Note that while the 1-WPrA bound is equal to the preimage bound, this is not trivial
because one needs to construct the extractor that converts the preimage bound into
the 1-WPrA bound.
Blockcipher-Based Double-Length Hash Functions 347
Lemma 4 ensures the following theorem via Lemma 2 and Lemma 3 where F1 is
PRO up to O(2n ) query complexity.
Theorem 1. There exists a simulator S1 = (S1g , S1C ) where S1C = (S1E , S1D )
such that for any distinguisher A1 making at most (qH , qg , qE , qD ) queries to four
oracles which are (F1 , g, E1, D1) or (F2n , S1g , S1E , S1D ), we have
where S1 works in time O(qE + qD + lqg Q1 ) + lqg × Time(unpad) and S1g makes
qg queries to F2n where l is the maximum number of n-bit blocks of a query to
F1 /F2n , Q1 = 2l(qH + 1) + qE + qD . S1g simulates g, which makes one query
to F2n for one S1g query, and S1C , which makes no query, simulates the ideal
cipher.
3.2 Step 2
3
Lemma 5 (f C2n,n is PRO). Let C2n,n 3
= (E3I , D3I ) be an ideal cipher. Let
g = F2n,2n . There exists a simulator S = (SE , SD ) such that for any distin-
guisher A2 making at most qf , qE and qD queries to oracles (Of , OE , OD ) where
3
(Of , OE , OD ) = (f C2n,n , E3I , D3I ) or (Of , OE , OD ) = (g, SE , SD ), we have
qf + qE + qD
Advpro
C3
(A2 ) ≤
f 2n,n ,S 2n
where S works in time O(qE +qD ) and makes at most queries qE +qD . S simulates
the ideal cipher.
We explain the intuition of the result of Lemma 5. The proof is given the full
version. An ideal cipher where the plain text is fixed by a constant value is RO. So
the first half value y1 of an output of f is randomly chosen from {0, 1}n and the
last half value is chosen from {0, 1}n\{y1 }, while an output of RO is randomly
chosen from {0, 1}2n. The statistical distance appears in the PRO bound.
Theorem 1 and Lemma 5 ensure the following theorem where F2 using Hirose’s
compression function is PRO up to O(2n ) query complexity in the IC model. We
prove the theorem in the full version. Similarly, we can prove the PRO security
of the hash functions using Tandem-DM and Abreast-DM, respectively.
Theorem 2 (F2 is PRO). There exists a simulator S2 = (S2, S3) where S2 =
(S2E , S2D ) and S3 = (S3E , S3D ) such that for any distinguisher A3 making at
most (qH , qE2 , qD2 , qE3 , qD3 ) queries to five oracles which are (F2 , E2, D2, E3, D3)
or (F2n , S2E , S2D , S3E , S3D ), we have
3.3 Step 3
In this section, we consider the hash function using Hirose’s compression func-
tion. The same discussion can be applied to the hash functions using Tandem-DM
and Abreast-DM, respectively. The discussions are given in the full version.
When using Hirose’s compression function, we use the constant values c1 and
c2 of the post-processing function f such that c1 and c2 are not equal to C ⊕IV [0]
BC2n,n
and IV [0] where IV is the initial value of SMDDLCF and C is the constant
value used in Hirose’s compression function. If c1 and c2 which are equal to
C ⊕ IV [0] or IV [0] are used, we cannot prove the security of the hash function.
In this case, we fail to construct a simulator.
First, we define the indifferentiability from a hash function as follows.
Definition 2. Let H1P1 : {0, 1}∗ → {0, 1}2n and H2P2 : {0, 1}∗ → {0, 1}2n be
hash functions using ideal primitives P1 and P2 , respectively. H1P1 is indifferen-
tiable from H2P2 if there exists a simulator S such that for any distinguisher A4
outputting a bit it is the case that
P1 P2
H ,P1 H ,S P2
Advindif
P P
H 1 ,H 2 ,S
(A4 ) ≤ | Pr[A4 1 ⇒ 1] − Pr[A4 2 ⇒ 1]|
1 2
is small where the probabilities are taken over the coins used the experiments.
The following lemma is that F is indifferentiable from F2 up to O(2n ) query
complexity in the IC model.
Lemma 6. Let C2n,n = (EI , DI ) be an ideal cipher. Let C2n,n 2
= (E2I , D2I )
and C2n,n = (E3I , D3I ) be different ideal ciphers. There exists a simulator S =
3
where S works in time O(3(qE + qD )) and makes at most ideal cipher queries
qE +qD . l is the maximum number of n-bit blocks of a query to OF .
Proof. Without loss of generality, we omit the padding function of our hash
function which is more general case than including the padding function. In
Fig. 2, we define a simulator S = (SE , SD ) such that it simulates the ideal cipher
C2n,n = (EI , DI ) and the relation among responses of (F C2n,n , EI , DI ) holds in
C2 3
,C2n,n C2 3
,C2n,n
responses of (F2 2n,n , SE , SD ) as well, namely, F S (M ) = F2 2n,n (M ).
Blockcipher-Based Double-Length Hash Functions 349
Fig. 2. Simulator
Since E2I is used in inner calculations and E3I is used in the post-processing
calculations, if for a query (k, x) to SE (k, x) is used in the post-processing
calculations, it returns the output of E3I (k, x), and otherwise it returns the
output of E2I (k, x). Since in post-processing calculation the second value x of
a E query is c1 or c2 , we define S such that SE (k, x) is defined by E3I (k, x), if
x = c1 or x = c2 , and is defined by E2I (k, x) otherwise.6 E and D are (initially
everywhere ⊥) arrays.
We give the proof via a game-playing argument on the game sequences Game
0, Game 1, and Game 2. Game 0 is the F scenario and Game 2 is the F2
scenario. In each game, A4 can make queries to three oracles (OF , OE , OD ).
Let Gj be the event that in Game j the distinguisher A4 outputs 1. Therefore,
C2 3
,C2n,n
C2n,n F 2n,n ,SE ,SD
Pr[AF
4
,EI ,DI
⇒ 1] = Pr[G0] and Pr[A4 2 ⇒ 1] = Pr[G2]. Thus
In the following, we use the lazily-sample ideal cipher in Fig. 4. E and D are
(initially everywhere ⊥) arrays and TE and TD (initially empty) tables. For any
(k, x) such that E[k, x] =⊥, E[k, x] is stored in TE [k], and for any (k, y) such
that D[k, y] =⊥, D[k, y] is stored in TD [k]. On a query which the key element
is k, first the output of EI (k, c1 ) is determined (steps 03-04 or steps 13-14) and
second the output of EI (k, c2 ) is determined (Steps 05-06 or Steps 15-16). Then
the outputs of EI (k, x) such that x = c1 and x = c2 are determined. Since
no adversary (distinguisher) learns EI (k, c1 ) and EI (k, c2 ) until querying the
corresponding value, the procedures of the steps 03-06 and 13-16 do not affect
the lazily-sample ideal cipher simulation.
We compare the simulator with the lazily-sample ideal cipher. In the simulator
and the ideal cipher, E[k, c1 ] and E[k, c2 ] (and also D[k, E[k, c1 ]] and D[k, E[k, c2 ]])
are chosen from the same distribution, while E[k, x] (and D[k, E[k, x]]) where
x = c1 and x = c2 is chosen different distribution. If in the step e10 y is randomly
chosen from TE2 [k] ∪ {E[k, c1 ], E[k, c2 ]} and in the step d10 x is randomly chosen
Blockcipher-Based Double-Length Hash Functions 351
from TD2 [k]∪{c1 , c2 }, then the output distribution of the simulator and the ideal
cipher is the same. That is, if any value y randomly chosen from {0, 1}n\TE2 [k]
does not collide E[k, c1 ] and E[k, c2 ] and any value x randomly chosen from
{0, 1}n\TD2 [k] does not collide c1 and c2 , then the output distribution between
them is the same. Since for any k, the number of values in TE2 [k] and TD2 [k]
is at most 2lqF + qE + qD , the statistical distance of E[k, x] (and D[k, E[k, x]])
where x = c1 and x = c2 is at most 2/(2n − (2lqF + qE + qD )). So the statistical
distance of the simulator and the ideal cipher is at most (2lqF + qE + qD ) ×
2/(2n − (2lqF + qE + qD )). We thus have that
2 × (2lqF + qE + qD )
| Pr[G1] − Pr[G0]| ≤ .
2n − (2lqF + qE + qD )
C2 ,C 3 C2 ,C 3
Game 2: We modify OF from F S to F2 2n,n 2n,n . So (OF , OE , OD ) = (F2 2n,n 2n,n ,
SE , SD ) and this is the F2 scenario.
We show that unless the following bad events occur, the A4 ’s view of Game
1 and Game 2 is the same.
– Event B1: On some query (k, x) to SE , the output y is such that y ⊕ x is
equal to c1 or c2 .
– Event B2: On some query (k, x) to SE , the output y is such that y ⊕ x ⊕ C
is equal to c1 or c2 .
– Event B3: On some query (k, y) to SD , the output x is equal to c1 or c2 such
that x is defined in the step D08.
To prove this, we use the proof method in [4,13]. Specifically, we prove the
following two points.
1. In Game 1, unless the bad events occur, for any query M the output of
C2 ,C 3
OF (M ) is equal to that of F2 2n,n 2n,n (M ). If this holds, the output distri-
bution of OF in Game 1 and Game 2 is equivalent.
2. In Game 2, unless the bad events occur, OE and OD are consistent with
OF as in Game 1. OF uses OE in Game 1 while does not in Game 2 (note
that in both games (OE , OD ) = (SE , SD )). So if this holds, the difference
does not affect the output distribution of OE and OD , namely, the output
distribution of OE and OD in Game 1 and Game 2 is the same.
In the following, for input-output triple (k, x, y) of S we denote x ⊕ y by w,
namely, w = x ⊕ y. Before proving the above two points, we define chain triples
and give a useful lemma.
Definition 3. (k1 , x1 , y1 ), . . . , (ki , xi , yi ), (k1 , x1 , y1 ), . . . , (ki , xi , yi ), (k, x, y),
(k , x , y ) stored in the simulator’s tables E, D are chain triples if for some M the
output of F S (M ) can be obtained from the triples. That is, x1 = IV [0], k1 [0] =
IV [1], kj = kj (j = 1, . . . , i), wj = xj+1 (j = 1, . . . , i − 1), wj ⊕ C = xj+1 (j =
1, . . . , i − 1), wj = kj+1 [0] (j = 1, . . . , i − 1), x = c1 , x = c2 , k = k , k[0] = wi ,
k[1] = wi , M = k1 [1]|| · · · ||ki [1], and y||y = F S (M ).
352 Y. Naito
Lemma 7. For any chain triple (k1 , x1 , y1 ), . . . , (ki , xi , yi ), (k1 , x1 , y1 ), . . . , (ki ,
C2 3
,C2n,n
xi , yi ), (k, x, y), (k , x , y ), unless the bad events occur, F S (M ) = F2 2n,n (M )
where M = k1 [1]|| · · · ||ki [1].
Proof. To contrary, assume that there exist chain triples (k1 , x1 , y1 ), . . . , (ki , xi , yi ),
C2 ,C 3
(k1 , x1 , y1 ), . . . , (ki , xi , yi ), (k, x, y), (k , x , y ) such that F S (M ) = F2 2n,n 2n,n (M )
where M = k1 [1]|| · · · ||ki [1]. Then, since the output of S is defined by E2I or E3I ,
one of the following events occur.
Consider Event 1. First consider the case that (kj , xj , yj ) is defined by E3I . Since
x1 = IV [0], j = 1. When the output of SE (kj , xj ) is defined by E3I , xj = c1
or xj = c2 . Which means that wj−1 = c1 or wj−1 = c2 . So the bad event 1
occurs. Second consider the case that (kj , xj , yj ) is defined by E3I . Similarly,
since x1 = IV [0] ⊕ C, j = 1. When the output of SE (kj , xj ) is defined by E3I ,
xj = c1 or xj = c2 . Which means that wj−1 ⊕ C = c1 or wj−1 ⊕ C = c2 . So the
bad event 2 occurs.
Next consider Event 2. First consider the case that (k, x, y) is defined by E2I .
Then the triple is defined in SD because x = c1 (if the triple is defined in SE , it
is defined by E2I due to the condition of the step E07). So the triple is defined in
the step D08. The bad event 3 occurs. Finally, consider the case that (k , x , y )
is defined by E2I . Then the triple is defined in SD because x = c2 . So the triple
is defined in the step D08. The bad event 3 occurs.
Proof of Point 1. From the above lemma, unless the bad event occurs, the
C2 3
,C2n,n
output of OF (M ) = F S (M ) = F2 2n,n (M ).
The Bound of | Pr[G2] − Pr[G1]|. The above two points imply that unless the
bad events occur, the A4 ’s view of Game 1 and Game 2 is the same, and so we
have that
| Pr[G2] − Pr[G1]|
≤ 2 × max{Pr[B11 ] + Pr[B21 ] + Pr[B31 ], Pr[B12 ] + Pr[B22 ] + Pr[B32 ]}
Blockcipher-Based Double-Length Hash Functions 353
where Bij is the event Bi in Game j. Since the number of queries to S in Game
1 is more than that in Game 2,
4 Conclusion
References
1. Bellare, M., Ristenpart, T.: Multi-Property-Preserving Hash Domain Extension
and the EMD Transform. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS,
vol. 4284, pp. 299–314. Springer, Heidelberg (2006)
2. Brachtl, B.O., Coppersmith, D., Hyden, M.M., Matyas Jr., S.M., Meyer, C.H.W.,
Oseas, J., Pilpel, S., Schilling, M.: Data authentication using modification detection
codes based on a public one way encryption function. US Patent No. 4,908,861
(1990) (filed August 28, 1987)
3. Chang, D., Lee, S., Nandi, M., Yung, M.: Indifferentiable Security Analysis of
Popular Hash Functions with Prefix-Free Padding. In: Lai, X., Chen, K. (eds.)
ASIACRYPT 2006. LNCS, vol. 4284, pp. 283–298. Springer, Heidelberg (2006)
4. Coron, J.-S., Dodis, Y., Malinaud, C., Puniya, P.: Merkle-Damgård Revisited: How
to Construct a Hash Function. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621,
pp. 430–448. Springer, Heidelberg (2005)
5. Damgård, I.B.: A Design Principle for Hash Functions. In: Brassard, G. (ed.)
CRYPTO 1989. LNCS, vol. 435, pp. 416–427. Springer, Heidelberg (1990)
6. Dodis, Y., Ristenpart, T., Shrimpton, T.: Salvaging Merkle-Damgård for Practical
Applications. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 371–388.
Springer, Heidelberg (2009)
7. Dodis, Y., Ristenpart, T., Shrimpton, T.: Salvaging Merkle-Damgård for Practical
Applications. ePrint 2009/177 (2009)
8. Fleischmann, E., Forler, C., Gorski, M., Lucks, S.: Collision Resistant Double-
Length Hashing. In: Heng, S.-H., Kurosawa, K. (eds.) ProvSec 2010. LNCS,
vol. 6402, pp. 102–118. Springer, Heidelberg (2010)
9. Fleischmann, E., Gorski, M., Lucks, S.: Security of Cyclic Double Block Length
Hash Functions. In: Parker, M.G. (ed.) Cryptography and Coding 2009. LNCS,
vol. 5921, pp. 153–175. Springer, Heidelberg (2009)
10. Gong, Z., Lai, X., Chen, K.: A synthetic indifferentiability analysis of some block-
cipher-based hash functions. In: Des. Codes Cryptography, vol. 48, pp. 293–305
(2008)
11. Hirose, S.: Some Plausible Constructions of Double-Block-Length Hash Functions.
In: Robshaw, M.J.B. (ed.) FSE 2006. LNCS, vol. 4047, pp. 210–225. Springer,
Heidelberg (2006)
12. Hirose, S., Park, J.H., Yun, A.: A Simple Variant of the Merkle-Damgård Scheme
with a Permutation. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833,
pp. 113–129. Springer, Heidelberg (2007)
Blockcipher-Based Double-Length Hash Functions 355
13. Hoch, J.J., Shamir, A.: On the Strength of the Concatenated Hash Combiner When
All the Hash Functions Are Weak. In: Aceto, L., Damgård, I., Goldberg, L.A.,
Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008, Part II.
LNCS, vol. 5126, pp. 616–630. Springer, Heidelberg (2008)
14. Lai, X., Massey, J.L.: Hash Functions Based on Block Ciphers. In: Rueppel, R.A.
(ed.) EUROCRYPT 1992. LNCS, vol. 658, pp. 55–70. Springer, Heidelberg (1993)
15. Lee, J., Kwon, D.: The Security of Abreast-DM in the Ideal Cipher Model. IEICE
Transactions 94-A(1), 104–109 (2011)
16. Lee, J., Stam, M.: Mjh: A Faster Alternative to mdc-2. In: Kiayias, A. (ed.)
CT-RSA 2011. LNCS, vol. 6558, pp. 213–236. Springer, Heidelberg (2011)
17. Lee, J., Stam, M., Steinberger, J.: The collision security of Tandem-DM in the
ideal cipher model. ePrint 2010/409 (2010)
18. Lucks, S.: A collision-resistant rate-1 double-block-length hash function. In: Sym-
metric Cryptography, Symmetric Cryptography, Dagstuhl Seminar Proceedings
07021 (2007)
19. Matyas, S., Meyer, C., Oseas, J.: Generating strong one-way functions with crypto-
graphic algorithms. IBM Technical Disclosure Bulletin 27(10a), 5658–5659 (1985)
20. Maurer, U.M., Renner, R.S., Holenstein, C.: Indifferentiability, Impossibility Re-
sults on Reductions, and Applications to the Random Oracle Methodology. In:
Naor, M. (ed.) TCC 2004. LNCS, vol. 2951, pp. 21–39. Springer, Heidelberg (2004)
21. Merkle, R.C.: One Way Hash Functions and DES. In: Brassard, G. (ed.) CRYPTO
1989. LNCS, vol. 435, pp. 428–446. Springer, Heidelberg (1990)
22. Meyer, C.H.W., Schilling, M.: Chargement securise d’un programma avec code de
detection (1987)
23. National Institute of Standards and Technoloty. FIPS PUB 180-3 Secure Hash
Standard. In: FIPS PUB (2008)
24. Özen, O., Stam, M.: Another Glance at Double-Length Hashing. In: Parker, M.G.
(ed.) Cryptography and Coding 2009. LNCS, vol. 5921, pp. 176–201. Springer,
Heidelberg (2009)
25. Preneel, B., Bosselaers, A., Govaerts, R., Vandewalle, J.: Collision-free Hashfunc-
tions Based on Blockcipher Algorithmsl. In: Proceedings of 1989 International Car-
nahan Conference on Security Technology, pp. 203–210 (1989)
26. Preneel, B., Govaerts, R., Vandewalle, J.: Hash Functions Based on Block Ciphers:
A Synthetic Approach. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773,
pp. 368–378. Springer, Heidelberg (1994)
27. Ristenpart, T., Shacham, H., Shrimpton, T.: Careful with Composition: Limita-
tions of the Indifferentiability Framework. In: Paterson, K.G. (ed.) EUROCRYPT
2011. LNCS, vol. 6632, pp. 487–506. Springer, Heidelberg (2011)
28. Rivest, R.L.: The MD4 Message Digest Algorithm. In: Menezes, A., Vanstone, S.A.
(eds.) CRYPTO 1990. LNCS, vol. 537, pp. 303–311. Springer, Heidelberg (1991)
29. Rivest, R.L.: The MD5 Message Digest Algorithm. In: RFC 1321 (1992)
30. Steinberger, J.P.: The Collision Intractability of MDC-2 in the Ideal-Cipher Model.
In: Naor, M. (ed.) EUROCRYPT 2007. LNCS, vol. 4515, pp. 34–51. Springer,
Heidelberg (2007)
ASC-1: An Authenticated Encryption
Stream Cipher
1 Introduction
Confidentiality and message authentication are two fundamental information se-
curity goals. Confidentiality addresses the issue of keeping the information secret
from unauthorized users. Often, this is achieved by encrypting the data using
a symmetric-key encryption scheme. Message authentication addresses the is-
sues of source corroboration and improper or unauthorized modification of data.
To protect the message authenticity, the sender usually appends an authentica-
tion tag that is generated by the signing (tagging) algorithm of some message
authentication scheme.
Although symmetric-key encryption and message authentication have been
mainly studied in a separate context, there are many applications where both
are needed. The cryptographic schemes that provide both confidentiality and
The research was supported in part by the Center for Wireless Systems and
Applications - CTIF Copenhagen.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 356–372, 2012.
c Springer-Verlag Berlin Heidelberg 2012
ASC-1: An Authenticated Encryption Stream Cipher 357
2 ASC-1 Specification
ASC-1 is an authenticated encryption scheme. Its key size can vary depending
on the block cipher that is used. Our block cipher suggestion is AES with 128-bit
key. The encryption and decryption algorithms for a message M = m1 ||m2 ||m3
consisting of three 128-bit blocks are depicted in Figure 1.
The schemes uses a 56-bit representation of a counter that provides a unique
initialization vector for each encrypted message. The encryption algorithm de-
rives an initial state X0 and three keys K1,0 , K2,0 and K3,0 by applying a block
cipher to 070 ||00||Cntr, 070 ||01||Cntr, 070 ||10||Cntr and l(M )||00000011||Cntr
respectively, where l(M ) is a 64-bit representation of the bit length of the mes-
sage M . The message is then processed in a CFB-like mode using the 4R-AES
transformation. The 4R-AES transformation takes as input a 128-bit input state
358 G. Jakimoski and S. Khajuria
and outputs a 128-bit “random” leak ri and a 128-bit output state. The first
leak r1 is used to encrypt the first message block m1 . The resulting ciphertext
block c1 is XOR-ed with the output state to give the input state for the second
4R-AES transformation. This process is repeated for all message blocks. The
leak from the last 4R-AES application is ignored, and its output h is encrypted
by K3,0 to give the authentication tag. The ciphertext consists of the counter
value, the ciphertext blocks and the authentication tag.
Fig. 1. The encryption and decryption algorithms of ASC-1. The message consists of
three blocks. The ciphertext consists of the counter value, three ciphertext block and an
authentication tag. The receiver recovers the original message and verifies its validity
by checking whether the re-computed authentication tag is equal to the received one.
The decryption algorithm uses the same secret key and the received counter
value to compute X0 , K1,0 , K2,0 and K3,0 . The leak r1 derived by applying 4R-
AES to X0 is used to decrypt c1 into the original message block m1 . The output
of the first 4R-AES is XOR-ed with the first ciphertext block to give the next
input state, and the process is repeated until all message blocks are recovered
and an authentication tag of the message is computed. If the computed tag is
same as the one that was received, then the decrypted message is accepted as
valid.
Although, we use 64-bit and 56-bit representations for the message length
and the counter, we assume that both the maximum message length and the
ASC-1: An Authenticated Encryption Stream Cipher 359
The 4R-AES transformation uses five 128-bit keys: four round keys and one
whitening key. These keys are derived from the 256-bit key K1,0 ||K2,0 as follows.
The AES-256 key scheduling algorithm is applied to K1,0 ||K2,0 to derive 14
round keys K1 , K2 , . . . , K14 . The keys K2 , K3 , K4 and K5 are used as round
keys in the first 4R-AES transformation. The keys K7 , K8 , K9 and K10 are used
as round keys in the second 4R-AES transformation. The key K1 is used as a
whitening key in the second 4R-AES transformation, and the key K11 is used as
360 G. Jakimoski and S. Khajuria
a whitening key in the first 4R-AES transformation. The AES-256 key scheduling
algorithm is again applied to K13 ||K14 to derive 14 keys that are used by the
third and the fourth 4R-AES transformation, and the process is repeated as long
as we need new keys.
One can use a LAXU hash function family as a building block to construct an
unconditionally secure authenticated encryption scheme as shown in Figure 3.
We assume that the message M consists of d n-bit blocks. Some techniques that
deal with arbitrary length messages are discussed later on. The ciphertext blocks
are computed as follows. A hash function hK1 is selected randomly from H and it
is applied to an initial value IV to get a leak l1 and hash value h1 . The leak l1 is
used to encrypt the message block m1 into a ciphertext block c1 = m1 ⊕l1 . A new
hash function hK2 is randomly drawn from H. It is applied to i2 = h1 ⊕ c1 ⊕ k1 ,
where k1 is a random key, to get a leak l2 and hash value h2 . The leak l2 is used
to encrypt the message block m2 into a ciphertext block c2 , and the process is
repeated until the encryption of the last message block md . The authentication
tag τ is computed as τ = KT ⊕ hd+1 , where KT is a random n-bit key, and hd+1
is the hash value that is obtained by applying a randomly drawn hash function
hKd+1 to cd ⊕ hd . The ciphertext C = IV ||c1 ||c2 || . . . ||cd ||τ is a concatenation of
the initial value, the ciphertext blocks, and the authentication tag.
We assume that the recipient has knowledge of the secret keys that were
used to encrypt the message. The decryption and verification of the ciphertext
proceeds as follows. First, hK1 is applied to IV to get a leak l1 and hash value
h1 . The leak l1 is used to decrypt the ciphertext block c1 into a message block
m1 = c1 ⊕ l1 . Then, the hash function hK2 is applied to i2 = h1 ⊕ c1 ⊕ k1
to get a leak l2 and hash value h2 . The second message block is obtained as
m2 = c2 ⊕l2 , and the process is repeated until all message blocks m1 , m2 , . . . , md
are decrypted. To verify the authenticity of the received ciphertext, the recipient
ASC-1: An Authenticated Encryption Stream Cipher 361
In the previous analysis, we used the fact that for any message M :
There are two possible types of attacks when considering the authenticity of
the ciphertext: an impersonation attack and a substitution attack.
In the case of an impersonation attack, the attacker constructs and sends a
ciphertext to the receiver before he sees the encryption of the message. Due to
the fact that the key KT is uniformly random, the probability of success of an
impersonation attack is at most 2−n . If the adversary makes qI impersonation
attempts, then the probability that at least one of this attempts will be successful
ia 1 − (1 − 2−n )qI ≤ qI × 2−n .
In the case of substitution attack, the adversary has intercepted the ciphertext
of a given message and tries to replace it with a different ciphertext that will be
accepted as valid by the receiver. We will show that the probability of success
in this case is at most qS × , where qS is the number of substitution attempts
made by the adversary.
Suppose that C = IV ||c1 || . . . ||cd ||τ is the ciphertext of a chosen message M
and C = IV ||c1 || . . . ||cd ||τ is the substitution ciphertext. If the two ciphertexts
C and C differ only in their authentication tags (i.e., τ = τ , IV = IV and
cj = cj , 1 ≤ j ≤ d), then the probability of successful substitution is zero.
Therefore, the only interesting case is when the substitution ciphertext C and
the original ciphertext C differ in at least one block that is different from the
tag block.
Let 0 ≤ j ≤ d be the index of the first block where C and C differ, and let
Δij+1 = cj ⊕ cj be the difference at the input of hKj+1 , with c0 = IV and c0 =
IV . Then, due to the -LAXU and invertibility properties of H, we have that
Pr[Δhj+1 = 0|M, C, C ] = 0 and ∀Δ∈{0,1}n ,Δ=0 Pr[Δhj+1 = Δ|M, C, C ] ≤ ,
where Δhj+1 is the difference at the output of hKj+1 . Hence, for the difference
Δij+2 = Δhj+1 ⊕Δcj+1 , we get that ∀Δ∈{0,1}n Pr[Δij+2 = Δ|M, C, C ] ≤ . The
probability Pr[Δhj+2 = 0|M, C, C ] is equal to the probability that Pr[Δij+2 =
0|M, C, C ], and is at most . When the input difference Δij+2 is nonzero, we get
that ∀Δ∈{0,1}n ,Δ=0 Pr[Δhj+2 = Δ|M, C, C ] ≤ . If we continue in this manner,
we get that ∀Δ∈{0,1}n Pr[Δhd+1 = Δ|M, C, C ] ≤ . The substitution ciphertext
will be accepted as valid only if hd+1 ⊕ KT = τ , i.e., only if Δhd+1 = Δτ , where
Δτ = τ ⊕ τ . Given the previous analysis, this will happen with probability no
larger than .
The probability that at least one out of qS substitution queries will be success-
ful is at most qS . The probability of success when making at most qv = qI + qS
verification queries is at most qv due to the fact that ≤ 2−n .
To deal with messages of arbitrary length, one can generate uniformly at random
a key KT for each possible message length. Now, if one substitutes a ciphertext
ASC-1: An Authenticated Encryption Stream Cipher 363
4 Security of ASC-1
In this section, we show that if the block cipher used in ASC-1 is secure and one
cannot tell apart the case when ASC-1 uses random round keys from the case
when it uses round keys derived by a key scheduling algorithm, then ASC-1 is
secure authenticated encryption scheme.
Here, we establish the unconditional security of ASC-1 with random keys. First,
we consider the two round SPN structure of Figure 4. The input x = x1 || . . . ||xn
is an n × m-bit string. The key addition operator is the bitwise XOR operator.
Fig. 4. A two round SPN structure with a leak. Each of the n S-boxes is a non-linear
permutation on {0, 1}m , and the branch number of the linear mixing layer is n + 1.
Without loss of generality, we assume that the leak positions are the first s positions
of v (i.e., l = v1 ||v2 || . . . ||vs )
364 G. Jakimoski and S. Khajuria
= Pr[Δy|Δu, l] Pr[Δu|Δx]
Δu
s
n
= Pr[Δu|Δx] × Pr[Δyi |Δui , vi ] × Pr[Δyi |Δui ]
Δu i=1 i=s+1
s
n
= Pr[Δu|Δx] × Pr[Δyi |Δui ] × Pr[Δyi |Δui ]
Δu i=1 i=s+1
ASC-1: An Authenticated Encryption Stream Cipher 365
n
= Pr[Δu|Δx] × Pr[Δyi |Δui ]
Δu i=1
= Pr[Δy|Δu] × Pr[Δu|Δx]
Δu
= Pr[Δy|Δx].
The previous results refer to two round SPN structures. In order to show that one
can use four AES rounds to construct a LAXU hash function, we will first con-
sider the composition of transformations depicted in Figure 5. The next lemma
establishes independence of the differential probability of F1 (resp., F2 ) from the
leak value l2 (resp., l1 ). This is due to the key addition operation that follows
F1 and precedes F2 .
x1
k1
F1 l1
y1
k
x2
k2
F2 l2
y2
Lemma 2. The following holds for the differential probabilities of the transfor-
mations F1 and F2 depicted in Figure 5:
and
Proof.
This concludes the first part of the proof. The second equation of the lemma can
be proved in a similar fashion, and we omit its proof.
Let us look now at the situation depicted in Figure 6. A keyed non-linear function
F is applied to a vector x of n input values (x1 , . . . , xn ) to produce a vector
y = (y1 , . . . , yn ) of n output values. Without loss of generality, we assume that
the first s output values are leaked after a uniformly random key is added to
them. The knowledge of the leak l = (l1 , . . . , ls ) does not change the output
differential probabilities of F .
ASC-1: An Authenticated Encryption Stream Cipher 367
Fig. 6. The first s output values of a non-linear function F are “leaked” after a uni-
formly random key is added to them
Proof. Since the output values are leaked after the random key is added, they tell
nothing about the values y1 , . . . , ys and do not affect the probability of having
output difference Δy.
Suppose that x0 and x0 are two distinct input values, and let us consider the
output difference Δz3 given the input difference Δx0 = x0 ⊕ x0 . By applying
the previously presented lemmas, we get:
Having the previous inequality in mind, we get that the family of functions
defined by four rounds of AES with leak extraction is an -LAXU2 hash function
family with = DP4rAES ≤ 2−113 [18].
In the previous subsection, we showed that if all the keys and the initial state are
random, then ASC-1 is unconditionally secure authenticated encryption scheme.
However, the keys and the initial state of ASC-1 are derived by combining a block
cipher in a counter mode and a key scheduling algorithm. The security of the
scheme in this case is based on two assumptions:
a stream cipher that is secure when the random round keys assumption is made,
but is trivial to break otherwise. Since the design of ASC-1 was inspired by the
LEX stream cipher, we are going to address the known attacks on LEX:
It is not hard to show that given an adversary AROR that can distinguish the
ciphertext generated by ASC-1 from a random string, one can construct two
adversaries APRP , which can tell apart the block cipher from a PRP, and AKSOR ,
which can distinguish the case when the round keys are random from the case
when the round keys are derived by a key scheduling algorithm, such that at
least one of these adversary wins with significant probability. Namely, the APRP
and AKSOR will use their oracles to simulate ASC-1 and answer AROR ’s queries.
The output of APRP and AKSOR will be same as AROR ’s output. If the advantage
of AROR is non-negligible, then at least one of APRP and AKSOR will have non-
negligible advantage. A similar result will hold in the case of a forging adversary
AF . So, we have the following informal theorems.
5 Conclusions
References
13. Jakimoski, G., Subbalakshmi, K.P.: On Efficient Message Authentication Via Block
Cipher Design Techniques. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS,
vol. 4833, pp. 232–248. Springer, Heidelberg (2007)
14. Jutla, C.S.: Encryption Modes with Almost Free Message Integrity. In: Pfitzmann,
B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 529–544. Springer, Heidelberg
(2001)
15. Kang, J.-S., Hong, S., Lee, S., Yi, O., Park, C., Lim, J.: Practical and Provable Se-
curity Against Differential and Linear Cryptanalysis for Ssubstitution-Permutation
Networks. ETRI Journal 23(4), 158–167 (2001)
16. Keliher, L., Meijer, H., Tavares, S.: New Method for Upper Bounding the Maximum
Average Linear Hull Probability for sPNs. In: Pfitzmann, B. (ed.) EUROCRYPT
2001. LNCS, vol. 2045, pp. 420–436. Springer, Heidelberg (2001)
17. Keliher, L., Meijer, H., Tavares, S.: Improving the Upper Bound on the Maximum
Average Linear Hull Probability for Rijndael. In: Vaudenay, S., Youssef, A.M. (eds.)
SAC 2001. LNCS, vol. 2259, pp. 112–128. Springer, Heidelberg (2001)
18. Keliher, L., Sui, J.: Exact Maximum Expected Differential and Linear Probabil-
ity for 2-Round Advanced Encryption Standard (AES). IACR ePrint Archive,
2005/321
19. Matsui, M.: New Structure of Block Ciphers with Provable Security against Differ-
ential and Linear Cryptanalysis. In: Gollmann, D. (ed.) FSE 1996. LNCS, vol. 1039,
pp. 205–218. Springer, Heidelberg (1996)
20. Minematsu, K., Tsunoo, Y.: Provably Secure MACs from Differentially-Uniform
Permutations and AES-Based Implementations. In: Robshaw, M.J.B. (ed.) FSE
2006. LNCS, vol. 4047, pp. 226–241. Springer, Heidelberg (2006)
21. Park, S., Sung, S.H., Chee, S., Yoon, E.-J., Lim, J.: On the Security of Rijndael-
Like Structures against Differential and Linear Cryptanalysis. In: Zheng, Y. (ed.)
ASIACRYPT 2002. LNCS, vol. 2501, pp. 176–191. Springer, Heidelberg (2002)
22. Park, S., Sung, S.H., Lee, S., Lim, J.: Improving the Upper Bound on the Maximum
Differential and the Maximum Linear Hull Probability for SPN Structures and
AES. In: Johansson, T. (ed.) FSE 2003. LNCS, vol. 2887, pp. 247–260. Springer,
Heidelberg (2003)
23. Rogaway, P.: Bucket Hashing and Its Application to Fast Message Authentication.
In: Coppersmith, D. (ed.) CRYPTO 1995. LNCS, vol. 963, pp. 29–42. Springer,
Heidelberg (1995)
24. Rogaway, P., Bellare, M., Black, J., Krovetz, T.: OCB: A block-cipher mode of
operation for efficient authenticated encryption. In: Proc. 8th ACM Conf. Comp.
and Comm. Security, CCS (2001)
25. Wu, H., Preneel, B.: Resynchronization Attacks on WG and LEX. In: Robshaw,
M.J.B. (ed.) FSE 2006. LNCS, vol. 4047, pp. 422–432. Springer, Heidelberg (2006)
On Various Families of Twisted Jacobi Quartics
Jérôme Plût
1 Introduction
The interest for elliptic curves in cryptography arises from the fact that,
given suitable parameter choices, they provide an efficient representation of the
“generic group” model. However, the need for separate formulæ for point ad-
dition and doubling in Weierstraß coordinates critically exposes elliptic curve
arithmetic to side-channel analysis.
One family of countermeasures protecting against these attacks is the use of
a coordinate system that allows point additions and doublings to be performed
with the same formulæ. Namely, addition formulæ are said to be unified if they
also allow doubling of non-zero points, and complete if the allow addition of any
pair of points, identical or not, zero or not.
Some curve models with such properties, over a field of odd characteristic,
are:
– twisted Edwards curves [Edw07, BL07, BBJ+ 08, HWCD08], with equa-
tion ax2 + y 2 = 1 + dx2 y 2 , have a unified addition formula, that is complete
in some cases, costing 9M + Da + Dd [HWCD08];
– Jacobi quartics, with equation y 2 = x4 + 2a x2 + 1, are unified [BJ03], and
have an addition formula costing 7M + 3S + Da [HWCD09];
– Huff cubics [JTV10], with equation ax(y 2 − 1) = by(x2 − 1), have a unified
addition formula costing 11M.
This work was supported by the French Agence Nationale de la Recherche through
the ECLIPSES project under Contract ANR-09-VERS-018.
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 373–383, 2012.
c Springer-Verlag Berlin Heidelberg 2012
374 J. Plût
Not all elliptic curves transform to the Edwards or Jacobi quartic forms: only the
curves with a rational point of order four transform to Edwards curves [BBJ+ 08,
Theorem 3.3][Mor09], whereas the condition for Jacobi quartics is examined in
more detail in section 2.2 of this document. Since it is preferred that elliptic
curves used in cryptography have a group of prime order (as is the case, for
example, of NIST-recommended curves [Nat00]), they are not actually amenable
to Edwards or Jacobi quartic form.
Recent research activity has focused on counting elliptic curves in various
families using explicit computation of the j-invariant, for example in the fami-
lies of Doche-Icart-Kohel [DIK06] and Edwards [RFS10], Legendre [FW10], and
complete Edwards curves [AG11].
We count the Jacobi quartics using a direct method, relying on the action of
the Frobenius on the 4-torsion points of elliptic curves. Throughout this docu-
ment, k is a finite field of odd characteristic. Let E be an elliptic curve defined
over k. The 4-torsion subgroup E[4] of E has coordinates in the algebraic closure
of k, and is thus equipped with an action of the Frobenius endomorphism ϕ of k.
Since k has odd characteristic, by [Sil86, 6.4(b)], the group E[4] is isomorphic
to (Z/4Z)2 , and the action ϕE (mod 4) of ϕ is given by a matrix in GL2 (Z/4Z).
Finally, a change of basis of E[4] conjugates the matrix of ϕE (mod 4). There-
fore, to the curve E, one may canonically attach the conjugacy class of ϕE
(mod 4) in GL2 (Z/4Z).
This work gives an intrinsic characterization of representability of elliptic
curves by the Jacobi quartic model (Theorem 5); this is given by a list of allowed
conjugacy classes for ϕE (mod 4). In particular, this does not depend on the rep-
resentation chosen for the curve. Thus, it allows us to give an asymptotic count
of the elliptic curves that can be represented as Jacobi quartics. This method
generalizes to other quadrics intersection models such as Edwards, Jacobi, and
Huff (Theorem 11).
Billet and Joye [BJ03, §3] also define a twist of the Jacobi model that repre-
sents all curves with at least one rational point of order two, and give unified
addition formulæ for these curves with a cost of 10M + 3S + 2D. We give here
improved addition formulæ for the two following variants of the twisted Jacobi
model:
– A 7M+3S+Da multiplication for the (2,2)-Jacobi quartic, which represents
all curves whose point group has (Z/2Z) × (Z/2Z) as a subgroup (1/6 of all
elliptic curves);
– A 8M + 3S + Da multiplication for the (2)-Jacobi quartic, which represents
all curves whose point group has (Z/2Z) as a subgroup (2/3 of all elliptic
curves).
These formulæ, as well as the Jacobi quartic formula from [HWCD09], are not
unified. They are, however, “complete except at 2”: any points P, Q such that
the formulæ don’t allow the computation of P + Q differ by a point of order two
(Propositions 6 and 8). Thus, these formulæ are usable for all computations in
On Various Families of Twisted Jacobi Quartics 375
2 Jacobi Quartics
2.1 Curve Equation
A Jacobi quartic is a projective curve with the quartic equation
y 2 = x4 + 2ax2 + 1, (1)
where a ∈ k is a parameter. The discriminant of the right-hand side polynomial
is 28 (a2 − 1)2 ; therefore, if a ∈
/ {−1, 1}, then the curve has the tacnode at the
point at infinity (0 : 1 : 0) as its only singular point. Resolution of this singularity
yields the intersection of two quadrics
JQa : y 2 = z 2 + 2ax2 + t2 , x2 = z · t, (2)
where the tacnode (0 : 1 : 0) has the two antecedents (0 : 1 : ±1 : 0).
The curve JQa contains the four points with coordinates (x : y : z : t) equal
to (0 : 1 : 0 : ±1) and (0 : 1 : ±1 : 0). We fix ε = (0 : 1 : 0 : 1) as the neutral
point; the three others are then the three points of order two.
As JQa is a smooth intersection of two quadrics in the projective space
of dimension three, it is an elliptic curve, and the group law is defined by
coplanarity [Ono94, LS01]. Namely, let ε be the neutral point; then any three
points P1 , P2 , P3 have zero sum if the four points (ε, P1 , P2 , P3 ) are coplanar. Of
course, when two of the points, say P1 and P2 , are identical, we replace P2 by
the direction of the tangent line at P1 to JQa .
We may then check that the addition formulæ for P3 = P1 + P2 , where Pi =
(xi : yi : zi : ti ), are
x3 = (x1 y2 + y1 x2 ) · (t1 t2 − z1 z2 );
y3 = (y1 y2 + 2ax1 x2 )(z1 z2 + t1 t2 ) + 2x1 x2 (z1 t2 + t1 z2 );
(3)
z3 = (x1 y2 + y1 x2 )2 ;
t3 = (t1 t2 − z1 z2 )2 .
The negative of the point (x : y : z : t) is (−x : y : z : t).
A speed-up of one multiplication is achieved [HWCD09] by observing that
z3 = (z1 z2 + t1 t2 )(z1 t2 + t1 z2 ) + 2x1 x2 (2ax1 x2 + y1 y2 ), (4)
so that y3 + z3 factorizes as
y3 + z3 = (z1 z2 + 2x1 x2 + t1 t2 )(y1 y2 + 2ax1 x2 + z1 t2 + t1 z2 ). (5)
The cost of a point addition using (5) is 7M + 3S + Da .
Remark 1. The formulæ (3) are not unified. One checks that these formulæ yield
(x : y : z : t) + (−x : y : t : z) = (0 : 0 : 0 : 0). (6)
This situation is examined in more detail in the proposition 6 below.
376 J. Plût
2.2 Representability
Proof. Let E be such an elliptic curve and assume for example that r2 − r3 is a
square in k. We may then define parameters a, c, d ∈ k by
r2 + r3 − 2r1 r2 − r3
a= , c= , 4d2 = r2 − r3 . (7)
r2 − r3 2
Remark 4. If E has full rational 2-torsion subgroup, then so does its quadratic
and at least one of E or E
twist E, can be represented by a Jacobi quartic.
On Various Families of Twisted Jacobi Quartics 377
We see that all ξn are rational, and therefore ϕE (ξn , ηn ) = (ξn , ±ηn ). Thus,
ϕE (mod 4) is diagonalizable, and thus belongs to {id, −id}. The case ϕE ≡
+id (mod 4) is equivalent to η1 ∈ k and thus to d being a square. Therefore,
in the case (i), E is representable by a Jacobi quartic if, and only if, ϕE ≡ id
(mod 4).
(ii) not all the dn reduce to the same class modulo squares: then E can be
represented by a Jacobi quartic. Moreover, if for example d1 is a square
and d2 is not, then (ξ3 − r3 )2 = −d1 d2 is not a square, and therefore ξ3 ∈ / k.
4) is not
Thus, ϕE (mod diagonalizable and belongs to one of the conjugacy
12 −1 2
classes , .
01 0 −1
This shows that the cases where E transforms to a Jacobi quartic are exactly
the cases where ϕE (mod 4) is one of the five matrices listed above.
378 J. Plût
It is smooth when (a2 −1)b = 0. We note that for all λ ∈ k × , the curve JQλ2 a,λ2 b
(2,2)
(2,2)
is isomorphic to JQa,b by the coordinate change (λx : y : z : t). Therefore, we
may choose b to be either one or a (small) preset quadratic non-residue in k.
This curve has the rational points of order two
ω1 = (0 : 1 : 0 : −1), ω2 = ω2 + ω1 = (0 : 1 : −1 : 0).
ω2 = (0 : 1 : 1 : 0),
(15)
The point addition formulæ are deduced from (3):
x3 = (x1 y2 + y1 x2 ) (z1 z2 − t1 t2 )
y3 = (y1 y2 + 2a x1 x2 ) (z1 z2 + t1 t2 ) + 2b x1 x2 (z1 t2 + t1 z2 )
2 2 2
z3 = (z1 z2 − t1 t2 ) = (z1 z2 + t1 t2 ) − (2b x1 x2 ) (16)
2
t3 = b (x1 y2 + y1 x2 )
y3 + t3 = (z1 z2 + 2b x1 x2 + t1 t2 ) (y1 y2 + 2a x1 x2 + z1 t2 + t1 z2 )
The full cost for a point addition is seen to be 7M+3S+Da +2Db . The advantage
of choosing a parameter b such that multiplication by b is fast is apparent. The
probability that all b < N are squares modulo p is asymptotically equivalent
to e−N , so that in practice we shall almost always be able to find such a b;
moreover, whether this is the case is easy to check by quadratic reciprocity.
(2,2)
Proposition 6. Let P1 , P2 be two points of JQa,b such that the addition for-
mulæ (16) yield P3 = P1 + P2 = (0 : 0 : 0 : 0). Then we either have P2 = P1 + ω2
or P2 = P1 + ω2 , where ωi are the points of order two defined in (15).
Proof. Let (xi : yi : zi : ti ) be the coordinates of Pi . If x1 = x2 = 0, then
both points belong to the 2-torsion group and the result follows by enumeration,
so we may assume for example x1 = 0. Since bx21 = z1 t1 , this implies z1 = 0
and t1 = 0.
The relations t3 = 0 and z3 = 0 then imply that there exist α, β ∈ k such
that P1 = (x1 : αx1 : βt1 : t1 ) and P2 = (x2 : −αx2 : z2 : βz2 ). Since bx21 = βt21 ,
there exists ξ ∈ k such that β = bξ 2 and x1 = ξt1 . Let η = ξα; then
P1 = (ξ : η : bξ 2 : 1), P2 = (σξ : −ση : 1 : bξ 2 ) for σ = ±1.
We then see that σ = 1 implies P2 = P1 + ω2 whereas σ = −1 implies P2 =
P1 + ω2 .
On Various Families of Twisted Jacobi Quartics 379
3.2 Representability
Proposition 7. The (2, 2)-Jacobi quartics represent exactly all elliptic curves E
with rational 2-torsion subgroup.
Proof. Let E be a curve with three rational points of order two and the equation
η 2 = (ξ − r1 )(ξ − r2 )(ξ − r3 ) and define, for any c ∈ k,
and coordinates (x : y : z : t) by
⎛ ⎞ ⎛ ⎞⎛ ⎞
x 0 c 0 0 ξ
⎜y ⎟ ⎜ 1 r1 (r2 + r3 ) − r2 r3 ⎟ ⎜ ⎟
⎜ ⎟ = ⎜ −2r1 0 ⎟ ⎜ η2 ⎟ . (18)
⎝z ⎠ ⎝−r2 − r3 0 1 r2 r3 ⎠ ⎝ξ ⎠
t r2 − r3 0 0 −r1 (r2 − r3 ) 1
4 (2)-Jacobi Quartics
4.1 Curve Equation
The (2)-Jacobi quartic is the intersection of the two quadrics
(2)
JQa,b : x2 = zt, y 2 = z 2 + 2a x2 + b t2 . (19)
It is smooth (and thus an elliptic curve) whenever (a2 − 1)b = 0. For all λ ∈ k × ,
(2) (2)
JQλ2 a,λ4 b is isomorphic to JQa,b by the coordinate change (λx : y : z : λ2 t).
The addition formulæ are given by
x3 = (x1 y2 + y1 x2 ) (z1 z2 − bt1 t2 )
y3 = (y1 y2 + 2ax1 x2 ) (z1 z2 + bt1 t2 ) + 2bx1 x2 (z1 t2 + t1 z2 )
2 (20)
z3 = (z1 z2 − bt1 t2 )
t3 = (x1 y2 + y1 x2 )2
The factorisation trick from [HWCD09] does not apply here, thus the total point
addition cost is 8M + 3S + Da + 2Db .
The point ω1 = (0 : 1 : 0 : −1) is of order two.
(2)
Proposition 8. Let P1 , P2 be two points of JQa,b such that the addition for-
mulæ (20) yield P3 = (0 : 0 : 0 : 0). Then P1 − P2 is a point of order two, distinct
from ω1 .
√ (2)
Proof. After extending the scalars to k( b), the curve JQa,b becomes isomorphic
(2,2)
to JQab,√b . The result follows from Proposition 6 on that curve.
380 J. Plût
4.2 Representability
Proposition 9. The (2)-Jacobi quartics represent exactly all elliptic curves E
with at least one rational point of order two.
Proof. Let E be a curve with one rational point of order two; then there ex-
ist r, s, p ∈ k such that E has the equation
We then define
a = s − 2r, b = s2 − 4p, (22)
and coordinates (x : y : z : t) by
⎛ ⎞ ⎛ ⎞⎛ ⎞
x 0 10 0 ξ
⎜y ⎟ ⎜−2r 0 1 rs − p⎟ ⎜ η ⎟
⎜ ⎟ = ⎜ ⎟⎜ ⎟
⎝z ⎠ ⎝ −s 0 1 p ⎠ ⎝ξ 2 ⎠ . (23)
t 1 0 0 −r 1
the set of elliptic curves over Fq with Frobenius class equal to S has a Dirichlet
density equal to #S/#GL2 (Z/4Z). Finally, the group GL2 (Z/4Z) is an extension
of GL2 (F2 ), which is isomorphic to the symmetric group S3 , by the group of
matrices ≡ 0 (mod 2), which is isomorphic to (Z/2Z)4 ; thus, it is a group of
order 96.
Proof. These elliptic curve models are all characterized by a set of conjugacy
classes of the Frobenius in GL2 (Z/4Z); namely:
(i) the Jacobi quartics are characterized by the list of conjugacy classes of
Theorem 5;
(ii) the (2, 2)-Jacobi quartics are exactly the curves E satisfying ϕE ≡ id
(mod 2) (by Proposition 7);
(iii) the (2)-Jacobi quartics are exactly the curves such that ϕE has a fixed point
modulo 2 (by Proposition 9);
(iv) the twisted Edwards curves are exactly the curves with a rational 4-torsion
point [BBJ+ 08, Theorem 3.3], which means that ϕE has a fixed point mod-
ulo 4;
(v) the Huff curves are the curves that contain (Z/2Z) × (Z/4Z) as a sub-
group [JTV10, Theorem 2] and are thus the intersection of (2, 2)-Jacobi
quartics and Edwards curves.
In each case, the results follow by counting the number of such matrices
in GL2 (Z/4Z). By the Hasse-Weil theorem, q = det(ϕE ); consequently, the
conditional results on q (mod 4) are derived by counting the number of such
matrices with the suitable determinant.
For instance, the Jacobi quartics
inthe case
where q ≡ +1
(mod 4) corre-
10 12 −1 2
spond to the conjugacy classes of , and , with respective
01 01 0 −1
cardinalities 1, 3 and 3. Therefore, asymptotically, 7/48 of all elliptic curves
with q ≡ +1 are isomorphic to a Jacobi quartic.
382 J. Plût
Remark 12. If the field k is a prime field then the results about some of these
families of curves may also be derived, in a similar way, from statistics about
the group structure of elliptic curves [Gek06, 2.18].
All coordinate systems in the following list may be represented as the smooth
intersection of two three-dimensional quadrics. For each, we list the cost for
a point addition according to literature, the condition for representability of a
curve by such a model, and the asymptotic probability that this model represents
a random curve, in the sense of Theorem 11.
References
[AG11] Ahmadi, O., Granger, R.: On isogeny classes of edwards curves over finite
fields. Arxiv preprint arXiv:1103.3381 (2011)
[BBJ+ 08] Bernstein, D.J., Birkner, P., Joye, M., Lange, T., Peters, C.: Twisted
Edwards Curves. In: Vaudenay, S. (ed.) AFRICACRYPT 2008. LNCS,
vol. 5023, pp. 389–405. Springer, Heidelberg (2008), doi:10.1007/978-3-540-
68164-9_26
[BJ03] Billet, O., Joye, M.: The Jacobi Model of an Elliptic Curve and
Side-Channel Analysis. In: Fossorier, M., Høholdt, T., Poli, A. (eds.)
AAECC 2003. LNCS, vol. 2643, pp. 34–42. Springer, Heidelberg (2003),
doi:10.1007/3-540-44828-4_5
[BL07] Bernstein, D.J., Lange, T.: Inverted Edwards Coordinates. In: Boztaş,
S., Lu, H.-F. (eds.) AAECC 2007. LNCS, vol. 4851, pp. 20–27. Springer,
Heidelberg (2007)
[CH11] Castryck, W., Hubrechts, H.: The distribution of the number of points
modulo an integer on elliptic curves over finite fields (Preprint, 2011)
[DIK06] Doche, C., Icart, T., Kohel, D.R.: Efficient Scalar Multiplication by Isogeny
Decompositions. In: Yung, M., Dodis, Y., Kiayias, A., Malkin, T. (eds.)
PKC 2006. LNCS, vol. 3958, pp. 191–206. Springer, Heidelberg (2006)
[Edw07] Edwards, H.M.: A normal form for elliptic curves. Bulletin-American Math-
ematical Society 44(3), 393–422 (2007)
[FW10] Feng, R., Wu, H.: On the isomorphism classes of legendre elliptic curves
over finite fields. Arxiv preprint arXiv:1001.2871 (2010)
[Gek06] Gekeler, E.-U.: The distribution of group structures on elliptic curves over
finite prime fields. Documenta Mathematica 11, 119–142 (2006)
On Various Families of Twisted Jacobi Quartics 383
[HWCD08] Hisil, H., Wong, K.K.-H., Carter, G., Dawson, E.: Twisted Edwards Curves
Revisited. In: Pieprzyk, J. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp.
326–343. Springer, Heidelberg (2008)
[HWCD09] Hisil, H., Wong, K.K.-H., Carter, G., Dawson, E.: Faster group operations
on elliptic curves. In: Proceedings of the Seventh Australasian Conference
on Information Security, AISC 2009, vol. 98, pp. 7–20. Australian Com-
puter Society, Inc., Darlinghurst (2009)
[JTV10] Joye, M., Tibouchi, M., Vergnaud, D.: Huff’s Model for Elliptic Curves.
In: Hanrot, G., Morain, F., Thomé, E. (eds.) ANTS-IX. LNCS, vol. 6197,
pp. 234–250. Springer, Heidelberg (2010)
[LS01] Liardet, P.-Y., Smart, N.P.: Preventing SPA/DPA in ECC Systems using
the Jacobi Form. In: Koç, Ç.K., Naccache, D., Paar, C. (eds.) CHES 2001.
LNCS, vol. 2162, pp. 391–401. Springer, Heidelberg (2001), doi:10.1007/3-
540-44709-1_32
[Mor09] Morain, F.: Edwards curves and cm curves. Arxiv preprint arXiv:0904.2243
(2009)
[Nat00] National Institute of Standards and Technology. FIPS PUB 186-2: Digital
Signature Standard (DSS) (January 2000)
[Ono94] Ono, T.: Variations on a theme of Euler: quadratic forms, elliptic curves,
and Hopf maps. Plenum. Pub. Corp. (1994)
[RFS10] Farashahi, R.R., Shparlinski, I.: On the number of distinct elliptic curves
in some families. Designs, Codes and Cryptography 54, 83–99 (2010),
doi:10.1007/s10623-009-9310-2
[Ser65] Serre, J.-P.: Zeta and L functions. In: Proc. Conf. on Arithmetical Al-
gebraic Geometry, Purdue Univ., pp. 82–92. Harper & Row, New York
(1965)
[Sil86] Silverman, J.H.: The arithmetic of elliptic curves. Springer, Heidelberg
(1986)
Improved Three-Way Split Formulas
for Binary Polynomial Multiplication
Abstract. In this paper we deal with 3-way split formulas for binary
field multiplication with five recursive multiplications of smaller sizes.
We first recall the formula proposed by Bernstein at CRYPTO 2009
and derive the complexity of a parallel multiplier based on this formula.
We then propose a new set of 3-way split formulas with five recursive
multiplications based on field extension. We evaluate their complexities
and provide a comparison.
1 Introduction
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 384–398, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Improved Formulas for Binary Polynomial Multiplication 385
In this paper we also deal with the 3-way splits and propose new formulas
for binary polynomial multiplication with five recursive multiplications. We use
the extension field F4 to obtain a sufficient number of elements to be able to
apply the multi-evaluation (i.e., evaluation at multiple elements) and interpo-
lation method. This leads to Toom-Cook like formulas. We study the recursive
complexity of the proposed formulas and evaluate the delay of the corresponding
parallel multiplier.
The remainder of this paper is organized as follows: in Section 2 we review
the general method based on multi-evaluation and interpolation to obtain 3-way
split formulas. We then review Bernstein’s formula and evaluate a non-recursive
form of its complexity. In Section 3 we present a new set of 3-way formulas
based on field extension. We evaluate the complexity and the delay of a parallel
multiplier based on these formulas. Complexity comparison and some concluding
remarks are given in Section 4.
In this subsection, first we review the 3-way split formula with five recursive
multiplications presented by Bernstein in [1]. We then derive its complexity
results. We consider two degree n − 1 polynomials A and B in F2 [X] where n
is a power of 3. We split these two polynomials in three parts and then replace
X n/3 by Y and consider them as polynomial in R[Y ] where R = F2 [X]
A = A0 + A1 Y + A2 Y 2 and B = B0 + B1 Y + B2 Y 2
with degX Ai , degX Bi < n/3. Bernstein uses a multi-evaluation and interpola-
tion approach by evaluating the polynomials at these five elements 1, 0, X, X + 1
and ∞ of R ∪ {∞}. We denote C as the product of A and B. We then define
the pairwise products of the evaluations of A(Y ) and B(Y ) at 0, 1, X, X + 1 and
∞ as follows
P0 = A0 B0 (eval. at 0),
P1 = (A0 + A1 + A2 )(B0 + B1 + B2 ) (eval. at 1),
2 2
P2 = (A
0 + A1 X + A2 X )(B0 + B1 X + B2 X ) (eval. at X),
2
P3 = (A0 +A1 + A2 ) + (A1 X + A2 X )
× (B0 + B1 + B2 ) + (B1 X + B2 X 2 ) (eval. at X + 1),
P4 = A2 B2 (eval. at ∞).
We evaluate the complexity of the formula of Bernstein when they are applied
recursively. This complexity will be expressed in terms of the number of bit
addition denoted as S⊕ (n) and the number of bit multiplication denoted S⊗ (n).
The complexities of the computation of the five products P0 , P1 , P2 , P3 and P4
are given in Table 7 in Appendix B. Note that the degrees of R3 , R3 , R4 and R4
are all equal to n/3 + 1, while the degrees of A0 , B0 , A2 , B2 , R1 and R1 are equal
to n/3 − 1. Consequently, the products P0 , P1 and P4 have their degree equal to
(2n/3 − 2) and the degrees of P2 and P3 are equal to (2n/3 + 2).
The formulas in Table 7 can be applied only once since the five products involve
polynomials of degree n/3−1 and n/3+1. In order to have a fully recursive method,
we express the product of degree n/3 + 1 polynomials in terms of one product of
degree n/3 − 1 polynomial plus some additional non-recursive computations.
n/3+1 n/3+1
For this purpose, we consider P = i=0 pi X i and Q = i=0 qi X i . We
first rewrite P as P = P +(pn/3 X n/3 +pn/3+1 X n/3+1 ) and Q = Q +(qn/3 X n/3 +
qn/3+1 X n/3+1 ) and then if we expand the product P Q we obtain
Improved Formulas for Binary Polynomial Multiplication 387
and this requires one bit additions and four bit multiplications. Finally, the
complexities S⊕ (n/3) and S⊗ (n/3) consist of the complexity of each product Mi
plus 2n/3 + 1 bit additions for the sum of these five products. This results in the
following complexity:
S⊕ (n/3 + 2) = S⊕ (n/3) + 4n/3,
(3)
S⊗ (n/3 + 2) = S⊗ (n/3) + 4n/3 + 4.
A0 A1 A2 P2 P3 P4 P0 P1
n
1 n 3
n
3 3
1
1
n
3
A0 R3 R1 R4 A2
Div. by X 2+X
n n n
3 3 3
(5)
Recursive products. In Table 2 we give the cost of the five recursive products.
In the case of a multiplication in F4 [X], all the polynomials are in F4 [X] and
Improved Formulas for Binary Polynomial Multiplication 391
Reconstruction in F4
Computations #⊕
U1 = P2 + P3 4n/3 − 2
U2 = αU1 (= α(P2 + P3 )) 2n/3 − 1
U3 = (1 + α)U1 (= (1 + α)(P2 + P3 )) 0
U4 = P1 + U3 (= P1 + (1 + α)(P2 + P3 )) 4n/3 − 2
U5 = U4 (X n/3 + X 2n/3 + X 3n/3 ) 4n/3 − 4
U6 = P0 + X n/3 P4 (= P0 + X n/3 P4 ) 2n/3 − 2
U7 = U6 (1 + X n ) (= P0 + X n/3 P4 )(1 + X n )) 0
C = U7 + U5 + X n U2
20n/3 − 10
+P2 X 2n/3 + P3 X n/3
Total 36n/3 − 21
Reconstruction in F2
Computations #⊕
U1 = P2 + P3 4n/3 − 2
U2 = [αU1 ]cte 0
U3 = [(1 + α)U1 ]cte 2n/3 − 1
U4 = [P1 + U3 ]cte 2n/3 − 1
U5 = [U4 (X n/3 + X 2n/3 + X 3n/3 )]cte 2n/3 − 2
U6 = [P0 + X n/3 P4 ]cte n/3 − 1
U7 = [U6 (1 + X n )]cte 0
C = [U7 + U5 + X n U2 ]cte
10n/3 − 5
+P2 X 2n/3 + P3 X n/3 ]cte
Total 21n/3 − 12
thus the cost of the recursive products are SF4 ,⊕ (n/3) and SF4 ,⊗ (n/3). For the
multiplication in F2 [X], there are three products which involve polynomials in
F2 [X] incurring a cost of SF2 ,⊕ (n/3) and SF2 ,⊗ (n/3); the two other products are
in F4 [X] and thus the corresponding cost is SF4 ,⊕ (n/3) and SF4 ,⊗ (n/3).
We then apply the same method to other complexities. Below we list the final
non-recursive expression for each case.
⎧ 107 log 3 (5) 29
⎨ SF2 ,⊕ (n) = 4
n − 3
n log3 (n)
C= − 55n
2
+ 34
⎩ log 3 (5)
⎧SF2 ,⊗ (n) = 4n − 3n
(11)
⎨ SF2 ,⊕ (n) = 111
4
nlog3 (5) − 29
3
n log3 (n)
C = − 57n + 34
⎩ 2
SF2 ,⊗ (n) = = 3nlog3 (5) − 2n
We evaluate the delay of the 3-way split multiplier by drawing the data flow of
the 3-way multiplier in F4 [X] and in F2 [X]. The sequence of operations for these
two cases (F4 and F2 ) are essentially the same: their only difference is on the
reconstruction: in the F2 [X] multiplication the operations are restricted to F2 .
The data flow shown in Figure 2 is valid for both cases. We now evaluate the
critical path delay for the multiplication in F4 [X] and then for the multiplication
in F4 [X].
A0 A1 A2 P1 P3 P2 P4 P0
n
3
x α
x (1+α) xα
n 2n
3 3 n
n 2n 3n
A0 R6 R5 R4 A2 3 3 3 n
Delay of the multiplier in F4 [X]. The critical path is made of the following three
parts:
Delay of the multiplier in F2 [X]. The critical path is the same as the critical path
for the multiplication in F4 [X]. The only difference is that the multiplication by
α and (1 + α) does not give any delay since it consists of some permutation of
the coefficients. Consequently, the recursive expression of the delay is DF2 (n) =
8DX + DF4 (n/3) and this yields the corresponding non-recursive expression to be
consequently if the cost of a bit multiplication is twice the cost of a bit addition
then the complexity of the proposed method become smaller than the one of
Bernstein approach. On the other hand, when those formulas are applied to
parallel implementation for polynomial multipliers, Bernstein’s formula leads to
a time delay linear in n, while the proposed ones are logarithmic.
to the combination of Karatsuba (cf [1]) and the formula of Bernstein or the
proposed formulas. To get the complexity based on the proposed formulas in
the special case where i ≥ 1 and j ≥ 1, we apply Karatsuba recursively up
to obtain polynomials of size 2 · 3j and then we apply the strategy presented
above to multiply such polynomials. The resulting complexity shows that, for
j ≥ 1 in n = 2i × 3j , our approach yields better space and time complexities
for the considered fields. The fact that our space complexity is better is due to
the terms −9.67n log3 (n)) in C and C in (11) which are non-negligible for the
above-mentioned sizes of polynomials.
References
1. Bernstein, D.J.: Batch Binary Edwards. In: Halevi, S. (ed.) CRYPTO 2009. LNCS,
vol. 5677, pp. 317–336. Springer, Heidelberg (2009)
2. Cenk, M., Koç, Ç., Özbudak, F.: Polynomial Multiplication over Finite Fields
Using Field Extensions and Interpolation. In: 19th IEEE Symposium on Computer
Arithmetic, ARITH 2009, pp. 84–91 (2009)
3. ElGamal, T.: A Public-Key Cryptosystem and a Signature Scheme Based on Dis-
crete Logarithms. IEEE Transactions on Information Theory 31(4), 469–472 (1985)
4. Fan, H., Hasan, M.A.: A New Approach to Subquadratic Space Complexity Parallel
Multipliers for Extended Binary Fields. IEEE Transactions on Computers 56(2),
224–233 (2007)
5. Fan, H., Sun, J., Gu, M., Lam, K.-Y.: Overlap-free Karatsuba-Ofman Polynomial
Multiplication Algorithm (May 2007)
6. Karatsuba, A.A.: The Complexity of Computations. In: Proceedings of the Steklov
Institute of Mathematics, vol. 211, pp. 169–183 (1995)
7. Koblitz, N.: Elliptic curve cryptosystems. Mathematics of Computation 48, 203–
209 (1987)
8. McGrew, D.A., Viega, J.: The Security and Performance of the Galois/Counter Mode
(GCM) of Operation. In: Canteaut, A., Viswanathan, K. (eds.) INDOCRYPT 2004.
LNCS, vol. 3348, pp. 343–355. Springer, Heidelberg (2004)
Improved Formulas for Binary Polynomial Multiplication 397
9. Miller, V.: Use of Elliptic Curves in Cryptography. In: Williams, H.C. (ed.)
CRYPTO 1985. LNCS, vol. 218, pp. 417–426. Springer, Heidelberg (1986)
10. Sunar, B.: A generalized method for constructing subquadratic complexity GF(2k )
multipliers. IEEE Transactions on Computers 53, 1097–1105 (2004)
11. Toom, A.L.: The Complexity of a Scheme of Functional Elements Realizing the
Multiplication of Integers. Soviet Mathematics 3, 714–716 (1963)
12. Winograd, S.: Arithmetic Complexity of Computations. Society For Industrial &
Applied Mathematics, U.S. (1980)
In this section we provide two lemmas which gives the non-recursive solution
to inductive expression. These solutions are required to obtain a non-recursive
expression of the complexity of the formula presented in the paper. The proof
of Lemma 1 can be found in [4].
bc d bc d
rn = (e + + )nlogb (a) − n− . (14)
a−b a−1 a−b a−1
rb = ar1 + f bδ + cb + d = ae + f bδ + cb + d (16)
Now we compare this value of rb to the value given by the formula (15)
δ
−a)
δ
rb = ae + f b b(b
δ −a + cb logb (b) + d(a−1)
a−1
= ae + f bδ + bc + d.
as required.
Table 7. Cost of multi-evaluation and products for Bernstein’s 3-way split formula
Computations #⊕
S = P2 + P3 , 2n/3 + 1
U = P0 + (P0 + P1 )X n/3 n−2
n/3
Reconstruction V = P2 + S(X + X) n+4
W = U + V + P4 (X 4 + X) 7n/3 − 3
W = W/(X 2 + X) n−2
W = W (X 2n/3 + X n/3 ) 2n/3 − 1
C = U + P4 (X 4n/3 + X n/3 ) + W 5n/3 − 3
Total 25n/3 − 6
Abstract. Recently, Dimitrov et. al. [5] proposed a novel algorithm for
scalar multiplication of points on elliptic Koblitz curves that requires a
provably sublinear number of point additions in the size of the scalar.
Following some ideas used by this article, most notably double-base ex-
pansions for integers, we generalize their methods to hyperelliptic Koblitz
curves of arbitrary genus over any finite field, obtaining a scalar multi-
plication algorithm requiring a sublinear number of divisor additions.
1 Introduction
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 399–411, 2012.
c Springer-Verlag Berlin Heidelberg 2012
400 H. Labrande and M.J. Jacobson Jr.
curves can be used for instance in embedded systems, where the computing
power and the memory are limited.
The problem we deal with in this article is scalar multiplication, comput-
ing m-folds of elements of the group associated to the curve. This problem has
practical implications: cryptographic schemes in the group of an elliptic or hy-
perelliptic curve often require computing m-folds. Thus, by making scalar mul-
tiplication more efficient, we improve the speed of curve-based cryptosystems,
possibly making them more practical and applicable in a broader set of systems.
In [5] the authors present a method to compute m-folds of points (m ∈ Z) on
elliptic Koblitz curves requiring a sublinear (in the size of m) number of point
additions. The method involves finding a suitable triple-base τ -adic expansion
for a particular complex number τ. They also design an algorithm using double-
base expansions that, while not requiring a provably sublinear number of point
additions (although benchmarks and experimental results would let us think it
is), is efficient in practice. These represent the first result that breaks the barrier
of linearity in number of point additions; all previous algorithms were using a
linear number of them, and aimed at improving the O-constant.
In this paper, we generalize the methods of [5] to the case of scalar mul-
tiplication on hyperelliptic Koblitz curves of all characteristic and all genera.
We present a scalar multiplication algorithm with sublinear complexity (in the
number of divisor additions) using triple-base expansions, with carefully cho-
sen bases. Although mostly of theoretical interest due to large constants in the
O-notation, our algorithm does prove for the first time the existence of sublin-
ear divisor multiplication on hyperelliptic Koblitz curves. We also present an
algorithm using double-base expansions that is conjecturally sublinear and more
likely to perform well in practice.
The next two sections provide background on hyperelliptic Koblitz curves
and multi-base number systems. The sections that follow contain our results.
We first present our algorithm that uses triple-base expansions to achieve a
sublinear complexity in number of divisor additions, followed by our practical
double-base algorithm.
and u|f − v 2 − hv. Thus, each divisor can be represented by at most 2g elements
of Fq . The divisor corresponding to the principal ideals class is denoted div[1, 0].
The inverse of [u, v] is [u, −h − v], where the second entry is reduced modulo u,
and is thus efficiently computable. The group operation can be done using Can-
tor’s algorithm for any genus; for genus up to 4 more efficient explicit formulas
exist.
Hyperelliptic Koblitz curves are hyperelliptic curves defined over Fq but the
group Pic0 (C(Fqn )) is considered over Fqn where n is prime. For example, one
of the hyperelliptic curves that is studied in [8] is the curve of genus 2
C : v 2 + uv = u5 + u2 + 1
considered over F2n . Such curves are a generalization of the approach developed
in the elliptic case by Koblitz [10]. In a string of articles, some authors suc-
cessfully generalized Solinas’s scalar multiplication method [13] to hyperelliptic
curves to get fast algorithms for divisor multiplication on this type of Koblitz
curve. An article [8] describes the method for hyperelliptic curves of genus 2,
and subsequent work by Lange describes a generalization of the method for
hyperelliptic curves of all genera and for every characteristic [11] (see also [2,
Sections 15.1.2 and 15.1.3].
As in the elliptic case, one main interest in hyperelliptic Koblitz curves is that
scalar multiplication can be sped up by making use of the action of the Frobenius
endomorphism on elements in Pic0 (C(Fq )). The Frobenius endomorphism τ over
Fqn is defined as x → xq . This operation is inherited by points on the curve
and by Pic0 (C(Fq )). It operates on elements of Pic0 (C(Fq )) given in Mumford
representation by
d d
τ ([u(x), v(x)]) = [τ (u(x)), τ (v(x))], where τ ui xi = uqi xi .
i=0 i=0
Let τ be a complex root of P ; since the Frobenius endomorphism and τ are both
roots of this polynomial, we can consider the Frobenius endomorphism as the
complex number τ . For example, in the case of the genus 2 Koblitz curve
C1 : v 2 + uv = u5 + u2 + 1 ,
√ √
we may take τ = μ±i 2 4−μ , where μ = 1±2 17 .
The idea to improve scalar multiplication is to compute a base-τ expansion
of the scalar, enabling a version of binary exponentiation based on repeated
402 H. Labrande and M.J. Jacobson Jr.
proved [11, Theorem 5.5] that, unlike the elliptic case, the expansions produced
by this algorithm are not necessarily finite. Some criteria for non-finiteness are
provided in particular cases, and in general it is possible to check for a particular
curve whether periodic expansions occur by testing a set of elements in Z[τ ] of
bounded norm. These results use the following norm, which we also use in this
paper: 8
9 2
9 g 2g−1
9 j
N (ζ) = : b j i ,
τ
i=1 j=0
additions on average.
When compared to standard methods, Lange’s algorithm leads to a speed-up
of 1.5g as compared to the binary method and 1.3g compared to NAF. When
compared to a binary window method of width 2 (assuming q = 2 and g = 2),
the speed-up is 11/3. However, we notice that the asymptotic complexity (in
number of divisor additions) is still linear in the size of the scalar m assuming,
as is usually the case in cryptographic applications, that m ∈ O(q ng ). In the next
sections, we give two algorithms that achieve sublinear complexity, one provably
so and the other conjecturally.
Our definitions are adapted from [3,5], where integer scalars are considered, and
the digit set R = {±1}.
The motivation of applying multi-base expansions to the scalar multiplication
problem is that the number of non-zero terms in such an expansion, when using
appropriate bases, is sublinear in the size of the scalar. In the case of integer
bases and scalars, we have the following theorem.
Theorem 1. Given two primes p, q, every integer m has a {p, q}-representation
with a sublinear number of summands, i.e., it can be represented as the sum or
difference of at most O( logloglogmm ) integers of the form pa q b for a, b ∈ N with
a, b ∈ O(log m).
Theorem 1 first appeared with proof in [3, Theorem 1] for bases p = 2 and
q = 3, but generalizes to any number of arbitrary distinct prime bases. The
representation can be computed using a greedy algorithm, namely computing
the largest integer of the form pa q b less than or equal to m and repeating
with m − pa q b . A result of Tijdeman [14] implies that there exists pa q b with
m − m/(log m)C < pa q b < m for some absolute constant C > 0; this implies
the running time and the bound on the exponents a and b occurring in the
representation of m.
Tijdeman’s result also holds for for complex bases provided that the norm of
one base is strictly greater than the other. For elliptic and hyperelliptic Koblitz
curves, we would like to use bases that are simple functions of τ, ideally τ
and τ − 1, so that the resulting scalar multiplication algorithm requires as few
divisor additions as possible. Unfortunately these bases have the same norm,
and the theoretical result does not apply. In [5], the authors get around this
problem by using a combination of triple-base representations in Z[τ ] and {2, 3}-
representations of integers, yielding an algorithm requiring only o(log m) point
additions. As that algorithm does not appear to be efficient in practice, an
algorithm using {τ, τ − 1}-representations is also presented that works well in
practice, despite having only conjectural sublinearity.
integer. Our asymptotic statements in this section assume that the field size q
and genus g are fixed, so that the norm of the scalar tends to infinity, although
we also give the dominant terns in q and g as well.
In [5], the authors achieve this for elliptic curves by using {2, 3}-expansions
of integers and then replacing the 2s and 3s by expressions involving τ using
2 = τ (μ − τ )
3 = 1 + μτ − τ 2 ,
where μ ∈ {±1} is a parameter indicating which of the two elliptic Koblitz curves
over F2 is used. We use a similar approach. We first select suitable bases for our
representation, that is two prime numbers that we can replace by polynomial
identities involving τ . Considering the characteristic polynomial of the Frobenius
endomorphism τ , we have the following identity:
q g = −τ 2g − a1 τ 2g−1 − ... − a1 q g−1 τ = Q(τ ).
Consider i, j ∈ Z such that q g + i and q g + j are prime. By Theorem 1, any
integer can be represented with a sublinear number of summands of the form
±(Q(τ ) + i)x (Q(τ ) + j)y . For convenience we call {τ, Q(τ ) + i, Q(τ ) + j}-integers
terms of the form ±τ x (Q(τ ) + i)y (Q(τ ) + j)z , x, y, z ∈ Z.
First, note that the straightforward approach of computing a {q g + i, q g + j}
representation of the integer scalar m and performing the substitution does not
yield a sublinear algorithm. Although the number of terms in the expansion is
indeed sublinear, the number of required divisor additions may not be because
the required powers of q g + i and q g + j may be as large as log m. Instead, we
model our approach on that of [5] and obtain the following theorem.
Theorem 2. Let ζ ∈ Z[τ ], and assume that the τ -adic representation of ζ with
coefficients in R = {0, ±1, . . . , ± q 2−1 } is finite. Then, for g and q fixed, ζ can
g
Using the fact that P (τ ) = 0, we see that for i ∈ {0...M }, the ith block corre-
sponds to an element of the form
2g−1
Ci = cij τ i .
j=0
We note that, since the xi are in R (and thus bounded by O(q g )) and that there
are O(N0α ) digits in each block, log cij is in O(q g N0α ).
We represent each cij in double-base representation using the prime integer
bases q g + i and q g + j. According to Theorem 1, and since both of our bases
are prime, these representations can be computed using the greedy algorithm of
[3], and have at most g α
q N0
O
log q g N0α
summands of the form (q g + i)x (q g + j)y where x, y ∈ O(q g N0α ). Then, since
q g = Q(τ ), we substitute q g + i by Q(τ ) + i and q g + j by Q(τ ) + j to obtain a
{Q(τ ) + i, Q(τ ) + j}-expansion of each cij .
Next, we compute the {τ, Q(τ ) + i, Q(τ ) + j}-expansions of Ci by multiplying
the expansion of cij for each j ∈ {0...2g − 1} by τ j , and adding the results. Thus,
the expansion of Ci has O(gq g N0α / log q g N0α ) summands of the form ±τ x (Q(τ )+
i)y (Q(τ ) + j)z , with x ∈ {0...2g − 1} and y, z ∈ O(q g N0α ).
The last step is to compute the expansion of ζ from the expansions of the M
blocks Ci , by multiplying each Ci by τ il . We obtain a {τ, Q(τ ) + i, Q(τ ) + j}-
expansion of ζ that has
N0α 1−α N0
O gq g N = O gq g
log q g N0α 0 log q g N0α
terms, and in which the exponents of of Q(τ ) + i and Q(τ ) + j are O(q g N0α ).
Now, since
assuming that g and q are fixed, we get that our number of terms in the end is
indeed
g g + log N (ζ) g log N (ζ)
O gq = O gq
α log log N (ζ) log log N (ζ)
as required.
ζ from Theorem 2, to compute (Q(τ ) + i)a (Q(τ ) + j)b D for all powers a, b in the
representation, use these to compute each term in the representation multiplied
by D, and then to add these together. We will prove that the sublinearity holds
for fixed g and q as long α is selected satisfying 0 < α < 1/2.
Theorem 3. Algorithm 1 requires o(gq g log N (ζ)) divisor additions for fixed g
and q, i.e. the required number of divisor additions is sublinear in log N (ζ).
1. Step 1: this step does not require any divisor additions. We give here an
order of magnitude of i and j. By Chebyschev’s theorem, there is a prime
number between n and 2n for any integer n. Thus, we know there is a prime
number between q g /2 and q g , and between q g and 2q g , and we can bound
|i| and |j| by q g .
2. Steps 2-3: these steps also require no divisor additions. Note that the greedy
algorithm of [3] can be used to compute the double-base representations of
the cij , and that consequently A, B ∈ O(logα N (ζ)).
3. Step 4: we compute A divisors, each one being derived from the previous one
by applying Q(τ ) + i to it. Applying Q(τ ) + i to a divisor D can be done as
follows:
(a) Compute rD for every r ≤ 2q g . Since the absolute value of every coeffi-
cient in Q(τ ) is smaller than 22g q g/2 (see [15, p.378]) and |i| ≤ q g , every
coefficient of Q(τ )+ i is bounded by 2q g . This step requires O(q g ) divisor
additions.
Sublinear Scalar Multiplication on Hyperelliptic Koblitz Curves 407
Now, since α < 1/2, the total number of divisor additions in Steps 4 and 5 is
o(q g log N (ζ)). Thus, the number of divisor additions for the entire algorithm is
o(q g log N (ζ)).
Note that, although the number of divisor additions required is sublinear, the
overall bit complexity of the algorithm is linear in q g log N (ζ). This is due to
the cost of computing the representation in Step 2. The first step of Theorem 2
is to compute the τ -adic expansion of ζ, which has complexity O(log N (ζ)). In
addition, double base expansions of the cij must be computed. From [4], the bit
complexity of each of these operations is in O(log cij ) = O(q g N0α ). There are
2g × M = 2gN01−α of the cij , so the total bit complexity is in O(gq g N0 ) =
O(gq g log N (ζ)).
A straightforward application of this algorithm to ζ = m ∈ Z allows 7 one
to compute mP in o(gq g log m) divisor additions, as log N (m) = log gm2 ∈
O(log m). However, in the case that m is of the usual size used for crypto-
graphic applications, namely O((q n )g ), we can do better by first reducing it
modulo τ n − 1, as τ n (D) = D in Pic0 (C(Fqn )). If, as is also usual in crypto-
graphic applications, arithmetic is restricted to a large prime order subgroup of
n
−1
Pic0 (C(Fqn )), we can reduce the scalar by ττ −1 (see [11, p.65]). For M ≡ m
τ −1
n
(mod τ −1 ), we get log N (M ) = O(n + 2g) = O( g log q + 2g), as we are assuming
log m
In [5], the authors also devise a scalar multiplication algorithm for elliptic Koblitz
curves using {τ, τ − 1}-expansions that is designed to work well in practice. Even
though these bases have the same norm, and thus cannot be proved to yield
sublinear length representations using the results of [14], they were selected
because they are as cheap as possible to apply to a given point (0 or 1 addition
required). A greedy algorithm to compute representations is too expensive, as
it is not known how to efficiently compute the closest {τ, τ − 1} number to a
given element in Z[τ ]. Hence, a blocking strategy is used, in which each short
block of a τ -adic representation is replaced by a pre-computed optimal {τ, τ − 1}
representation. Assuming that these bases do yield sublinear representations, it
is proved that the strategy yields a sublinear algorithm, and numerical results
were presented demonstrating its efficiency in practice.
We attempt to follow the same strategy with hyperelliptic Koblitz curves,
using bases τ and τ − 1, and terms of the form ri τ ai (τ − 1)bi , with r ∈ R =
{0, ±1, . . . , ± q 2−1 }. Our algorithm computes the τ -adic expansion of a given
g
scalar ζ ∈ Z[τ ] using Algorithm 5.19 of [11] and cut this representation into d
blocks of fixed size w. Each block corresponds to an element of Z[τ ], and we can
write:
d−1
ζ= Ni τ iw .
i=0
in the general conjecture for hyperelliptic curves is based on this evidence. Work
is underway to produce supporting numerical evidence for genus 2.
Assuming Conjecture 1,we obtain the following theorem.
Theorem 4. For fixed g and q, and assuming Conjecture 1, every ζ ∈ Z[τ ] with
a finite τ -adic expansion using digits in R can be represented as the sum of at
most O(log N (ζ)/ log log N (ζ)) {τ, τ − 1}-integers such that the largest power of
τ − 1 is O(log N (ζ)/ log log N (ζ)).
Proof. The method used here is exactly the same as the proof of Theorem 5 of [5]:
cut the τ -adic representation of ζ in log log N (ζ) blocks, each of which is of length
O(log N (ζ)/ log log N (ζ)). If each block is replaced by a representation with a
sublinear number of terms, which is possible by Conjecture 1, then both the total
number of terms and the highest power of τ − 1 are in O(log N (ζ)/ log log N (ζ)).
The resulting algorithm is presented in Algorithm 2. We note that the bigger
the block size, the better our algorithm performs thanks to the optimality of
the precomputed table. However, this has to be balanced by the increase storage
cost for the larger table.
Our algorithm can be used for scalar multiplication as follows. To simplify the
max(a )
analysis we assume that the {τ, τ − 1}-expansion is of the form ζ = l=0 i (τ −
max(a )
1)l i=0 i,l ri,l τ ai,l , where max(ai,l ) is the maximal power of τ that is mul-
tiplied by (τ − 1)l in the expansion and ri,j ∈ R. We then denote rl (ζ) =
max(ai,l ) max(a )
i=0 ri,l τ ai,l , and thus ζ = l=0 i (τ − 1)l rl (ζ). The algorithm is pre-
sented in Algorithm 3.
The number of divisor additions required to compute ζD is equal to the
number of terms in the expansion plus max(ai ) q 2−1 . Provided Conjecture 1
g
holds, Theorem 4 implies that the total number of divisor additions is sublinear
in log ζ. As before, if we assume that integer scalars of size O(q ng ) are used,
then reducing the scalar modulo (τ n − 1)/(τ − 1) and applying the algorithm
also requires a sublinear number of divisor additions.
410 H. Labrande and M.J. Jacobson Jr.
This paper successfully generalizes ideas taken from the elliptic case to achieve
improved algorithms for computing m-folds of divisor classes on hyperelliptic
Koblitz curves. However, there are still a number of ways this work could be
expanded.
As indicated earlier, Algorithm 1 requires provably (and unconditionally) sub-
linear number of divisor additions, but only in the asymptotic sense. The O-
constants involved are almost certainly too large for it to be efficient in practice.
The blocking algorithm, Algorithm 3, is more promising for practical applica-
tions, but numerical experiments are required in order to determine its com-
plexity in practice. Most importantly, we need to determine whether sufficiently
short (τ, τ − 1)-representations of all length w τ -adic numbers can be found,
i.e., providing numerical evidence in support of Conjecture 1. If such short rep-
resentations can be found, then the algorithm should compare favorably to the
methods in [11], and a careful implementation, possibly generalizing ideas in [6]
to compute (τ − 1)D efficiently, will be required. This is work in progress.
We also base our result on the hypothesis that all τ -adic expansions that we
consider are finite. Although this is the case in many hyperelliptic Koblitz curves,
the possibility of periods arising can be a concern in practice. We still have
to come up with ways to understand those periods better, and devise efficient
methods to deal with them or to avoid them completely.
Finally, we note that our double-base algorithm requires a modest amount
of storage in order to achieve computational improvements. Although the pre-
computed table used can be viewed as part of the domain parameters, for it
does not depend on m or on the divisor class D, an efficient memory-free divisor
multiplication algorithm, such as that of [1] in the case of elliptic Koblitz curves,
is still left to find, as it would allow the most memory-constrained systems to
enjoy this speedup as well.
Sublinear Scalar Multiplication on Hyperelliptic Koblitz Curves 411
References
1. Avanzi, R., Dimitrov, V., Doche, C., Sica, F.: Extending Scalar Multiplication Us-
ing Double Bases. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS, vol. 4284,
pp. 130–144. Springer, Heidelberg (2006)
2. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.
(eds.): Handbook of elliptic and hyperelliptic curve cryptography. Discrete Math-
ematics and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton
(2006); MR2162716 (2007f:14020)
3. Dimitrov, V.S., Jullien, G.A., Miller, W.C.: An algorithm for modular exponenti-
ation. Inform. Process. Lett. 66(3), 155–159 (1998); MR 1627991 (99d:94023)
4. Dimitrov, V.S., Imbert, L., Zakaluzny, A.: Multiplication by a constant is sublinear.
In: IEEE Symposium on Computer Arithmetic 2007, pp. 261–268 (2007)
5. Dimitrov, V.S., Järvinen, K.U., Jacobson Jr., M.J., Chan, W.F., Huang, Z.: Prov-
ably sublinear point multiplication on Koblitz curves and its hardware implemen-
tation. IEEE Trans. Comput. 57(11), 1469–1481 (2008); MR2464687 (2009j:68053)
6. Doche, C., Kohel, D.R., Sica, F.: Double-Base Number System for Multi-scalar
Multiplications. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 502–
517. Springer, Heidelberg (2009)
7. Enge, A.: Computing discrete logarithms in high-genus hyperelliptic Jacobians in
provably subexponential time. Math. Comp. 71(238), 729–742 (2002); (electronic).
MR 1885624 (2003b:68083)
8. Günther, C., Lange, T., Stein, A.: Speeding up the Arithmetic on Koblitz Curves
of Genus Two. In: Stinson, D.R., Tavares, S. (eds.) SAC 2000. LNCS, vol. 2012,
pp. 106–117. Springer, Heidelberg (2001); MR 1895585 (2003c:94024)
9. Koblitz, N.: Elliptic curve cryptosystems. Math. Comp. 48(177), 203–209 (1987);
MR 866109 (88b:94017)
10. Koblitz, N.: CM-Curves with Good Cryptographic Properties. In: Feigenbaum, J.
(ed.) CRYPTO 1991. LNCS, vol. 576, pp. 279–287. Springer, Heidelberg (1992)
11. Lange, T.: Efficient arithmetic on hyperelliptic curves, Ph.D. thesis, Universität-
Gesamthochschule Essen, Essen, Germany (2001)
12. Miller, V.S.: Use of Elliptic Curves in Cryptography. In: Williams, H.C. (ed.)
CRYPTO 1985. LNCS, vol. 218, pp. 417–426. Springer, Heidelberg (1986)
13. Solinas, J.A.: Efficient arithmetic on Koblitz curves. Des. Codes Cryptogr. 19(2-3),
195–249 (2000)
14. Tijdeman, R.: On the maximal distance between integers composed of small primes.
Compositio. Math. 28, 159–162 (1974); MR 0345917 (49 #10646)
15. Vercauteren, F.: Computing Zeta Functions of Hyperelliptic Curves Over Finite
Fields of Characteristic 2. In: Yung, M. (ed.) CRYPTO 2002. LNCS, vol. 2442,
pp. 369–384. Springer, Heidelberg (2002)
Faster Hashing to G2
1 Introduction
Let E be an elliptic curve defined over Fp and let r be a large prime divisor
of #E(Fp ). The embedding degree of E (with respect to r, p) is the smallest
positive integer k such that r | pk − 1. The Tate pairing on ordinary elliptic
curves maps two linearly independent rational points defined over the order-r
groups G1 , G2 ⊆ E(Fpk ) to the group of r-th roots of unity of the finite field
Fpk . In practice, the Tate pairing is computed using variations of an iterative
algorithm that was proposed by Victor Miller in 1986 [21]. The result is in the
quotient group F∗pk /(F∗pk )r and is followed by a final exponentiation in order to
obtain a unique representative.
Efficient realizations of the Tate pairing have been intensively pursued in
recent years. Using different strategies, that research effort has produced sev-
eral remarkable algorithm improvements that include: construction of pairing-
friendly elliptic curves with prescribed embedding degree [4,8,23], decreases of
the Miller loop length [3,13,14,29], and reductions in the associated towering
field arithmetic costs [6,11,15,17].
A. Miri and S. Vaudenay (Eds.): SAC 2011, LNCS 7118, pp. 412–430, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Faster Hashing to G2 413
With the increase in efficiency of the Miller loop calculation, the final expo-
nentiation step has become more of a computational bottleneck. Several research
works have reported more refined methods for computing the final exponentia-
tion on pairings defined over ordinary elliptic curves [6,12,26]. In particular, the
results by Scott et al. [26] represent the current state-of-the-art in this topic, as
can be verified from the fact that most recent implementations of pairings (see
for example [1,5]) have obtained significant accelerations by computing the final
exponentiation according to the vectorial addition chain based method described
in that work.
Another important task related to pairing computation that has been less
studied is the problem of generating random points in G1 and G2 , known in the
literature as hashing to G1 and hashing to G2 , respectively. The group G1 is
defined as E(Fp )[r]. Hashing to G1 is normally seen as a straightforward task,
whereas hashing to G2 is considered more challenging.
The customary method for representing G2 is as the order-r subgroup of
Ẽ(Fpk/d ), where Ẽ is the degree-d twist of E over Fpk/d with r | #Ẽ(Fpk/d ); here
#S denotes the cardinality of S. Hashing to G2 can be accomplished by finding
a random point Q ∈ Ẽ(Fpk/d ) followed by a multiplication by c = #Ẽ(Fpk/d )/r.
The main difficulty of this hashing is that c is normally a relatively large scalar
(for example, larger than p). Galbraith and Scott [10] reduce the computational
cost of this task by means of an endomorphism of Ẽ. This idea was further
exploited by Scott et al. [27], where explicit formulae for hashing to G2 were
given for several pairing-friendly curves.
In this work, we offer improvements in both the final exponentiation and hash-
ing to G2 . We draw on the methods that Vercauteren [29] employed to reduce the
cost of the Miller function. Our results for the final exponentiation reduce the
cost by a fixed number of operations in several curves, a modest but measurable
improvement. Nonetheless, the techniques we use can be applied to increase the
speed of hashing as well, saving a fixed number of point additions and doublings.
Our framework for fast hashing produces more dramatic results. For example,
we estimate that for BN curves [4] at the 128-bit security level, our results yield
a hashing algorithm that is at least two times faster than the previous fastest-
known algorithm. For higher-embedding-degree curves, the results can be more
dramatic.
The rest of this paper is organized as follows. In Section 2 we review Ver-
cauteren’s “optimal” pairings. Sections 3 and 4 present our lattice-based method
for computing the final exponentiation and exponentiation examples for several
emblematic pairing-friendly elliptic curves, respectively. Sections 5 and 6 give
our lattice-based approach for hashing to G2 and hashing for several families of
elliptic curves.
2 Background
The Tate pairing is computed in two steps. First, the Miller function value f =
fr,P (Q) ∈ F∗pk is computed. This gives a value in the quotient group F∗pk /(F∗pk )r .
414 L. Fuentes-Castañeda, E. Knapp, and F. Rodrı́guez-Henrı́quez
h(p)/r
s
fr,P = gP fhi ,pi P fphii,P ,
i=0
inversions, and very cheap p-th powerings in Fpk . Raising to the power d =
Φk (p)/r is considered more difficult.
Observing that p-th powering is much less expensive than multiplication, Scott
et al. [26] give a systematic method for reducing the expense of exponentiating
by d. They showed that by writing d = Φk (p)/r in base p as d = d0 + d1 p +
· · · + dϕ(k)−1 pϕ(k)−1 , one can find short vectorial addition chains to compute
f $→ f d much more efficiently than the naive method. For parameterized curves,
Faster Hashing to G2 415
more concrete results can be given. For instance, Barreto-Naehrig curves [4]
are constructed over a prime field Fp , where p is a large prime number that
can be parameterized as a fourth-degree polynomial p = p(x), x ∈ Z. The
result of Scott et al. gives an algorithm to compute f $→ f d , by calculating
2
three intermediate exponentiations, namely, f x , (f x )x , (f x )x , along with a short
sequence of products. By choosing the parameter x ∈ Z to have low hamming
weight, the total cost of computing f $→ f d is 34 log p field squarings plus a small
fixed-number of field multiplications and squarings.
Using the fact that a fixed power of a pairing is also a pairing, it suffices
to raise to the power of any multiple d of d, with r not dividing d . Based
on this observation, we present a lattice-based method for determining d such
that f $→ f d can be computed at least as efficiently as f $→ f d . For Barreto-
Naehrig and several other curves, explicit d polynomials yielding more-efficient
final exponentiation computations are reported. However, it is noted that the
main bottleneck remains, namely the exponentiation by powers of x.
In the case of parameterized curves, the key to finding suitable polynomials d
is to consider Q[x]-linear combinations of d(x). Specifically, we consider Q-linear
combinations of d(x), xd(x), . . . , xdeg r−1 d(x). To see why this set of multiples of
d(x) suffices, consider f ∈ Fpk with order dividing Φk (p). Since r(x)d(x) = Φk (p),
deg r
it follows that f r(x)d(x) = 1 and so f x d(x)
is the product of Q-powers of f ,
xd(x) xdeg r−1 d(x)
f , ..., f .
Now, consider an arbitrary Q-linear combination d (x) of the elements d(x),
xd(x), . . . , xdeg r−1 d(x). Following the method of Scott et al. [26], d (x) can be
written in base p(x) as d (x) = d0 (x) + d1 (x)p(x) + · · · + dφ(k)−1 (x)p(x)φ(k)−1 ,
where each di has degree less than the degree of p. Set di = di,0 + xdi,1 + · · · +
xdeg p−1 di,deg p−1 and assume that di,j ∈ Z for 0 ≤ i ≤ ϕ(k) − 1, 0 ≤ j ≤
deg(p(x)) − 1. Then f d (x) can be computed in two steps as explained next.
deg p−1
First, the exponentiations f x , . . . , f x are performed. From these in-
j i
termediate exponentiations, terms of the form f x p can be easily calculated.
Second, a vectorial addition chain containing the di,j -s is found. This allows
j i
to compute f d (x) from terms of the form f x p using the work of Olivos [24].
The advantage of allowing multiples of d(x) for this computation is to provide
more flexibility in the choices of the exponents d (x) = di,j xj pi with di,j ∈ Z,
that can potentially yield shorter addition chains, which in turn means a more-
efficient final exponentiation calculation. However the savings are necessarily
modest, since as in the method of Scott et al. [26], the main expense in this
deg p−1
exponentiation process comes from computing the terms f x , . . . , f x .
In order to find efficient polynomials d (x), let us construct a rational matrix
M with dimensions deg r × ϕ(k) deg p such that
⎡ ⎤ ⎛⎡ ⎤ ⎡ ⎤⎞
d(x) 1 1
⎢ xd(x) ⎥ ⎜⎢ p(x) ⎥ ⎢ x ⎥⎟
⎢ ⎥ ⎜⎢ ⎥ ⎢ ⎥⎟
⎢ .. ⎥ = M ⎜⎢ .. ⎥⊗⎢ .. ⎥⎟ .
⎣ . ⎦ ⎝⎣ . ⎦ ⎣ . ⎦⎠
xdeg r−1 d(x) p(x)ϕ(k)−1 xdeg p−1
416 L. Fuentes-Castañeda, E. Knapp, and F. Rodrı́guez-Henrı́quez
4 Exponentiation Examples
4.1 BN Curves
BN curves [4] have embedding degree 12 and are parameterized by x such that
Scott et al. [26] applied the work of Olivos [24] to compute the map f $→ f d
using vectorial addition chains. From the above representation for d, vectorial
addition chains can be used to compute f $→ f d using 3 exponentiations by x,
13 multiplications, and 4 squarings.
For the method described in Section 3, consider multiples of d represented in
the base p with coefficients in Q[x]/(p(x)).
A 4 × 16 integer matrix M is found such that
⎡ ⎤ ⎛⎡ ⎤ ⎡ ⎤⎞
d(x) 1 1
⎢ xd(x) ⎥ ⎜⎢ p(x) ⎥ ⎢ x ⎥⎟
⎢ 2 ⎥ ⎜⎢ ⎥ ⎢ ⎥⎟
⎣ 6x d(x) ⎦ = M ⎝⎣ p(x)2 ⎦ ⊗ ⎣ x2 ⎦⎠ .
6x3 d(x) p(x)3 x3
The first row in M corresponds to the final exponentiation given by Scott et al.
[26]. Any non-trivial integer linear combination of the rows corresponds to an ex-
ponentiation. For computational efficiency, a linear combination with coefficients
as small as possible is desired.
None of the basis vectors returned by the LLL algorithm has an advantage
over [26]. However, if small integer linear combinations of the short vectors re-
turned by the LLL algorithm are considered, a multiple of d which corresponds
to a shorter addition chain could potentially be found. A brute force search of
linear combinations of the LLL basis yields 18 non-zero vectors with maximal
entry 12. Among these vectors we consider the vector
The final exponentiation which results can be computed more efficiently without
using addition chains.
First, the following exponentiations are computed
2 2 3
f $→ f x $→ f 2x $→ f 4x $→ f 6x $→ f 6x $→ f 12x $→ f 12x
multiplications. Finally, the result f d is obtained as
[a · f 6x · f ] · [b]p · [a]p · [b · f −1 ]p
2 2 3
are both prime. For d = Φk (p)/r, we compute an integer matrix M such that
⎛ ⎡ ⎤⎞
⎡ ⎤ ⎡ ⎤ 1
3d(x) ⎜ 1 ⎢ x ⎥⎟
⎜ ⎢ ⎥⎟
⎢ (3/7)xd(x) ⎥ ⎜⎢ p(x) ⎥ ⎢ x2 ⎥⎟
⎢ ⎥ ⎜ ⎢ ⎥ ⎢ ⎥⎟
⎢ (3/49)x2 d(x) ⎥ ⎜⎢ p(x)2 ⎥ ⎢ x3 ⎥⎟
⎢ ⎥ ⎜ ⎢ ⎥ ⎢ ⎥⎟
⎢ (3/49)x3 d(x) ⎥ = M ⎜⎢ p(x)3 ⎥ ⊗ ⎢ x4 ⎥⎟ .
⎢ ⎥ ⎜ ⎢ ⎥ ⎢ ⎥⎟
⎣ (3/49)x4 d(x) ⎦ ⎜⎣ p(x)4 ⎦ ⎢ x5 ⎥⎟
⎜ ⎢ 6 ⎥⎟
(3/49)x5 d(x) ⎝ p(x)5 ⎣ x ⎦⎠
x7
{1, 2, 3, 5, 6, 7, 10, 14, 15, 19, 21, 25, 35, 38, 49, 73,
98, 108, 133, 147, 181, 245, 254, 343, 490, 505, 686}.
Once again, applying Olivos’ method for computing a short vectorial addition
chain, we can compute the map f $→ f d using 7 exponentiations by x, 52 mul-
tiplications, and 8 squarings.
Table 1. A comparison of our final exponentiation method with the method of Scott et
al. [26]. ‘M’ denotes a multiplication and ‘S’ denotes a squaring. Both methods require
the same number of exponentiations by x, determined by the curve.
the exponentiation, the total cost can easily be computed from their work. The
operation counts are given for field multiplications and squarings only, since
the number of exponentiations by x is fixed for each curve and computing p-th
powers maps is comparatively inexpensive.
For example, let us consider the case of BN curves parameterized with x =
−262 −254 +244 , which yields a 127-bit security level [5]. Further, assume that the
relative cost of a field multiplication compared to a cyclotomic squaring on Fpk
is given as M ≈ 4.5S [1,15]. Then, the total cost to perform the exponentiations
2
f x , (f x )x , (f x )x , is of around 3 · log2 x = 183 cyclotomic squarings. Using the
results reported in Table 1, this gives an approximate cost for the hard part of
the final exponentiation of 187S + 13M ≈ 245S for the method of Scott et al.
and 186S + 10M ≈ 231S using our method.
Table 2. Possible values for the trace t̃ of the q-power Frobenius of a degree-d twist
Ẽ of E
d 2 3 4 6
t̃ −t̂ (±3fˆ − t̂)/2 ±fˆ (±3fˆ + t̂)/2
ψ 2 P − tψP + pP = ∞ (2)
for all P ∈ Ẽ(Fq ) [9, Theorem 1]. The cofactor c can be written as a polynomial
in p with coefficients less than p. Scott et al. use this representation of c and
reduce using (2) so that c is expressed as a polynomial in ψ with coefficients less
than p. For parameterized curves, the speedup in the cost of computing Q $→ cQ
can become quite dramatic. For example, MNT curves have embedding degree
k = 6 and are parameterized by x such that
p(x) = x2 + 1
r(x) = x2 − x + 1
Proof. Since Ẽ(Fq ) is cyclic and ψ fixes Ẽ(Fq ), there exists an integer a such
that ψP = aP for all P ∈ Ẽ(Fq ). By solving for a in (2) and using the fact that
t2 − 4p = Df 2 , we obtain
1 7 1 7 1 √
a≡ (t ± t2 − 4p) ≡ (t ± Df 2 ) ≡ (t ± f D) (mod ñ).
2 2 2
Working√ modulo ñ, we observe that Df˜2 = t̃2 − 4q = t̃2 − 4t̃ + 4 = (t̃ − 2)2
and so D ≡ ±(√t̃ − 2)/f˜ √ (mod ñ). Without loss of generality, let f , f˜ be such
that a = 2 (t + f D) and D ≡ (t̃ − 2)/f˜ (mod ñ). Then, since P ∈ Ẽ(Fq ) has
1
where a is the column vector with i-th entry −ai . Consider the convex set
C ⊆ Rϕ(k) generated by all vectors of the form (±|c|1/ϕ(k) , . . . , ±|c|1/ϕ(k) ). The
424 L. Fuentes-Castañeda, E. Knapp, and F. Rodrı́guez-Henrı́quez
volume of C is 2ϕ(k) |c| and the lattice above has volume |c|. By Minkowski’s
Theorem [22], the region C contains a lattice point. Hence, there exists a non-
zero polynomial h with coefficients at most |c|1/ϕ(k) such that h(a) ≡ 0 (mod c).
This concludes the proof of Theorem 1.
6 Hashing Examples
6.1 BN Curves
BN curves are parameterized by
p(x) = 36x4 + 36x3 + 24x2 + 6x + 1
r(x) = 36x4 + 36x3 + 18x2 + 6x + 1
t(x) = 6x2 + 1
f (x) = 6x2 + 4x + 1
where
t(x)2 − 4p(x) = −3f (x)2
r(x) + t(x) = p(x) + 1
q(x) = p(x)2 .
We have that c(x) = p(x)+t(x)−1 is such that ñ(x) = r(x)c(x). Using Lemma 2,
we obtain
1
a(x) = (t + f (t̃ − 2)/f˜)
2
1
= − (3456x7 + 6696x6 + 7488x5 + 4932x4 + 2112x3 + 588x2 + 106x + 6).
5
As a sobriety check, note that a(x) ≡ p(x) (mod r) and thus ψQ = a(x)Q =
p(x)Q for all Q ∈ Ẽ(Fq )[r].
We construct the following lattice and reduce the −a(x)i entries modulo c(x):
⎡ ⎤ ⎡ ⎤
c(x) 0 0 0 36x4 + 36x3 + 30x2 + 6x + 1 0 0 0
⎢ −a(x) 1 0 0 ⎥ ⎢ ⎥
⎥ → ⎢ 48/5x3 + 6x2 + 4x − 2/5 1 0 0 ⎥ .
3 2
⎢
⎣ −a(x) 0 1 0 ⎦
2 ⎣ 36/5x + 6x + 6x + 1/5 0 1 0 ⎦
−a(x)3 0 0 1 12x3 + 12x2 + 8x + 1 001
Faster Hashing to G2 425
6.3 KSS-8
KSS-8 curves [16] have embedding degree k = 8 and are parameterized by x
such that
1
r = r(x) = (x4 − 8x2 + 25)
450
1
p = p(x) = (x6 + 2x5 − 3x4 + 8x3 − 15x2 − 82x + 125)
180
426 L. Fuentes-Castañeda, E. Knapp, and F. Rodrı́guez-Henrı́quez
are both prime. Set q = pk/d = p2 . There exists a degree-4 twist Ẽ(Fq ) of order
1 8
ñ(x) = (x + 4x7 + 6x6 + 36x5 + 34x4 − 84x3 + 486x2 + 620x + 193)r(x).
72
Set c(x) = ñ(x)/r(x). After some work, we discover that ψ is such that ψQ = aQ
for all Q ∈ Ẽ(Fq ) where
1
a= − 52523x11 − 174115x10 + 267585x9 − 193271x8
184258800
− 325290x7 + 15093190x6 − 29000446x5 − 108207518x4
+ 235138881x3 + 284917001x2 − 811361295x − 362511175 .
As we’ve done previously, we find a short basis for the lattice generated by the
matrix
⎡ ⎤
c(x) 0 0 0
⎢ −a(x) 1 0 0 ⎥
⎢ ⎥
⎣ −a(x)2 0 1 0 ⎦
−a(x)3 0 0 1
1 2
h(a) = (x − 25)c(x) = λ0 + λ1 a + λ2 a2 + λ3 a3
75
λ0 Q = −(x2 + x)Q
λ1 Q = (x + 1)Q − 4Q
λ2 Q = 2(x + 1)Q + 4Q
λ3 Q = −2(x + 1)Q − 2Q
which requires three more additions and another doubling. Finally, we compute
6.4 KSS-18
λ0 = 5x + 18
λ1 = x3 + 3x2 + 1
λ2 = −3x2 − 8x
λ3 = 3x + 1
λ4 = −x2 − 2
λ5 = x2 + 5x.
We construct the addition chain {1, 2, 3, 5, 8, 10, 18}, from which we can compute
Q $→ h(a)Q using sixteen additions, two doublings, three multiply-by-x’s, and
five ψ maps.
Table 3. A comparison of our hashing algorithm with the hashing algorithm of Scott
et al. ‘A’ denotes a point addition, ‘D’ denotes a point doubling, ‘X’ denotes a multi-
plication by x, and ‘ψ’ denotes an application of the map ψ.
other hand, the identity ψ k/2 P = −P is critically used in the eprint version [28]
of their work. Fortunately, all curves except the MNT curve can be explicitly
shown to satisfy the identity ψ k/2 P = −P . In practice, we’ve found that MNT
curves also satisfy this property. More work needs to be done to determine the
structure of the twist and the action of ψ on various subgroups of the twist.
We use the eprint version [28] to represent Scott et al.’s operation counts
on Freeman curves. We have verified that the identity Φk (ψ)P = ∞ holds for
BN, KSS-8, and KSS-18 curves and use the counts from the proceedings version
[27] of their work for those curves in Table 3. Since the multiplications by x
dominate the other operations, it can be seen that our hash algorithm is ap-
proximately twice as fast as that of Scott et al. for BN curves. For the KSS-8
curve we see a 52 -fold improvement, and for the KSS-18 curves, we see a 73 -fold
improvement.
7 Conclusion
We shown that both the final exponentiation and hashing to G2 tasks can be
efficiently performed by adapting the lattice-based framework that Vercauteren
utilized in [29] for finding optimal pairings. Let us recall that an optimal pairing
as defined in [29] computes the Miller loop in just log2 r/φ(k) iterations.
Scott et al. [26] showed that by writing d = Φk (p)/r in base p as d =
d0 + d1 p + · · · + dϕ(k)−1 pϕ(k)−1 , one can find short vectorial addition chains to
efficiently compute the hard part of the final exponentiation f $→ f d . This work
presents a lattice-based method for determining a multiple d of d, with r not
dividing d such that f $→ f d can be computed at least as efficiently as f $→ f d
and where d (x) is written in base p(x) as d (x) = d0 (x) + d1 (x)p(x) + · · · +
dφ(k)−1 (x)p(x)φ(k)−1 . In theorem 1 it was proved that there exists a polynomial
h(z) = h0 + h1 z + · · · + hϕ(k)−1 z ϕ(k)−1 ∈ Z[z] such that every point P ∈ Ẽ(Fq ),
can be hashed to G2 by computing h(ψ)P , where |hi |ϕ(k) ≤ #Ẽ(Fq )/r for all i.
Vercauteren’s lattice-based framework reveals the crucial role that φ(k) plays
for defining upper bounds on the optimal length of the Miller loop and on the
final exponentiation and hashing to G2 computational efforts. This makes us
conclude that the optimal solutions of these three problems are tightly related
on an eternal golden braid.
Faster Hashing to G2 429
References
1. Aranha, D.F., Karabina, K., Longa, P., Gebotys, C.H., López, J.: Faster Explicit
Formulas for Computing Pairings over Ordinary Curves. In: Paterson, K.G. (ed.)
EUROCRYPT 2011. LNCS, vol. 6632, pp. 48–68. Springer, Heidelberg (2011)
2. Barreto, P.S.L.M., Galbraith, S., ÓhÉigeartaigh, C., Scott, M.: Efficient pairing
computation on supersingular Abelian varieties. Designs, Codes and Cryptogra-
phy 42(3), 239–271 (2007)
3. Barreto, P.S.L.M., Kim, H.Y., Lynn, B., Scott, M.: Efficient Algorithms for Pairing-
Based Cryptosystems. In: Yung, M. (ed.) CRYPTO 2002. LNCS, vol. 2442, pp.
354–368. Springer, Heidelberg (2002)
4. Barreto, P.S.L.M., Naehrig, M.: Pairing-Friendly Elliptic Curves of Prime Or-
der. In: Preneel, B., Tavares, S. (eds.) SAC 2005. LNCS, vol. 3897, pp. 319–331.
Springer, Heidelberg (2006)
5. Beuchat, J.-L., González-Dı́az, J.E., Mitsunari, S., Okamoto, E., Rodrı́guez-
Henrı́quez, F., Teruya, T.: High-speed Software Implementation of the Optimal
Ate Pairing over Barreto–Naehrig Curves. In: Joye, M., Miyaji, A., Otsuka, A.
(eds.) Pairing 2010. LNCS, vol. 6487, pp. 21–39. Springer, Heidelberg (2010)
6. Devegili, A.J., Scott, M., Dahab, R.: Implementing Cryptographic Pairings over
Barreto-Naehrig Curves. In: Takagi, T., Okamoto, T., Okamoto, E., Okamoto, T.
(eds.) Pairing 2007. LNCS, vol. 4575, pp. 197–207. Springer, Heidelberg (2007)
7. Freeman, D.: Constructing Pairing-Friendly Elliptic Curves with Embedding De-
gree 10. In: Hess, F., Pauli, S., Pohst, M. (eds.) ANTS 2006. LNCS, vol. 4076, pp.
452–465. Springer, Heidelberg (2006)
8. Freeman, D., Scott, M., Teske, E.: A Taxonomy of Pairing-Friendly Elliptic Curves.
Journal of Cryptology 23(2), 224–280 (2010)
9. Galbraith, S.D., Lin, X., Scott, M.: Endomorphisms for Faster Elliptic Curve Cryp-
tography on a Large Class of Curves. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS,
vol. 5479, pp. 518–535. Springer, Heidelberg (2009)
10. Scott, M., Benger, N., Charlemagne, M., Dominguez Perez, L.J., Kachisa, E.J.: On
the Final Exponentiation for Calculating Pairings on Ordinary Elliptic Curves. In:
Shacham, H., Waters, B. (eds.) Pairing 2009. LNCS, vol. 5671, pp. 78–88. Springer,
Heidelberg (2009)
11. Granger, R., Scott, M.: Faster Squaring in the Cyclotomic Subgroup of Sixth
Degree Extensions. In: Nguyen, P.Q., Pointcheval, D. (eds.) PKC 2010. LNCS,
vol. 6056, pp. 209–223. Springer, Heidelberg (2010)
12. Hankerson, D., Menezes, A., Scott, M.: Software Implementation of Pairings. In:
Identity-Based Cryptography, ch.12, pp. 188–206 (2009)
13. Hess, F.: Pairing Lattices. In: Galbraith, S.D., Paterson, K.G. (eds.) Pairing 2008.
LNCS, vol. 5209, pp. 18–38. Springer, Heidelberg (2008)
14. Hess, F., Smart, N., Vercauteren, F.: The Eta Pairing Revisited. IEEE Transactions
on Information Theory 52(10), 4595–4602 (2006)
15. Karabina, K.: Squaring in Cyclotomic Subgroups (2010) (manuscript),
http://eprint.iacr.org/2010/542
430 L. Fuentes-Castañeda, E. Knapp, and F. Rodrı́guez-Henrı́quez
16. Kachisa, E.J., Schaefer, E.F., Scott, M.: Constructing Brezing-Weng Pairing-
Friendly Elliptic Curves Using Elements in the Cyclotomic Field. In: Galbraith,
S.D., Paterson, K.G. (eds.) Pairing 2008. LNCS, vol. 5209, pp. 126–135. Springer,
Heidelberg (2008)
17. Koblitz, N., Menezes, A.: Pairing-Based Cryptography at High Security Levels. In:
Smart, N.P. (ed.) Cryptography and Coding 2005. LNCS, vol. 3796, pp. 13–36.
Springer, Heidelberg (2005)
18. Lee, E., Lee, H.-S., Park, C.-M.: Efficient and Generalized Pairing Computation
on Abelian Varieties. IEEE Transactions on Information Theory 55(4), 1793–1803
(2009)
19. Lenstra, A.K., Lenstra Jr., H.W., Lovasz, L.: Factoring Polynomials with Rational
Coefficients. Mathematische Annalen 261(4), 515–534 (1982)
20. Menezes, A.: Elliptic Curve Public Key Cryptosystems. Kluwer Academic Publish-
ers (1993)
21. Miller, V.S.: The Weil Pairing, and Its Efficient Calculation. Journal of Cryptol-
ogy 17(4), 235–261 (2004)
22. Minkowski, H.: Geometrie der Zahlen, Leipzig und Berlin, Druck ung Verlag von
B.G. Teubner (1910)
23. Miyaji, A., Nakabayashi, M., Takano, S.: New Explicit Conditions of Elliptic-Curve
Traces for FR-reduction. IEICE Trans. Fundamentals E84, 1234–1243 (2001)
24. Olivos, J.: On Vectorial Addition Chains. Journal of Algorithms 2(1), 13–21 (1981)
25. Pereira, G.C.C.F., Simplcio Jr., M.A., Naehrig, M., Barreto, P.S.L.M.: A Family
of Implementation-Friendly BN Elliptic Curves. Journal of Systems and Software
(to appear, 2011)
26. Scott, M., Benger, N., Charlemagne, M., Dominguez Perez, L.J., Kachisa, E.J.: On
the Final Exponentiation for Calculating Pairings on Ordinary Elliptic Curves. In:
Shacham, H., Waters, B. (eds.) Pairing 2009. LNCS, vol. 5671, pp. 78–88. Springer,
Heidelberg (2009)
27. Scott, M., Benger, N., Charlemagne, M., Dominguez Perez, L.J., Kachisa, E.J.:
Fast Hashing to G2 on Pairing-Friendly Curves. In: Shacham, H., Waters, B. (eds.)
Pairing 2009. LNCS, vol. 5671, pp. 102–113. Springer, Heidelberg (2009)
28. Scott, M., Benger, N., Charlemagne, M., Dominguez Perez, L.J., Kachisa, E.J.: Fast
Hashing to G2 on Pairing-Friendly Curves, http://eprint.iacr.org/2008/530
29. Vercauteren, F.: Optimal Pairings. IEEE Transactions on Information Theory 56(1),
455–461 (2010)
Author Index