0% found this document useful (0 votes)
25 views

Cybersecurity Crypto

Uploaded by

Stefan Etienne
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Cybersecurity Crypto

Uploaded by

Stefan Etienne
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Cryptology: From Caesar Ciphers to Public-Key Cryptosystems

Dennis Luciano
Gordon Prichett

The College Mathematics Journal, January 1987, Volume 18,


Number 1, pp. 2–17.
Dennis Luciano is currently Associate Professor and Chairman of the Department of
Mathematics and Computer Science of Western New England College in Springfield,
Massachusetts, where he has been since 1977. Prior to this, he had appointments at
Saint Joseph’s University (Philadelphia, Pa) and LeMoyne College (Syracuse, NY).
He received an M.A. and Ph.D. (in commutative algebra under the direction of David
Lissner) from Syracuse University. Professor Luciano is a referee for the College
Mathematics Journal, a member of the Executive Committee of the Northeastern Section
of the MAA, and an evaluator for the New England Association of Schools and
Colleges. His current mathematical interests include number theory, graph theory, and
applications of mathematics at the undergraduate level. His nonmathematical interests
include running and basketball.
Gordon Prichett is Professor of Mathematics at Babson College. He received his B.A.
from Williams College in 1963 and his Ph.D. from the University of Wisconsin,
Madison in 1970. After teaching mathematics and mathematics education at San Diego
State University, he moved to Hamilton College where he served as chairman. He has
since taught at Wellesley College and Babson. He has published a variety of papers on
number theory and computer applications. Dr. Prichett has spent one year as a visiting
professor in the Biology Department at the University of York, England, studying
population biology. He has an active interest in curriculum development and alternative
approaches to instruction, and is coauthor of A Supplement to Calculus: One and
Several Variables, which adapts introductory calculus to the self-paced mode of
instruction.
Few persons can be made to believe that it is not quite an easy thing to invent a
method of secret writing that shall baffle investigation. Yet it may be roundly
asserted that human ingenuity cannot concoct a cipher which human ingenuity
cannot resolve.
Edgar Allen Poe [7]

C
ryptography, from the Greek ‘kryptos’ (hidden) and ‘graphein’ (to write), is the
art and science of making communications unintelligible to all except the
intended recipients. Its existence through the ages was usually confined to
diplomatic and military circumstances, where it was used for the concealment of
information communicated over secure and insecure lines. Today, there is an urgent need
to provide cost-effective, efficient, and secure systems to protect the vast quantity of
digital data stored and communicated by electronic data-processing systems. With the
growth in electronic fund transfers, instant electronic mail, point-of-sale terminals, home
banking, and conferencing through computers, the threat of unauthorized accessibility to
this data becomes a pressing concern of our society.
The science of reading secret messages and uncovering the cryptographic system
utilized is called cryptanalysis. This science played a vital role in the conclusion of the
Second World War. The early breaking of the Purple Code of Japan allowed the Allied
forces to continually read secret messages pertaining to strategic movements in the
Pacific; the recovery of an Enigma Machine from a sunken German submarine did the
same for the Allied command in Europe [9], [11].
Edgar Allen Poe fancied himself a skilled cryptanalyst. But there is some doubt about
the defensability of his quoted dictum. Cryptanalysis reveals deficiencies in existing
cryptographic systems. Improved cryptographic systems then pose new problems for the
cryptanalyst. The intent of this survey is to discuss the mutually reactive relationship
between the two areas, with an emphasis on the underlying mathematics.
Linear Ciphers
A cipher is a system which transforms plaintext into ciphertext by applying a set of
transformations to each character (or letter ) in the plaintext. The particular
transformations employed at any time are controlled by a “key” used at that time.
Security of the ciphertext rests heavily on the secrecy of the key; it is the objective of
the cryptanalyst to find the key and consequently break the system.
Caesar Ciphers. One of the earliest known cryptographic systems was used by Julius
Caesar and is appropriately referred to as a Caesar Cipher [26]. Around 50 B.C., Julius
Caesar wrote to Marcus Cicero, using a cipher that shifts the alphabet three places to the
right and wraps the last three letters X, Y, Z back onto the first three letters:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
D E F G H I J K L M N O P Q R S T U V W XY Z A B C
Thus, the plaintext message
MEET YOU IN ORLANDO
is transformed into the ciphertext
PHHW BRX LQ RUODQGR.
Such a transformation, using modular arithmetic, can also be performed by a computer.
Any message can be expressed digitally based on the one-to-one correspondence,
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
To encipher the plaintext message, we use the transformation
EsMd ; sM 1 3dsmodulo 26d,
where M is the numeric equivalent of a plaintext letter. To decipher, we utilize the
transformation
DsCd ; sC 1 23dsmodulo 26d,
where C is the numeric equivalent of a ciphertext letter. Hence:
O R L A N D O
14 17 11 00 13 03 14
E: D:
2
17 20 14 03 16 06 17
R U O D Q G R
It is not known why Caesar selected 3 as the key to his cipher system; he could have
chosen any integer value. Since we are operating modulo 26, there are 26 distinct keys,
one of which (the identity) offers no secrecy at all. A message enciphered by a Caesar
Cipher is extremely insecure since exhaustive cryptanalysis using the 25 nontrivial keys
is easily performed.
Decimation and Linear Ciphers. A Decimation Cipher is similar to a Caesar Cipher but
it uses multiplication, rather than addition, by a number key. In order to assure a one-to-
one correspondence among the letters of the alphabet, the key number must be relatively
prime to 26. fFor example, the multiplier 2 yields EsAd 5 EsNd 5 A, since
0 ; 2s13d(mod 26d.g Thus, the enciphering transformation must be
EsMd ; kMsmod 26d,
where M is the numeric value of a plaintext letter, and the key number k is an odd
integer which is not a multiple of 13. If k 5 3, for example, the cipher alphabet is
obtained by starting with A and selecting every third letter that follows in the plain
alphabet as we cycle through it. The deciphering algorithm is then
DsCd ; 9Csmod 26d,
where C is the numeric equivalent of a ciphertext letter. In general, the deciphering key
is the multiplicative inverse modulo 26 of the enciphering key k (the inverse must exist
since k is relatively prime to 26). A decimation cipher offers no more security than a
Caesar Cipher since there are only twelve distinct keys.
Decimation and Caesar Ciphers are subcases of a more general class of ciphers called
linear or affine ciphers. A linear cipher is defined by the enciphering transformation
EsMd ; skM 1 tdsmod 26d, (1)
where k, t are integers and k is relatively prime to 26. There are (12) (26) 5 312 distinct
linear ciphers.
Suppose the following ciphertext was generated using a linear cipher:
YHTQF SCUFD SBULX IOLBF ALYZT IDSCL YCSDO
FZYCU FAFMF ODITF YCDKV SBICD XBXCF TX.
We can compare the frequency distribution of the letters occurring in this ciphertext
(Table 1) with the expected frequency of occurrence of letters in the English Language
(Table 2).
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
2 4 7 6 0 9 0 1 4 0 1 4 1 0 3 0 1 0 5 4 3 1 0 4 5 2
Table 1. Frequencies of letters in ciphertext.

3
Letter Frequency Probability Letter Frequency Probability
A 321712 .0804 N 283561 .0709
B 61472 .0154 O 303844 .0760
C 122403 .0306 P 79845 .0200
D 159726 .0399 Q 4226 .0011
E 500334 .1251 R 244867 .0612
F 92100 .0230 S 261470 .0654
G 78434 .0196 T 370072 .0925
H 219481 .0549 U 108516 .0271
I 290559 .0726 V 39504 .0099
J 6424 .0016 W 76673 .0192
K 26972 .0067 X 7779 .0019
L 165559 .0414 Y 69334 .0173
M 101339 .0253 Z 3794 .0009

TABLE 2. Individual letter frequencies in 4 million characters of English text [6]


(Based on a sample of 8000 excerpts of 500 letters taken from the Brown University
Corpus of Present-Day American English).

The two letters which occur most frequently in the English Language are E and T, in
that order. Therefore, it is reasonable to guess that F, which occurs most frequently in
the ciphertext (see Table 1), corresponds to E, while the second most frequent letter C in
this ciphertext corresponds to T. Based on the assumption that E(4) 5 5 and E(19) 5 2,
we use (1) to generate the linear congruences
4k 1 t ; 5 smod 26d
19k 1 t ; 2 smod 26d.
The solution to this system is k 5 5 and t 5 11, obtained via elementary techniques
described in most number theory texts. Therefore, the conjectured enciphering
transformation would be
EsMd ; s5M 1 11dsmod 26d.
Since the multiplicative inverse modulo 26 of 5 is 21, and the additive inverse modulo
26 of 11 is 15, the corresponding deciphering transformation would be
DsCd ; 21sC 1 15dsmod 26d.
Based on this, we would decipher our earlier ciphertext as
ciphertext A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

3 24 19 14 9 4 25 20 15 10 5 0 21 16 11 6 1 22 17 12 7 2 23 18 13 8
plaintext D Y T O J E Z U P K F A V Q L G B W R M H C X S N I
NUMBE RTHEO RYHAS PLAYE DANIM PORTA NTROL
EINTH EDEVE LOPME NTOFC RYPTO SYSTE MS
that is,
4
NUMBER THEORY HAS PLAYED AN IMPORTANT ROLE IN THE
DEVELOPMENT OF CRYPTOSYSTEMS.
Substitution Ciphers
Caesar, decimation, and linear ciphers form a small subset of a class of ciphers known
as substitution ciphers. The key to a substitution cipher is a permutation of the twenty-
six letters of the alphabet. For example, such a key may be
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
K Y A X J B Z I N W C M G H V D T L U S P E R F O Q,
which enciphers MICKEY MOUSE as GNACJO GVPUJ.
This type of enciphering increases the number of possible keys to 26! sa number
exceeding 4 3 1026d, and this effectively eliminates exhaustive cryptanalysis. To
decipher a ciphertext encrypted by a general substitution cipher, one must use the
statistical frequency distribution of single-letter occurrences in the English Language. If
this doesn’t suffice, more information can be obtained from language patterns [26]. By
comparing frequency occurrences for letters, initial letters, final letters, digraphs
(combinations of two letters), and trigraphs (combinations of three letters) of a
ciphertext with known characteristic frequencies of the English Language, one can
eventually reconstruct the key and decipher the text.
This technique is especially efficient because each plaintext letter is represented always
by the same ciphertext letter. Consequently, all the properties of the plaintext language
are retained in the ciphertext language. These invariant properties of substitution ciphers
can be utilized to obtain the key and break the cipher. It can be shown that an average of
twenty-eight letters of ciphertext is needed to uniquely solve for the key of a substitution
cipher [11].
Polyalphabetic Ciphers
One way to obtain greater security is to use a cipher which guarantees that a given
plaintext letter is not always represented by the same ciphertext letter. This can be
achieved by using a sequence of n different substitution ciphers, in periodic fashion with
period n, to encipher a message. The use of such substitution ciphers increases the
effective number of possible keys from 26! to s26!dn. For n 5 3, the magnitude of this
number is approximately equal to the total number of atoms in the universe!
A classic polyalphabetic enciphering procedure, devised by the French cryptographer
Vigenére, utilizes both a keyword and the Caesar cipher transformation. Suppose we
wish to encipher the plaintext message
MEET YOU AT SPACE MOUNTAIN,
using the keyword MATH. To encipher the message, one uses the sequence of four
different Caesar ciphers, where the key values 12, 0, 19, 7 are the respective numeric
equivalents of M, A, T, H. This sequence is applied consecutively to groups of four
letters of plaintext, converting each group into ciphertext. For the first group of four
letters, MEET, we obtain YEXA. Specifically, for EksMd denoting sM 1 kd(mod 26d:

5
Keyword M A T H Plaintext M E E T
12 0 19 7 12 4 4 19
E12 E0 E19 E7

24 4 23 0
Ciphertext Y E X A
Proceeding in this manner for each consecutive group of four letters, we obtain
the ciphertext
YEXAKONHFSIHOEFVGNMHUNSG.
Note that the double E in the plaintext MEET no longer appears as a double letter in the
ciphertext. Furthermore, the two F’s in the ciphertext correspond to different plaintext
letters. Since plaintext letters are not always represented by the same ciphertext letters,
the non-uniform frequency distributions of single letters, digraphs, trigraphs, etc. in
plaintext are smoothed out in the ciphertext. This smoothing out improves the security
of the system significantly.
An Unbreakable Cipher. To obtain a truly “unbreakable” system, one can select a
polyalphabetic cipher whose key, consisting of randomly selected numbers, has the
length of the plaintext message [13], [25]. This cipher, called a “one-time pad” or
Vernian cipher, was invented by the U.S. Army Corps in 1917. The one-time pad is
invincible since it is equiprobable that a plaintext character is represented by any
ciphertext character, and revealing patterns no longer exist because each choice of
representation is random. The number of possible keys of length n is s26dn, which for
large n makes exhaustive cryptanalysis infeasible. Note, for example, that s26d56 < 1079.
This cipher is also theoretically unbreakable, as the following example illustrates.
Plaintext S E N D M O R E M O N E Y
18 04 13 03 12 14 17 04 12 14 13 04 24
Key 09 26 01 07 23 15 21 14 11 11 02 08 09
01 04 14 10 09 03 12 18 23 25 15 12 07
Ciphertext B E O K J D M S X Z P M H
Plaintext R U N T O E X E R C I S E
17 20 13 19 14 04 23 04 17 02 08 18 04
Key 10 10 01 17 21 25 15 14 06 23 07 20 03
01 04 14 10 09 03 12 18 23 25 15 12 07
Ciphertext B E O K J D M S X Z P M H
There is no logical basis for determining which of the two plaintext messages above
corresponds to the ciphertext
BEOKJDMSXZPMH.
Using an appropriate sequence of values in the key, any thirteen-letter message can be
enciphered to yield the same ciphertext.
The one-time pad is one method of secret radio communication used by the
U.S.S.R.[10]. It is also employed on the “hot line” between Moscow and Washington.
Even though this system offers absolute secrecy, it poses key-management problems of

6
enormous proportions: both the sender and the receiver of a message must have the
identical sequence of random numbers in order to communicate via such a system.
Although a key could be sent in advance each time a message is to be communicated,
unnoticed interception of the key would jeopardize the secrecy of the forthcoming
communication. Furthermore, the key can only be used once since repeated use would
generate recognizable patterns for the cryptanalyst’s use.
The problem of generating, distributing, and cancelling keys is unmanageable in a
system where there is a high traffic volume. In wartime, millions of key characters
would be needed daily [10]. Even with today’s powerful computers, this system is
expensive and time consuming.
Public-key Cryptosystems
In 1975, a significant new type of cipher system, called a public-key cryptosystem, was
proposed by Whitfield Diffie and Martin Hellman [4]. The security of this new system is
not measured by the complexity of the enciphering algorithm, nor is it measured by
statistical uncertainty. Instead, the system’s security relies on discoveries in a young and
important branch of computer science called computational complexity theory.
Complexity theory primarily deals with the analysis and design of algorithms, and
especially with the number of computational steps needed to complete an algorithm.
The security of any cipher is now measured by the expected amount of time a computer
would expend to break the cipher.
In a “public-key cryptosystem” each user in the communication network places an
encryption algorithm E and cipher key in a public file (analogous to a telephone
directory). The corresponding decryption algorithm D of the user is kept secret. Each
user in the system must select his own encryption-decryption pair so as to satisfy the
following properties:
(i) If 3 and #, respectively, represent all feasible plaintext and ciphertext messages,
then E: 3→# and D: #→3
(ii) E and D are inverse transformations; that is,
EsDsCdd 5 C and DsEsMdd 5 M for all C [ # and M [ 3.
(iii) The pair (E, D) can be easily found (in the computational sense) by the owner, and
E and D are easy to compute.
(iv) It is computationally infeasible for anyone except the owner to determine D, even
if its inverse E is known.
Because of conditions (ii) and (iv), we call E a “trapdoor” or “one-way” function. It is
“one-way” in the sense that for given M it is easy to compute E(M); but given E(M), it is
effectively impossible to compute M unless certain private (trapdoor) information is
known. According to convention in computational complexity theory, a function is
“easy” to compute if there exists an algorithm which computes it using approximately
kda computational steps, where k and a are constants, and d is the “size” of the input. If
no such algorithm exists, it is said to be “hard” to compute.
Let us inspect more closely how the system works. Suppose person A desires to send
message M to person Z. Person A looks up, in the public directory, the public

7
enciphering algorithm Ez of Z and sends the message as EzsMd. sThere is absolutely no
fear of interception—wiretapping and spying—since Z is the only individual that has
access to the deciphering algorithm, Dz.d Then, Z deciphers the message by applying the
algorithm Dz to EzsMd. Specifically, DzsEzsMdd 5 M.
A major advantage of this system is that it avoids the necessity of distributing a new key
before the message is sent. This is obviously important for the success of any electronic
mail system.
The public-key cryptosystem also allows “signatures.” A signature is a guarantee that
the message has been issued by the sender. Signatures are a welcome bonus of the
system. They are used in many contexts (electronic fund transfers, electronic mail, home
banking, conferencing through computers, etc.).
If a person A desires to send person Z a message, signed to insure its origin, then A can
use his private deciphering algorithm DA as a signature. Specifically, A sends EzfDAsMdg
to Z, and Z uses EAHDzsEzfDAsMdgdJ 5 M. Note that only Z can read the encrypted
message EzfDAsMdg since only Z knows the private deciphering algorithm Dz. When Dz
is applied, Z is left with the text DAsMd, which is still unreadable. But then Z can read
this message by applying the public enciphering algorithm EA of A. If the message is
now meaningful, Z has the assurance that it was sent by A, since only A knows the
private deciphering algorithm DA. It should be noted that condition (ii) of the public-key
cryptosystem can be relaxed without any great damage. If DsEsMdd 5 M for all M [ 3
and EsDsCdd 5 C for only a fraction of the possible C [ #, then signatures can still be
obtained [12].
The enormous key management problems of the one-time pad system are virtually
nonexistent in a public-key cryptosystem. Since each user of the system generates only
one pair of keys, and since there is a public directory for the enciphering keys of the
users, there are no distribution problems. Theoretically, this system overcomes the
primary deficiency of the unbreakable one-time pad system, while it apparently
maintains the same level of security.
The RSA Public-key Cryptosystem
In 1977, Ronald L. Rivest, Adi Shamir, and Leonard Adleman, all of MIT, developed an
elegant way to implement the Diffie-Hellmen system, using only elementary ideas from
number theory [21], [22]. A little background is necessary to completely understand the
workings of their system, referred to as the RSA cryptosystem.
In 1640, Pierre Fermat communicated a result to one of his correspondents, Frénicle de
Bessy, an official at the French mint. Frénicle’s unique ability to work successfully with
large numbers represented a challenge to Fermat, and consequently, he was sometimes
the recipient of Fermat’s most guarded results. This communication contained what is
now known as Fermat’s Little Theorem, along with the comment “I would send you the
demonstration, if I did not fear it being too long.”
Fermat’s Little Theorem. If p is a prime and a is a positive integer not divisible by p,
then ap21 ; 1 (mod p).
(Euler published the first proof of this theorem in 1763, although Leibniz had written an
identical argument in an unpublished manuscript before 1683.)
8
Armed with Fermat’s Little Theorem, we can now design our public-key cryptosystem.
To generate the keys, each user in the system must select two large distinct prime
numbers p and q, each about 100 digits long. (As we shall discuss later, there exist
efficient algorithms that can generate a one-hundred digit prime on a high speed
computer in less than two minutes.) Now let n 5 pq and choose an integer e between 3
and n such that e is relatively prime to fsnd 5 sp 2 1dsq 2 1d. sNote that any prime
number e satisfying maxHp,qJ < e < n will suffice.d Since se, fsndd 5 1, we know that
e has a multiplicative inverse modulo fsnd; call it d. Thus, ed ; 1smodfsndd. Moreover,
the Euclidean Algorithm can be used efficiently to find both this positive integer d and a
corresponding negative integer b such that ed 1 bfsnd 5 1. The user then submits e
and n to the public directory, and keeps d, p and q private.
Suppose we desire to encipher a plaintext message. Using the mapping
A ↔ 00, B ↔ 01, . . . , Z ↔ 25, convert the message of t letters to an integer M
consisting of 2t digits. Partition the consecutive digits of M into blocks Mi of equal
length so that each block represents a number less than n 5 pq. The enciphering
transformation is given by
EsMid ; Mei (mod nd with 0 ≤ EsMid < n.
To decipher the numeric ciphertext C generated by enciphered cipher blocks Ci, use the
transformation
DsCid ; Cdi (mod nd with 0 ≤ DsCid < n
on each cipher block Ci.
As an example of an RSA encryption-decryption pair, consider the following small, but
illustrative, example. Let p 5 47 and q 5 61 be the two selected primes. Then
n 5 pq 5 2867, and fsnd 5 sp 2 1dsq 2 1d 5 s46ds60d 5 2760. Selecting e 5 49, we
use the Euclidean Algorithm to verify that it is relatively prime to fsnd:
fsnd 5 56s49d 1 16
49 5 3s16d 1 1
16 5 16s1d 1 0.
The last nonzero remainder 1 is the greatest common divisor of fsnd and e. Since fsnd
and e are relatively prime, e 5 49 has a multiplicative inverse modulo fsnd; call it d.
Using the equalities above in reverse, it is easy to find d:
1 5 49 2 3s16d
1 5 49 2 3ffsnd 2 56s49dg 5 169s49d 1 s23ds2760d.
Consequently, 169s49d ; 1smod 2760d. Therefore, d 5 169.
The resulting encryption algorithm is
EsMd ; M49smod 2867d,
and the decryption algorithm is
DsCd ; C169smod 2867d,
where M and C are numeric blocks less than 2867. To encipher the plaintext message
9
EPCOT CENTER IS SPECTACULAR
translate first into numeric form
04 15 02 14 19 02 04 13 19 04 17 08 18 18 15 04 02 19 00 02 20 11 00 17
and partition into blocks of digit-length 4. Then encrypt and decrypt each block by the
functions E and D, as defined above. The result is:
Original Block Mi Encryption EsMid Decryption DfEsMidg
0415 2261 0415
0214 2536 0214
1902 2329 1902
0413 2243 0413
1904 1416 1904
1708 1464 1708
1818 2688 1818
1504 1504 1504
0219 1544 0219
0002 2264 0002
2011 0786 2011
0017 2328 0017
Translating the right-hand column reveals the correct message.
In general, do E and D satisfy the four conditions of a public-key cryptosystem?
If we define 3 5 # to be the set of all nonnegative integers less than n, then
condition (i) is satisfied. To verify condition (ii), EsDsCdd ; Cde ; C (mod n) and
DsEsMdd ; Med ; M (mod n) for C [ # and M [ 3, it suffices to prove
Med ; M (mod n) for all nonnegative integers M < n 5 pq. (2)
If p doesn’t divide M,
Mp21 ; 1 (mod p)
by Fermat’s Little Theorem. Now recall that b < 0 and ed 1 bfsnd 5 1. Since (p 2 1)
divides fsnd,
M2bfsnd ;1 (mod p)
and it follows that
M2bfsnd 11 ; M (mod p).
Thus, Med ; M (mod p) whenever p is not a factor of M. But clearly this also holds
when p does divide M. In a similar manner,
Med ; M (mod q).
Since p and q are relatively prime, (2) holds for all M [ 3.
Note that our enciphering transformation was defined only on nonnegative integers less
than n in order to guarantee that it is one-to-one.

10
To verify condition (iii), we must first show that the pair sE, Dd can be easily found
from a computational point of view. The algorithm for determining this pair begins with
finding two distinct one-hundred digit primes. There are several primality tests
available—as, for example, Rabin’s test [18] and the Solovay-Strassen test [27].
The former uses a test due to G.L. Miller:
Let m be a positive integer with m 2 1 5 2st, where s is a nonnegative integer and
t is an odd positive integer. The integer m is said to pass Miller’s test for the base
b if either bt ; 1 (mod m) or b2 t ; 21 (mod m) for some integer j [ f0, s 2 1g.
j

To test a number m for primality using Rabin’s test, perform Miller’s test for k bases
each less than m. If the integer m passes all k tests, the probability that it is not prime is
s1y4dk. For k 5 100, this probability is s1y4d100 < 10260. Although this test is not
deterministic in nature, it gives a high degree of certainty.
Recently, a deterministic primality algorithm was announced that has “nearly”
polynomial run time [2], [16]. The algorithm is quite complicated, both in theory and in
practice, since many different cases are covered in computer program implementation.
Motivated by this test, there have appeared other algorithms dependent on the work of
H. Cohen and H. W. Lenstra. One algorithm with particularly impressive run times was
implemented on a CDC Cyber 170-175 computer and can test a 100-digit number in
about 30 seconds, a 200-digit number in 8 minutes, and a 1000-digit number in a week
[5], [19].
How many 100-digit numbers must be tested before we find one which passes Rabin’s
test and is (most likely) a prime? In 1793, Gauss conjectured that if psxd represents the
number of primes not exceeding x, then psxd < xyln x for large x. This was proved
independently by Hadamard and de la Vallée Poussin (1896), and is referred to as the
Prime Number Theorem. The probability that a randomly selected odd integer in the
interval from 10y to 10x is prime is approximately
ps10xd 2 ps10yd
number of primes
number of odd integers
5
s10 2 10 dy2
x y
<
10
2
x2y
10x2y
2
1
1
2 1 x ln 10 y ln 10
. 2
Consequently, the probability that a 100-digit number is prime is approximately
2
1 10
2
1
9 100 ln 10 99 ln 10 2
< .008676.

On the average,
1
< 115
.008676
odd 100-digit numbers would be tested before a prime is found. The entire procedure
outlined above requires only a few minutes of computer time to find a 100-digit prime.
Although this may seem to be complicated, keep in mind that each user needs to
complete this procedure only twice. Once p, q, are obtained, one can find positive
integers d and e, as described earlier. Thus, the pair sE, Dd can be “easily” found by the
owner. Moreover, E and D are “easy” to compute using modular exponentiation [17].
A 200-digit message can be encrypted or decrypted in a few seconds on a high speed
computer.
11
To complete the verification that the RSA cryptosystem is a public-key cryptosystem, it
suffices to show that E is a trapdoor function. Recall that the public-key information is e
and n, whereas d, p, and q are kept private. If d is known, any ciphertext C 5 EsMd can
be read. But how “difficult” is it to find d? By definition, d satisfies ed ; 1 (mod f(n)).
So if fsnd can be found, then d can be found as shown in an earlier example. Suppose
fsnd is known. Then, since n is public information, it is easy to find
p 1 q 5 pq 2 sp 2 1dsq 2 1d 1 1 5 n 2 fsnd 1 1
p 2 q 5 fsp 1 qd2 2 4pqg1y2 5 fsp 1 qd2 2 4ng1y2,
and therefore,
1 1
p 5 fsp 1 qd 1 sp 2 qdg and q 5 fsp 1 qd 2 sp 2 qdg.
2 2
Suppose, conversely, that d is known. Then a multiple of fsnd, namely ed 2 1, can be
found. Given a multiple of fsnd, it is possible to factor n 5 pq (see [14]). Thus, finding
fsnd or d is computationally no easier than factoring n. The fastest factoring algorithm
known [15] uses approximately exps!sln Nd ln sln Ndd bit operations to factor an integer
N. Assuming that each bit operation takes one microsecond s1026 secondd, we obtain the
following factorization times:
Integer N Number of Bit Operations Time
1050 1.4 3 1010 3.9 hours
1075 9.0 3 1012 104 days
10100 2.3 3 1015 74 years
10200 1.2 3 1023 3.8 3 109 years
10300 1.5 3 1029 4.9 3 1015 years
Since the integer n 5 pq has approximately 200 digits, it appears that it is
computationally infeasible to find D, based only on knowing E. Thus, E is a
one-way function and the trapdoor information is the prime factorization pq of n.
Is it possible to decipher a ciphertext message without finding the deciphering algorithm
D, and hence avoiding the factorization of n 5 pq? In 1977, G. L. Simmons and M. J.
Norris proposed a method of cryptanalyzing the RSA cryptosystem by using successive
encryptions by E. They generated a sequence C1, C2, . . . by defining C1 5 EsMd and
Cj11 5 EsCjd, where M is the plaintext message. The sequence is terminated when an
integer t is reached such that Ct 5 C1. Then Ct21 5 M, since EsMd 5 EsCt21d.
However, it has been shown [20] that the probability of this attack being successful is
extremely low if the chosen primes p, q, differ in length by only a few digits,
sp 2 1, q 2 1d is small, and both p 2 1 and q 2 1 have large prime factors. Other
cryptanalytic approaches to the RSA cryptosystem have been suggested, many of which
have been quickly followed by counterarguments.
Since the announcement of the RSA cryptosystem, no generally effective attacks have
surfaced to threaten its security. Ironically, the research activity stimulated by this
system has enhanced its security. In order to protect an RSA encryption cipher from
attack, certain conditions must be satisfied:
12
(i) p and q should each be at least 100 digits
(ii) p and q should differ in length by a few digits
(iii) sp 2 1, q 2 1d should be large
(iv) Each of the integers p ± 1, q ± 1 should have at least one large prime factor.
Property (i) ensures that n is extremely time consuming to factor by trial or by the
Monte Carlo method. Properties (ii)-(iv), in conjunction with (i), guarantee that no
special factorization techniques (such as the difference of squares, and the p 1 1 or
p 2 1 method) or special decryption methods (such as successive encryption) can be
effectively used. Since it is possible to construct a modulus satisfying all four conditions
[28], we can build a secure RSA cryptosystem today.
It has not been proven that the problem of breaking the RSA cryptosystem is equivalent
in difficulty to factoring the modulus n 5 pq. There could be a method that would
decipher messages without identifying the trapdoor information. This appears unlikely
however, since all known deciphering methods that work in the general case are
equivalent to factoring n. Factorization is a computationally “hard” problem; that is,
there is no general-purpose prime factorization algorithm that computes the factors of n
in a number of steps bounded by a polynomial in n. As long as prime factorization
continues to be a “hard” problem and no new cryptanalytic techniques develop, the RSA
system appears secure; it will always be “easier” to find large primes than it will be to
factors numbers of the same magnitude. It certainly is exciting that a conjecture by
Fermat in 1640 plays such a pivotal role in cryptology today.
Trapdoor Knapsacks
In 1977, a public-key cipher was proposed, with its security depending on the difficulty
of solving the classic knapsack problem [12]:
Given positive integers a1, a2, . . ., an and a sum S, solve
S 5 a1x1 1 a2x2 1 . . . 1 anxn
for the xi's s1 ≤ i ≤ nd, where each xi 5 0 or 1.
This is a notoriously “hard” problem [8]. An exhaustive search requires a check of the
2n possibilities for sx1, x2, . . . , xnd. The best known method for finding a solution of the
knapsack problem requires approximately 2ny2 computational steps. So for n 5 100, a
reliable computer solution is infeasible. However, for certain sequences a1, a2, . . ., an,
the solution of the knapsack problem is quite “easy.”
A sequence Ha1, a2, a3, . . . , anJ is said to be superincreasing if it satisfies
j21

oa
i51
i < aj for each j 5 2, 3, . . . , n.

Assume we have such a sequence, and that the preselected sum S can be attained using
some subset of the sequence. To solve S 5 oi51
n
aixi, we proceed as follows: let xn 5 1
if S ≥ an, and xn 5 0 if S < an. Then find xn21, xn22, , , ,. x1, in succession, using

13
5
n
1, if S 2 o
i5j11
xiai ≥ aj
xj 5 n
0, if S 2 o
i5j11
xiai < aj

for j 5 n 2 1, n 2 2, . . ., 2, 1. This can be solved rapidly on a computer, even for


large n.
To build a public-key cryptosystem, assume that each user in the system selects a
superincreasing sequence Ha1, a2, . . ., anJ, an integer m > 2an, and an integer w which
is relatively prime to m. Let w represent the multiplicative inverse modulo m of w (this
can be easily found using the Euclidean Algorithm). Now for the not necessarily
superincreasing sequence by Hb1, b2, . . ., bnJ by
bi ; waismod md
with 0 < bi < m for i 5 1, 2, . . ., n.
The public-key information is the sequence Hb1, b2, . . ., bnJ, while the trapdoor
information is m and w. Suppose someone wants to send this user a message M.
The message is first translated into a string of 0’s and 1’s, using the binary equivalent
to letters:
Letter Binary Equivalent Letter Binary Equivalent
A 00000 N 01101
B 00001 O 01110
C 00010 P 01111
D 00011 Q 10000
E 00100 R 10001
F 00101 S 10010
G 00110 T 10011
H 00111 U 10100
I 01000 V 10101
J 01001 W 10110
K 01010 X 10111
L 01011 Y 11000
M 01100 Z 11001

| |
This string of 0’s and 1’s is then split into blocks of length n (if the length 5 M of a
string is not divisible by n, fill in the last block with 1’s). Each plaintext block
Xi 5 Hxi1, xi2, . . ., xinJ si 5 1, 2, . . ., kd is enciphered as
n
Si 5 obx .
j51
j ij

→ → →
[Note that Si can be interpreted as the dot product b ? Xi for b 5 Hb1, b2, . . ., bnJ.]
The sums HS1, S2, . . ., SkJ corresponding to these enciphered blocks form the ciphertext
message. There is no concern about interception of the ciphertext since deciphering the
text requires solving a group of “hard” knapsack problems of the form Si 5 oj51 n
bj xij.
However, when m and w are known (the private trapdoor information held by the user),
14
these “hard” knapsack problems can be transformed into “easy” knapsack
problems without changing the solution sets. We illustrate this for a single block.
The transformation is as follows. For Si 5 oj51
n
bj xij:

5 o a x 6smod md.
n n
wSi 5 o
j51
wbj xij ;
j51
j ij

Note that m > 2an (by definition) and 2an > oj51 n a (since Ha , a , . . ., a J is
j 1 2 n
~ ~
superincreasing) yields oj51 n
aj xij < m. If S i ; wSismod mdwith 0 ≤ S i < m, then
~ ~
S i 5 oj51
n
aj xij. Solving these easy knapsack problems, S i 5 oj51n
aj xij, yields solutions
to the hard knapsack problems Si 5 oj51 bj xij, and consequently, the original plaintext
n

message M.
Example of a knapsack cipher. Suppose your private-key is the superincreasing
sequence {11, 15, 30, 60} and you’ve selected numbers m 5 150 and w 5 77.
Then your public-key sequence, defined by
bj ; 77aj smod 150d sj 5 1, 2, 3, 4d,
is {97, 105, 60, 120}. To encipher the message
DONALD DUCK
first translate each letter into its binary equivalents
D O N A L D D U C K,
00011 01110 01101 00000 01011 00011 00011 10100 00010 01010
and then partition this sequence of 0’s and 1’s into blocks of length four:
0001 1011 1001 1010 0000 0101 1000 1100 0111 0100 0001 0010 1011.
→ →
For each block Xi, form the corresponding dot product sum Si 5 b ? Xi using

b 5 H97, 105, 60, 120}:
S1 5 0s97d 1 0s105d 1 0s60d 1 1s120d 5 120
S2 5 1s97d 1 0s105d 1 1s60d 1 1s120d 5 277
S3 5 1s97d 1 0s105d 1 0s60d 1 1s120d 5 217
S4 5 1s97d 1 0s105d 1 1s60d 1 0s120d 5 157
S5 5 0s97d 1 0s105d 1 0s60d 1 0s120d 5 0
:
S13 5 1s97d 1 0s105d 1 1s60d 1 1s120d 5 277.
The ciphertext is then 120, 277, 217, 157, 0, . . . , 277. To decipher, find the
multiplicative inverse (modulo 150) of w 5 77. Solving 77w ; 1smod 150d yields
w 5 113. Then multiply each Si by w. Thus, for example, S1 5 120 is deciphered to 60
since
113s120d 5 60 smod 150d.

15
The remaining sums become 101, 71, 41, 0, . . ., 101, respectively. Each of these
~
transformed sums S i corresponds to an “easy” knapsack problem relative to the
superincreasing sequence {11, 15, 30, 60}. Solving the knapsack problems
~
S i 5 11x1 1 15x2 1 30x3 1 60x4
for each sum yields
0001, 1011, 1001, 1010, 0000, . . ., 1011.
Partitioning the string 00011011100110100000 . . . 1011 into blocks of length five
identifies the message:
00011 01110 01101 00000 ... 01010
D O N A ... K
A cryptosystem based on the public-key knapsack cipher does not satisfy the four
conditions in the definition of the public-key cryptosystem. The enciphering algorithm is
clearly not onto, since only certain sums can be generated by the integers b1, b2, . . ., bn.
It can be shown, however, that all of the conditions, except this one, are satisfied.
Without EsDsCdd 5 C holding in general, it is impossible to use the signature scheme
that was described earlier. For a discussion of knapsack-based algorithms and signatures,
see [23].
Knapsack cryptosystems have received their share of attention since 1978. They were
initially favored over the RSA system since encryption-decryption was faster. Special
integrated circuit chips to implement a knapsack cryptosystem have even been
considered by a few companies. Recently, this attention has diminished significantly.
In April 1982, a fundamental cryptanalytic breakthrough was made when a polynomial
time attack on “almost all” (Merkle-Hellman) knapsack cryptosystems was carried
out [24]. The attack, by A. Shamir, demonstrated that certain information about
superincreasing sequences is not well hidden by modular multiplication. Using only the
sequence Hb1, b2, . . ., bnJ and ideas from Diophantine approximation, it is possible to
find enough secret information to solve any knapsack problem involving the weights b1,
b2, . . ., bn. Several knapsack cryptosystems have been proposed and broken. For
example, the Graham-Shamir scheme was subsequently broken by Adleman, using a
polynomial time algorithm [1].
Meanwhile, Shamir collected a prize of $100 from Merkle for breaking his basic
scheme. Merkle, distressed by misleading media reports of this special case solution,
offered a $1,000 prize to anyone who could break the multiply-iterated version of his
basic scheme [12]. Several attacks followed. In the fall on 1984, Ernest Brickell, of
Sandia Labs, collected the $1,000 prize. Brickell’s technique depends on the fact that
modular multiplication is the only method used to keep the private key hidden. He found
a way to use a single function to change the multiple iterations into an easy knapsack
problem. The technique identifies the plaintext message without finding the original
superincreasing sequence. Knapsack ciphers with 100 weights and 20 iterations can be
broken using less than 2 hours of computer time on a Cray 1S computer. This technique
is expected to break a 1000-weights, 40-iterations cipher in 750 hours on the same
computer [3].

16
Does all this rule out the possibility of the existence of a secure knapsack cryptosystem?
Not necessarily. But certainly any new knapsack systems must find other ways, besides
modular arithmetic, to hide private information. There is already a new knapsack cipher,
proposed by B. Chor and R. Rivest, based of arithmetic in finite fields. It attempts to
avoid the flaws identified by successful attacks on prior knapsack systems, and only
time will determine its success of failure. It is interesting to note, in light of Edgar Allen
Poe’s quote, that when Chor-Rivest proposed their system they remarked, “At the
moment, we do not know of any attacks capable of breaking this system in a reasonable
amount of time.”
REFERENCES
1. L. M. Adleman, “On Breaking the Generalized Knapsack Public-Key
Cryptosystem,” Proceedings 15th Annual ACM Symposium on Theory of
Computing (1983) 402–412.
2. L. M. Adleman, G. Pomerance, and R. S. Rumely, “One Distinguishing Prime
Numbers from Composite Numbers,” Annals of Mathematics 117 (1983) 173–206.
3. E. Brickell, Breaking Iterated Knapsacks,” Lecture Notes in CS, 196, Advances
in Cryptology: Proceedings of CRYPTO 84, Springer-Verlag, New York, 1985,
pp. 342–357.
4. W. Diffie and M. E. Hellman, “New Directions on Cryptography,” IEEE
Transactions of Information Theory IT-22 (November 1976) 644–654.
5. J. Dixon, “Factorization and Primality Tests,” American Mathematical Monthly 91
(1984) 333–352.
6. W. Francis, “A Standard Sample of Present-Day Edited American English for Use
with Digital Computers,” Linguistics Department, Brown University, 1964.
7. M. Gardner, “A New Kind of Cipher that Would Take Millions of Years to Break,”
Scientific American 237 (1977) 120–124.
8. M. R. Garey and D. S. Johnson, Computers and Intractability, A Guide to the
Theory of NP-Completeness, W. H. Freeman and Company, San Francisco, 1979.
9. A. Hodges, Alan Turing: The Enigma, Simon & Schuster, New York, 1983.
10. D. Kahn, The Codebreakers, the Story of Secret Writing, MacMillan, New York,
1967.
11. A. G. Konheim, Cryptography: A Primer, Wiley-Interscience, New York, 1981.
12. R. Merkle and M. Hellamn, “Hiding Information and Signatures in Trapdoor
Knapsacks,” IEEE Transactions of Information Theory IT-24, No. 5 (September
1978) 525–530.
13. C. H. Meyer and S. M. Matyas, Cryptography: A New Dimension in Computer Data
Security, Wiley-Interscience, New York, 1982.
14. G. L. Miller, “Riemann’s Hypothesis and Tests for Primality,” Journal of Computer
and System Sciences 13, No. 3 (December 1976) 300–317.
15. C. Pomerance, “Analysis and Comparison of Some Integer Factoring Algorithms,”
17
Computational Methods in Number Theory, Math Centrum Tracts 154, Part I (1982)
89–139.
16. ______, “The Search for Prime Numbers,” Scientific American 247 (December
1982) 122–130.
17. _____, Lecture Notes on Primality Testing and Factoring, MAA Notes, No. 4,
Mathematical Association of America, Washington D.C., 1984.
18. M. O. Rabin, “Probabilistic Algorithm for Testing Primality,” Journal of Number
Theory 12 (1980) 128–138.
19. H. Riesel, Prime Numbers and Computer Methods for Factorization, Progress in
Mathematics 57, Birkhäuser, Boston, Massachusetts, 1985.
20. R. L. Rivest, “Remarks on a Proposed Cryptanalytic Attack on the M.I.T. Public-
Key Cryptosystem,” Cryptologia 2 (1978) 62–65.
21. R. L. Rivest, A. Shamir, and L. Adleman, “A Method for Obtaining Digital
Signatures and Public-Key Cryptosystems,” Laboratory for Computer Science,
M.I.T., LCS/TM-82 (April 1977).
22. _____, “A Method for Obtaining Digital Signatures and Public-Key
Cryptosystems,” Communications of the ACM 21 : 2 (1978) 120–126.
23. A. Shamir, “The Cryptanalytic Security of Compact Knapsacks,” M.I.T., LCS/TM-
164 (1980).
24. _____, “A Polynomial Time Algorithm Testing for Breaking the Basic Merkle-
Hellman Cryptosystems,’ Proceedings of the 23rd Annual Symposium of the
Foundations of Computer Science (1982) 145–152.
25. C. E. Shannon, “Communications Theory of Secrecy Systems,” Bell Sys, Tech.
Journal 28 (1949) 656–715.
26. A. Sinkov, Elementary Cryptanalysis—A Mathematical Approach, New
Mathematical Library, No. 22, Mathematical Association of America, 1966.
27. R. Solovay and V. Strassen, “A Fast Monte-Carlo Test for Primality,’ SIAM Journal
on Computing 6 (March 1977) 84–85.
28. R. C. Williams and B. Schmid, “Some Remarks Concerning the M.I.T. Public-Key
Cryptosystems,” Science Report No. 91 (1979) Department of Computer Science,
University of Manitoba, Winnipeg, Canada.

18

You might also like