C Fakepathc Fakepathc Fakepathcrypto
C Fakepathc Fakepathc Fakepathcrypto
University)
Cryptography Basics
• Secret Key Encryption is also called Symmetric Key Encryption: the key used for encryption is the
same as the key for decryption.
– Drawbacks: this cipher can be easily broken using frequency analysis. In any language, the
frequencies of the characters are different. For example, in English, ’z’ appears much less
frequent than ’t’. Although each character is mapped to another character in the substitution
cipher, their frequencies can reveal themselves.
• Polyalphabetic Substitution Cipher: using multiple substitution alphabets.
– The Enigma machine is more complex but still fundamentally a polyalphabetic substitution ci-
pher.
– Units of the plaintext are rearranged in a different and usually quite complex order, but the
units themselves are left unchanged. By contrast, in a substitution cipher, the units of the
plaintext are retained in the same sequence in the ciphertext, but the units themselves are
altered.
– Example
Lecture Notes (Syracuse Cryptography Basics: 2
University)
– In January 1997, NIST announced that they wished to choose a successor to DES to be known
as AES.
– In September 1997, NIST ofifcially calls for new algorithms for AES. The algorithms were all
to be block ciphers, supporting a block size of 128 bits and key sizes of 128, 192, and 256
bits. Such ciphers were rare at the time of the announcement.
– In the nine months that followed, fifteen different designs were created and submitted from
several different countries.
– These algorithms were investigated by cryptographers. The investigation focused on security,
performance, feasibility, and other factors. During the process, some algorithms were
eliminated because of their weakness in security, some were eliminated because of the
performance or other factors.
– In August 1999, NIST announced that they were narrowing the field from fifteen to five. All
five algorithms, commonly referred to as ”AES finalists”, were designed by cryptographers
consid- ered well-known and respected in the community.
∗ Rijndael (Pronuciation ”Rain Doll”).
∗ IDEA (International Data Encryption Algorithm), used by PGP
∗ Blowfish (Bruce Schneier).
∗ RC5 (Rivest).
∗ CAST-128, used by PGP.
– These finalists went through a further round of intense analysis and cryptanalysis.
– On October 2, 2000, NIST announced that Rijndael had been selected as the proposed AES.
• Note: The security of AES and many other well-known encryption algorithms (except the one-time
pad) has never been proven. They are considered secure because they have been thoroughly investi-
gated by many cryptographers, and so far nobody could break them.
• Known-Plaintext Attack: the attacker has obtained some (plaintext, ciphertext) pairs, and they use
these known pairs to find out the other things that they do not know, such as the key and the
unknown plaintexts.
• Chosen-Plaintext Attack: attackers can select any plaintext, and ask the encryption system to
produce a ciphertext. The attackers can do this for many times. Attakers then try to use the
information to break the encryption scheme.
In the real life, all the above three scenarios can happen. Therefore, to become a strong cryptosystem,
a cryptosystem should resist all the above three attacks.
– Parallelization
∗ Encryption cannot be parallelized
∗ Decryption can be parallelized
• Counter (CTR)
Padding. Because a block cipher works on units of a fixed size, but messages come in a variety of
lengths, some modes (mainly CBC) require that the final block be padded before encryption. Several
padding schemes exist.
• Add null bytes to the palintext. Care must be taken so the original length of the plaintext can be
recovered.
• The original DES method: add a single one bit, followed by enough zero bits to fill out the block; if
the message ends on a block boundary, a whole padding block will be added.
• PKCS#5 Standard: each padding octet contains the number of octets that are added in the padding.
The following is an example for a 128-bit block cipher that uses the PKCS#5 padding scheme (in
the first example, 0x09 is used in the padding, because 9 octets are added in the padding; in the
second example, 0x10 is used because 16 octets are added):
• CFB, OFB and CTR modes do not padding. The size of ciphertext is the same as the size of
plaintext. This characteristic of stream ciphers makes them suitable for applications that require the
encrypted ciphertext data to be the same size as the original plaintext data, and for applications that
transmit data in streaming form where it is inconvenient to add padding bytes.
Lecture Notes (Syracuse Cryptography Basics: 8
University)
• Collision Free Property: computationally infeasible to find two messages that hash to the same hash
value.
– For m-bit hash, using the brute-force attack, it takes only about 2m/2 messages, chosen at ran-
dom, before one would find two with the same value (like the birthday problem).
– MD5 is broken: it is found not to be collision free!
• Why do we want collision-free property?
– Example: We can construct M1 and M2, such that h(M1) = h(M2). The meaning of M1 and
M2 can be exactly the opposite.
– M1 =“Alice owes Kevin $100”. M2 =“Alice owes Kevin $1M”. Alice signs h(M1), but not
h(M2). If h(M1) = h(M2), Alice will be in trouble.
– Even if the hash is not collision free, but finding such a meaningful M1 and M2 is not easy.
However, it might be possible to find h(M1, r1) = h(M2, r2), where r1 and r2 are random
numbers.
• Hash Algorithms
– MD2 (Message Digest) – by Rivest
– MD3 does exist, but it was superseded by MD4 before it was ever published or used.
– MD4 (Message Digest) – by Rivest Faster than MD2, but a version of MD4 is found to be weak.
– MD5 (Message Digest) – by Rivest
∗ A little slower than MD4.
∗ 128-bit hash
– SHA (Secure Hash Algorithm)
∗ 1993 NIST published SHA
∗ 1995 a never published flaw is found in SHA.
Lecture Notes (Syracuse Cryptography Basics: 9
University)
• Stream Cipher: Use one-way hash function as a pseudorandom number generator to genrate a stream
of psuedorandom numbers. XOR the plaintext with this stream of numbers. In decryption, the same
stream can be constructed using the counter and the seed (the seed is the encryption key).
Lecture Notes (Syracuse Cryptography Basics: 10
University)
• Making Commitment: Alice and Bob plays a simple online game. Each person provides an
integer. If the sum of the two integers is odd, Alice wins; otherwise Bob wins. However, whoever gives
out the number first will definitely lose. Using one-way hash function, they can commit their
numbers first, and then release the numbers to each other. Nobody can change his/her mind after
the commitment, unless he/she can find a collision of the hash function.
• Detecting Changes: To ensure system security, it is necessary to routinely check whether the
impor- tant files/configurations are modified. We can use one-way hash to achieve this. This is the main
idea behind the Tripwire idea.
• One-way hash chain
– The S/KEY one-time password scheme.
– Broadcast authentication
• Merkle Tree
– Timestamping a document
∗ Publish hash in a magazine or newspaper.
∗ One hash per document: expensive.
∗ One hash per 1000 documents: cost saving.
∗ Using Merkle Tree can achieve such a cost saving.
– Broadcast authentication in lossy channels
∗ Signing each packet is expensive
∗ Hashing all of them together and then hash the result cannot tolerate the loss of packet.
∗ Using Merkle Tree can solve this problem.
Lecture Notes (Syracuse Cryptography Basics: 11
University)
4 Public-Key Cryptography
4.1 History
• First asymmetric key algorithm was invented, secretly, by Clifford Cocks (then a recent
mathematics graduate and a new staff member at GCHQ in the UK) early in the 1970s.
• 1976, Diffie and Hellman postulated this system without demonstrating that such algorithms exist.
• 1978, Rivest, Shamir and Adleman all then at MIT invented RSA, which is a reinvention of Cocks
scheme.
• Since then, several other asymmetric key algorithms have been developed, but the most widely
known remains Cocks/RSA.
• Another algorithm is ElGamal (Taher ElGamal), which relies on the (similar and related) difficulty
of the discrete logarithm problem.
• A third is a group of algorithms based on elliptic curves, first discovered by Neal Koblitz in the mid
’80s.
• NSA has also claimed to have invented public-key cryptography, in the 1960s; however, there is
currently (as of 2004) little supporting evidence for their claims.
• Merkle-Hellman (MH) was one of the earliest public key cryptosystems invented by Ralph Merkle
and Martin Hellman in 1978. Although its ideas are elegant, and far simpler than RSA, it has been
broken. (Merkle-Hellman Knapsacks).
• Stories behind RSA: Steven Levy’s Crypto book
– Encryption:
1. Bob generates y, and generates a key K = (gx)y mod p.
2. Bob encrypts M using the key K and a symmetric key encryption method, such as AES.
3. Bob sends the ciphertext and gy mod p to Alice.
– Decryption:
1. Alice generates K = (gy)x mod p.
2. Alice decrypts the ciphertext using K.
• ElGamal algorithm: this algorithm is similar to the above algorithm, but it does not rely on any
symmetric key encryption scheme.
1. Let h = gx mod p.
2. Public key: (p, g, h)
3. Private key: x
4. Encryption: generate a random k, let c1 = gk mod p, c2 = m ∗ hk mod p.
5. Decryption: c2/cx mod p. It should be noted that c2/cx = m ∗ hk/gkx = m mod p.
1 1
5 Digital Signature
5.1 Digital Signature Algorithms
• Motivation of digital signature
– Physical signature
– Properties: Authenticity, unforgeable, not reusable, unalterable, can’t be repudiated.
• RSA Signature Scheme
– Public key (verifying key): (e, n).
– Private key (signing key): (d, n).
– Sign: M d mod n.
Lecture Notes (Syracuse Cryptography Basics: 14
University)
– Verify: (Md)e mod n. The result should be equal to M if the signature is authentic.
• DSA (Digital Signature Algorithm):
– 1991, Proposed by NIST as a DSS (Digital Signature Standard)
– Criticism from RSA and its supporters
– Developed by NSA
– Royalty-free
– DSA is slower than RSA
• Avoid reusing digital signature: Sign with timestamps.
• Sign the hash. In practice, we only sign the hash of the message, not the message itself, because the
message may be large, and signing is quite slow for large message.
A Sample X.509
Certificate: Data:
Version: 1 (0x0)
Serial Number: 7829 (0x1e95)
Signature Algorithm: md5WithRSAEncryption
Issuer: C=ZA, ST=Western Cape, L=Cape Town, O=Thawte Consulting
cc, OU=Certification Services Division,
CN=Thawte Server CA/[email protected]
Validity
Not Before: Jul 9 16:04:02 1998 GMT
Not After : Jul 9 16:04:02 1999 GMT
Subject: C=US, ST=Maryland, L=Pasadena, O=Brent Baccala,
Lecture Notes (Syracuse Cryptography Basics: 15
University)
OU=FreeSoft, CN=www.freesoft.org/emailAddress=...
Subject Public Key Info:
Public Key Algorithm:
rsaEncryption RSA Public Key:
(1024 bit)
Modulus (1024 bit):
00:b4:31:98:0a:c4:bc:62:c1:88:aa:dc:b0:c8:bb:
33:35:19:d5:0c:64:b9:3d:41:b2:96:fc:f3:31:e1:
66:36:d0:8e:56:12:44:ba:75:eb:e8:1c:9c:5b:66:
70:33:52:14:c9:ec:4f:91:51:70:39:de:53:85:17:
16:94:6e:ee:f4:d5:6f:d5:ca:b3:47:5e:1b:0c:7b:
c5:cc:2b:6b:c1:90:c3:16:31:0d:bf:7a:c7:47:77:
8f:a0:21:c7:4c:d0:16:65:00:c1:0f:d7:b8:80:e3:
d2:75:6b:c1:ea:9e:5c:5c:ea:7d:c1:a1:10:bc:b8:
e8:35:1c:9e:27:52:7e:41:8f
Exponent: 65537 (0x10001)
Signature Algorithm:
md5WithRSAEncryption
93:5f:8f:5f:c5:af:bf:0a:ab:a5:6d:fb:24:5f:b6:59:5d:9d:
92:2e:4a:1b:8b:ac:7d:99:17:5d:cd:19:f6:ad:ef:63:2f:92:
ab:2f:4b:cf:0a:13:90:ee:2c:0e:43:03:be:f6:ea:8e:9c:67:
d0:a2:40:03:f7:ef:6a:15:09:79:a9:46:ed:b7:16:1b:41:72:
0d:19:aa:ad:dd:9a:df:ab:97:50:65:f5:5e:85:a6:ef:19:d1:
5a:de:9d:ea:63:cd:cb:cc:6d:5d:01:85:b5:6d:c8:f3:d9:f7:
8f:0e:fc:ba:1f:34:e9:96:6e:6c:cf:f2:ef:9b:bf:de:b5:22:
68:9f
• Key management
JHU Report: In the Diebold code we analyzed, both the keys for the smartcard and the keys used to
encrypt the votes were static entries in the source code. This means that the same keys are used on
every voting device. Thus, an attacker who was able to compromise a single voting device would
have access to the keys for all other voting devices running the same software.
From the CVS logs, we see this particular key has been used without change since December 1998,
when the CVS tree for AccuVote-TS version 3 began, and we assume that the key was in use much
before 14 that. Although Jones reports that the vendor may have been aware of the key management
problems in their code since at least 1997 [16, 17], our findings show that the design flaw was
never addressed.
• Encryption algorithm
JHU Report: A second set of problems has to do with the way that the Diebold code encrypts the
votes and audit logs. The files that hold the votes are encrypted using the Data Encryption Standard
(DES) algorithm in CBC mode. There are problems with the use of both DES and the CBC mode,
as we describe below.
In their response to “allegation 44”, Diebold states that “there are stronger forms of compression
than DES, but the authors’ implication that the keys can be recovered ‘in a short time’ is
deliberately misleading.” We assume that Diebold meant to claim that there are stronger encryption
algorithms available, as DES is not a compression algorithm.
• Integrity
JHU Report: Instead of using such a MAC, the Diebold code uses a non-cryptographic checksum
called a CRC to detect whether a file has been tampered with. This is completely insecure as is
discussed on page 15 of our paper. The use of CRCs instead of MACs has long been documented in
the security literature as a very serious mistake.
In Diebold system, before being encrypted, a 16-bit cyclic redundancy check (CRC) of the plaintext
data is computed. This CRC is then stored along with the ciphertext in the file and verified
whenever the data is decrypted and read. This process in handled by the ReadRecord and
WriteRecord functions in TSElection/RecordFile.cpp. Since the CRC is an unkeyed, public
function, it does not provide any meaningful integrity protection for the data. In fact, by storing it in
an unencrypted form, the purpose of encrypting the data in the first place (leaking no information
about the contents of the
Lecture Notes (Syracuse Cryptography Basics: 17
University)
plaintext) is undermined. Standard industry practice would be to first encrypt the data to be stored
and then to compute a keyed cryptographic checksum (such as HMAC-SHA1) of the ciphertext.
This cryptographic checksum could then be used to detect any tampering with the plaintext. Note
also that each entry has a timestamp, which can be used to detect reordering, although sequence
numbers should also be added to detect record deletion.
• Mode
JHU Report: We note that “DES is being used in CBC mode which requires an initialization vector
to ensure its security.” We go on to show that the Diebold code does not provide the necessary
initialization vectors. A detailed explanation of this problem is highly technical; we refer the interested
reader to A Concrete Security Treatment of Symmetric Encryption: Analysis of the DES Modes of
Operation.
Second, DES is being used in CBC mode which requires a random initialization vector to ensure its
security. The implementation here always uses zero for its IV. This is illustrated by the call to DesCB-
CEncrypt in TSElection/RecordFile.cpp; since the second to last argument is NULL, DesCBCEncrypt
will use the all-zero IV. To correctly implement CBC mode, a source of strong random numbers must
be used to generate a fresh IV for each encryption. Suitably strong random numbers can be derived
from many different sources, ranging from custom hardware to accumulated observations of user
behavior.
• Random number
JHU Report: While the voter’s identity is not stored with the votes, each vote is given a serial number
... generated by a linear congruential random number generator ... seeded with static information.
Diebold Response: There is no need for “security” here. The only intent of this code is to pseudo-
randomize the order of ballots for purposes of display and reporting, as required in some states.
Jones (Doug Jone from the University of Iowa also responded to Diebold’s reponse): Diebold is
wrong. There is need for security here. If the sequence of pseudo-random numbers is known, and
the sequence in which voters actually entered the booth has been recorded (as a poll-watcher can
easily do), then we can recover any particular voter’s ballot from the report of individual ballots.
This allows an insider working at election central to check this report (I’d use a pocket camera to take
photos of the report), in cooperation with a poll watcher, to confirm whether the paid voters have
earned their pay by voting the required way. Vote buying schemes that rely on insiders at the vote
count cooperating with poll-watchers date back many years, and therefore, strong randomization
schemes are justified here! I’ve worked as a poll watcher, I know that perfect records are hard to
keep, but I also know that I can correct my records if I can talk a few voters into signing their
ballots with pre-selected write-in votes or funny patterns of yes-no votes on the judicial retention
ballot.
• Smartcards
JHU Report: Upon reviewing the Diebold code, we observed that the smartcards do not perform
any cryptographic operations. This, in and of itself, is an immediate red flag. One of the biggest ad-
vantages of smartcards over classic magnetic-stripe cards is the smartcards ability to perform
crypto- graphic operations internally, and with physically protected keys. Because of a lack of
cryptography, there is no secure authentication of the smartcard to the voting terminal. This means
that nothing prevents an attacker from using his or her own homebrew smartcard in a voting
terminal.
One might naturally wonder how easy it would be for an attacker to make such a homebrew smartcard.
First, we note that user-programmable smartcards and smartcard readers are available commercially
Lecture Notes (Syracuse Cryptography Basics: 18
University)
over the Internet in small quantities and at reasonable prices. Second, an attacker who knows the
pro- tocol spoken between voting terminals and legitimate smartcards could easily implement a
homebrew card that speaks the same protocol.
• The process:
1. Each DVD player has a master key, which is unique to the DVD player manufacturer. This
key is called the player key.
2. The palyer reads an encrypted disk key from the DVD, and uses the player key to decrypt the
disk key. Since there are many player keys out there (each manufacture has one), the DVD
must contain a copy of the encrypted disk key for each player key.
3. The player reads the encrypted title key for the file to be played. The DVD will likely contain
multiple files, each with its own title key. The player uses the descrypted disk key to decrypt
the title key.
4. The player used the title key to descrypt the content.
• Encryption algorithm: Content Scrambling System (CSS).
– Its security depends on its secrecy: in 1999, Jon Johansen (with another two anonymous
people) disassembled a software DVD player to uncover the descrambling algorithm. They
then wrote and released a C code called DeCSS.
– The encryption key is only 40-bit and not all possible 40-bit numbers can be keys: A
high- end home computer in 1999 running optimized code could brute-force it within 24
hours, and modern computers can now brute-force it in a few seconds or less.
• Software player has its own unlock key. This is where the problem is.
Lecture Notes (Syracuse Cryptography Basics: 19
University)
7.4 Mistake in Encrytion: Inventing your own encryption algorithm and keep it secret
• Good encryption algorithms such as DES, AES, and Blowfish take many years to develop by smart
minds who specialize on cryptography, and then they were scrutinized by many other smart minds.
The reason why we see these names is because nobody has broken them so far. All the bad ones
have already been eliminated.
Lecture Notes (Syracuse Cryptography Basics: 20
University)
• If you ever want to invent your own encryption algorithm for your software within a few days or
even a few months, think about what those algorithms have gone through. If this does not stop you,
then look at the following cases:
• If you still want to try, then you are either a cryptographer who simply wants to develop a better
encryption algorithm, or you are simply crazy.