0% found this document useful (0 votes)

25 views16 pages

Ahmad 2005

Uploaded by

BT21EC026 Siddharth Jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views16 pages

Ahmad 2005

Uploaded by

BT21EC026 Siddharth Jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Computers and Electrical Engineering 31 (2005) 345–360

www.elsevier.com/locate/compeleceng

Hardware implementation analysis of SHA-256

and SHA-512 algorithms on FPGAs q
Imtiaz Ahmad *, A. Shoba Das
Department of Computer Engineering, Kuwait University, P.O. Box 5969, Safat 13060, Kuwait

Received 2 May 2005; received in revised form 1 June 2005; accepted 14 July 2005
Available online 18 October 2005

Abstract

Hash functions are common and important cryptographic primitives, which are very critical for data
integrity assurance and data origin authentication security services. Field programmable gate arrays
(FPGAs) being reconfigurable, flexible and physically secure are a natural choice for implementation of
hash functions in a broad range of applications with different area-performance requirements. In this paper,
we explore alternative architectures for the implementation of hash algorithms of the secure hash standards
SHA-256 and SHA-512 on FPGAs and study their area-performance trade-offs. As several 64-bit adders
are needed in SHA-512 hash value computation, new architectures proposed in this paper implement mod-
ulo-64 addition as modulo-32, modulo-16 and modulo-8 additions with a view to reduce the chip area.
Hash function SHA-512 is implemented in different FPGA families of ALTERA to compare their perfor-
mance metrics such as area, memory, latency, clocking frequency and throughput to guide a designer to
select the most suitable FPGA for an application. In addition, a common architecture is designed for imple-
menting SHA-256 and SHA-512 algorithms.
2005 Elsevier Ltd. All rights reserved.

q
This research is funded by Kuwait University Grant EO 04/03.
*
Corresponding author. Tel.: +965 4985849; fax: +965 4839461.
E-mail address: [email protected] (I. Ahmad).

0045-7906/$ - see front matter 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.compeleceng.2005.07.001
346 I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360

1. Introduction

Data integrity assurance and data origin authentication are essential security services in finan-
cial transactions, electronic commerce, electronic mail, software distribution, data storage and so
on. Cryptographic hash functions are utilized to achieve these security services. The purpose of a
hash function is to produce a ‘‘fingerprint’’ of a file, message, or other block of data. A hash value
h is generated by a function H of the form h = H(M), where M is a variable-length message and
H(M) is the fixed-length hash value. In a cryptographic hash function, a message of arbitrary
length padded and broken into blocks is input sequentially to a compression function which con-
verts a fixed-length input (current message block) to a fixed-length output (hash value). The hash
values of individual blocks are used iteratively by the compression function to find the final hash
value, referred to as message digest. A hash function provides a unique relationship between the
input message and the hash value and hence, represents a longer message in a concise way. There-
fore, computation of digital signature to a large document (message) can be replaced by applying
cryptographic processing to the documentÕs hash value which is much smaller than the document
[1]. Other popular applications of hash functions include digital signature schemes in public-key
cryptosystems recommended in REC 2437 [19], password storage and verification specified in
RFC 2289 [20], and pseudo-random number generation. Hash function is also the building block
of secret-key message authentication codes (MACs) [2] used in two popular security protocols,
namely, secure sockets layer (SSL) provided in RFC 2246 [21] and IPSecurity mentioned in
RFC 2404 [3].
A cryptographic hash function should be hard to invert, i.e., given a hash value h, it should be
computationally infeasible to find some input M such that H(M) = h and collision-free, i.e., find
two messages M1 and M2 such that H(M1) = H(M2). Most of the secure hash functions in use
today have an iterative structure. The motivation for the iterative structure stems from the fact
that the compression function which generates a hash value for a current message block using
hash value of the preceding block, actually combines two or more inputs to produce an output
where each output bit is a different complex non-linear function of all the input bits. This makes
the resultant hash function collision resistant. High performance cryptographic hardware systems
typically require an extra module for hash function calculation to reduce the workload of the
main microprocessor.
Hash functions which have a ‘‘dedicated’’ design are fast and have considerable advantage over
other algorithms which are based on block cipher. Dedicated hash functions suitable for both
software and hardware implementation have been proposed and now widely used in real world
applications. Some of the most widely used dedicated hash functions in real applications are mes-
sage-digest algorithm MD5 [4] and secure hash algorithm SHA-1 [5]. The complexity of the best
attack of SHA-1 is 280 and it does not any longer match the security guaranteed by the new secret
key encryption standard, AES (Advanced Encryption Standard), which uses one key for both
encryption and decryption with key sizes of 128, 192 and 256 bits [6]. Therefore, three new hash
functions (SHA-256, SHA-384 and SHA-512) referred to as SHA-2, with the security matching
the security of AES with complexity of the best attack as 2128, 2192 and 2256, respectively, have
been announced by the National Institute of Standards and Technology (NIST) [7]. More re-
cently, a new standard SHA-224 has been announced by NIST [8]. The functional characteristics
of SHA functions are different and are presented in Table 1.
I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360 347

Table 1
Comparison of functional characteristics of hash functions
Hash functions SHA-1 SHA-224 SHA-256 SHA-384 SHA-512
Size of hash value 160 224 256 384 512
Complexity of the best attack 280 2112 2128 2192 2256
Message size <264 <264 <264 <2128 <2128
Message block size 512 512 512 1024 1024
Word size 32 32 32 64 64
Number of words 5 8 8 8 8
Number of digest rounds 80 64 64 80 80
Number of constants 4 64 64 80 80
Round-dependent operations ft None None None None

A hardware implementation of cryptographic hash function has more physical security by nat-
ure as they are physically separate from the main processor and has higher performance than soft-
ware implementation. Moreover, the reconfigurable hardware devices such as field programmable
gate arrays (FPGAs) are best suited for implementation of cryptographic hash functions as they
are flexible and easily upgradeable. In the implementation of hash function on FPGAs, area and
performance are two of the most important design criteria of concern. Hash functions have a
broad range of applications and hence, their area-performance requirements may be different
for different applications. In some applications such as smart cards, area is of concern, whereas
in storage area networks (SANs) and virtual private networks (VPNs) performance is critical.
Some other applications, such as digital video recorders, require an optimization of perfor-
mance/area ratio. Therefore, different architectures can be used for SHA function implementation
and it is necessary to evaluate alternative architectures on the basis of area-performance
characteristics.
In cryptographic hash functions a common sequence of operations is called a digest round and
the compression function produces a hash value by subjecting a block of message to several digest
rounds. The number of digest rounds differ among the SHA functions as shown in Table 1. In
many applications performance of these basic cryptographic primitives is often directly reflected
in an overall improvement of the system performance. Among the several operations of a digest
round in SHA-2 functions, addition of several operands is involved which occupy a major chunk
of the chip area when implemented in FPGAs. Multi-operand addition also dictates the critical
path delay in the computation of hash value [17]. Hence, we focused on design of adders while
proposing new architectures. This paper deals with three issues, namely, proposing different archi-
tectures for implementation of a hash function on FPGA, comparing the performance metrics of
different FPGAs that implement a SHA-2 function and single chip implementation of SHA-2 fam-
ily hash functions. As the performance metrics of FPGAs of different families even by the same
manufacturer are not identical, an evaluation of FPGAs on the basis of performance metrics helps
in selection of appropriate FPGA to suit an application. Moreover, since hash functions of SHA-2
family have identical operations in a digest round, we were motivated to design a common archi-
tecture for these functions.
The remaining paper is organized in the following manner. An overview of the previous work
is given in Section 2. The prelude in Section 3 discusses SHA-256 and SHA-512 algorithms. In
348 I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360

Section 4, the philosophy behind the reduced word length implementation of SHA-512 function
and single chip implementation of SHA-256 and SHA-512 are explained and in Section 5 the imple-
mentation is detailed. Results are discussed in Section 6 and the paper is concluded in Section 7.

2. Previous work

Many studies had been done on implementation of cryptographic hash functions [9–17]. Boos-
elaers et al. [9,10] reported the software performance evaluations of Message-Digest algorithm
MD5 and algorithm of secure hash standards, SHA-1 hash functions on a Pentium processor.
Nakajima and Matsui [11] reported the software performance analysis of the new proposed hash
function SHA-512 on Pentium III processor. Hash function applications demand hardware imple-
mentations to meet the performance requirements for high-speed networks. Dominikus [12] has
reported an FPGA implementation of MD5 hash algorithm. McLoone and McCanny proposed
a single-chip FPGA solution for SHA-384 and SHA-512 [13]. Kang et al. [14] reported the imple-
mentation of MD5 and SHA-1 on Altera FPGA. Grembowski et al. [15] recently reported the
comparative analysis of the hardware implementation of SHA-1 and the new proposed hash func-
tion SHA-512 on Xilinix Virtex FPGA. A common architecture for implementation of SHA-2
family architecture is reported in [16]. An elegant application specific integrated circuit (ASIC)
implementation of SHA-512 by making use of delay balancing and pipelining is recently reported
by Dadda and Macchetti [17].
To the best of our knowledge a study on area-performance metrics in the FPGA implementa-
tion of SHA-2 functions to suit different applications has not been done so far. In a digest round
of SHA-512, several 64-bit operands are added and logic operations are performed on them. This
paper explores alternative architectures for SHA-512 implementation on a FPGA using 8, 16 and
32-bit adder/logic circuits and compares their area-performance trade-offs. Performance metrics
such as area, memory, latency, clocking frequency and throughput of FPGAs of different families
of ALTERA for implementation of SHA-512 are evaluated in this paper and finally, single chip
implementation of SHA-256 and SHA-512 with 32-bit adders also has been done.

3. Prelude

In this section, SHA-256 and SHA-512 algorithms are discussed in detail. When a message of
any length <264 bits (for SHA-256) or <2128 bits (for SHA-512) is input, the hash functions SHA-
256 and SHA-512 compute a condensed representation of message, referred to as message digest.
The message digest generated by SHA-256 and SHA-512 are 256 and 512 bits long, respectively.
The algorithm for generation of message digest is identical for SHA-256 and SHA-512 and only
the constants and functions used diﬀer, and hence, in this section SHA-256 and SHA-512 are dis-
cussed simultaneously. The procedure consists of two stages, namely, preprocessing and hash
computation. In the preprocessing stage, the message is padded, parsed into m-bit blocks and ini-
tialization values to be used in the hash computation are set. A Message Scheduler (MS) divides
the m-bit block into 16 words and prepares a message schedule by passing one word at a time. A
series of hash values are generated iteratively from functions, constants, and word operations and
I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360 349

Message

Padder

Wt
Message
Scheduler a in - hin
Kt
ROM
Iterative Processing Unit
H 0i- H 7i
Hash
constants a out - hout

ai - hi

Modulo Adder

H 0i+1 - H 7i+1

Message Digest

Fig. 1. Message digest generation.

the ﬁnal hash value is the message digest. The message digest generation technique is shown in
Fig. 1. The operations performed on the two stages are listed below:
Preprocessing:

• Padding the message into a multiple of 512 or 1024 bits.

• Parsing the padded message into N message blocks B0, B1, . . . , BN, where block size is 512 or
1024 bits.

Hash computation:

• Each message block Bi are processed in order. A word (32 bits or 64 bits wide) of a message
block Bi is referred to as Bit and in a block there are 16 such words.
• For each message block i in the range 1 to N, starting from message schedule Wt, following
steps (1– 4) are repeated to compute hash values H i0 to H i7 for the ith block.

Step 1: Wt is computed by identical procedure for SHA-256 and SHA-512, only the logic func-
tions r0 and r1 are diﬀerent.
350 I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360

SHA-256:
Message schedule W t ¼ Bit 0 6 t 6 15
¼ r256
1 ðW t2 Þ þ W t7 þ r256
0 ðW t15 Þ þ W t16 16 6 t 6 63
where
17 19 10
r256
1 ¼ ROTR ðxÞ ROTR ðxÞ SHR ðxÞ
7 18 3
r256
0 ¼ ROTR ðxÞ ROTR ðxÞ SHR ðxÞ

SHA-512:
Message schedule W t ¼ Bit 0 6 t 6 15
¼ r512
1 ðW t2 Þ þ W t7 þ r512
0 ðW t15 Þ þ W t16 16 6 t 6 80
where
19 61 6
r512
1 ¼ ROTR ðxÞ ROTR ðxÞ SHR ðxÞ
1 8 7
r512
0 ¼ ROTR ðxÞ ROTR ðxÞ SHR ðxÞ

ROTRn(x) is a circular rotation of a variable x by n positions to the right and SHRn(x) is shifting
of a variable x by n positions to the right.
The block diagram of SHA-256/SHA-512 algorithm is shown in Fig. 2.
Step 2: The hash values, H i1
0 to H i1
7 are assigned to variables a, b, c, d, e, f, g, h. The eight initial
hash values, which are 32 or 64 bits wide, are shown in Table 2.

Padded
Message scheduler Message

σ0 σ1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Wt
T1 + T 2
Kt

T1
0
Maj (a, b, c) 1
Ch (e, f, g)

a b c d e f g h

Iterative processing unit

Fig. 2. Block diagram of SHA-256/SHA-512.

I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360 351

Table 2
Initial hash values of SHA-256 and SHA-512
SHA-256 SHA-512
H 00 ! a 6a09e667 6a09e667 f3bcc908
H 01 ! b bb67ae85 bb67ae85 84caa73b
H 02 ! c 3c6ef372 3c6ef372 fe94f82b
H 03 ! d a54ﬀ53a a54ﬀ53a 5f1d36f1
H 04 ! e 510e527f 510e527f ade682d1
H 05 ! f 9b05688c 9b05688c 2b3e6c1f
H 06 ! g 1f83d9ab 1f83d9ab fb41bd6b
H 07 ! h 5be0cd19 5be0cd19 137e2179

• A sequence of 64 constant 32-bit words, K 256

t or 80 constant 64-bit words, K 512
t are used by the
processing unit.
• The processing unit uses four logical functions, Ch and Maj, R0 , and R1 . The logic functions Ch
and Maj are identical for SHA-256 and SHA-512.

Chðx; y; zÞ ¼ ðx ^ yÞ ðpx ^ zÞ
Majðx; y; zÞ ¼ ðx ^ yÞ ðx ^ zÞ ðy ^ zÞ
SHA-256:
R0 ¼ ROTR2 ðxÞ ROTR13 ðxÞ ROTR22 ðxÞ
R1 ¼ ROTR6 ðxÞ ROTR11 ðxÞ ROTR25 ðxÞ
SHA-512:
R0 ¼ ROTR28 ðxÞ ROTR34 ðxÞ ROTR39 ðxÞ
R1 ¼ ROTR14 ðxÞ ROTR18 ðxÞ ROTR41 ðxÞ
Step 3: The processing unit performs this step, 64 or 80 times on a 512 or 1024 bit block.
T 1 ¼ h þ R1 ðeÞ þ Chðe; f ; gÞ þ K t þ W t
T 2 ¼ R0 ðaÞ þ Majða; b; cÞ
h¼g
g¼f
f ¼e
e ¼ d þ T1
d¼c
c¼b
b¼a
a ¼ T1 þ T2
Variables used in the above equations refer to respective values for SHA-256 and SHA-512.
352 I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360

Step 4: The ith intermediate hash value H i0 to H i7 are computed by modulo-32 or modulo-64 bit
adders after the iterations.

H i0 ¼ a þ H i1
0 H i1 ¼ b þ H i1
1 H i2 ¼ c þ H i1
2 H i3 ¼ d þ H i1
3
H i4 ¼ e þ H i1
4 H i5 ¼ f þ H i1
5 H i6 ¼ g þ H i1
6 H i7 ¼ h þ H i1
7

• The message digest is computed by H N0 kH N1 kH N2 kH N3 kH N4 kH N5 kH N6 kH N7 after processing all the

N blocks in the message.

4. Reduced word length implementations

FPGAs are best suited for implementation of cryptographic hash functions as they meet the
speed requirements and are reconfigurable. It is clear from Section 3, that message schedule
W t ¼ r256 256
1 ðW t2 Þ þ W t7 þ r0 ðW t15 Þ þ W t16 requires four operand addition and intermediate
value a = T1 + T2 where, T 1 ¼ h þ R1 ðeÞ þ Chðe; f ; gÞ þ K t þ W t and T 2 ¼ R0 ðaÞ þ Majða; b; cÞ,
requires six operand addition. The ith intermediate hash value H i0 to H i7 are computed by mod-
ulo-64 bit adders after the iterations and hence, an additional eight modulo-64 adders are required
to find the final hash value for the block. The multi-operand addition is the most problematic part
in the implementation of hash functions. Hence, reducing the size of these adders will reduce the
number of logic elements required in the FPGA, thereby reducing the overall area of the final
circuit.
The implementation of multi-operand 64-bit adders on FPGAs demands selection of proper
scheme for performing the addition as both the speed and area are of concern. It is a well-known
fact that carry look-ahead adders (CLAs) are faster than conventional carry propagate adders,
but carry save adders (CSAs), referred to as redundant adders are faster and has a smaller area
than a CLA. For an n-bit adder with each module handling m bits, the delay is proportional
to logmn for one level CLA whereas the redundant adders have a constant delay [18]. Assuming
a complexity of km for implementing 1 bit module, the area of a CLA has a complexity propor-
tional to kmn and a redundant adder has an area proportional to n. In a CSA, an array of full
adders (FAA) are used to perform addition of three binary vectors without propagating the car-
ries and two binary vectors, pseudo-sum and carry are generated. As the carry output of a ith full
adder has a weight i + 1, carry of bit 0, vc0 = 0. Hence, the carry-in (cin) can be included in the
place of vc0 as shown in Fig. 3.
Several schemes exist for the implementation of multi-operand addition. Using a network of
full-adders, p operands each n bits wide can be added using an array of [p:2] adders, referred
to as reduction by rows or using an array of (p:q] counters, referred to as reduction by columns.
The arrays can be linear or tree array and same number of adders are used by both the schemes.
Reduction by rows technique will involve an array of full adders and a CLA in the last stage to
add the final pseudo-sum and pseudo-carry. For large n, the number of groups in one level CLA
will be large, resulting in a slow operation. If multiple levels are used, the maximum number of
levels, L = logmn, the number of modules Nmax with maximum number of levels will be
(n 1)/(m 1) and the delay is proportional to 2 logmn [18]. It can be seen that the selection of
I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360 353

X Y Z x0 y0 z0

n n n

c out
CSA c in FA

n n c in

vc vs vc 1 vs0 vc 0

Fig. 3. Carry save adder.

group size m is an important factor which in turn affects the delay and the number of modules. An
optimum size for m will be 4 which gives rise to 21 modules for the 3 level implementation of a
64 bit adder.
Reduction of word length of operands input to the adders from 64-bit to smaller denomina-
tions, namely, 32, 16 and 8 will reduce the value of n which in turn will reduce L, Nmax and
the delay. The size of the adders and the overall size of the hash function circuit are thereby re-
duced. Instead of a 64-bit adder, if a 32-bit adder is used, the reduction in area of CSA will be
proportional to n = 32. The addition operation of three 8-bit operands, P, Q and R using a 4-
bit CSA is illustrated in Fig. 4. Initially the lower nibble of the operands is added and the same
adder is used to add the higher nibbles.
The carry bit vc4 = 1 generated while adding bits (3-0) of operands in a 4-bit CSA is stored in a
flip-flop and vc0 is assigned a value Ô0Õ. The pseudo-sum vs3-0 and the pseudo-carry vc3-0 are added
in a 4-bit CLA to get the final sum VS3-0 = 0101. The higher nibble of the operands is added by
the same CSA and the carry produced vc4 = 0 is ignored and vc0 is assigned the value Ô1Õ which is
the carry from lower nibble addition saved in the flip-flop. In the same 4-bit CLA, pseudo-sum vs3-
0 and the pseudo-carry vc3-0 are added to get the final sum VS74 = 0110. The same logic is used
for the carry generated from the lower nibble addition in CLA and the Modulo-8 sum is
01100101.
Moreover, the logic functions Ch(e, f, g) and Maj(a, b, c) are such that these can be performed
on reduced size operands. Design of SHA-512 with 64/32/16/8 bit adders and logic circuits will

Let P = 22 Q = 79 R = CA
Bits (3-0) Bits (7-4)
X3-0 = 0010 X3-0 = 0010
Y3-0 = 1001 Y3-0 = 0111
Z3-0 = 1010 Z3-0 = 1100
vs3-0 = 0001 vs3-0 = 1001
vc4-1 =1010 vc4-1 = 0 1 1 0
vc4 = 1 vc4 is ignored

Fig. 4. Addition of 8-bit operands using 4-bit CSA.

354 I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360

be referred to as SHA(64)-512, SHA(32)-512, SHA(16)-512 and SHA(8)-512, respectively, in the rest

of the paper.

5. Implementation of hash functions with scaled down adders and logic circuits

In the block diagram of SHA(64)-512 which is shown in Fig. 2 in Section 3, the message sched-
uler is implemented with sixteen, 64-bit registers and in the Iterative Processing Unit (IPU),
(a h) are 64-bit registers. The addition of operands in message scheduler and processing unit
are implemented using a network of 64-bit CSAs and the reduction is done by rows. The ﬁnal
pseudo-sum and carry vectors are added using a 64-bit CLA.

5.1. SHA-512 implementation with 32 bit adders

Message scheduler (MS) and the iterative processing unit (IPU) implemented with 32-bit CSAs
are shown in Figs. 5 and 6, respectively. Sixteen registers in MS and the eight registers in IPU are
64-bit registers, but split and used as two 32-bit registers. The suffixes U and L refer, respectively,
to bit vectors (63-32) and (31-0). A selector unit SEL is used to select the U or L register and
DSEL performs the opposite function of SEL. The SEL unit consists of two tristate buffers which
in turn are driven by select lines S0 and S1. The select signal S1 is generated by inverting S0. A
positive edge triggered T-flip-flop generates S0. The data is always shifted from one half of a reg-
ister to the corresponding half of the next register in the path. The transfer of data between lower
halves of registers take place when S0 is asserted and upper halves when S1 = 1. The shift logic
functions r0 and r1 in MS needs 64-bit operands together, therefore, the two halves of the oper-
ands are latched into these blocks at negative edge of S1 and the shift logic is performed by com-
binational circuits. The same technique is used for R0 and R1 in IPU.
The logic functions Ch(e, f, g) and Maj(a, b, c) are such that these can be performed on two
halves of the operands independently, hence, only 32 bit circuits are used for these logic. This
in turn reduces the size of the overall circuit.

Padded
Message
CLA
FAA

FAA

32
32
Sel Sel 32 32
32 32 64
Sel

64
Sel
σ0 σ1
32
DSel

0L 1L 2L 3L 4L 5L 6L 7U 8U 9U 10U 11U 12U 13U 14U 15U

0U 1U 2U 3U 4U 5U 6U 7L 8L 9L 10L 11L 12L 13L 14L 15L

Sel

Fig. 5. SHA(32)-512—message scheduler.

I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360 355

FAA

FAA
FAA

FAA
Kt

32 32
CLA Ch (e, f, g)
Sel Maj (a, b, c) FAA Sel
64 64

0
CLA 1

Sel Sel Sel Sel Sel Sel Sel Sel

DSel
32
DSel

aU bU cU du eU fU gU hU
aL bL cL dL eL fL gL hL

Fig. 6. SHA(32)-512—iterative processing unit.

The addition of three 64-bit operands using a 32-bit FAA is shown in Fig. 7. The SEL circuits
are included to demonstrate the selection of one half of the operands and the output is two 32 bit
vectors, namely, Sum and Carry. The vc32, which is shown as cout in Fig. 7 is stored in a D ﬂip-ﬂop
when S0 is asserted as this the carry generated from the addition of bits (31-0) of the operands. Bit
vc0 = 0 when S0 is asserted and vc0 = cout of the lower half addition when S1 is asserted. The carry
from higher half addition is ignored as hash functions require modulo-64 addition. The 64-bit
constant Kt is input 32-bit at a time from a ROM. In order to store the eighty constants, a
(160 * 32) ROM is used. Message digest computation is done by 32-bit CLAs, which add the hash
values of the preceding iteration with the contents of registers a to h, hence eight CLAs are used to
perform this addition. The carries from the lower order words are stored and used as cin while
higher order words are added as shown in Fig. 7 for FAAs.

5.2. SHA-512 implementation with 16 and 8 bit adders

The design methodology of SHA-512 function using 16 or 8 bits adder and logic circuits is
identical to 32-bit version. The 64-bit registers are split to suit the respective word lengths. The

X 31-0 32 32 vs31-0
Sum
32
32-bit Adder
S0 (FAA)
X 63-32 vc 31-1
32
32 Carry
c out vc 0
D FF
S1
Clk
S1
Y 31-0
Z 31-0 ‘0’
S0
S0 S0
Y 63-32 S0
Z 63-32

S1 S1

Fig. 7. 32-bit FAA.

356 I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360

selection is done by a counter-decoder (2 · 4) circuit and a (3 · 8) decoder is used for the 8-bit
version. Accordingly, the size of SEL and DSEL circuits, FAAs, Ch, Maj functions and ROM
are also changed to operate on 16-bit or 8-bit operands.

5.3. Single chip implementation of SHA-512 and SHA-256

Implementation of SHA-512 using 32 bit adders and logic circuits facilitates implementation of
SHA-256 on the same chip, as SHA-256 performs operations on 32 bits operands. The algorithm
is identical for SHA-512 and SHA-256 functions and the user can select the algorithm by asserting
an input line. One initial hash value and one Kt value are shown in Table 3. It is clear from the
table that even the initial hash values and constant Kt of SHA-256 are exactly half of those of
SHA-512. Therefore, the ROM is organized as two banks of 80 words, with each bank handling
one half of the constants. A combinational logic selects the appropriate banks depending on the
algorithm.
The ROM banks and their associated logic is shown in Fig. 8. An 80 · 32 ROM bank (KH32)
stores the Kt constants of SHA-256 which is also the higher words of SHA-512 and another ROM
bank (KL32) of same capacity stores the lower half of constants. The contents of KH32 are the Kt
constants of SHA-256 which have to be selected at every clock pulse, whereas, as the same con-
stants are the higher words of SHA-512, these have to be selected at alternate clock pulses when
passed on to a 32-bit adder. The associated logic shown in Fig. 8 selects the contents of KH32
either at every clock when it is computing SHA-256 or at alternate clocks for SHA-512.
The logic functions r0, r1, R0 , and R1 involve rotation and shifting and is diﬀerent for both the
functions and separate logic circuits are designed for SHA-256 and SHA-512 to handle the respec-
tive functions.

Table 3
Relationship between constants of SHA-256 and SHA-512
SHA-256 SHA-512
H 1
0 ! a 6a09e667 6a09e667 f3bcc908
K0 428a2f98 428a2f98 d728ae22

Adr7-1 7 HAdr6-0 ROM 32 32

HAdr6-0 Kt
80 X 32
S512

Adr6-0 32
ROM
Adr7-1
80 X 32 Adr0
S512

Fig. 8. ROM banks with selection logic.

I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360 357

6. Experimental results

The SHA-512 and SHA-256 algorithms were designed and tested using a comprehensive design
software, the Altera Quartus II, version 4.0. Altera is the programmable logic performance leader
across all platforms as reported in http://www.altera.com/products/devices/performance/per-in-
dex.html and provides a complete multi-platform design environment to suit specific design needs.
The designs were analyzed and synthesized using Verilog HDL and VHDL, placed and routed in
Altera devices of APEX II, Stratix, and Mercury family FPGAs. Five performance metrics such
as the area (a), memory (l), latency (k), clocking frequency (f) and throughput (d) were computed.
APEX II FPGAs have up to 67,200 logic elements (LEs) and 1.1 Mbits of embedded RAM and
these devices offer abundant logic resources and remarkable I/O performance. High speed com-
pute-intensive data path functions can be easily implemented with one or multiple APEX II de-
vices. Mercury family FPGAs typically have up to 14400 LEs with maximum RAM bits of
114,688. The FPGAs of Stratix family contain 10,570 to 79,040 LEs and up to 7,427,520 RAM
bits (928,440 bytes) without reducing logic resources. High-speed differential I/O support on up
to 116 channels with up to 80 channels optimized for 840 megabits per second (Mbps) is provided
by these FPGAs. The resources used in terms of number of logic elements for the implementation
of algorithm is referred to as the area. A memory segment consists of a bit-slice of a memory that
is implemented in a single embedded cell. Each embedded cell implements one output of the mem-
ory and multiple memory segments may be needed to create a single memory block. Latency is
defined as the number of rounds in a loop and the minimum operating clock as clock period.
The throughput (d) is computed as, d = message block size/(clock period * latency).
The designs were simulated for a block of 1024 bits padded message. SHA(64)-512 and SHA(32)-
512 were designed and placed on the FPGA, EP1S10F484C5 of Stratix family and their perfor-
mance metrics are presented in Table 4. The SHA(32)-512 design occupies 2800 logic elements
whereas SHA(64)-512 occupies 4229 logic elements. As SHA(32)-512 occupies only 26% of the chip
area to handle one block of 1024 bits of padded message, three blocks of message can be authen-
ticated at a time by the chip if the blocks are pipelined. This in turn will occupy only 78% of the
chip area and the rest can be used for implementing encryption logic. SHA(64)-512 can handle only
two blocks at a time and leaves only 20% of chip area for other purposes. Moreover, area used by
one block of SHA(32)-512 design is only 66% of that of SHA(64)-512 and throughput for maximum
number of blocks is almost 72%.
In Table 5, a similar SHA(64)-512 design implemented with a Xilinx Virtex-E XCV600E-8 [13] is
compared with our design implemented with the FPGA of Mercury family, EPM120F484C5. The
operating frequency reported in [13] was 38 MHz whereas, our design has an operating frequency
of 43.7 MHz. The throughput of our design is also more than that of [13].

Table 4
Synthesis results of SHA(64)-512 and SHA(32)-512 on Stratix
Design Area Percentage of chip Memory Percentage of memory Clock Throughput (Mbits/s)
(LEs) (a) area used (%) (bits) (l) on chip used (%) (MHz) (f) (Max block)
SHA(64)-512 4229 40 9216 <1 47.9 1226.2
SHA(32)-512 2800 26 8448 <1 46.5 892.8
358 I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360

Table 5
Comparison of our SHA(64)-512 design with design of [13]
Design Clock (MHz) (f) Throughput (Mbits/s)
Design of [13] 38 479
SHA(64)-512 43.7 560

The lower word length versions, namely, SHA(64)-512, SHA(32)-512, SHA(16)-512, SHA(8)-512
were synthesized on a Mercury family FPGA, EPM120F484C5 and their areas were compared
on the basis of logic element count. The design was optimized for area and their comparison chart
is shown in Fig. 9. Choosing SHA(64) as the base, SHA(8) occupies 27.1% less area followed by
SHA(16) with 24.3% and SHA(32) with 16.5% than SHA(64). It is clear from Fig. 9, that applica-
tions where area is of concern, smaller word length implementations will be suitable.
In order to evaluate the devices belonging to different families of Altera, the throughput of
SHA(32)-512 design on devices belonging to three different families had been done and their per-
formance metrics area, memory, throughput, and operating frequency are listed in Table 6. It can
be seen that the hash algorithm synthesized on Stratix device occupies less area than the other two
FPGAs listed in Table 6. Moreover, since one block occupies only 26% of the chip area, three
blocks of 1024 bits of padded message can be handled by Stratix device whereas only two blocks
can be handled by Apex II and one block by Mercury device. The maximum possible throughput
is listed in the last column.
Finally, synthesis results of SHA(32)-512 and SHA-256 on a single chip are given in Table 7.
Both the algorithms use the same area and memory, but the throughput is different since the block
size and latency are 512 and 64, respectively, for SHA-256, whereas for SHA(32)-512, block size is
1024 and latency is 160.

Area Comparison - 8/16/32/64 bit versions

4000 3711
3101
Logic element

3000 2705 2810

count

2000

1000

0
SHA(8) SHA(16) SHA(32) SHA(64)

Fig. 9. Area comparison—8/16/32/64 bit versions.

Table 6
Synthesis results of SHA(32)-512 on diﬀerent FPGAs
FPGA Area Chip area Memory Memory Clock Throughput Throughput
(LEs) (a) used (%) (bits) (l) used (%) (MHz) (f) (Mbits/s) (Mbits/s)
(1 block) (Max block)
Stratix EP1S10F484C5 2794 26 8448 <1 45.8 292.8 878.4
Apex II EP20K200EFC484-1 2867 34 15,360 14 24.86 159.1 318.2
Mercury EPM120F484C5 3775 78 15,360 31 48.7 311.9 311.9
I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360 359

Table 7
Single chip implementation of SHA(32)-512 and SHA-256
Design Area Memory Clock Throughput Throughput
used (%) used (%) (MHz) (f) (Mbits/s) (1 block) (Mbits/s) (Max block)
SHA-256 32 1 41.97 335.9 1007.7
SHA(32)-512 32 1 41.97 268.7 806.1

7. Conclusions

Secure hash algorithms SHA-256 and SHA-512 are versatile algorithms deployed in a broad
range of applications with different area-performance requirements. Several 64-bit adders are re-
quired to implement a SHA-512 hash function in FPGAs requiring bulk of the chip area. In this
paper, we explored alternative adder architectures for implementing SHA-512 in FPGA with re-
duced size operands and studied their area-performance trade-offs. Our results showed that the
chip area on FPGA decreased with reduction in operand size but the throughput suffered due
to increased latency. The architectures were synthesized in different FPGA families of ALTERA
and their performance metrics such as area, memory, latency, clocking frequency and throughput
were compared. Implementation of SHA-256 and SHA-512 was also done using a common archi-
tecture. The performance metrics shed light on the possibility of synthesizing multiple blocks on a
single chip, which in turn would increase the throughput. Stratix family of FPGAs offered the best
performance metrics. Future work will be directed towards error analysis and error detection pro-
cedures for the hardware implementation of hash functions.

References

[1] Kaufman C, Perlman R, Speciner M. Network security: private communication in a public world. 2nd
ed. Prentice-Hall; 2002.
[2] FIPS Publication 198. The Keyed-hash message authentication code (HMAC). US Doc/NIST, March 6, 2002.
[3] Madson C, Glenn R. The use of HMAC-SHA-1-96 within ESP and AH. RFC 2404, November 1998.
[4] Rivest RL. The MD5 message digest algorithm. RFC 1321, April 1992.
[5] FIPS Publication 180-1. Secure hash standard (SHS). US Doc/NIST, April 17, 1995.
[6] FIPS Publication 197. Advanced encryption standard (AES). US Doc/NIST, November 26, 2001.
[7] FIPS Publication 180-2. Secure hash standard (SHS). US Doc/NIST, May 30, 2001.
[8] FIPS Publication 180-2. Secure hash standard (SHS) change notice 1. US Doc/NIST, February 2004.
[9] Booselaers A, Govaerts R, Vandewalle J. Fast hashing on Pentium. Proceedings of CryptoÕ96, LNCS
1109. Springer-Verlag; 1996. p. 298–312.
[10] Booselaers A, Govaerts R, Vandewalle J. SHA: a design for parallel architectures? Proceedings of the
EUROCRYPTÕ97, LNCS 1233. Springer-Verlag; 1997. p. 348–62.
[11] Nakajima J, Matsui M. Performance analysis and parallel implementation of dedicated hash functions on Pentium
III. IEICE Transactions on Fundamentals 2003;E86-A(1):54–63.
[12] Dominikus S. A hardware implementation of MD5-family hash algorithm. In: Proceedings of the international
conference on electronics circuits and systems, Dubrovnik, Croatia, September 15–18, 2002. p. 1143–6.
[13] McLoone M, McCanny JV. Efficient single-chip implementation of SHA-384 and SHA-512. In: Proceedings of the
IEEE international conference on field-programmable technology (FPT), Hong Kong, July 2002. p. 311–4.
[14] Kang YK, Kim DW, Kwon TW, Choi JR. An efficient implementation of hash function processor for IPSEC. In:
Proceedings of the Asia–Pacific conference on ASICs, August 2002. p. 93–6.
360 I. Ahmad, A. Shoba Das / Computers and Electrical Engineering 31 (2005) 345–360

[15] Grembowski T, Lien R, Gaj K, Nguyen N, Bellows P, Flidr J, et al. Comparative analysis of the hardware
implementation of hash functions SHA-1 and SHA-512. Proceedings of the 5th international conference on
information security (ISCÕ2002), LNCS 2433. Springer-Verlag; 2002. p. 75–89.
[16] Sklavos N, Koufopavlou O. On the hardware implementation of the SHA-2 (256, 384, 512) hash functions. In:
Proceedings of the IEEE international symposium on circuits and systems, vol. 5, May 2003. p. 153–6.
[17] Dadda L, Macchetti M. The design of a high speed ASIC unit for the hash functions SHA-256 (384, 512). In:
Proceedings of the design, automation and test in Europe conference (DATEÕ04), February 16–20, 2004.
[18] Ercegovac MD, Lang T. Digital arithmetic. Morgan Kaufmann Publishers; 2004.
[19] Kaliski B, Staddon J. RSA cryptography speciﬁcations—Version 2.0. RFC 2437, October 1998.
[20] Haller N, Metz C, Nesser P, Straw M. A one-time password system. RFC 2289, February 1998.
[21] Dierks T, Allen C. The TLS protocol—Version 1.0. RFC 2246, January 1999.

Imtiaz Ahmad received his B.Sc. in Electrical Engineering from University of Engineering and
Technology, Lahore, Pakistan, an M.Sc. in Electrical Engineering from King Fahd University of
Petroleum and Minerals, Dhahran, Saudi Arabia, and a Ph.D. in Computer Engineering from
Syracuse University, Syracuse, New York, in 1984, 1988 and 1992, respectively. Since September
1992, he has been with the Department of Computer Engineering at Kuwait University, Kuwait,
where he is currently a professor. His research interests include design automation of digital
systems, high-level synthesis, and parallel and distributed computing.

A. Shoba Das received the B.E. degree from Guindy College of Engineering, Madras University,
India and the M.E. degree from PSG College of Technology, Madras University, India. She has
been in various teaching assignments in India from 1982 and presently working as scientiﬁc
assistant in Kuwait University. Her research interests include optimal design of sequential
machines and testing of communication systems.

Zeghid
No ratings yet
Zeghid
12 pages
Low Power Implementation of Secure Hashing Algorithm (SHA-2) Using VHDL On FPGA of SHA-256
No ratings yet
Low Power Implementation of Secure Hashing Algorithm (SHA-2) Using VHDL On FPGA of SHA-256
6 pages
Cost-Efficient SHA Hardware Accelerators
No ratings yet
Cost-Efficient SHA Hardware Accelerators
10 pages
Optimizing SHA-2 FPGA Implementations
No ratings yet
Optimizing SHA-2 FPGA Implementations
13 pages
01204427-Hash Crypto
No ratings yet
01204427-Hash Crypto
47 pages
Unit - 4
No ratings yet
Unit - 4
27 pages
Capstone PPT 1 2023
No ratings yet
Capstone PPT 1 2023
19 pages
HW2 Cibersecurity
No ratings yet
HW2 Cibersecurity
12 pages
Cryptographic Hash Guide
No ratings yet
Cryptographic Hash Guide
10 pages
Cryptographic Hash Guide
No ratings yet
Cryptographic Hash Guide
10 pages
Cryptographic Hash Functions Guide
No ratings yet
Cryptographic Hash Functions Guide
63 pages
Hash Function (H) : Applications of Cryptographic Hash Functions
No ratings yet
Hash Function (H) : Applications of Cryptographic Hash Functions
4 pages
Suhali&Watanabe
No ratings yet
Suhali&Watanabe
6 pages
Cryptographic Hash Functions Guide
No ratings yet
Cryptographic Hash Functions Guide
43 pages
Cryptographic Hash Functions Guide
No ratings yet
Cryptographic Hash Functions Guide
48 pages
Design and Implementation A New Security Hash Algorithm Based On Md5 and Sha-256
No ratings yet
Design and Implementation A New Security Hash Algorithm Based On Md5 and Sha-256
9 pages
Descrite Research
No ratings yet
Descrite Research
15 pages
Low Power and Area SHA-256 Hardware Accelerator On Virtex-7 FPGA
No ratings yet
Low Power and Area SHA-256 Hardware Accelerator On Virtex-7 FPGA
5 pages
Hardware Performance Evaluation of SHA-3 Candidate Algorithms
No ratings yet
Hardware Performance Evaluation of SHA-3 Candidate Algorithms
8 pages
Hash Function
No ratings yet
Hash Function
43 pages
Module-IV Message Authentication & Integrity
No ratings yet
Module-IV Message Authentication & Integrity
84 pages
A Review Paper On Cryptographic Hash Function
No ratings yet
A Review Paper On Cryptographic Hash Function
11 pages
CH 11
No ratings yet
CH 11
22 pages
Basic Concepts of Message Digest and Hash Function Draft
No ratings yet
Basic Concepts of Message Digest and Hash Function Draft
14 pages
6.sample Toc, Lof, Lot
No ratings yet
6.sample Toc, Lof, Lot
5 pages
INTEGRATION, The VLSI Journal: H.E. Michail, G.S. Athanasiou, G. Theodoridis, C.E. Goutis
No ratings yet
INTEGRATION, The VLSI Journal: H.E. Michail, G.S. Athanasiou, G. Theodoridis, C.E. Goutis
21 pages
Unit-4-Cryptographic Hash Functions
No ratings yet
Unit-4-Cryptographic Hash Functions
40 pages
Psi Lect Hashes 2 PPT
No ratings yet
Psi Lect Hashes 2 PPT
25 pages
Cryptography and Network Security: Fifth Edition by William Stallings
No ratings yet
Cryptography and Network Security: Fifth Edition by William Stallings
23 pages
HashFunctions Ok
No ratings yet
HashFunctions Ok
22 pages
YAsh NIS
No ratings yet
YAsh NIS
18 pages
Hash
No ratings yet
Hash
4 pages
Hash Function: Message Digest or Simply Hash Values
No ratings yet
Hash Function: Message Digest or Simply Hash Values
20 pages
Litehash
No ratings yet
Litehash
17 pages
Chapter - 4 Information Security
No ratings yet
Chapter - 4 Information Security
25 pages
Hash
No ratings yet
Hash
9 pages
Malware Final Essay - Vo Minh Nhut - Nguyen The Dan - Nguyen Quang Truong
No ratings yet
Malware Final Essay - Vo Minh Nhut - Nguyen The Dan - Nguyen Quang Truong
35 pages
Cryptographic Hash Algorithms Performance Finding Using .Net Simulation
No ratings yet
Cryptographic Hash Algorithms Performance Finding Using .Net Simulation
5 pages
An Efficient Implementation of SHA
No ratings yet
An Efficient Implementation of SHA
4 pages
‎⁨نسخة ch2-Hash - Function (1) ⁩
No ratings yet
‎⁨نسخة ch2-Hash - Function (1) ⁩
24 pages
Chapter 1 On The Secure Hash Algorithm Family
No ratings yet
Chapter 1 On The Secure Hash Algorithm Family
17 pages
Hash Functions Technical Report
No ratings yet
Hash Functions Technical Report
3 pages
Unit 3
No ratings yet
Unit 3
42 pages
BHA-160: Constructional Design of Hash Function Based On NP-hard Problem
No ratings yet
BHA-160: Constructional Design of Hash Function Based On NP-hard Problem
6 pages
IEEE Transaction Paper - Draft - QC - AUS
No ratings yet
IEEE Transaction Paper - Draft - QC - AUS
22 pages
5.1 - Criptography - Hash Func 2.0
No ratings yet
5.1 - Criptography - Hash Func 2.0
39 pages
Ns Unit 3
No ratings yet
Ns Unit 3
20 pages
Unit 4
No ratings yet
Unit 4
74 pages
Hash Functions for Tech Experts
No ratings yet
Hash Functions for Tech Experts
21 pages
Serial Communications Implementations On FPGAs1
No ratings yet
Serial Communications Implementations On FPGAs1
4 pages
Wk4 1 Hash
No ratings yet
Wk4 1 Hash
8 pages
12.hash and MAC Algorithms
No ratings yet
12.hash and MAC Algorithms
39 pages
Unit 3 Cryptographic Hash Functions
No ratings yet
Unit 3 Cryptographic Hash Functions
13 pages
Cryptographic Hash Algorithm Survey
No ratings yet
Cryptographic Hash Algorithm Survey
5 pages
Cryptographic Hash Functions Presentation
No ratings yet
Cryptographic Hash Functions Presentation
12 pages
Hash Functions in Blockchain Tech
No ratings yet
Hash Functions in Blockchain Tech
15 pages
HASH Function
No ratings yet
HASH Function
12 pages
Hash Family and How Hash Functions Work
No ratings yet
Hash Family and How Hash Functions Work
4 pages
Cryptography - Hash Functions
No ratings yet
Cryptography - Hash Functions
10 pages
BigM Method Calculator
0% (2)
BigM Method Calculator
4 pages
Application of DFT Filter Bank To Power Frequency Harmonic Measurement
No ratings yet
Application of DFT Filter Bank To Power Frequency Harmonic Measurement
5 pages
HW3 Tot
No ratings yet
HW3 Tot
10 pages
CS502 - Midterm - by DR Tariq Hanif
No ratings yet
CS502 - Midterm - by DR Tariq Hanif
18 pages
Dynamic Programming for Study Optimization
No ratings yet
Dynamic Programming for Study Optimization
7 pages
Ga3 PDF
No ratings yet
Ga3 PDF
16 pages
Shashidhar-18csl76 Final
No ratings yet
Shashidhar-18csl76 Final
19 pages
Algebraic Expressions and Indentities
No ratings yet
Algebraic Expressions and Indentities
10 pages
Efficient Deep Learning with AdderNets
No ratings yet
Efficient Deep Learning with AdderNets
8 pages
Vanishing and Exploding
No ratings yet
Vanishing and Exploding
9 pages
Data Structure KCS301
No ratings yet
Data Structure KCS301
2 pages
Lab Report#08 DSP
No ratings yet
Lab Report#08 DSP
11 pages
Time Frequency Analysis & Wavelet Transforms: Homework (II)
No ratings yet
Time Frequency Analysis & Wavelet Transforms: Homework (II)
1 page
Bit Plane Slicing and Bit Plane Compression
No ratings yet
Bit Plane Slicing and Bit Plane Compression
5 pages
Chapter 8 MultiFreedom Constraints Solutions
No ratings yet
Chapter 8 MultiFreedom Constraints Solutions
3 pages
06 Smoothing PDF
No ratings yet
06 Smoothing PDF
55 pages
Question Bank Ann
50% (2)
Question Bank Ann
2 pages
Lobatto Methods
No ratings yet
Lobatto Methods
17 pages
Unit 5 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Mining - WWW - Rgpvnotes.in
15 pages
Finite Element Stress Recovery
No ratings yet
Finite Element Stress Recovery
6 pages
Answer: H-Fixed-Length-Eight-Genes-G-Q120961572
No ratings yet
Answer: H-Fixed-Length-Eight-Genes-G-Q120961572
2 pages
Chapter 3 - Image Enhancement
No ratings yet
Chapter 3 - Image Enhancement
79 pages
R Feature Selection with Boruta
No ratings yet
R Feature Selection with Boruta
18 pages
Lec 04
No ratings yet
Lec 04
79 pages
Machine Learning 100 Hours
No ratings yet
Machine Learning 100 Hours
4 pages
Control System Stability Guide
No ratings yet
Control System Stability Guide
28 pages
Scheduling of Vehicles From A Central Depot To A Number of Delivery Points
No ratings yet
Scheduling of Vehicles From A Central Depot To A Number of Delivery Points
14 pages
Intro to Arrays for Beginners
No ratings yet
Intro to Arrays for Beginners
43 pages
Ee 113 Book 2008
No ratings yet
Ee 113 Book 2008
641 pages
Recursive Descent Parsing: Goal Approach Key Question: Which Production To Use?
No ratings yet
Recursive Descent Parsing: Goal Approach Key Question: Which Production To Use?
25 pages

Ahmad 2005

Uploaded by

Ahmad 2005

Uploaded by

Computers and Electrical Engineering 31 (2005) 345–360

Hardware implementation analysis of SHA-256

Fig. 1. Message digest generation.

• Padding the message into a multiple of 512 or 1024 bits.

Iterative processing unit

Fig. 2. Block diagram of SHA-256/SHA-512.

• A sequence of 64 constant 32-bit words, K 256

• The message digest is computed by H N0 kH N1 kH N2 kH N3 kH N4 kH N5 kH N6 kH N7 after processing all the

4. Reduced word length implementations

Fig. 3. Carry save adder.

Fig. 4. Addition of 8-bit operands using 4-bit CSA.

be referred to as SHA(64)-512, SHA(32)-512, SHA(16)-512 and SHA(8)-512, respectively, in the rest

5.1. SHA-512 implementation with 32 bit adders

0L 1L 2L 3L 4L 5L 6L 7U 8U 9U 10U 11U 12U 13U 14U 15U

Fig. 5. SHA(32)-512—message scheduler.

Sel Sel Sel Sel Sel Sel Sel Sel

Fig. 6. SHA(32)-512—iterative processing unit.

5.2. SHA-512 implementation with 16 and 8 bit adders

Fig. 7. 32-bit FAA.

5.3. Single chip implementation of SHA-512 and SHA-256

Adr7-1 7 HAdr6-0 ROM 32 32

Fig. 8. ROM banks with selection logic.

Area Comparison - 8/16/32/64 bit versions

3000 2705 2810

Fig. 9. Area comparison—8/16/32/64 bit versions.

You might also like