CEC370 UNIT 3
CEC370 UNIT 3
Downloadable at
Ms.G VIJAYAKUMARI,AP/ ECE
tiny.cc/npsb-elearning
NEW PRINCE DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
E-LEARNING CEC370 LOW POWER IC DESIGN
OUTLINE
Introduction
Standard adder cells as a basic building blocks are used in designing and fabricating
of different kinds of adder architectures.
Half Adders:
➢The half adders are the simplest and most fundamental kind of adders.
➢It consists of two binary operands (A&B) that have a pair of single-bits as inputs
Half Adders:
Truth Table of a Half Adder
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
Sum = A xor B
Cout = A.B
➢ It is constructed using two half adders and an OR gate. There is a total of three
inputs for the full adder, two for the input numbers A and B, and one for the carry-
in Cin.
➢ The outputs are the sum and carry-out.
➢ The modified conventional CMOS full adder configuration has been widely
accepted and utilized in numerous applications; it often exhibits a critical delay
that actually limits the systems total performance.
➢ There is an alternative implementation of the full adder cell that does not use
XOR gates but instead use 28 transistors.
➢ TFA consists of 16 transistors and dissipates less power than conventional CMOS full
adder reported so far.
➢ Another schematic configuration of the full adder that ensures both low power and
➢ Next a low power CMOS full adder cell consisting of 17 transistors is described. It
is based on XOR and XNOR gates and the pass transistors. Comparative analysis
has shown that it consumes 10 to 15 percent less power than either the T.F.A. or the
14-transistor full adder.
➢ These power savings are due to the fact that this cell has no short circuit power and
that its dynamic power, relative to the other two cells is lower.
Using a power supply voltage of 3.3v the critical path delay of the 10 transistor full
adder measures at 0.086ns while in the T.F.A it measures at 0.12ns. also, with the same
supply voltage and running a clock frequency of 1ghz the 10 transistor full adder has
an average dissipation of 81µw of power, where as the T.F.A dissipates about 170µw.
➢ Since all the full adders are connected together by the carry chain a worst-case
addition will require the carry to ripple from the position of the least significant
bit to that of the most significant bit.
➢ The worst-case delay increases linearly with the length of carry propagation path
which depends on the no. of bits processed by the operand’s “n”.
➢ However, carry propagation can be enhanced by exploiting faster logic circuit
technologies and faster full adder designs RCA is subjected to a glitching
problem.
➢ Besides the Pi and Gi signals, the Boolean variables for the CLA adder are
Si = Pi xor Ci
Ci+1 = Gi +Pi .Ci
A Variation of basic CLA addition algorithm, namely the ELM adder will be
analyzed. The ELM addition algorithm incorporates a binary tree of simple
processors running 0 (log n) time and it is also based on the concept of carry
propagate and carry generate. The fig. shows the block diagram of 8- bit ELM
adder.
PERFORMANCE EVOLUTION
The 32-bit CLA and ELM adders have been simulated using the static CMOS
circuit design methodology.
Below table shows a clear explanation of the relative aspects of the adders. Note
that even though the CLA has more transistors than the ELM adder, it has shorter
interconnects and hence occupies a smaller area.
Adder type Area (*106 λ2) No. of transistors Delay Avg. power dissipation per
addition(mW)
The Manchester adder uses the MCC as its carry network. The conceptual
representation and CMOS realization of a one stage MCC are depicted in fig. referring
to fig(a), a one stage MCC can be conceptually analyzed as having three switches each
manipulated by controlling signals Gi, Pi And ANi from the above equations. It is clear
that at any time, only one of the three signals Gi,Pi and ANi is at logic at 1.the carry out
signal Ci-1 is connected to 0. If ANi is high or to 1 if Gi is high, and to the incoming
carry Cin, if Pi is high.
➢ The carry select adder provides a substantial compromise between the RCA,
which occupies a small area and has a longer delay, and the CLA, which occupies
a larger area and has a shorter delay.
➢ In the CSL both the n-bit operand, Ai and Bi are divided into k blocks of possibly
different sizes.
Layouts for the RCA and CSL adders were generated for the following sizes 8-
bit,16-bit,32- bit,64-bit and 128-bit.
The comparisons of the area sizes and performance delays for both types of
adders are summarized in below table.
No.of bits Area,λ2 Delay,ns
% change %change
Hybrid adders which refer to the elementary combination of two or more design
pure design methods aim to reduce power dissipation improve cost effectiveness
and achieve other performance enhancements as well.
• The RCA makes use of a row of cascaded binary F.A’S to compute the
summation of two operands. In fact, with slight modification this row of
F.A’S can also be viewed as a mechanism to reduce three binary numbers
into two binary numbers in multi operand addition.
• This method is used in the carry save adder where it is indeed an RCA with
its carries saved rather than propagated, therefore, the CSA operator is often
called a 3:2 counter.
T = (K-2).TCSA +TCPA
Number 8 24 40 56 64 8 24 40 56 64
of bits n
Without 3.12 8.46 13.16 17.17 19.33 802 1337 2245 3412 3873
CSA
With 2.72 8.06 12.77 15.57 18.12 364 1186 1993 2934 3341
CSA
Reduction 13 5 3 9 6 9 11 11 14 14
Most of the process technology studies for low voltage and low-power applications
converge to the conclusion that scaled BiCMOS/ CMOS technology will remain the
dominant solution in the future. The technology was at 95nm in 2001 and it is reduced to
65nm in 2003. It is conceivable that once the problem in manufacturing yield is
overcome, by 2016, the gate length will reduce to 13nm. As for the power supply
voltage, it was at 1.2V in 2001 and it is expected to experience a ladder like reduction to
0.9V by 2007. In the long term, it is predicted that it will continue to reduce to 0.6V by
2016due to probability and reliability issues.
High speed adder that uses low power consumption became a most crucial component
of processor, because it is heavily used in Arithematic Logic Unit, Floating Point Unit,
and for address generation during cache or memory access.
The relentless drive for adders with low power dissipation can be addressed at various
design levels, namely
a.Architecture level
b.Circuitlevel
c.Layout level
d.Device level and
e.Process Technology Level
• The operation of all dynamic logic gates depends upon on temporary storage of
charge in parasitic.
• This operational property necessitates periodic updating of internal node voltage
levels, since stored charge in capacitor cannot retain indefinitely.
• Consequently, dynamic logic circuits require periodic clock signals in order to
control charge refreshing.
The serial connection of pMOS or nMOS require increased width in order to acquire
a reasonable conducting current to drive capacitive loads. This is because
connecting pMOS or Nmos devices in series can be visualized as a number of
cascaded transistors. The delay time imposed by these devices is defined by
C- capacitance, R- Resistance,
W- Channel Width, L- Channel Length
This logic style eliminates the problem of vigilantly sizing the series transistors,
there by requiring one half as many transistors as compared to the static CMOS
XOR gate. When the output of the nMOS pass transistor network at node X is
logically high, at (VDD – Vth), where Vth is the threshold voltage, it causes a
major setback by inducing an incomplete turnoff of the pMOS in the inverter,
thus resulting a high short circuit current. To restrain this current, a pMOS
device is then coupled across the output of the inverter gate in order to pull up
the output node X to full VDD
By using both the pMOS and nMOS devices, the DPL prevents the problem of the
nMOS threshold voltage dropping in CPL logic design.