Vlsi Pd 21et64d2 Unit 2 (1)
Vlsi Pd 21et64d2 Unit 2 (1)
21ET64D2
By,
0 0 0
+ + +
Cout C12 C8 C4
1 1 1 Cin
+ + + +
1
1
0
0
S16:13 S12:9 S8:5 S4:1
Carry-Select Adder
• Carry-select adder duplicates two small adders for the cases
Cin='0' and Cin='1' and then uses a MUX to select the case
that we need
• A carry-select adder is often used as the fast adder in a
datapath library because its layout is regular.
Conditional-sum Adder
• Extending the idea of carry-select adder we can design
Conditional-sum adder.
• The n-bit conditional-sum adder that uses n single-bit
conditional adders, together with a tree of 2:1 MUXs is as
shown.
• Split n-bit the adder into an i-bit adder for the i LSBs and an
(n - i)-bit adder for the n - i MSBs.
• Both of the smaller adders generate two conditional sums as
well as true and complement carry signals.
• The two (true and complement) carry signals from the LSB
adder are used to select between the two (n - i + 1)-bit
conditional sums from the MSB adder using 2(n - i + 1) two-
input MUXes.
• For example, split a 16-bit adder using i = 8 and n = 8; then
we can split one or both 8bit adders again and so on.
• Figure shows the normalized delay and area Figures for a set
of pre-designed datapath adders.
Area vs. delay of Synthesized Adders
Agenda
•Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort of Inverter,
NAND and NOR gates
• Predicting delay, Logical paths, Logical area, and logical
efficiency
• Multi-stage cells, Optimum delay
• Optimum number of stages
Multiplier
• A multiplier is a hardware circuit dedicated to
multiplying binary values.
• The Basic Blocks are the
• full adders
• half adders
• Multiplication involves
• Partial Product generation
• Accumulating the partial products
Traditional Multiplier
• Multiplication can be viewed as repeated shifts and adds.
• It uses an adder, a shift register, and a small amount of
control logic and it is slow.
Multiplication: General Form
❑ Multiplicand: Y = (yM-1, yM-2, …, y1, y0)
❑ Multiplier: X = (xN-1, xN-2, …, x1, x0)
M −1 N −1
N −1 M −1
❑ Product: P = y j 2 j xi 2i = xi y j 2i + j
j =0 i =0 i =0 j =0
y5 y4 y3 y2 y1 y0 multiplicand
x5 x4 x3 x2 x1 x0 multiplier
x0y5 x0y4 x0y3 x0y2 x0y1 x0y0
x1y5 x1y4 x1y3 x1y2 x1y1 x1y0
x2y5 x2y4 x2y3 x2y2 x2y1 x2y0 partial
x3y5 x3y4 x3y3 x3y2 x3y1 x3y0 products
x4y5 x4y4 x4y3 x4y2 x4y1 x4y0
x5y5 x5y4 x5y3 x5y2 x5y1 x5y0
p11 p10 p9 p8 p7 p6 p5 p4 p3 p2 p1 p0 product
Fewer Partial Products
2 2 A 4
Y
2 B 4
A 2
A Y Y
1 B 2 1 1
Normalized Delay: d
5 p=2
d = (4/3)h + 2
4 g=1
p=1
3 d=h+1
2 Effort Delay: f
1
Parasitic Delay: p
0
0 1 2 3 4 5
Electrical Effort:
h = Cout / Cin
Agenda
• Datapath Logic Cells
• Carry select and Conditional sum adder
• Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort of
Inverter, NAND and NOR gates
• Predicting delay, Logical paths,
Logical area, and logical efficiency
• Multi-stage cells, Optimum delay
• Optimum number of stages
Predicting Delay
Naming of Complex CMOS Combinational
Logic Cells
Logical Area and Logical Efficiency
Logical Area and Logical Efficiency
Agenda
• Datapath Logic Cells
• Carry select and Conditional sum adder
• Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort of
Inverter, NAND and NOR gates
• Predicting delay, Logical paths, Logical area,
and logical efficiency
• Multi-stage cells, Optimum delay
• Optimum number of stages
Multistage Cells
Multistage Networks
Stage effort: fi = gihi
Path electrical effort: H = Cout/Cin
Path logical effort: G = g1g2…gN
Branching effort: B = b1b2…bN
Path effort: F = GHB
= f1f2…fN
fˆ = gi hi = F
1
N
fˆ = gh = g CCoutin
gi Couti
Cini =
fˆ
• Working backward, apply capacitance transformation to find
the input capacitance of each gate given the load it drives.
Gate Sizes
• The optimum sizes of the NAND cells are not very different
from 1X in this case because H = 1 and we are only driving a
load no bigger than the input capacitance.
• What is the optimum stage effort if we have to drive a large
load, H >> 1?
• Notice that, so far, we have only calculated the optimum
stage effort when we have a fixed number of stages, N.
• We have said nothing about the situation in which we are free
to choose, N, the number of stages.
Example: Optimize Path
1 b c
a
5
g=1 g = 5/3 g = 5/3 g=1
Effective fanout, =
G=
H=
F=
fi =
a=
b=
c=
Example: Optimize Path
1 b c
a
5
Effective fanout = 5
G = 25/9
H=5
F = 125/9 = 13.9
fi = 1.93
c = 5g4/fi = 2.59
b = 2.59g3/fi = 2.23
a = 2.23g2/fi = 1.93
Delay = 7.72
Agenda
• Datapath Logic Cells
• Carry select and Conditional sum adder
• Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort of
Inverter, NAND and NOR gates
• Predicting delay, Logical paths, Logical area,
and logical efficiency
• Multi-stage cells, Optimum delay
effort delay f DF = fi
parasitic delay p P = pi
delay d= f +p D = di = DF + P
Method of Logical Effort
1) Compute path effort F = GBH
2) Estimate best number of stages N = log 4 F
3) Sketch path with N stages
1
4) Estimate least delay D = NF + PN
gi Couti
6) Find gate sizes Cini =
fˆ
Summary
• Logical effort is useful for thinking of delays in circuits
– Numeric logical effort characterizes gates
– NANDs are faster than NORs in CMOS
– Paths are fastest when effort delays are ~4
– Path delay is weakly sensitive to stages, sizes
– But using fewer stages doesn’t mean faster paths
– Delay of path is about log4F FO4 inverter delays
– Inverters and NAND2 best for driving large caps
• Provides language for discussing fast circuits
– But requires practice to master
THANK YOU
Branching Effort
❑ Introduce branching effort
– Accounts for branching between stages in path
Con path + Coff path
b=
Con path
B = bi
Note:
h = BHi
M −1 N −1
N −1 M −1
❑ Product:
P = y j 2 j xi 2i = xi y j 2i + j
j =0 i =0 i =0 j =0
Braun Multiplier
Only for unsigned multiplication
Structure of 4x4 Braun Multiplier
❑ .
Fewer Partial Products
❑ Array multiplier requires N partial products
❑ If we looked at groups of r bits, we could form N/r partial
products.
– Faster and smaller?
– Called radix-2r encoding
❑ Ex: r = 2: look at pairs of bits
– Form partial products of 0, Y, 2Y, 3Y
– First three are easy, but 3Y requires adder
Booth Encoding
❑ Instead of 3Y, try –Y, then increment next partial product
to add 4Y
❑ Similarly, for 2Y, try –2Y + 4Y in next partial product
I/O Cells
• Three-state bidirectional output buffer is as shown below:
• The three-state buffer allows us to employ the same pad for
input and output bidirectional I/O.
• When we want to use the pad as an input, we set OE low and
take the data from DATAin.
• When the output enable (OE) signal is high, the circuit
functions as a noninverting buffer driving the value of
DATAin onto the I/O pad.
• When OE is low, the output transistors or drivers, M1 and
M2, are disconnected.
• This allows multiple drivers to be connected on a bus.
• It is up to the designer to make sure that a bus never has two
drivers a problem known as contention.