0% found this document useful (0 votes)
15 views

Vlsi Pd 21et64d2 Unit 2 (1)

Uploaded by

pranavjha.et21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Vlsi Pd 21et64d2 Unit 2 (1)

Uploaded by

pranavjha.et21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

VLSI Physical Design

21ET64D2

By,

Dr. Premananda B.S.


Books
❑ Michael John Sebastian Smith, “Application - Specific
Integrated Circuits” Addison-Wesley Professional; 1st
edition, 1997/Pearson Education, 2002.
❑ N. Weste and D. Harris, “CMOS VLSI Design: A Circuits
and Systems Perspective”, 3rd Edition, Pearson Education,
2006.
❑ Vikram Arkalgud Chandrasetty, “VLSI Design: A
Practical Guide for FPGA and ASIC Implementations”,
Springer, 2011.
❑ Jan M. Rabaey. Anantha Chandrakasan, and Borivoje
Nikolic, “Digital Integrated Circuits: A Design
Perspective” 2nd Edition, Pearson Education India
❑ …
Modules
• Types of ASIC, Design Flow, and Datapath
Logic Cells

• Datapath Logic Cells and


ASIC Library Design
• Programmable ASIC Architectures
• ASIC Construction-I
• ASIC Construction-II
Agenda
• Datapath Logic Cells
• Carry select and Conditional sum adder
• Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort of
Inverter, NAND and NOR gates
• Predicting delay, Logical paths, Logical area,
and logical efficiency
• Multi-stage cells, Optimum delay
• Optimum number of stages
Datapath Logic Cells
• Carry select adder
• Conditional sum adder
Carry-Select Adder
❑ Trick for critical paths dependent on late input X
– Pre-compute two possible outputs for X = 0, 1
– Select proper output when X arrives
❑ Carry-select adder pre-computes n-bit sums
– For both possible carries into the n-bit group
A16:13 B16:13 A12:9 B12:9 A8:5 B8:5 A4:1 B4:1

0 0 0
+ + +

Cout C12 C8 C4
1 1 1 Cin
+ + + +
1

1
0

0
S16:13 S12:9 S8:5 S4:1
Carry-Select Adder
• Carry-select adder duplicates two small adders for the cases
Cin='0' and Cin='1' and then uses a MUX to select the case
that we need
• A carry-select adder is often used as the fast adder in a
datapath library because its layout is regular.
Conditional-sum Adder
• Extending the idea of carry-select adder we can design
Conditional-sum adder.
• The n-bit conditional-sum adder that uses n single-bit
conditional adders, together with a tree of 2:1 MUXs is as
shown.
• Split n-bit the adder into an i-bit adder for the i LSBs and an
(n - i)-bit adder for the n - i MSBs.
• Both of the smaller adders generate two conditional sums as
well as true and complement carry signals.
• The two (true and complement) carry signals from the LSB
adder are used to select between the two (n - i + 1)-bit
conditional sums from the MSB adder using 2(n - i + 1) two-
input MUXes.
• For example, split a 16-bit adder using i = 8 and n = 8; then
we can split one or both 8bit adders again and so on.
• Figure shows the normalized delay and area Figures for a set
of pre-designed datapath adders.
Area vs. delay of Synthesized Adders
Agenda

• Datapath Logic Cells


• Carry select and Conditional sum adder

•Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort of Inverter,
NAND and NOR gates
• Predicting delay, Logical paths, Logical area, and logical
efficiency
• Multi-stage cells, Optimum delay
• Optimum number of stages
Multiplier
• A multiplier is a hardware circuit dedicated to
multiplying binary values.
• The Basic Blocks are the
• full adders
• half adders
• Multiplication involves
• Partial Product generation
• Accumulating the partial products
Traditional Multiplier
• Multiplication can be viewed as repeated shifts and adds.
• It uses an adder, a shift register, and a small amount of
control logic and it is slow.
Multiplication: General Form
❑ Multiplicand: Y = (yM-1, yM-2, …, y1, y0)
❑ Multiplier: X = (xN-1, xN-2, …, x1, x0)
 M −1   N −1
 N −1 M −1
❑ Product: P =   y j 2 j    xi 2i  =  xi y j 2i + j
 j =0   i =0  i =0 j =0
y5 y4 y3 y2 y1 y0 multiplicand
x5 x4 x3 x2 x1 x0 multiplier
x0y5 x0y4 x0y3 x0y2 x0y1 x0y0
x1y5 x1y4 x1y3 x1y2 x1y1 x1y0
x2y5 x2y4 x2y3 x2y2 x2y1 x2y0 partial
x3y5 x3y4 x3y3 x3y2 x3y1 x3y0 products
x4y5 x4y4 x4y3 x4y2 x4y1 x4y0
x5y5 x5y4 x5y3 x5y2 x5y1 x5y0
p11 p10 p9 p8 p7 p6 p5 p4 p3 p2 p1 p0 product
Fewer Partial Products

• Array multiplier requires N partial products


• If we looked at groups of r bits, we could form N/r partial
products.
• Faster and smaller?
• Called radix-2r encoding
• Ex: r = 2: look at pairs of bits
• Form partial products of 0, Y, 2Y, 3Y
• First three are easy, but 3Y requires an adder
Booth Encoding
• Refer Class notes for detailed material and problems
Agenda
• Datapath Logic Cells
• Carry select and Conditional sum adder
• Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort
of Inverter, NAND and NOR gates
• Predicting delay, Logical paths, Logical area,
and logical efficiency
• Multi-stage cells, Optimum delay
• Optimum number of stages
Logical Effort
• The logical effort of a logic gate tells how much worse it is at
producing output current than an inverter, given each of its
inputs, may contain only the same input capacitance as the
inverter.
• It captures enough information about a logic gate’s topology-
the network of transistors that connect the gate’s output to the
power supply and the ground, to determine the delay of the
logic gate.
• It is independent of the actual size of the logic gate, allowing
one to postpone detailed calculations of transistor sizes until
after the logical effort analysis is complete.
• An easy way to estimate delay in an MOS circuit.
Logical Effort Contd..
• A design procedure for achieving the least delay along a path
of a logic network.
• Describes drive capability relative to that of a reference
inverter.
• Number of times worse it is at delivering output current than
would be an inverter with identical input capacitance.
• It combines into one calculation the effort required to drive
large electrical loads and to perform logic functions.
• By comparing delay estimates of different logic structures,
the fastest candidate can be selected.
Logical Effort
• Logical effort is the ratio of the input capacitance of a gate to
the input capacitance of an inverter with the same output
current.
• Logical effort increases with the gate complexity.
• Logical effort is independent of the size of a logic cell.
• Logical effort is a function of topology, independent of
sizing.
• Logical effort achieves an approximate optimum because it
ignores several second-order effects.
Logical Effort Contd..
❑ Logical effort is a method to make these decisions:
– Uses a simple model of delay
– Allows back-of-the-envelope calculations
– Helps make rapid comparisons between alternatives
❑ Calculation of logical effort for a logic gate is straightforward:
– Design the logic gate, picking transistor sizes that make it as good a
driver of output current as the reference inverter.
– Logical effort per input for a particular input is the ratio of the
capacitance of that input to the total input capacitance of the
reference inverter.
– The total logical effort of the gate is the sum of the logical efforts of
all of its inputs.
Logical Effort

tPD = R (Cout + Cp) + tq


tPD = (0.07 + 1.46 Cout + 0.15) ns
Logical Effort
Delay
• The delay equation is the sum of three terms,
d=f+p+q
delay = effort delay + parasitic delay + nonideal delay
• The effort delay f is the product of logical effort, g, and
electrical effort, h:
f = g·h
• Thus, delay = logical effort x electrical effort + parasitic delay
+ nonideal delay.

• R and C will change as we scale a logic cell, but the RC product


stays the same.
Delay in a Logic Gate
❑ Express delays in process-independent unit d
d = abs
❑ Delay has two components: d = f + p 
❑ f: effort delay = gh
– has two components
❑ g: logical effort
– Measures relative ability of gate to deliver current
– g  1 for inverter
❑ h: electrical effort = Cout / Cin
– Ratio of output to input capacitance
– Sometimes called fanout
❑ p: parasitic delay
– Represents delay of gate driving no load
– Set by internal parasitic capacitance
Logical Effort
• The logical effort is the ratio of its input capacitance to that of
an inverter that delivers equal output current.
• The logical effort of a gate presents the ratio of its input
capacitance to the inverter capacitance when sized to deliver
the same current.
• We can find logical effort by scaling a logic cell to have the
same drive as a 1X minimum-size inverter.
• Then the logical effort, g, is the ratio of the input capacitance,
Cin, of the 1X logic cell to Cinv.
• Inverter has the smallest logical effort and intrinsic delay of
all static CMOS gates.
Computing Logical Effort
• Logical effort is the ratio of the input capacitance of a gate to
the input capacitance of an inverter delivering the same
output current.
• Measure from delay vs. fanout plots or estimate by counting
transistor widths

2 2 A 4
Y
2 B 4
A 2
A Y Y
1 B 2 1 1

Cin = 3 Cin = 4 Cin = 5


g = 3/3 g = 4/3 g = 5/3
Delay Plots
d =f+p 2-input
= gh + p 6
NAND Inverter
g = 4/3

Normalized Delay: d
5 p=2
d = (4/3)h + 2
4 g=1
p=1
3 d=h+1

2 Effort Delay: f

1
Parasitic Delay: p
0
0 1 2 3 4 5

Electrical Effort:
h = Cout / Cin
Agenda
• Datapath Logic Cells
• Carry select and Conditional sum adder
• Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort of
Inverter, NAND and NOR gates
• Predicting delay, Logical paths,
Logical area, and logical efficiency
• Multi-stage cells, Optimum delay
• Optimum number of stages
Predicting Delay
Naming of Complex CMOS Combinational
Logic Cells
Logical Area and Logical Efficiency
Logical Area and Logical Efficiency
Agenda
• Datapath Logic Cells
• Carry select and Conditional sum adder
• Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort of
Inverter, NAND and NOR gates
• Predicting delay, Logical paths, Logical area,
and logical efficiency
• Multi-stage cells, Optimum delay
• Optimum number of stages
Multistage Cells
Multistage Networks
Stage effort: fi = gihi
Path electrical effort: H = Cout/Cin
Path logical effort: G = g1g2…gN
Branching effort: B = b1b2…bN
Path effort: F = GHB
= f1f2…fN

Path delay D = Sdi = Spi + Sfi


Multistage Logic Networks
• Logical effort generalizes to multistage networks
• Path Logical Effort G=  gi
Cout-path
• Path Electrical Effort H=
Cin-path
• Path Effort F =  fi =  gi hi
Multistage Logic Networks
• Logical effort generalizes to multistage networks
• Path Logical Effort G=  gi
Cout − path
• Path Electrical Effort H=
Cin − path
• Path Effort F =  fi =  gi hi

• Can we write F = GH?


Optimum Path Delay
• Refer class notes
Designing Fast Circuits
D =  di = DF + P
• Delay is smallest when each stage bears the same effort

fˆ = gi hi = F
1
N

• Thus minimum delay of the N-stage path is


1
D = NF + PN

• This is a key result of logical effort


– Find the fastest possible delay
– Doesn’t require calculating gate sizes
Gate Sizes
• Use logical effort in the design of logic cells and in the design
of logic that uses logic cells.
• If we have the flexibility to continuously size each logic cell
(we choose from 1X, 2X, and 4X drive strengths), each logic
stage can be sized using the equation for the individual stage
electrical efforts.
fˆ = gh = g CCoutin
gi Couti
find hi from the above equation  Cini =

• Consider the example of a multistage gate.
• Work backward starting at the fixed load capacitance at the
input of the last inverter.
Gate Sizes
Gate Sizes
• How wide should the gates be for the least delay?

fˆ = gh = g CCoutin
gi Couti
 Cini =

• Working backward, apply capacitance transformation to find
the input capacitance of each gate given the load it drives.
Gate Sizes
• The optimum sizes of the NAND cells are not very different
from 1X in this case because H = 1 and we are only driving a
load no bigger than the input capacitance.
• What is the optimum stage effort if we have to drive a large
load, H >> 1?
• Notice that, so far, we have only calculated the optimum
stage effort when we have a fixed number of stages, N.
• We have said nothing about the situation in which we are free
to choose, N, the number of stages.
Example: Optimize Path
1 b c
a
5
g=1 g = 5/3 g = 5/3 g=1

Effective fanout, =
G=
H=
F=
fi =
a=
b=
c=
Example: Optimize Path
1 b c
a
5

Effective fanout = 5
G = 25/9
H=5
F = 125/9 = 13.9
fi = 1.93
c = 5g4/fi = 2.59
b = 2.59g3/fi = 2.23
a = 2.23g2/fi = 1.93
Delay = 7.72
Agenda
• Datapath Logic Cells
• Carry select and Conditional sum adder
• Booth Encoding
• ASIC Library Design
• Logical effort: Cell delay, Logical effort of
Inverter, NAND and NOR gates
• Predicting delay, Logical paths, Logical area,
and logical efficiency
• Multi-stage cells, Optimum delay

• Optimum number of stages


Optimum Number of Stages
• Chain of N inverters each with equal stage effort, f = gh.
• Total path delay is Nf=Ngh=Nh, since g=1 for an inverter.
• To drive a path electrical effort H, hN=H, or Nlnh = lnH.
• Delay, Nh = hlnH/lnh.
• Since lnH is fixed, we can only vary h/ln(h).
• h/ln(h) is a shallow function with a minimum at h = e = 2.718
• Total delay is Ne = elnH.
• Figure shows us how to minimize delay regardless of area or
power and neglecting parasitic and non-ideal delays.
Optimum Number of Stages
Review of Definitions
Term Stage Path
number of stages 1 N
logical effort g G =  gi
H=
Cout-path
electrical effort h= Cout
Cin Cin-path
Con-path +Coff-path
branching effort b= Con-path B =  bi
effort f = gh F = GBH

effort delay f DF =  fi

parasitic delay p P =  pi
delay d= f +p D =  di = DF + P
Method of Logical Effort
1) Compute path effort F = GBH
2) Estimate best number of stages N = log 4 F
3) Sketch path with N stages
1
4) Estimate least delay D = NF + PN

5) Determine best stage effort ˆf = F N1

gi Couti
6) Find gate sizes Cini =

Summary
• Logical effort is useful for thinking of delays in circuits
– Numeric logical effort characterizes gates
– NANDs are faster than NORs in CMOS
– Paths are fastest when effort delays are ~4
– Path delay is weakly sensitive to stages, sizes
– But using fewer stages doesn’t mean faster paths
– Delay of path is about log4F FO4 inverter delays
– Inverters and NAND2 best for driving large caps
• Provides language for discussing fast circuits
– But requires practice to master
THANK YOU
Branching Effort
❑ Introduce branching effort
– Accounts for branching between stages in path
Con path + Coff path
b=
Con path
B =  bi
Note:

 h = BHi

❑ Now we compute the path effort


– F = GBH
Branching Effort
Paths that Branch
❑ No! Consider paths that branch:
15
G =1 90
5
H = 90 / 5 = 18
GH = 18 15
90
h1 = (15 +15) / 5 = 6
h2 = 90 / 15 = 6
F = g1g2h1h2 = 36 = 2GH
Residue Number System
Parallel Multiplier
❑ Multiplicand: Y = (yM-1, yM-2, …, y1, y0)
❑ Multiplier: X = (xN-1, xN-2, …, x1, x0)

 M −1   N −1
 N −1 M −1
❑ Product:
P =   y j 2 j    xi 2i  =  xi y j 2i + j
 j =0   i =0  i =0 j =0

Braun Multiplier
Only for unsigned multiplication
Structure of 4x4 Braun Multiplier

For n x n multiplier, n(n-1) adders are required


Multiplier for Signed Numbers
❑ Baugh-Wooley Multiplier
❑ Baugh-Wooley Multiplier

❑ .
Fewer Partial Products
❑ Array multiplier requires N partial products
❑ If we looked at groups of r bits, we could form N/r partial
products.
– Faster and smaller?
– Called radix-2r encoding
❑ Ex: r = 2: look at pairs of bits
– Form partial products of 0, Y, 2Y, 3Y
– First three are easy, but 3Y requires adder
Booth Encoding
❑ Instead of 3Y, try –Y, then increment next partial product
to add 4Y
❑ Similarly, for 2Y, try –2Y + 4Y in next partial product
I/O Cells
• Three-state bidirectional output buffer is as shown below:
• The three-state buffer allows us to employ the same pad for
input and output bidirectional I/O.
• When we want to use the pad as an input, we set OE low and
take the data from DATAin.
• When the output enable (OE) signal is high, the circuit
functions as a noninverting buffer driving the value of
DATAin onto the I/O pad.
• When OE is low, the output transistors or drivers, M1 and
M2, are disconnected.
• This allows multiple drivers to be connected on a bus.
• It is up to the designer to make sure that a bus never has two
drivers a problem known as contention.

You might also like