0% found this document useful (0 votes)
3 views144 pages

Unit 2 Analysis

The document outlines the course file for VLSI Design and Testing at Visvesvaraya Technological University, detailing the vision and mission of the institute and department, along with program educational objectives and outcomes. It covers the course content, including MOS transistor theory, CMOS fabrication, semiconductor memories, and testing of digital circuits. Additionally, it highlights the historical context and advancements in VLSI technology, emphasizing the exponential growth in integration density and its implications for various applications.

Uploaded by

saranrakshu27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views144 pages

Unit 2 Analysis

The document outlines the course file for VLSI Design and Testing at Visvesvaraya Technological University, detailing the vision and mission of the institute and department, along with program educational objectives and outcomes. It covers the course content, including MOS transistor theory, CMOS fabrication, semiconductor memories, and testing of digital circuits. Additionally, it highlights the historical context and advancements in VLSI technology, emphasizing the exponential growth in integration density and its implications for various applications.

Uploaded by

saranrakshu27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 144

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

COURSE FILE

VLSI DESIGN AND TESTING


Sub Code: 21EC63

Prepared by Approved by

Dr. Kiran Kumar V G H.O.D


Associate Professor Dept. of E&CE
Dept. of E&CE

Department of Electronics & Communication Engineering


A J Institute of Engineering & Technology
Mangaluru
VLSI DESIGN AND TESTING 21EC63

Accredited By NBA (BE: CV, CSE, ECE, ISE & ME)

Vision of the Institute

To produce top-quality engineers who are groomed for attaining excellence in their profession and
competitive enough to help in the growth of nation and global society.

Mission of the Institute

M1: To offer affordable high-quality graduate program in engineering with value

education and make the students socially responsible.

M2: To support and enhance the institutional environment to attain research

excellence in both faculty and students and to inspire them to push the

boundaries of knowledge base.

M3: To identify the common areas of interest amongst the individuals for the

effective industry- institute partnership in a sustainable way by systematically

working together.

M4: To promote the entrepreneurial attitude and inculcate innovative ideas among

the engineering professionals.


VLSI DESIGN AND TESTING 21EC63

Department of Electronics & Communication Engineering


Vision of the Department
To be recognized as a center of excellence in the area of Electronics and Communication
Engineering by nurturing the young innovative minds into skillful and ethical professionals to
cater the industrial and societal needs.

Mission of the Department


M1. To establish state-of-the art laboratories to facilitate research and innovation to upgrade
the knowledge and skills in healthcare sector and IoT.
M2. To provide industry interaction for training programs on latest technology.
M3. To provide ethical and value based education by promoting activities addressing the
societal needs.

PROGRAM EDUCATIONAL OBJECTIVES (PEOs)


PEO1. Exhibit a desire for lifelong learning through professional and societal activities.
PEO2. Exhibit and apply their technical skills and knowledge in Electronics and
Communication Engineering for industry and societal needs.
PEO3. Exhibit leadership qualities, professional skills, management skills and ethics
needed for successful career.

PROGRAM OUTCOMES (POs)

1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering


fundamentals, and an engineering specialization to the solution of complex engineering
problems.
2. Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.
3. Design/development of solutions: Design solutions for complex engineering problems and
design system components or processes that meet the specified needs with appropriate
consideration for the public health and safety, and the cultural, societal, and environmental
considerations.
VLSI DESIGN AND TESTING 21EC63

4. Conduct investigations of complex problems: Use research-based knowledge and


research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex
engineering activities with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and
need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give
and receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member
and leader in a team, to manage projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to engage
in independent and life-long learning in the broadest context of technological change.

PROGRAM SPECIFIC OUTCOMES (PSOs)

PSO1. Embedded Systems: Ability to apply the fundamental knowledge of core Electronics
and Communication Engineering courses in the analysis, design, and development of
integrated electronic systems and healthcare devices.
PSO2. Communication Systems: Ability to apply the fundamental knowledge of signal
processing in the analysis, design, and development of communication systems.
VLSI DESIGN AND TESTING 21EC63

PSO3. Simulation: Ability to use modern electronic tools such as MATLAB, Xilinx,
Multisim etc, to design and analyze the complex electronics and communication
systems.

Course Contents

RBT CO
Module Details
Level Mapped
Introduction: A Brief History, MOS Transistors, CMOS Logic
1 MOS Transistor Theory: Introduction, Long-channel I-V L1, L2, CO1
L3
Characteristics, Non-ideal I-V Effects, DC Transfer Characteristics
Fabrication: CMOS Fabrication and Layout, Introduction, CMOS
Technologies, Layout Design Rules. L1, L2,
2 Delay: Introduction, Transient Response, RC Delay Model, Linear CO2
L3
Delay Model, Logical Efforts of Paths
Semiconductor Memories: Introduction, Dynamic Random Access
Memory (DRAM) and Static Random Access Memory (SRAM), L1, L2,
3 CO3
Nonvolatile Memory, Flash Memory, Ferroelectric Random Access L3
Memory (FRAM)
Faults in digital circuits: Failures and faults, Modelling of faults,
Temporary faults
4 Test generation for combinational logic circuits: Fault diagnosis of L1, L2, CO4
digital circuits, test generation techniques for combinational circuits, L3
Detection of multiple faults in combinational logic circuits.
Test generation for sequential circuits: Testing of sequential circuits
as iterative combinational circuits, state table verification, test
generation based on circuits structure, functional fault models, test
generation based on functional fault models. L1, L2,
5 Design of testable sequential circuits: Controllability and CO5
L3
Observability, Adhoc design rules, design of diagnosable sequential
circuits, The scan path technique, LSSD, Random Access scan
technique, partial scan.
VLSI DESIGN AND TESTING 21EC63
Course Outcome
At the end of the course the student will be able to:

PO PSO
Sl. No. DESCRIPTION
MAPPING MAPPING
Demonstrate understanding of MOS transistor theory,
CO1 PO1, PO2, PO5 PSO1, PSO3
CMOS fabrication flow and technology scaling.
Draw the basic gates using the stick and layout PO1, PO2, PO3,
CO2 PSO1, PSO3
diagrams with the knowledge of physical design aspects PO4, PO5
Interpret the memory elements along with timing PO1, PO2, PO3, PSO1
CO3
considerations PO4,PO5 PSO3
Interpret testing and testability issues in Combination PO1, PO2, PO3,
CO4 PSO1
logic design PO4
Interpret testing and testability issues in Sequential Logic PO1, PO2, PO3,
CO5 PSO1
design PO4

TEXT BOOKS & REFERENCE BOOKS:

BOOK TITLE / AUTHORS / PUBLICATION


“CMOS VLSI Design- A Circuits and Systems Perspective”, Neil H E Weste, and David
T-1.
Money Harris4th Edition, Pearson Education.
“CMOS Digital Integrated Circuits: Analysis and Design”, Sung Mo Kang & Yosuf
T-2.
Leblebici, Third Edition, Tata McGraw-Hill.
T-3 “Digital Circuit Testing and Testability”, Lala Parag K, New York, Academic Press, 1997.
“Basic VLSI Design”, Douglas A Pucknell, Kamran Eshraghian, 3rd Edition, Prentice Hall
R-1.
of India publication, 2005.
“Essential of Electronic Testing for Digital, Memory and Mixed Signal Circuits”, Vishwani
R-2.
D Agarwal, Springer, 2002.
VLSI DESIGN AND TESTING 21EC63

Module 1: Introduction to VLSI

A Brief History
In 1958, Jack Kilby built the first integrated circuit flip-flop with two transistors at Texas
Instruments. In 2008, Intel’s Itanium microprocessor contained more than 2 billion transistors
and a 16 Gb Flash memory contained more than 4 billion transistors. This corresponds to a
compound annual growth rate of 53% over 50 years. No other technology in history has
sustained such a high growth rate lasting for so long. This incredible growth has come from
steady miniaturization of transistors and improvements in manufacturing processes. Most other
fields of engineering involve tradeoffs between performance, power, and price. However, as
transistors become smaller, they also become faster, dissipate less power, and are cheaper to
manufacture. This synergy has not only revolutionized electronics, but also society at large.
The processing performance once dedicated to secret government supercomputers is now
available in disposable cellular telephones. The memory once needed for an entire company’s
accounting system is now carried by a teenager in her iPod. Improvements in integrated circuits
have enabled space exploration, made automobiles safer and more fuelefficient, revolutionized
the nature of warfare, brought much of mankind’s knowledge to our Web browsers, and made
the world a flatter place. Figure 1.1 shows annual sales in the worldwide semiconductor market.
Integrated circuits became a $100 billion/year business in 1994. In 2007, the industry
manufactured approximately 6 quintillion (6 × 1018) transistors, or nearly a billion for every
human being on the planet. Thousands of engineers have made their fortunes in the field. New
fortunes lie ahead for those with innovative ideas and the talent to bring those ideas to reality.
During the first half of the twentieth century, electronic circuits used large, expensive, power-
hungry, and unreliable vacuum tubes. In 1947, John Bardeen and Walter Brattain built the first
functioning point contact transistor at Bell Laboratories, shown in Figure 1.2(a) [Riordan97].
It was nearly classified as a military secret, but Bell Labs publicly introduced the device the
following year.
Over the last two decades the electronics industry has achieved a phenomenal growth, mainly
due to the rapid advances in integration technologies, very large-scale systems design - in short,
due to the advent of VLSI. VLSI stands for "Very Large Scale Integration". This is the field
which involves packing more and more logic devices into smaller and smaller areas. The
number of applications of integrated circuits in high-performance computing,
telecommunications, consumer electronics and now in the field of mechanical engineering
(MEMS) has been rising steadily, and at a very fast pace. Typically, the required computational
power (or, in other words, the intelligence) of these applications is the driving force for the fast
development of this field. Figure 1.1 gives an overview of the prominent trends in information
technologies over the next few decades. The current leading-edge technologies (such as low
bit-rate video and cellular communications) already provide the end-users a certain amount of
processing power and portability. One of the most important characteristics of information
services is their increasing need for very high processing power and bandwidth (in order to
handle real-time video, for example). The other important characteristic is that the information
services tend to become more and more personalized (as opposed to collective services such as
broadcasting), which means that the devices must be more intelligent to answer individual

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 1


VLSI DESIGN AND TESTING 21EC63

demands, and at the same time they must be portable to allow more flexibility/mobility.

Figure-1.1: Prominent trends in information service technologies

As more and more complex functions are required in various data processing and
telecommunications devices, the need to integrate these functions in a small system/package is
also increasing. The levels of integration as measured by the number of logic gates in a
monolithic chip has been steadily rising for almost three decades, mainly due to the rapid
progress in processing technology and interconnect technology. Table 1.1 shows the evolution
of logic complexity in integrated circuits over the last three decades, and marks the milestones
of each era. Here, the numbers for circuit complexity should be interpreted only as
representative examples to show the order-of-magnitude.

ERA DATE COMPLEXITY Inventions


Single transistor 1959 less than 1
Unit logic (one gate) 1960 1 Junction diodes and
transistors
Multi-function 1962 2-4
Complex 1964 5 – 20 (SSI) Logic gates and Flip
function(SSI) flops
Medium Scale 1967 20 – 200 (MSI) Counters and
Integration Multiplexers
Large Scale 1972 200-2000 (LSI) 16bit and 32bit P
Integration
Very Large Scale 1978 2000-20000 (VLSI) Special processors
Integration
Ultra Large Scale 1989 20000 - ? (ULSI) PIV processors
Integration
Table-1.1: Evolution of logic complexity in integrated circuits.

A logic block can contain anywhere from 10 to 100 transistors, depending on the function.
State-of-the-art examples of ULSI chips, such as the DEC Alpha or the INTEL Pentium contain

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 2


VLSI DESIGN AND TESTING 21EC63

3 to 6 million transistors.The most important message here is that the logic complexity per chip
has been (and still is) increasing exponentially. The monolithic integration of a large number
of functions on a single chip usually provides:

• Less area/volume and therefore, compactness


• Less power consumption
• Less testing requirements at system level
• Higher reliability, mainly due to improved on-chip interconnects
• Higher speed, due to significantly reduced interconnection length
• Significant cost savings

Figure-1.2: Evolution of integration density and minimum feature size, as seen in the
early 1980s.

Therefore, the current trend of integration will also continue in the foreseeable future.
Advances in device manufacturing technology, and especially the steady reduction of
minimum feature size (minimum length of a transistor or an interconnect realizable on chip)
support this trend. Figure 1.2 shows the history and forecast of chip complexity - and minimum
feature size - over time, as seen in the early 1980s. At that time, a minimum feature size of 0.3
microns was expected around the year 2000. The actual development of the technology,
however, has far exceeded these expectations. A minimum size of 0.25 microns was readily
achievable by the year 1995. As a direct result of this, the integration density has also exceeded
previous expectations - the first 64 Mbit DRAM, and the INTEL Pentium microprocessor chip
containing more than 3 million transistors were already available by 1994, pushing the
envelope of integration density.

When comparing the integration density of integrated circuits, a clear distinction must be made
between the memory chips and logic chips. Figure 1.3 shows the level of integration over time
for memory and logic chips, starting in 1970. It can be observed that in terms of transistor
count, logic chips contain significantly fewer transistors in any given year mainly due to large
consumption of chip area for complex interconnects. Memory circuits are highly regular and
thus more cells can be integrated with much less area for interconnects.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 3


VLSI DESIGN AND TESTING 21EC63

Figure-1.3: Level of integration over time, for memory chips and logic chips.

Generally speaking, logic chips such as microprocessor chips and digital signal
processing (DSP) chips contain not only large arrays of memory (SRAM) cells, but also many
different functional units. As a result, their design complexity is considered much higher than
that of memory chips, although advanced memory chips contain some sophisticated logic
functions. The design complexity of logic chips increases almost exponentially with the
number of transistors to be integrated. This is translated into the increase in the design cycle
time, which is the time period from the start of the chip development until the mask-tape
delivery time. However, in order to make the best use of the current technology, the chip
development time has to be short enough to allow the maturing of chip manufacturing and
timely delivery to customers. As a result, the level of actual logic integration tends to fall short
of the integration level achievable with the current processing technology. Sophisticated
computer-aided design (CAD) tools and methodologies are developed and applied in order to
manage the rapidly increasing design complexity.

1.3 Why CMOS?

With CMOS structure power dissipation is very low but area required is high, structure is best
compared to other structure. For battery operated structures (device) there is a low energy
consumption / computation. If power is low then size of the battery is small also heat dissipated
is low.

1.3.1. Comparison between MOSFET and BJT

MOSFET BJT
1 It has a low switching loss It has a higher switching loss
2 It has a high conduction loss It has lower conduction loss
3 Low output drive Higher output drive
4 It is a voltage controlled device It is a current controlled device

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 4


VLSI DESIGN AND TESTING 21EC63

5 They have a positive temperature They have a negative temperature


coefficient, parallel operation is coefficient, parallel operation is easier only
easier with current sharing resistors
6 Secondary breakdown doesn’t occur Secondary breakdown occurs
7 Finds application in high frequency Used in lower operating frequencies
fields
8 High noise margin Lower noise margin
9 Operation depends on majority Operation depends on holes and carriers
carriers
10 Power dissipation is low Power dissipation is high
11 Unipolar device Bipolar device
12 Gate current required is less, Gate Base current required is high.
impedance is of order of 109 MΩ
Table-1.2: Comparison between MOSFET and BJT.

1.3.2. Gorden Moore’s Law.


Gorden Moore’s (founder of Intel) in 1960 stated that transistors per chip will double every 18
months. The predictions have largely come true
except for an increasing divergence between the
predicted and actual over the last few years due to
the problems associated with complexity involved in
designing and testing large circuits.
To improve the actual curve it is necessary to
improve the technology both in terms of scaling and
processing and incorporation of BiCMOS and some
new technologies like GaAs (Gallium Arsenide)
technologies.

Figure-1.6: Moore’s first law: transistors integrated


on a single chip.

1.4 Basic MOS Transistors.

nMOS devices are formed in a p-type substrate of moderate doping level. The source and drain
regions are formed by diffusing n-type
impurities through suitable marks into these
areas to give the desired n-impurity
concentration and give rise to depletion
regions which extend mainly in the more
lightly doped p-region as shown. Thus source
and drain are isolated from one another by
two diodes. Connections to the source and
drain are made by the deposited metal layer.

Figure-1.7 c: Transistor circuit symbols

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 5


VLSI DESIGN AND TESTING 21EC63

The MOS transistor

the gate of an MOS transistor controls the flow of current between the source and drain.
Simplifying this to the extreme allows the MOS transistors to be viewed as simple ON/OFF
switches. When the gate of an nMOS transistor is 1, the transistor is ON and there is a
conducting path from source to drain. When the gate is low, the nMOS transistor is OFF and
almost zero current flows from source to drain. A pMOS transistor is just the opposite, being
ON when the gate is low and OFF when the gate is high. This switch model is illustrated in
Figure 1.10, where g, s, and d indicate gate, source, and drain. This model will be our most
common one when discussing circuit behavior.

CMOS Logic
1.4.1 The Inverter
Figure 1.11 shows the schematic and symbol for a CMOS inverter or NOT gate using one
nMOS transistor and one pMOS transistor. The bar at the top indicates VDD and the triangle
at the bottom indicates GND. When the input A is 0, the nMOS transistor is OFF and the pMOS
transistor is ON. Thus, the output Y is pulled up to 1 because it is connected to VDD but not
to GND. Conversely, when A is 1, the nMOS is ON, the pMOS is OFF, and Y is pulled down
to ‘0.’ This is summarized in Table 1.1.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 6


VLSI DESIGN AND TESTING 21EC63

1.4.2 The NAND Gate


Figure 1.12(a) shows a 2-input CMOS NAND gate. It consists of two series nMOS transistors
between Y and GND and two parallel pMOS transistors between Y and VDD. If either input
A or B is 0, at least one of the nMOS transistors will be OFF, breaking the path from Y to
GND. But at least one of the pMOS transistors will be ON, creating a path from Y to VDD.
Hence, the output Y will be 1. If both inputs are 1, both of the nMOS transistors will be ON
and both of the pMOS transistors will be OFF. Hence, the output will be 0. The truth table is
given in Table 1.2 and the symbol is shown in Figure 1.12(b). Note that by DeMorgan’s Law,
the inversion bubble may be placed on either side of the gate. In the figures in this book, two
lines intersecting at a T-junction are connected. Two lines crossing are connected if and only
if a dot is shown.

The NOR Gate


A 2-input NOR gate is shown in Figure 1.16. The nMOS transistors are in parallel to pull the
output low when either input is high. The pMOS transistors are in series to pull the output high
when both inputs are low, as indicated in Table 1.4. The output is never crowbarred or left
floating

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 7


VLSI DESIGN AND TESTING 21EC63

1.4.3 CMOS Logic Gates


• The inverter and NAND gates are examples of static CMOS logic gates, also called
complementary CMOS gates. In general, a static CMOS gate has an nMOS pull-down
network to connect the output to 0 (GND) and pMOS pull-up network to connect the
output to 1 (VDD), as shown in Figure 1.14. The networks are arranged such that one
is ON and the other OFF for any input pattern.
• The pull-up and pull-down networks in the inverter each
consist of a single transistor. The NAND gate uses a
series pull-down network and a parallel pullup network.
More elaborate networks are used for more complex
gates.
• Two or more transistors in series are ON only if all of the
series transistors are ON.
• Two or more transistors in parallel are ON if any of the
parallel transistors are ON.
• This is illustrated in Figure 1.15 for nMOS and pMOS transistor pairs.
• In general, when we join a pull-up network to a pull-down network to form a logic gate
as shown in Figure 1.14, they both will attempt to exert a logic level at the output. The
possible levels at the output are shown in Table 1.3.
• From this table it can be seen that the output of a CMOS logic gate can be in four states.
The 1 and 0 levels have been encountered with the inverter and NAND gates, where
either the pull-up or pull-down is OFF and the other structure is ON. When both pull-
up and pull-down are OFF, the highimpedance or floating Z output state results. This
is of importance in multiplexers, memory elements, and tristate bus drivers. The
crowbarred (or contention) X level exists when both pull-up and pull-down are
simultaneously turned ON. Contention between the two networks results in an
indeterminate output level and dissipates static power. It is usually an unwanted
condition.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 8


VLSI DESIGN AND TESTING 21EC63

Compound Gates
• . A compound gate performing a more complex logic function in a single stage of logic
is formed by using a combination of series and parallel switch structures. For example,
the derivation of the circuit for the function 𝑌 = (𝐴̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
· 𝐵) + (𝐶 · 𝐷) is shown in
Figure 1.18.
• This function is sometimes called AND-OR-INVERT-22, or AOI22 because it
performs the NOR of a pair of 2-input ANDs. For the nMOS pull-down network, take
the uninverted expression ((A · B) + (C · D)) indicating when the output should be
pulled to ‘0.’ The AND expressions (A · B) and (C · D) may be implemented by series
connections of switches, as shown in Figure 1.18(a). Now ORing the result requires the
parallel connection of these two structures, which is shown in Figure 1.18(b
• For the pMOS pull-up network, we must compute the complementary expression using
switches that turn on with inverted polarity. By DeMorgan’s Law, this is equivalent to
interchanging AND and OR operations. Hence, transistors that appear in series in the
pull-down network must appear in parallel in the pull-up network. Transistors that
appear in parallel in the pulldown network must appear in series in the pull-up network.
• This principle is called conduction complements and has already been used in the design
of the NAND and NOR gates. In the pull-up network, the parallel combination of A and
B is placed in series with the parallel combination of C and D. This progression is
evident in Figure 1.18(c) and Figure 1.18(d). Putting the networks together yields the
full schematic (Figure 1.18(e)). The symbol is shown in Figure 1.18(f ).

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 9


VLSI DESIGN AND TESTING 21EC63

Pass Transistors and Transmission Gates


The strength of a signal is measured by how closely it approximates an ideal voltage
source. In general, the stronger a signal, the more current it can source or sink. The power
supplies, or rails, (VDD and GND) are the source of the strongest 1s and 0s. An nMOS
transistor is an almost perfect switch when passing a 0 and thus we say it passes a strong 0.
However, the nMOS transistor is imperfect at passing a 1. The high voltage level is somewhat
less than VDD, as will be explained in Section 2.5.4. We say it passes a degraded or weak 1.
A pMOS transistor again has the opposite behavior, passing strong 1s but degraded 0s. The
transistor symbols and behaviors are summarized in Figure 1.20 with g, s, and d indicating
gate, source, and drain. When an nMOS or pMOS is used alone as an imperfect switch, we
sometimes call it a pass transistor. By combining an nMOS and a pMOS transistor in parallel
(Figure 1.21(a)), we obtain a switch that turns on when a 1 is applied to g (Figure 1.21(b)) in
which 0s and 1s are both passed in an acceptable fashion (Figure 1.21(c)). We term this a
transmission gate or pass gate. In a circuit where only a 0 or a 1 has to be passed, the appropriate
transistor (n or p) can be deleted, reverting to a single nMOS or pMOS device.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 10


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 11


VLSI DESIGN AND TESTING 21EC63

Principle of operation of nMOS transistor:-


The principle of operation is to control the current conduction between the source and the
drain using the electric field generated by the gate voltage as a control variable.
The simplest bias condition that can be applied to the n-channel enhancement type is as
shown.

Figure-2.2: n-channel enhancement type MOSFET


A positive gate to source voltage VGS is then applied to the gate in order to create a conducting
channel. For a small gate voltage ie, VGS << Vt which is the threshold voltage the holes start
accumulating as shown in Fig 2.3a. This is termed as the accumulation mode.

Figure-2.3: n-channel enhancement type MOSFET action

As VGS is increased (such that VGS ≈ Vt) the holes are repelled back into the substrate and the
surface of the p-type substrate is depleted. There are no mobile carriers current conduction
between source and drain. This is termed as depletion mode as in fig 2.3 b. The value of gate
to source voltage VGS is increased further to cause surface inversion. ie; a conducting n-type
layer will form between the source and drain diffusion regions (conducting channel). The
voltage corresponding to this is called threshold
voltage (Vt).
When VGS>Vt a large number of minority carriers
(electrons) are attracted to the surface ultimately
contributing to the channel formation. This is
termed as inversion mode as shown in fig 2.3 c
and 2.3 d respectively.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 12


VLSI DESIGN AND TESTING 21EC63

Figure-2.3d: n-channel enhancement type MOSFET action (inversion mode).


2.2.2 Influence of drain to source bias VDS:-
Consider a case when VGS>Vt, VDS = 0. Thermal equilibrium exists in the inverted channel
region and the drain current ID = 0. This is called cut-off region as shown in fig 2.4 a.
If a small drain voltage VDS > 0 is applied a
drain current proportional to VDS will flow
from source to drain through conducting
channel. The inversion layer ie; channel
forms the conduction current from source to
drain. The operation is called linear mode or
linear region, non saturated or unsaturated
or resistive region of operation shown in fig
2.4b.
In linear region, the channel acts as a voltage
controlled resistor. The electron velocity ion
the channel is usually much lower than the
drift velocity limit as shown in fig 2.4b.
As the drain voltage is increased the
inversion layer charge and the channel
deposited at the drain end start decrease
when VDS = VDSAT the inversion charge is
reduced to zero which is called pinch off
point. Beyond pinch off point (VDS > VGS –
Vt) a depleted surface region forms adjacent
to the drain and this depletion region grows
towards the source with increase in drain
voltage. This operation mode of transistor is
called saturation mode or saturation region.
Figure-2.4: n-channel transistor operation
A MOS operating in the saturation region the
effective channel length is reduced as the
inversion layer near the drain vanishes. The
channel end voltage is VDSAT.
The voltage (VDS-VDSAT) appears across the
pinched off section and a high field region forms between the channel end and the drain.
Electrons arriving from the source to the drain depletion region and are accelerated towards the
drain in this high electric field.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 13


VLSI DESIGN AND TESTING 21EC63

Long-Channel I-V Characteristics ) derivation of Drain current)


As stated previously, MOS transistors have three regions of operation:
Cutoff or subthreshold region
Linear region
Saturation region
Let us derive a model [Shockley52, Cobbold70, Sah64] relating the current and voltage (I-V)
for an nMOS transistor in each of these regions. The model assumes that the channel length is
long enough that the lateral electric field (the field between source and drain) is relatively low,
which is no longer the case in nanometre devices. This model is variously known as the long-
channel, ideal, first-order, or Shockley model. Subsequent sections will refine the model to
reflect high fields, leakage, and other nonidealities. The long-channel model assumes that the
current through an OFF transistor is 0. When a transistor turns ON (Vgs > Vt), the gate attracts
carriers (electrons) to form a channel. The electrons drift from source to drain at a rate
proportional to the electric field between these regions. Thus, we can compute currents if we
know the amount of charge in the channel and the rate at which it moves. We know that the
charge on each plate of a capacitor is Q = CV. Thus, the charge in the channel Qchannel is

where Cg is the capacitance of the gate to the channel and Vgc - Vt is the amount of voltage
attracting charge to the channel beyond the minimum required to invert from p to n. The gate
voltage is referenced to the channel, which is not grounded. If the source is at Vs and the drain
is at Vd , the average is Vc = (Vs + Vd)/2 = Vs + Vds
/2. Therefore, the mean difference between the gate
and channel potentials Vgc is Vg – Vc = Vgs – Vds /2,
as shown in Figure 2.5. We can model the gate as a
parallel plate capacitor with capacitance proportional
to area over thickness. If the gate has length L and
width W and the oxide thickness is tox, as shown in
Figure 2.6, the capacitance is

Each carrier in the channel is accelerated to an average velocity, v, proportional to the lateral
electric field, i.e., the field between source and drain. The constant of proportionality R is called

the mobility

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 14


VLSI DESIGN AND TESTING 21EC63

The electric field E is the voltage difference between drain and source Vds divided by the
channel length

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 15


VLSI DESIGN AND TESTING 21EC63

Nonideal I-V Effects


The long-channel I-V model of EQ (2.10) neglects many effects that are important to devices
with channel lengths below 1 micron.
Figure 2.14 compares the simulated I-V characteristics of a 1-micron wide nMOS transistor in
a 65 nm process to the ideal characteristics computed in Section 2.2. The saturation current
increases less than quadratically with increasing Vgs. This is caused by two effects: velocity
saturation and mobility degradation. At high lateral field strengths (Vds/L), carrier velocity
ceases to increase linearly with field strength. This is called velocity saturation and results in
lower Ids than expected at high Vds. At high vertical field strengths (Vgs /tox ), the carriers
scatter off the oxide interface more often, slowing their progess. This mobility degradation
effect also leads to less current than expected at high Vgs. The saturation current of the nonideal
transistor increases somewhat with Vds. This is caused by channel length modulation, in which
higher Vds increases the size of the depletion region around the drain and thus effectively
shortens the channel.
The threshold voltage indicates the gate voltage necessary to invert the channel and is primarily
determined by the oxide thickness and channel doping levels. However, other fields in the
transistor have some effect
on the channel, effectively
modifying the threshold
voltage. Increasing the
potential between the source
and body raises the threshold
through the body effect.
Increasing the drain voltage
lowers the threshold through
drain-induced barrier
lowering. Increasing the

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 16


VLSI DESIGN AND TESTING 21EC63

channel length raises the threshold through the short channel effect. Several sources of leakage
result in current flow in nominally OFF transistors. When Vgs < Vt, the current drops off
exponentially rather than abruptly becoming zero. This is called subthreshold conduction. The
current into the gate Ig is ideally 0.

Mobility Degradation and Velocity Saturation


carrier drift velocity, and hence current, is proportional to the lateral electric field Elat = Vds
/L between source and drain. The constant of proportionality is called the carrier mobility, R.
The long-channel model assumed that carrier mobility is independent of the applied fields.
This is a good approximation for low fields, but breaks down when strong lateral or vertical
fields are applied.
a high voltage at the gate of the transistor attracts the carriers to the edge of the channel,
causing collisions with the oxide interface that slow the carriers. This is called mobility
degradation.
carriers approach a maximum velocity vsat when high fields are applied. This phenomenon is
called velocity saturation.7 Mobility degradation can be modeled by replacing R with a
smaller Reff that is a function of Vgs. A universal model [Chen96, Chen97] that matches
experimental data from multiple processes reasonably well is

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 17


VLSI DESIGN AND TESTING 21EC63

Channel Length Modulation


Ideally, Ids is independent of Vds for a transistor in saturation, making the transistor a
perfect current source. As discussed in Section 2.3.3, the p–n junction between the drain and
body forms a depletion region with a width Ld that increases with Vdb , as shown in Figure
2.18. The depletion region effectively shortens the channel length to

To avoid introducing the body voltage into our calculations, assume the source
voltage is close to the body voltage so Vdb ~ Vds. Hence, increasing Vds decreases the
effective channel length. Shorter
channel length results in higher
current; thus, Ids increases with Vds in
saturation, as shown in Figure 2.18.
This can be crudely modeled by
multiplying EQ (2.10) by a factor of (1
+ Vds / VA), where VA is called the
Early voltage [Gray01]. In the
saturation region, we find

As channel length gets shorter, the effect of the channel length modulation becomes relatively
more important. Hence, VA is proportional to channel length. This channel length modulation
model is a gross oversimplification of nonlinear behavior and is more useful for conceptual
understanding than for accurate device modeling. Channel length modulation is very important
to analog designers because it reduces the gain of amplifiers. It is generally unimportant for
qualitatively understanding the behavior of digital circuits

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 18


VLSI DESIGN AND TESTING 21EC63

for the two transistors shown in Figure 2.26(a). The plot shows Idsn and Idsp in terms of Vdsn
and Vdsp for various values of Vgsn and Vgsp. Figure 2.26(b) shows the same plot of Idsn and
|Idsp| now in terms of Vout for various values of Vin. The possible operating points of the
inverter, marked with dots, are the values of Vout where Idsn = |Idsp| for a given value of Vin.
These operating points are plotted on Vout vs. Vin axes in Figure 2.26(c) to show the inverter
DC transfer characteristics. The supply current IDD = Idsn = |Idsp| is also plotted against Vin
in Figure 2.26(d) showing that both transistors are momentarily ON as Vin passes through
voltages between GND and VDD, resulting in a pulse of current drawn from the power supply.
The operation of the CMOS inverter can be divided into five regions indicated on Figure
2.26(c). The state of each transistor in each region is shown in Table 2.3. In region A, the
nMOS transistor is OFF so the pMOS transistor pulls the output to VDD. In region B, the
nMOS transistor starts to turn ON, pulling the output down. In region C, both transistors are in
saturation. Notice that ideal transistors are only in region C for Vin = VDD/2 and that the slope
of the transfer curve in this example is – in this region, corresponding to infinite gain. Real
transistors have finite output resistances on account of channel length modulationand thus have
finite slopes over a broader region C. In region D, the pMOS transistor is partially ON and in
region E, it is completely OFF, leaving the nMOS transistor to pull the output down to GND.
Also notice that the inverter’s current consumption is ideally zero, neglecting leakage, when
the input is within a threshold voltage of the VDD or GND rails. This feature is important for
low-power operation

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 19


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 20


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 21


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 22


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 23


VLSI Design AND TESTING 21EC63

Module 2: Fabrication

Enhancement mode Transistor action


To establish the channel between the source and the drain a minimum voltage (Vt) must be applied
between gate and source. This minimum voltage is called as “Threshold Voltage”. The complete
working of enhancement mode transistor can be
explained with
the help of diagram a, b and c.
a) Vgs > Vt
Vds = 0
Since Vgs > Vt and Vds = 0
the channel is formed but no current
flows between drain and source.

b) Vgs > Vt
Vds < Vgs - Vt
This region is called the non-saturation Region or
linear region where the drain current increases linearly
with Vds. When Vds is increased the drain side
becomes more reverse biased(hence more depletion
region towards the drain end) and the channel starts to
pinch. This is called as the pinch off point.

c) Vgs > Vt
Vds > Vgs - Vt
This region is called Saturation Region where the drain current remains almost constant. As the
drain voltage is increased further beyond (Vgs-Vt) the pinch off point starts to move from the drain

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

end to the source end. Even if the Vds is


increased more and more, the increased voltage
gets dropped in the depletion region leading to
a constant current.
The typical threshold voltage for an
enhancement mode transistor is given by Vt =
0.2  Vdd.

Simplified View of CMOS Fabrication Process

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

nMOS Fabrication
nMOS Fabrication :
1. Processing is carried out on a thin wafer cut from a single crystal of silicon of high purity
into which the required impurities are introduced as the crystal is grown. Such wafers are
typically 75 to 150 mm in diameter and 0.4mm thick and are doped with boron so as to
give a resistivity of 25 ohm cm to 2 ohm cm.
2. A layer of Silicon dioxide (SiO2), typically 1m thick, is grown all over the surface of the
wafer to protect the surface, act as a barrier to dopants during processing provide generally
insulating substrate on to which other layers may be deposited and patterned.
3. The surface is now covered with a photo resist which is deposited on to the wafer and spun
to achieve an even distribution of the required thickness.
4. The photo resist layer is then exposed to ultraviolet light through a mask which defines
those regions into which diffusion is to take place together with transistor channels.
Assume, for example those areas exposed to ultraviolet rays are polymerized (hardened),
but that areas required for diffusion are shielded by the mask and remain unaffected.
5. These areas are subsequently readily etched away together with the underlying silicon
dioxide so that wafer surface is exposed in the window defined by the mask.
6. The remaining photo resist is removed and a thin layer of SiO2 (0.1m typical) is grown
over the entire chip surface and then polysilicon is deposited by chemical vapour deposition
(CVD). In the fabrication of fine pattern devices precise control of thickness, impurity
concentration and resistivity is necessary.
7. Further photo resist coating and masking allows the polysilicon to be patterned and then
the thin oxide is removed to expose areas into which n-type impurities are to be diffused
to form the source and drain as shown. Diffusion is achieved by heating the wafer to a high
temperature and passing a gas containing the desired n impurity (eg: Phosphorous) over
the surface as indicated in fig (1.3b)
8. Thick Oxide (SiO2) is grown over all again and is then masked with photoresist and etched
to expose selected areas of the polysilicon gate and the drain and source areas where
connections ( ie contact cuts) are to be made.
9. The whole chip then has metal (aluminium) deposited over its surface to a thickness
typically of 1µm. This metal layer is then masked and etched to form the required
interconnection pattern.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

CMOS Fabrication:
The four dominant CMOS Fabrication process are
• p-well process
• n-well process
• twin tub process
• silicon on insulator
1.1 The p-well process
The common approach to p-well CMOS fabrication has been to start with a moderately doped n-
type substrate (wafer), create the p-type well for a n-channel devices and build the p channel
transistor n substrate. This diffusion must be carried out with a special care since the p-well doping
concentration and depth will affect the threshold voltages as well as the breakdown voltages of n-
transistors. To achieve low threshold voltage (0.6 – 0.7V) we need either deep well diffusion or
high well resistivity. However, deep wells require larger spacing between the n- and p- type
transistors and wires because of lateral diffusion and therefore a larger chip area.
The p-wells act as substrates for the n-devices within the parent n-substrate, and, provided that
voltage polarity restrictions are observed, the two areas are electrically isolated.
The masking, patterning and diffusion process is similar to nMOS fabrication. The typical
processing steps are
• Mask 1 – defines the areas in which the deep p-well diffusions are to take place.
• Mask 2 – defines the thinox regions, namely those areas where the thick oxide is to be
stripped and thin oxide grown to accommodate p- and n- transistors and diffusion wires.
• Mask 3 – used to pattern polysilicon layer which is deposited after the thin oxide.
• Mask 4 – A p-plus mask is now used (to be in effect ‘Anded’ with Mask 2 ) to define all
areas where p – diffusion is to take place.
• Mask 5- This is usually performed using the negative form of the p – plus mask and with
Mask 2, defines those areas where n-type diffusion is to take place.
• Mask 6 – Contact cuts are now defined.
• Mask 7 – The metal layer pattern is defined by this mask.
• Mask 8 – An overall passivation (over glass) layer is now applied and Mask 8 is needed to
define the openings for access to bonding pads.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figure: CMOS p-well process

Figure : CMOS P-well inverter showing VDD and VSS connections.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

1.2 The n-well process


N-well CMOS circuits are also superior to p-well because of the lower substrate bias effects on
transistor threshold voltage and inherently lower parasitic capacitances associated with source and
drain regions. Typical n-well fabrication steps are similar to p-well process except that n-well is
used.
The typical processing steps are
• Mask 1 – defines the areas in which the deep n-well diffusions are to take place.
• Mask 2 – defines the nMOS and pMOS active areas.
• Mask 3 – defines the thinox regions, namely those areas where the thick oxide is to be
stripped and thin oxide grown to accommodate p- and n- transistors and diffusion wires.
• Mask 4 – used to pattern polysilicon layer which is deposited after the thin oxide.
• Mask 5 – A p-plus mask is now used to define all areas where p – diffusion is to take place.
• Mask 6- defines those areas where n-type diffusion is to take place.
• Mask 7 – Contact cuts are now defined.
• Mask 8 – The metal layer pattern is defined by this mask.
• Mask 9 – An overall passivation (over glass) layer is now applied and Mask 8 is needed to
define the openings for access to bonding pads.

Figure : Cross-sectional view of n-well CMOS inverter.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

CMOS Fabrication and Layout


 The fabrication sequence consists of a series of steps in which layers of the chip are defined
through a process called photolithography.
 Because a whole wafer full of chips is processed in each step, the cost of the chip is
proportional to the chip area, rather than the number of transistors.
 As manufacturing advances allow engineers to build smaller transistors and thus fit more
in the same area, each transistor gets cheaper.
 Smaller transistors are also faster because electrons don’t have to travel as far to get from
the source to the drain, and they consume less energy because fewer electrons are needed
to charge up the gates! This explains the remarkable trend for computers and electronics to
become cheaper and more capable with each generation.
 The inverter could be defined by a hypothetical set of six masks: n-well, polysilicon, n+
diffusion, p+ diffusion, contacts, and metal
 Masks specify where the components will be manufactured on the chip. Figure 1.35(a)
shows a top view of the six masks.
 The cross-section of the inverter from Figure 1.34 was taken along the dashed line. Take
some time to convince yourself how the top
view and cross-section relate; this is critical
to understanding chip layout.
 Consider a simple fabrication process to
illustrate the concept. The process begins
with the creation of an n-well on a bare p-
type silicon wafer.
 Figure 1.36 shows cross-sections of the
wafer after each processing step involved in
forming the n-well; Figure 1.36(a)
illustrates the bare substrate before
processing. Forming the n-well requires
adding enough Group V dopants into the
silicon substrate to change the substrate
from p-type to n-type in the region of the
well. To define what regions receive n-
wells, we grow a protective layer of oxide
over the entire wafer, then remove it where
we want the wells. We then add the ntype
dopants; the dopants are blocked by the

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

oxide, but enter the substrate and form the wells where there is no oxide. The next
paragraph describes these steps in detail. The wafer is first oxidized in a high-temperature
(typically 900–1200 °C) furnace that causes Si and O2 to react and become SiO2 on the
wafer surface (Figure 1.36(b)).

 The oxide must be patterned to define the n-well. An organic photoresist2 that softens
where exposed to light is spun onto the wafer (Figure 1.36(c)). The photoresist is exposed
through the n-well mask (Figure 1.35(b)) that allows light to pass through only where the
well should be. The softened photoresist is removed to expose the oxide (Figure 1.36(d)).

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

 The oxide is etched with hydrofluoric acid (HF) where it is not protected by the photoresist
(Figure 1.36(e)), then the remaining photoresist is stripped away using a mixture of acids
called piranha etch (Figure 1.36(f )). The well is formed where the substrate is not covered
with oxide. Two ways to add dopants are diffusion and ion implantation. In the diffusion
process, the wafer is placed in a furnace with a gas containing the dopants. When heated,
dopant atoms diffuse into the substrate. Notice how the well is wider than the hole in the
oxide on account of lateral diffusion (Figure 1.36(g)). With ion implantation, dopant ions
are accelerated through an electric field and blasted into the substrate. In either method, the
oxide layer prevents dopant atoms from entering the substrate where no well is intended.
Finally, the remaining oxide is stripped with HF to leave the bare wafer with wells in the
appropriate places.
 The transistor gates are formed next. These consist of polycrystalline silicon, generally
called polysilicon, over a thin layer of oxide. The thin oxide is grown in a furnace. Then
the wafer is placed in a reactor with silane gas (SiH4) and heated again to grow the
polysilicon layer through a process called chemical vapor deposition.
 The polysilicon is heavily doped to form a reasonably good conductor. The resulting cross-
section is shown in Figure 1.37(a).
 As before, the wafer is patterned with photoresist and the polysilicon mask (Figure
1.35(c)), leaving the polysilicon gates atop the thin gate oxide (Figure 1.37(b)).
 The n+ regions are introduced for the transistor active area and the well contact. As with
the well, a protective layer of oxide is formed (Figure 1.37(c)) and patterned with the n-
diffusion mask (Figure 1.35(d)) to expose the areas where the dopants are needed (Figure
1.37(d)). Although the n+ regions in Figure 1.37(e) are typically formed with ion
implantation, they were historically diffused and thus still are often called n-diffusion.
 Notice that the polysilicon gate over the nMOS transistor blocks the diffusion so the source
and drain are separated by a channel under the gate. This is called a self-aligned process
because the source and drain of the transistor are automatically formed adjacent to the gate
without the need to precisely align the masks. Finally, the protective oxide is stripped
(Figure 1.37(f )).
 The process is repeated for the p-diffusion mask (Figure 1.35(e)) to give the structure of
Figure 1.38(a). Oxide is used for masking in the same way, and thus is not shown. The
field oxide is grown to insulate the wafer from metal and patterned with the contact mask
(Figure 1.35(f )) to leave contact cuts where metal should attach to diffusion or polysilicon
(Figure 1.38(b)). Finally, aluminum is sputtered over the entire wafer, filling the contact
cuts as well. Sputtering involves blasting aluminum into a vapor that evenly coats the
wafer. The metal is patterned with the metal mask (Figure 1.35(g)) and plasma etched to
remove metal everywhere except where wires should remain (Figure 1.38(c)). This
completes the simple fabrication process.
 Modern fabrication sequences are more elaborate because they must create complex doping
profiles around the channel of the transistor and print features that are smaller than the

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

wavelength of the light being used in lithography. However, masks for these elaborations
can be automatically generated from the simple set of masks we have just examined.
 Modern processes also have 5–10+ layers of metal, so the metal and contact steps must be
repeated for each layer. Chip manufacturing has become a commodity, and many different
foundries will build designs from a basic set of masks.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Layout Design Rules


Layout design rules describe how small features can be and how closely they can be reliably
packed in a particular manufacturing process. Industrial design rules are usually specified in
microns. This makes migrating from one process to a more advanced process or a different
foundry’s process difficult because not all rules scale in the same way.
Mead and Conway [Mead80] popularized scalable design rules based on a single parameter, ,
that characterizes the resolution of the process.  is generally half of the minimum drawn transistor
channel length.
This length is the distance between the source and drain of a transistor and is set by the minimum
width of a polysilicon wire. For example, a 180 nm process has a minimum polysilicon width (and
hence transistor length) of 0.18 µm and uses design rules with  = 0.09 µm.3 Lambda-based rules
are necessarily conservative because they round up dimensions to an integer multiple of .
Designers often describe a process by its feature size. Feature size refers to minimum transistor
length, so  is half the feature size.
MOSIS has developed a set of scalable lambda-based design rules that covers a wide range of
manufacturing processes. The rules describe the minimum width to avoid breaks in a line,
minimum spacing to avoid shorts between lines, and minimum overlap to ensure that two layers
completely overlap.
A conservative but easy-to-use set of design rules for layouts with two metal layers in an n-well
process is as follows:
• Metal and diffusion have minimum width and spacing of 4 .

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

• Contacts are 2  × 2  and must be surrounded by 1  on the layers above and below.
• Polysilicon uses a width of 2 .
• Polysilicon overlaps diffusion by 2 
where a transistor is desired and has a
spacing of 1  away where no transistor
is desired.
• Polysilicon and contacts have a spacing
of 3  from other polysilicon or contacts.
• N-well surrounds pMOS transistors by 6
 and avoids nMOS transistors by 6 
Figure 1.39 shows the basic MOSIS design rules
for a process with two metal layers.
In a three-level metal process, the width of the
third layer is typically 6  and the spacing 4 .
In general, processes with more layers often
provide thicker and wider toplevel metal that
has a lower resistance.
Transistor dimensions are often specified by
their Width/Length (W/L) ratio. For example,
the nMOS transistor in Figure 1.39 formed
where polysilicon crosses n-diffusion has a W/L
of 4/2. In a 0.6 m process, this corresponds to
an actual width of 1.2 µm and a length of 0.6
µm. Such a minimum-width contacted transistor is often called a unit transistor.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

pMOS transistors are often wider than nMOS transistors because holes move more slowly than
electrons so the transistor has to be wider to deliver the same current. Figure 1.40(a) shows a unit
inverter layout with a unit nMOS transistor and a double-sized pMOS transistor. Figure 1.40(b)
shows a schematic for the inverter annotated with Width/ Length for each transistor. In digital
systems, transistors are typically chosen to have the minimum possible length because short-
channel transistors are faster, smaller, and consume less power. Figure 1.40(c) shows a shorthand
we will often use, specifying multiples of unit width and assuming minimum length.
Estimation of Area of Layout

Figure 1.46 shows how to count tracks to estimate the size of a 3-input NAND. There are four
vertical wire tracks, multiplied by 8 Q per
track to give a cell width of 32 Q. There are
five horizontal tracks, giving a cell height of
40 Q. Even though the horizontal tracks are
not drawn to scale, they are still easy to
count.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

4 Basic Physical Design of Simple logic gates


4.4.1 Simple Layout Guidelines
The goal of the designer should be to simultaneously minimize R and C.
Here are some simple guidelines to keep in mind:
• Keep the transistors minimum-length whenever possible (unless you are designing a weak
transistor on purpose). Making transistors longer than minimum length makes them weaker
(more resistive) and more capacitive (i.e., the slow down the input signal as well).
• Diffusion is rather highly resistive and capacitive, so avoid it as much as possible. In other
words, put only 1λ (the minimum) of diffusion between the transistors and the diffusion contact
whenever possible.
• Because the mobility of holes in silicon is lower than that of electrons, make the p-transistors
wider than the n-transistors. A ratio of 2:1 or 1.6:1 in widths works well. However, when you
have many n-transistors in series, they are also weak, so in that case, you can make the n-
transistors wider as well.
• Polysilicon is also mush more resistive than metal. Avoid it for long interconnections.
• Metal 2 is run farther from the substrate than metal 1 is. Thus its capacitance per unit area is
lower, so use metal 2 for long wires if you can. Especially use metal 2 for long wires with weak
drivers (e.g., bus wires that have many drivers connected to them); leave metal 1 for the more
powerful control signals. Since data and control wires are usually run at right angles to each
other, it will probably be useful to run metal 1 vertically across the design (for control signals)
and metal 2 horizontally through the design (for data busses).

Stick Diagrams
B.1 Introduction
One approach to layout is based on the concept of simple stick diagrams, where each layer
is represented by a distinct color, and the routing consists of colored lines that obey the rules of
chip formation.
Stick diagrams are Cartoon of a layout that shows all components, it does not show exact
placement, transistor sizes, wire lengths, wire widths, boundaries, or any other form of compliance
with layout or design rules. It is Useful for interconnect visualization, preliminary layout, layout
compaction, power/ground routing, etc.
The basic color coding is given in figure B.1

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figure B.1: Color coding for Stick diagram


B.2 CMOS Stick diagrams for Logic Gates.
A transistor is formed whenever diffusion crosses a poly layer.When a n-diffusion crosses
a poly layer nFET is formed as Shown in figure B.2(a)and when p-diffusion crosses a poly layer
pFET is formed as Shown in figure B.2(b) The difference is that pFETs are embedded within an
n-Well boundary.

Figure B.2: Stick diagram (a) nFET (b) pFET.


A metal (blue) may cross over n+/p+ (green) or poly (red) without a connection as shown
in figure B.3

Figure B.3: Stick diagram shows no connection between metal and other layers.
Connections between layers is specified by a contact as shown in fig B.4

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figure B.4: Stick diagram showing contacts


Metal to diffusion contacts are called as active contacts, while Metal to poly contacts are
also called as poly contacts

Figure B.5: Stick diagram showing contacts and different metal layers.
Metal layers on different layers can cross one another. Connecting two different metal
layers requires a via (metal to metal contact).

Figure B.6:VDD and VSS connections


To create CMOS logic gates, we start with the VDD and VSS lines. We will use horizontal
orientation for the lines Remember stick diagram only deals with routing. Widths of the layers or
design rules are not important.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figures below give Stick diagrams for various logic gates

Figure B.7:Stick diagrams for logic gates.


Figure B.7 shows various logic gates. Figure B.7(c) shows NOR2 gate and figure B.7.(d)
shows OR2 gate .We can see how VDD and VSS connections have been made common to NOR and
inverter to get OR gate.
CMOS Layout Design for Logic Gates.
A transistor is formed whenever diffusion crosses a poly layer.
When a n-diffusion crosses a poly layer nFET is formed as Shown in figure A.5(a)and when p-
diffusion crosses a poly layer pFET is formed as Shown in figure A.5(b) The difference is that
pFETs are embedded within an n-Well boundary.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figure A.5 Layout for (a) nFET and (b) pFET


A.4 Designing FET Arrays
CMOS logic gates are switching networks that are controlled by the input variables. These
switching arrays use FET’s that are wired together in series and parallel groups in a manner that
allows us to create the desired functions.
Let us start with the simplest case where two nFETs are in series Figure A.6 (a) shows the
schematic diagram .The signals A and B are applied to the gate terminals of the respective
transistors. To construct the Layout pattern, note that there are really only three n+ regions that
are needed: one on the left, one in the middle and one on the right as shown in A.6 (b)

Figure A.6 Layout for two series connected FETs


Hence we can conclude: Devices can share patterned regions, which may reduce the layout area
or complexity.
In this case it is not necessary to first build individual devices and then wire them together. A more
efficient design results if we combine n+ regions.
This technique can be applied to any group of series connected FET’s . A 3-FET chain is shown
in figure A.7.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figure A.7 Layout for three series connected FETs


Parallel connected FET’s can be patterned in the same manner. In figure A.8 two nFETs are wired
parallel using metal patterns. The parallel connection can be understood by noting the drain and
source regions of both transistors are connected between the nodes labeled x and y which implies
that they are in parallel. The schematic is shown in A.8. (a) While A.8(b) shows layout pattern.

Figure A.8 Layout for parallel connected FETs


An alternate layout for parallel FETs is shown in figure A.9. This uses vertical drain source
orientations for the transistors. In this approach two FETs are created with separate n+ regions.
The parallel connection is accomplished by using metal interconnects to give the nodes x and y
shown in figure A.9.This type of layout that uses separated transistors usually requires more area
than those that share drain/source regions, so this type of scheme is restricted to special situations.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figure A.9 Alternate Layout for parallel connected FETs

A.5 Basic Gate Designs.


Now that we have seen the basic ideas involved in CMOS layout, let us examine the surface
patterns used for CMOS logic gates in silicon.
A.5.1 Layout design for CMOS NOT gate:
Figure A.10 shows how the circuit is wired using transistors Mn and Mp as a complimentary pair.
The layout implementation is shown in figure A.10(b). The layout has been structured so that there
is a visual one-to-one correspondence with the circuit. Some of the important aspects are that:
• Both the power supply VDD and ground VSS are routed using the same metal layer.
• N+ and p+ regions are denoted using the same fill pattern. The difference is that pFETs are
embedded within an nWell boundary.

Figure A.10 Translating a NOT gate circuit to layout.

An alternate layout is shown in figure A.11. In this case NOT gate transistors have been laid
horizontally as shown. Thus we can infer that different geometrical layouts can be used to
implement CMOS circuits. Variations in layout strategy are not important until actual sizes of the
patterns are taken into account.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figure A.11 Alternate Layout for a NOT gate circuit layout.


A.5.2 Layout design for CMOS NAND2 and NOR2 gate :
Figure A.12.(a) shows a NAND2 circuit that has been drawn in a manner that leads to the
patterning of figure A.12(b). The two nFETs in series can be laid out using the method shown in
figure A.6. Since the gates (with inputs a and b) run in vertical direction the parallel connected
pFETs can be added using the technique shown in figure A.6. Which achieves the parallel
connection by metal wiring. This allows us to maintain simple gate poly lines as shown.

Figure A.12 Layout for a NAND2 gate.


The same approach may be used to construct NOR2 gate. As shown in figure A.13 (a). The FET
arrangement is quite opposite with nFETs in parallel and pFETs in series. The resulting layout in
figure A.13 (b) follows the same philosophy as for NAND2 gate wiring.
Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru
VLSI Design AND TESTING 21EC63

Figure A.13 Layout for a NOR2 gate.


The similarity between NAND2 and NOR2 layouts can be seen decomposing the structures into
transistors and wiring. The basic FET arrangement for both gates is shown in figure A.14 (a). To
obtain a NAND2 gate, we use the metal wiring pattern provided in figure A.14 (b); the NOR2 gate
is obtained by using the wiring in figure A.14(c). If you take a moment to study the metal patterns
for the two gates, you will see they are identical! This can be verified by drawing an imaginary
horizontal line through the center of one, and then rotating the pattern around it. This illustrates
how AND-OR property of duality translates into a layout symmetry.

Figure A.14 Layout for a NOR2 gate.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

These layout techniques can be extended to gates with 3 or more inputs. A NOR3 gate is shown in
figure A.15 (a).This uses 3 series connected pFETs and 3 parallel connected nFETs. If we flip the
metal layers then we obtain the NAND3 circuit of figure A.15(b).

Figure A.15 Layout for a NOR3 and NAND3 gates

A.6 Complex Logic Gate.


The layout of complex logic gates can be assumed in the same manner. Consider the circuit in the
figure A.16(a) that implements the function

The circuit requires that an nFET be placed in parallel with a group of two series connected nFETs.
The pFET array consists of two parallel- connected transistors that are wired in series with one
other device. The layout in figureA.16(b) provides the correct wiring and uses single poly gate
patterns for each input.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figure A.16 Layout for a complex logic gate

An interesting variation of the layout demonstrates another important point. Suppose that we flip
the metal wiring pattern around an imaginary horizontal line. The resulting layout pattern is shown
in figure A.17(a). Tracing out the circuit yields schematic in A.17(b).It is seen that the circuit
implements the function

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

Figure A.17 Layout for a complex logic gate (Logical Dual of A.16)

Consider the general expression for a 4 input gate

That can be implemented using the circuit in figure A.18(a). If we want to maintain the layout
strategy where we use a vertical running polyline for each input, then we start with 4 gate lines
with VDD and Gnd lines. To minimize the area, we would like to shar n+ and p+ regions. The nFET
patterning is easy since it consists of two groups in parallel, with each group containing two nFETs.
The layout is shown in A.18(b).

Figure A.18 Layout for a 4 input complex logic gate.

A.7 General Discussion


In the sections A.1 to A.6 we have discussed some techniques for creating gate level layouts. In
the basic gates examined, it was possible to share n+ and p+ regions among several transistors,
which reduces the area and wiring complexity. This is not always possible, especially in
complicated arrangements. Various approaches to handling FET placement and wiring have been
developed.
Consider the general problem of placing the transistors into CMOS circuit. Experience has shown
that regular patterns and arrays will yield the best packing density, and randomly placed polygons
should be avoided when possible. In general, every logic gate requires a power supply (VDD) and
ground connection, which run as horizontal metal lines in our examples without loss of generality.
This leads to the basic framework illustrated in figure A.19. All FETs are placed in between two
power rails. In the drawing, transistors are shown as individual devices, groups with shared poly
lines, and groups with shared drain/source regions. This latter case is the most area efficient
placement, but it may not always possible to link transistors. The drawing also shows that gate

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI Design AND TESTING 21EC63

lines can run perpendicular or parallel to the power supply rails. Although not shown explicitly in
the drawing, pFETs will be embedded in n-Wells around VDD, while the nFETs are closer to
ground rail

Figure A.19 General gate Layout geometry.

Dr Kiran Kumar V G, Department of ECE AJIET Mangaluru


VLSI DESIGN AND TESTING 21EC63

Delay
The two most common metrics for a good chip are speed and power.

Delay and power are influenced as much by the wires as by the transistors, A chip is of no value
if it cannot reliably accomplish its function

3.1.1 Definitions
A few definitions illustrated in Figure 3.1:
• Propagation delay time, tpd = maximum time from the input crossing 50% to the output
crossing 50%
• Contamination delay time, tcd = minimum time from the input crossing 50% to the output
crossing 50%
• Rise time, tr = time for a waveform to rise from 20% to 80% of its steady-state value
• Fall time, tf = time for a waveform to fall from 80% to 20% of its steady-state value
• Edge rate, trf = (tr + tf )/2

Intuitively, we know that when an input changes, the


output will retain its old value for at least the
contamination delay and take on its new value in at most
the propagation delay.
We differentiate between the delays for the output rising,
tpdr /tcdr , and the output falling,
tpdf /tcdf . Rise/fall times are also sometimes called
slopes or edge rates. Propagation and contamination
delay times are also called max-time and min-time,
respectively. The gate that charges or discharges a node is called the driver and the gates and wire
being driven are called the load. Propagation delay is usually the most relevant value of interest,
and is often simply called delay.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 1


VLSI DESIGN AND TESTING 21EC63

A timing analyzer computes the arrival times, i.e., the latest time at which each node in a block
of logic will switch. The nodes are classified as inputs, outputs, and internal nodes. The user
must specify the arrival time of inputs and the time data is required at the outputs. The arrival
time ai at internal node i depends on the propagation delay of the gate driving i and the arrival
times of the inputs to the gate:

The timing analyzer computes the arrival times at


each node and checks that the outputs arrive by their
required time. The slack is the difference between
the required and arrival times. Positive slack means
that the circuit meets timing. Negative slack means
that the circuit is not fast enough. Figure 3.2 shows nodes annotated with arrival times. If the
outputs are all required at 200 ps, the circuit has 60 ps of slack.

3.1.2 Timing Optimization


In most designs there will be many logic paths that do not require any conscious effort when it
comes to speed. These paths are already fast enough for the timing goals of the system. However,
there will be a number of critical paths that limit the operating speed of the system and require
attention to timing details. The critical paths can be affected at four main levels:
• The architectural/microarchitectural level
• The logic level
• The circuit level
• The layout level
The most leverage is achieved with a good microarchitecture. This requires a broad knowledge of
both the algorithms that implement the function and the technology being targeted, such as how

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 2


VLSI DESIGN AND TESTING 21EC63

many gate delays fit in a clock cycle, how quickly addition occurs, how fast memories are
accessed, and how long signals take to propagate along a wire.
Trade-offs at the microarchitectural level include the number of pipeline stages, the number of
execution units (parallelism), and the size of memories.
The next level of timing optimization comes at the logic level. Trade-offs include types of
functional blocks (e.g., ripple carry vs. lookahead adders), the number of stages of gates in the
clock cycle, and the fan-in and fan-out of the gates. The transformation from function to gates and
registers can be done by experience, by experimentation, or, most often, by logic synthesis.
Remember, however, that no amount of skillful logic design can overcome a poor
microarchitecture.
Once the logic has been selected, the delay can be tuned at the circuit level by choosing transistor
sizes or using other styles of CMOS logic. Finally, delay is dependent on the layout. The floorplan
(either manually or automatically generated) is of great importance because it determines the wire
lengths that can dominate delay. Good cell layouts can also reduce parasitic capacitance.
Many RTL designers never venture below the microarchitectural level. A common design practice
is to write RTL code, synthesize it (allowing the synthesizer to do the timing optimizations at the
logic, circuit, and placement levels) and check if the results are fast enough. If they are not, the
designer recodes the RTL with more parallelism or pipelining, or changes the algorithm and
repeats until the timing constraints are satisfied. Timing analyzers are used to check timing closure,
i.e., whether the circuit meets all of the timing constraints.
Without an understanding of the lower levels of abstraction where the synthesizer is working, a
designer may have a difficult time achieving timing closure on a challenging system.
The RC delay model approximates a switching transistor with an effective resistance and provides
a way to estimate delay using arithmetic rather than differential equations.
The method of Logical Effort simplifies the model even further and is a powerful way to evaluate
delay in circuits.

3.2 Transient Response


The most fundamental way to compute delay is to develop a physical model of the circuit of
interest, write a differential equation describing the output voltage as a function of input voltage

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 3


VLSI DESIGN AND TESTING 21EC63

and time, and solve the equation. The solution of the differential equation is called the transient
response, and the delay is the time when the output reaches VDD /2.
The differential equation is based on charging or discharging of the capacitances in the circuit. The
circuit takes time to switch because the capacitance cannot
change its voltage instantaneously. If capacitance C is
charged with a current I, the voltage on the capacitor varies
as:
𝑑𝑣
𝐼=𝐶
𝑑𝑡
Figure 3.3(a) shows an inverter X1 driving another inverter
X2 at the end of a wire. Suppose a voltage step from 0 to
VDD is applied to node A and we wish to compute the
propagation delay, tpdf , through X1, i.e., the delay from
the input step until node B crosses VDD/2.
These capacitances are annotated on Figure 4.3(b). There
are diffusion capacitances between the drain and body of
each transistor and between the source and body of each
transistor: Cdb and Csb . The gate capacitance Cgs of the
transistors in X2 are part of the load. The wire capacitance
is also part of the load. The gate capacitance of the
transistors in X1 and the diffusion capacitance of the transistors in X2 do not matter because they
do not connect to node B. The source-to-body capacitors Csbn1 and Csbp1 have both terminals
tied to constant voltages and thus do not contribute to the switching capacitance. It is also irrelevant
whether the second terminal of each capacitor connects to ground or power because both are
constant supplies, so for the sake of simplicity, we can draw all of the capacitors as if they are
connected to ground. Figure 4.3(c) shows the equivalent circuit diagram in which all the
capacitances are lumped into a single Cout.
4.3 RC Delay Model
RC delay models approximate the nonlinear transistor I-V and C-V characteristics with an average
resistance and capacitance over the switching range of the gate. This approximation works

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 4


VLSI DESIGN AND TESTING 21EC63

remarkably well for delay estimation despite its obvious limitations in predicting detailed analog
behavior.
4.3.1 Effective Resistance
The RC delay model treats a transistor as a switch in series with a resistor. The effective resistance
is the ratio of Vds to Ids averaged across the switching interval of interest.
A unit nMOS transistor is defined to have effective resistance R. The size of the unit transistor is
arbitrary but conventionally refers to a transistor with minimum length and minimum contacted
diffusion width (i.e., 4/2 ). Alternatively, it may refer to the width of the nMOS transistor in a
minimum-sized inverter in a standard cell library. An nMOS transistor of k times unit width has
resistance R/k because it delivers k times as much current.
A unit pMOS transistor has greater resistance, generally in the range of 2R–3R, because of its
lower mobility. Throughout, we will use 2R for examples to keep arithmetic simple. R is typically
on the order of 10 kΩ for a unit transistor.
According to the long-channel model, current decreases linearly with channel lengthand hence
resistance is proportional to L.
4.3.2 Gate and Diffusion Capacitance
Each transistor also has gate and diffusion capacitance. We define C to be the gate capacitance
of a unit transistor of either flavor. A transistor of k times unit width has capacitance kC. Diffusion
capacitance depends on the size of the source/drain region.
we assume the contacted source or drain of a unit transistor to also have capacitance of about C.
Wider transistors have proportionally greater diffusion capacitance. Increasing channel length
increases gate capacitance proportionally but does not affect diffusion capacitance. we roughly
estimate C for a minimum length transistor to be 1 fF/µm of width. In a 65 nm process with a unit
transistor being 0.1 µm wide, C is thus about 0.1 fF.
43.3.3 Equivalent RC Circuits

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 5


VLSI DESIGN AND TESTING 21EC63

Figure 4.5 shows equivalent RC circuit models for nMOS and pMOS transistors of width k with
contacted diffusion on both source and drain. The pMOS transistor has approximately twice the
resistance of the nMOS transistor because holes have lower
mobility than electrons. The pMOS capacitors are shown with
VDD as their second terminal because the n-well is usually tied
high. However, the behavior of the capacitor from a delay
perspective is independent of the second terminal voltage so
long as it is constant. Hence, we sometimes draw the second
terminal as ground for convenience. The equivalent circuits for
logic gates are assembled from the individual transistors. Figure
4.6 shows the equivalent circuit for a fanout-of-1 inverter with
negligible wire capacitance. The unit inverters of Figure 4.6(a)
are composed from an nMOS transistor of unit size and a pMOS
transistor of twice unit width to achieve equal rise and fall
resistance. Figure 4.6(b) gives an equivalent
circuit, showing the first inverter driving the second inverter’s
gate. If the input A rises, the nMOS transistor will be ON and the pMOS OFF. Figure 4.6(c)
illustrates this case with the switches removed. The capacitors shorted between two constant
supplies are also removed because they are not charged or discharged.
The total capacitance on the output Y is 6C.

3.3.4 Transient Response

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 6


VLSI DESIGN AND TESTING 21EC63

Now, consider applying the RC model to estimate the step response of the first-order system
shown in Figure 4.8. This system is a good model of an inverter sized for equal rise and fall
delays. The system has a transfer function
1
𝐻(𝑠) =
1 + 𝑠𝑅𝐶
And a step response
𝑉𝑜𝑢𝑡 = 𝑉𝐷𝐷 𝑒 −𝑡/
where  = RC. The propagation delay is the time at which Vout reaches VDD /2, as shown in
Figure 4.9.
𝑡𝑝𝑑 = 𝑅𝐶 𝑙𝑛 2

Now the propagation delay is simply =RC. For the sake of convenience, we usually drop the
prime symbols and just write
tpd = RC
Figure 4.10 shows a second-order system. R1 and R2 might model the two series nMOS transistors
in a NAND gate or an inverter driving a long wire with non-negligible resistance. The transfer
function is

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 7


VLSI DESIGN AND TESTING 21EC63

EQ (4.12) is so complicated that it defeats the purpose of simplifying a CMOS circuit into an
equivalent RC network. However, it can be further approximated as a firstorder system with a
single time constant:

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 8


VLSI DESIGN AND TESTING 21EC63

3.3.5 Elmore Delay


In general, most circuits of interest can be represented as an RC tree, i.e., an RC circuit with no
loops. The root of the tree is the voltage source and the leaves are the capacitors at the ends of the
branches. The Elmore delay model [Elmore48] estimates the delay from a source switching to one
of the leaf nodes changing as the sum over each node i of the capacitance Ci on the node, multiplied
by the effective resistance Ris on the shared path from the source to the node and the leaf.
Application of Elmore delay is best illustrate through examples.

It is often helpful to express delay in a process-independent form so that circuits can be compared
based on topology rather than speed of the manufacturing process. Moreover, with a process-
independent measure for delay, knowledge of circuit speeds gained while working in one process
can be carried over to a new process. Observe that the delay of an ideal fanout-of-1 inverter with
no parasitic capacitance is  = 3RC -1 [Sutherland99].We denote the normalized delay d relative to
this inverter delay:
𝑡𝑝𝑑
𝑑=

The delay consists of two components. The parasitic delay is the time for a gate to drive its own
internal diffusion capacitance. Boosting the width of the transistors decreases the resistance but
increases the capacitance so the parasitic delay is ideally independent of the gate size.
The effort delay depends on the ratio h of external load capacitance to input capacitance and thus
changes with transistor widths. It also depends on the complexity of the gate. The capacitance ratio
is called the fanout or electrical effort and the term indicating gate complexity is called the logical
effort.
For example, an inverter has a delay of d = h + 1, so the parasitic delay is 1 and the logical effort
is also 1. The NAND3 has a worst case delay of d = (5/3)h + 5. Thus, it has a parasitic delay
of 5 and a logical effort of 5/3.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 9


VLSI DESIGN AND TESTING 21EC63

Compute the Elmore delay for Vout in the 2nd order RC system.
SOLUTION: The circuit has a source and two nodes. At node n1, the capacitance is C1 and the
resistance to the source is R1. At node Vout, the capacitance is C2 and the resistance to the source
is (R1 + R2). Hence, the Elmore delay is tpd = R1C1 + (R1 + R2)C2, just as the single time constant
predicted in EQ (4.13). Note that the effective resistances should account for the factor of ln 2.

Estimate Elmore delay tpd for a unit inverter driving m identical unit inverters.
SOLUTION: Figure 4.12 shows an equivalent circuit for the falling
transition. Each load inverter presents 3C units of gate capacitance, for
a total of 3mC. The output node also sees a capacitance of 3C from the
drain diffusions of the driving inverter. This capacitance is called
parasitic because it is an undesired side-effect of the need to make the
drain large enough to contact. The parasitic capacitance is independent
of the load that the inverter is driving.
Hence, the total capacitance is (3 + 3m)C. The resistance is R, so the
Elmore delay is tpd = (3 + 3m)RC.

Estimate Elmore delay tpd for a inverter of width w driving m identical unit inverters.
SOLUTION: Figure 4.13 shows the equivalent circuit. The driver transistors are w times as wide,
so the effective resistance decreases by a factor of w . The
diffusion capacitance increases by a factor of w. The
Elmore delay is tpd = ((3w + 3m)C)(R/w) = (3 + 3m/w)RC.
Define the fanout of the gate, h, to be the ratio of the load
capacitance to the input capacitance. (Diffusion
capacitance is not counted in the fanout.) The load
capacitance is 3mC. The input capacitance is 3wC. Thus,
the inverter has a fanout of h = m/w and the delay can be
written as (3 + 3h)RC.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 10


VLSI DESIGN AND TESTING 21EC63

If a unit transistor has R = 10 kΩ and C = 0.1 fF in a 65 nm process, compute the delay, in picoseconds,
of the inverter in Figure 4.14 with a fanout of h = 4.
Tpd = (3 + 3h)RC = 15ps

3.4 Linear Delay Model


The RC delay model showed that delay is a linear function of the fan-out of a gate. Based on this
observation, designers further simplify delay analysis by characterizing a gate by the slope and y-
intercept of this function. In general, the normalized delay of a gate can be expressed in units of 
as
𝑑 = 𝑓 + 𝑝 (𝟒. 𝟐𝟎)
p is the parasitic delay inherent to the gate when no load is attached. f is the effort delay or stage
effort that depends on the complexity and fanout of the gate:
𝑓 = 𝑔ℎ (𝟒. 𝟐𝟏)
The complexity is represented by the logical effort, g. An inverter is defined to have a logical effort
of 1. More complex gates have greater logical efforts, indicating that they take longer to drive a
given fanout.
For example, the logical effort of the 3-input NAND gate from the previous example is 5/3. A gate
driving h identical copies of itself is said to have a fanout or electrical effort of h. If the load does
not contain identical copies of the gate, the electrical effort can be computed as
𝐶𝑜𝑢𝑡
ℎ=
𝐶𝑖𝑛

where 𝐶𝑜𝑢𝑡 is the capacitance of the external load being driven and
𝐶𝑖𝑛 is the input capacitance of the gate.

Figure 4.21 plots normalized delay vs. electrical effort for an idealized
inverter and 3-input NAND gate. The y-intercepts indicate the
parasitic delay, i.e., the delay when the gate drives no load. The slope
of the lines is the logical effort. The inverter has a slope of 1 by
definition. The NAND has a slope of 5/3.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 11


VLSI DESIGN AND TESTING 21EC63

The remainder of this section explores how to estimate the logical effort and parasitic delay and
how to use the linear delay model
Estimate tpdf and tpdr for the 3-input NAND gate shown in the figure if the output is loaded
with h identical NAND gates.
Figure 4.7(a) shows such a gate. The three nMOS
transistors are in series so the resistance is three times
that of a single transistor. Therefore, each must be three
times unit width to compensate. In other words, each
transistor has resistance R/3 and the series combination
has resistance R. The two pMOS transistors are in
parallel. In the worst case (with one of the inputs low), only
one of the pMOS transistors is ON. Therefore, each must be
twice unit width to have resistance R.
Figure (c) redraws the gate with these capacitances deleted
and the remaining capacitances lumped to ground. Figure
(d) shows the equivalent circuit for the falling output
transition. The output pulls down through the three series
nMOS transistors. Figure 4.7(e) shows the equivalent
circuit for the rising output transition. In the worst case, the
upper two inputs are 1 and the bottom one falls to 0. The
output pulls up through a single pMOS transistor. The upper
two nMOS transistors are still on, so the diffusion
capacitance between the series nMOS transistors must also
be discharged
Each NAND gate load presents 5 units of capacitance on a
given input. Figure 4.15(a) shows the equivalent circuit
including the load for the falling transition.
Node n1 has capacitance 3C and resistance of R/3 to ground.
Node n2 has capacitance 3C and resistance (R/3 + R/3) to
ground. Node Y has capacitance (9 + 5h)C and resistance
(R/3 + R/3 + R/3) to ground.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 12


VLSI DESIGN AND TESTING 21EC63

The Elmore delay for the falling output is the sum of these RC products,
tpdf = (3C)(R/3) + (3C)(R/3 + R/3) + ((9 + 5h)C)(R/3 + R/3 + R/3) = (12 + 5h)RC.
Node Y has capacitance (9 + 5h)C and resistance R to the VDD supply. Node n2 has capacitance
3C. The relevant resistance is only R, not (R + R/3), because the output is being charged only
through R. This is what is meant by the resistance on the shared path from the source (VDD) to
the node (n2) and the leaf (Y). Similarly, node n1 has capacitance 3C and resistance R. Hence, the
Elmore delay for the rising output is tpdr = (15 + 5h)RC.
3.4.1 Logical Effort
Logical effort of a gate is defined as the ratio of the input capacitance of the gate to the input
capacitance of an inverter that can deliver the same output current.
Equivalently, logical effort
indicates how much worse a gate is at producing output current as
compared to an inverter, given that each input of the gate may only present
as much input capacitance as the inverter.
Logical effort can be measured in simulation from delay vs. fanout plots
as the ratio of the slope of the delay of the gate to the slope of the delay
of an inverter.
Figure 4.22 shows inverter, 3-input NAND, and 3-input NOR gates with
transistor widths chosen to achieve unit resistance, assuming pMOS
transistors have twice the resistance of nMOS transistors. The inverter
presents three units of input capacitance. The NAND presents five units
of capacitance on each input, so the logical effort is 5/3. Similarly, the
NOR presents seven units of capacitance, so the logical effort is 7/3. This
matches our expectation that NANDs are better than NORs because
NORs have slow pMOS transistors in series.
Table 4.2 lists the logical effort of common gates.
The effort tends to increase with the number of
inputs. NAND gates are better than NOR gates
because the series transistors are nMOS rather than
pMOS. Exclusive-OR gates are particularly costly
and have different logical efforts for different

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 13


VLSI DESIGN AND TESTING 21EC63

inputs. The multiplexers built from ganged tristates, as shown in Figure 1.29(b), have a logical
effort of 2 independent of the number of inputs. This might at first seem to imply that very large

multiplexers are just as fast as small ones. However, the parasitic delay does increase with
multiplexer size; hence, it is generally fastest to construct large multiplexers out of trees of 4-input
multiplexers.

3.4.2 Parasitic Delay


The parasitic delay of a gate is the delay of the gate when it drives zero load. It can be estimated
with RC delay models. A crude method good for hand calculations is to count only diffusion
capacitance on the output node. For example, consider the gates in Figure 4.22, assuming each
transistor on the output node has its own drain diffusion contact. Transistor widths were chosen to
give a resistance of R in each gate. The inverter has three units of diffusion capacitance on the
output, so the parasitic delay is 3RC = .
In general, the normalized parasitic delay pinv . pinv is the ratio of diffusion capacitance to gate
capacitance in a particular process. It is usually close to 1 and will be considered to be 1 in many
examples for simplicity.
The 3-input NAND and NOR each have 9 units of diffusion capacitance on the output, so the
parasitic delay is three times as great (3pinv, or simply 3).
Table 4.3 estimates the parasitic delay of common gates. Increasing transistor sizes reduces
resistance but increases capacitance correspondingly, so parasitic delay is, on first order,
independent of gate size.
However, wider transistors can be folded and often see less than linear increases in internal
wiring parasitic capacitance, so in practice, larger gates tend to have slightly lower parasitic
delay.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 14


VLSI DESIGN AND TESTING 21EC63

For example, Figure 4.23 shows a model of an n-input NAND gate in which the upper inputs were
all 1 and the bottom input rises. The gate must discharge the diffusion capacitances of all of the
internal nodes as well as the output. The Elmore delay is

3.4.4 Drive
A good standard cell library contains multiple sizes of each common gate. The sizes are typically
labeled with their drive. For example, a unit inverter may be called inv_1x. An inverter of eight
times unit size is called inv_8x. A 2-input NAND that delivers the same current as the inverter is
called nand2_1x.
It is often more intuitive to characterize gates by their drive, x, rather than their input capacitance.
If we redefine a unit inverter to have one unit of input capacitance, then the drive of an arbitrary
gate is
𝐶𝑖𝑛
𝑥=
𝑔
Delay can be expressed in terms of drive as

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 15


VLSI DESIGN AND TESTING 21EC63

𝐶𝑜𝑢𝑡
𝑑= +𝑝
𝑥

3.5 Logical Effort of Paths

The method of Logical Effort provides a simple method “on the back of an envelope” to choose
the best topology and number of stages of logic for a function. Based on the linear delay model, it
allows the designer to quickly estimate the best number of stages for a path, the minimum possible
delay for the given topology, and the gate sizes that achieve this delay.
3.5.1 Delay in Multistage Logic Networks
Figure 4.29 shows the logical and electrical efforts of each stage in a multistage path as a function
of the sizes of each stage. The path of interest (the only path in this case) is marked with the dashed
blue line. Observe that logical effort is independent of size, while electrical effort depends on sizes.
This section develops some metrics for the path as a whole that are independent of sizing decisions.

The path logical effort G can be expressed as the products of the logical efforts of each stage along
the path.

The path electrical effort H can be given as the ratio of the output capacitance the path must drive
divided by the input capacitance presented by the path. This is more convenient than defining path
electrical effort as the product of stage electrical efforts because we do not know the individual
stage electrical efforts until gate sizes are selected.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 16


VLSI DESIGN AND TESTING 21EC63

The path effort F is the product of the stage efforts of each stage. Recall that the stage effort of a
single stage is f = gh.

In paths that branch, F  GH . This is illustrated in Figure


4.30, a circuit with a twoway branch. Consider a path from the
primary input to one of the outputs. The path logical effort is
G = 1 × 1 = 1. The path electrical effort is H = 90/5 = 18.
Thus, GH = 18. But F = f1 f2 = g1h1g2h2 = 1 × 6 × 1 × 6 =
36. In other words, F = 2GH in this path on account of the
two-way branch.

The branching effort b is the ratio of the total capacitance seen by a stage to the capacitance on
the path; in Figure 4.30 it is (15 + 15)/15 = 2.

The path branching effort B is the product of the branching efforts between stages.

Now we can define the path effort F as the product of the logical, electrical, and branching efforts
of the path. Note that the product of the electrical efforts of the stages is actually BH, not just H.

𝐹 = 𝐺𝐵𝐻

We can now compute the delay of a multistage network. The path delay D is the sum of the delays
of each stage. It can also be written as the sum of the path effort delay DF and path parasitic delay
P:

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 17


VLSI DESIGN AND TESTING 21EC63

The product of the stage efforts is F, independent of gate sizes. The path effort delay is the sum of
the stage efforts. The sum of a set of numbers whose product is constant is minimized by choosing
all the numbers to be equal. In other words, the path delay is minimized when each stage bears the
same effort. If a path has N stages and each bears the same effort, that effort must be

Thus, the minimum possible delay of an N-stage path with path effort F and path parasitic
delay P is

the capacitance transformation formula to find the best input capacitance for a gate given the
output capacitance it drives

Estimate the minimum delay of the path from A to B in Figure 4.31 and choose transistor
sizes to achieve this delay. The initial NAND2 gate may present a load of 8  of transistor
width on the input and the output load is equivalent
to 45  of transistor width.
The path logical effort is G = (4/3) × (5/3) × (5/3) =
100/27. The path electrical effort is H = 45/8. The path
branching effort is B = 3 × 2 = 6. The path effort is F =
GBH = 125. As there are three stages, the best stage
3
effort is 𝑓 = √125 = 5 . The path parasitic
delay is P = 2 + 3 + 2 = 7. Hence, the minimum path delay is
D = 3 × 5 + 7 = 22 in units of , or 4.4 FO4 inverter delays.

OR
The NAND2 gate delay is d1 = g1h1 + p1 = (4/3) × (10 + 10 + 10)/8 + 2 = 7. The NAND3 gate
delay is d2 = g2h2 + p2 = (5/3) × (15 + 15)/10 + 3 = 8. The NOR2 gate delay is d3 = g3h3 + p3
= (5/3) × 45/15 + 2 = 7. Hence, the path delay is 22

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 18


VLSI DESIGN AND TESTING 21EC63

3.5.2 Choosing the Best Number of Stages


Let us compute how many should be added for least delay. The logic block shown in Figure 4.34
has n1 stages and a path effort of F.
Consider adding N – n1 inverters to the end
to bring the path to N stages. The extra
inverters do not change the path logical effort
but do add parasitic delay. The delay of the
new path is

Differentiating with respect to N and setting to 0 allows us to solve for the best number of
stages, which we will call N . The result can be expressed more compactly by defining
 = 𝐹 1/𝑁
to be the best stage effort.

Neglecting parasitics (i.e., assuming pinv = 0), we find the classic result that the stage effort  =
2.71828 (e)

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 19


VLSI DESIGN AND TESTING 21EC63

Estimate the minimum delay of the path from A to B in Figure 4.31 and choose transistor
sizes to achieve this delay. The initial NAND2 gate may present a load of 8 Q of transistor
width on the input and the output load is
equivalent to 45 Q of transistor width.

The path logical effort is

G = (4/3) × (5/3) × (5/3) = 100/ 27. T

he path electrical effort is H = 45/8.

The path branching effort is B = 3 × 2 = 6.

The path effort is F = GBH = 125.

As there are three stages, the best stage effort is .

The path parasitic delay is P = 2 + 3 + 2 = 7.

Hence, the minimum path delay is D = 3 × 5 + 7 = 22 in units of Y, or 4.4 FO4 inverter delays.
The gate sizes are computed with the capacitance transformation from EQ (4.41) working
backward along the path: y = 45 × (5/3)/5 = 15. x = (15 + 15) × (5/3)/5 = 10.

We verify that the initial 2-input NAND gate has the specified size of (10 + 10 + 10) × (4/3)/5 =
8. The transistor sizes in Figure 4.32 are chosen to give the desired amount of input capacitance
while achieving equal rise and fall delays. For example, a 2-input NOR gate should have a 4:1
P/N ratio. If the total input capacitance is 15, the pMOS width must be 12 and the nMOS width
must be 3 to achieve that ratio.

We can also check that our delay was achieved. The NAND2 gate delay is d1 = g1h1 + p1 = (4/3)
× (10 + 10 + 10)/8 + 2 = 7.

The NAND3gate delay is d2 = g2h2 + p2 = (5/3) × (15 + 15)/10 + 3 = 8. The NOR2 gate delay is
d3 = g3h3 + p3 = (5/3) × 45/15 + 2 = 7. Hence, the path delay is 22, as predicted. Recall that delay
is expressed in units of Y. In a 65 nm process with Y = 3 ps, the delay is 66 ps. Alternatively, a
fanout-of-4 inverter delay is 5Y, so the path delay is 4.4 FO4s.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 20


VLSI DESIGN AND TESTING 21EC63

3.1 Semiconductor Memories


Semiconductor memory arrays capable of storing large quantities of digital information are
essential to all digital systems. The amount of memory required in a particular system depends
on the type of application, but, in general, the number of transistors utilized for the information
(data) storage function is much larger than the number of transistors used in logic operations
and for other purposes. The ever-increasing demand for larger data storage capacity has driven
the fabrication technology and memory development towards more compact design rules and,
consequently, toward higher data storage densities. Thus, the maximum realizable data storage
capacity of single-chip semiconductor memory arrays approximately doubles every two years.
On-chip memory arrays have become widely used subsystems in many VLSI circuits, and
commercially available single-chip read/write memory capacity has reached 64 megabits. This
trend toward higher memory density and larger storage capacity will continue to push the
leading edge of digital system design.

The area efficiency of the memory array, i.e., the number of stored data bits per unit area, is
one of the key design criteria that determine the overall storage capacity and, hence, the
memory cost per bit. Another important issue is the memory access time, i.e., the time required
to store and/or retrieve a particular data bit in the memory array. The access time determines
the memory speed, which is an important performance criterion of the memory array. Finally,
the static and dynamic power consumption of the memory array is a significant factor to be
considered in the design, because of the increasing importance of low-power applications. In
the following, we will investigate different types of MOS memory arrays and discuss in detail
the issues of area, speed, and power consumption for each circuit type. Memory circuits are
generally classified according to the type of data storage and the type of data access. Read-
Only Memory (ROM) circuits allow, as the name implies, only the retrieval of previously
stored data and do not permit modifications of the stored information contents during normal
operation. ROMs are non-volatile memories, i.e., the data storage function is not lost even
when the power supply voltage is off. Depending on the type of data storage (data write)
method, ROMs are classified as mask-programmed ROMs, Programmable ROMs (PROM),
Erasable PROMs (EPROM), and Electrically Erasable PROMs (EEPROM)

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 1


VLSI DESIGN AND TESTING 21EC63

Read-write (R/W) memory circuits, on the other hand, must permit the modification (writing)
of data bits stored in the memory array, as well as their retrieval (reading) on demand. This
requires that the data storage function be volatile, i.e., the stored data are lost when the power
supply voltage is turned off. The read-write memory circuit is commonly called Random
Access Memory (RAM), mostly due to historical reasons. Compared to sequential-access
memories such as magnetic tapes, any cell in the R/W memory array can be accessed with
nearly equal access time. Based on the operation type of individual data storage cells, RAMs
are classified into two main categories: Static RAMs (SRAM) and Dynamic RAMs (DRAM).
Figure 10.1 shows an overview of the different memory types and their classifications.

A typical memory array organization is shown in Fig. 10.2. The data storage structure, or core,
consists of individual memory cells arranged in an array of horizontal rows and vertical
columns. Each cell is capable of storing one bit of binary information. Also, each memory cell
shares a common connection with the other cells in the same row, and another common
connection with the other cells in the same column. In this structure, there are 2 N rows, also
called word lines, and 2M columns, also called bit lines. Thus, the total number of memory cells
in this array is 2M x 2N.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 2


VLSI DESIGN AND TESTING 21EC63

To access a particular memory cell, i.e., a particular data bit in this array, the corresponding bit
line and the corresponding word line must be activated (selected). The row and column
selection operations are accomplished by row and column decoders, respectively. The row
decoder circuit selects one out of 2N word lines according to an N-bit row address, while the
column decoder circuit selects one out of 2M bit lines according to an M-bit column address.
Once a memory cell or a group of memory cells are selected in this fashion, a data read and/or
a data write operation may be performed on the selected single bit or multiple bits on a
particular row. The column decoder circuit serves the double duties of selecting the particular
columns and routing the corresponding data content in a selected row to the output individual
memory cells can be accessed for data read and/or data write operations in random order,
independent of their physical locations in the memory array. Thus, the array organization
examined here is called a Random Access Memory (RAM) structure.

Read-Only Memory (ROM) Circuits

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 3


VLSI DESIGN AND TESTING 21EC63

The read-only memory array can also be seen as a simple combinational Boolean network
which produces a specified output value for each input combination, i.e., for each address.
Thus, storing binary information at a particular address location can be achieved by the
presence or absence of a data path from the selected row (word line) to the selected column
(bit line), which is equivalent to the presence or absence of a device at that particular location.
In the following, we will examine two different implementations for MOS ROM arrays.
Consider first the 4-bit × 4-bit memory array shown in Fig. 10.3. Here, each column consists
of a pseudo-nMOS NOR gate driven by some of the row signals, i.e., the word lines.

only one word line is activated (selected) at a time by raising its voltage to VDD, while all
other rows are held at a low voltage level. If an active transistor exists at the cross point of a
column and the selected row, the column voltage is pulled down to the logic low level by that
transistor. If no active transistor exists at the cross point, the column voltage is pulled high by
the pMOS load device. Thus, a logic " 1 "-bit is stored as the absence of an active transistor,
while a logic ""-bit is stored as the presence of an active transistor at the crosspoint. To reduce
static power consumption, the pMOS load transistors in the ROM array shown in Fig. 10.3 can
also be driven by a periodic precharge signal, resulting in a dynamic ROM. In actual ROM
layout, the array can be initially manufactured with nMOS transistors at every row-column
intersection. The " 1 "-bits are then realized by omitting the drain or source connection, or the
gate electrode of the corresponding nMOS transistors in the final metallization step. Figure
10.4 shows four nMOS transistors in a NOR ROM array, forming the intersection of two metal
bit lines and two polysilicon word lines. To save silicon area, the transistors in every two
adjacent rows are arranged to share a common ground line, also routed in n-type diffusion. To
store a 0-bit at a particular address location, the drain diffusion of the corresponding transistor

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 4


VLSI DESIGN AND TESTING 21EC63

must be connected to the metal bit line via a metal-to-diffusion contact. Omission of this
contact, on the other hand, results in a stored "1 "-bit.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 5


VLSI DESIGN AND TESTING 21EC63

a NAND ROM (Fig. 10.8). Here, each bit line consists of a depletion-load NAND gate, driven
by some of the row signals, i.e., the word lines. In normal operation, all word lines are held at
the logic-high voltage level except for the selected line, which is pulled down to logic-low
level. If a transistor exists at the crosspoint of a column and the selected row, that transistor is
turned off and the column voltage is pulled high by the load device. On the other hand, if no
transistor exists (shorted) at that particular crosspoint, the column voltage is pulled low by the
other nMOS transistors in the multi-input NAND structure. Thus, a logic "1 "-bit is stored by
the presence of a transistor that can be deactivated, while a logic "0"-bit is stored by a shorted
or normally on transistor at the crosspoint. As in the NOR ROM case, the NAND-based ROM
array can be fabricated initially with a transistor connection present at every row-column
intersection. A "0-bit is then stored by lowering the threshold voltage of the corresponding
nMOS transistor at the cross point through a channel implant, so that the transistor remains on
regardless of the gate voltage

A row decoder designed to drive a NOR ROM array must, by definition, select one of the 2-
word lines by raising its voltage to VOH. As an example, consider the simple row address
decoder shown in Fig. 10.10, which decodes a two-bit row address and selects one out of four-
word lines by raising its level

A most straightforward implementation of this decoder is another NOR array, consisting of 4


rows (outputs) and 4 columns (two address bits and their complements). Note that this NOR-
based decoder array can be built just like the NOR ROM array, using the same selective
programming approach (Fig. 10.11). The ROM array and its row decoder can thus be fabricated
as two adjacent NOR arrays, as shown in Fig. 10.12

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 6


VLSI DESIGN AND TESTING 21EC63

A row decoder designed to


drive a NAND ROM, on the
other hand, must lower the
voltage level of the selected
row to logic "0" while keeping
all other rows at a logic-high
level. This function can be
implemented by using an N-
input NAND gate for each of
the row outputs. The truth table of a simple address decoder for four rows and the double
NAND-array implementation of the decoder and the ROM are shown in Fig. 10. 13. As in the
NOR ROM case, the row address decoder of the NAND ROM array can thus be realized
using the same layout strategy as the memory array itself.

The column decoder circuitry is designed to select one out of 2M bit lines (columns) of the
ROM array according to an M-bit column address, and to route the data content of the selected
bit line to the data output. A straightforward but costly approach would be to connect an nMOS

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 7


VLSI DESIGN AND TESTING 21EC63

pass transistor to each bit-line (column) output, and to selectively drive one out of 2M pass
transistors by using a NOR-based column address decoder, as shown in Fig. 10.14. In this
arrangement, only one nMOS pass transistor is turned on at a time, depending on the column
address bits applied to the decoder inputs. The conducting pass transistor routes the selected
column signal to the data output. Similarly, a number of columns can be chosen at a time, and
the selected columns can be routed to a parallel data output port.

The example shown in Fig. 10.15 is a column decoder tree for eight bit lines, which requires
three column address bits (and their complements) to select one of the eight columns.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 8


VLSI DESIGN AND TESTING 21EC63

Static Read-Write Memory (SRAM) Circuits


Read-write (R/W) memory circuits are designed to permit the modification (writing) of data
bits to be stored in the memory array, as well as their retrieval (reading) on demand. The
memory circuit is said to be static if the stored data can be retained indefinitely (as long as a
sufficient power supply voltage is provided), without any need for a periodic refresh operation.
We will examine the circuit structure and the operation of simple SRAM cells, as well as the
peripheral circuits designed to read and write the data
The data storage cell, i.e., the 1-bit memory cell in static RAM arrays, invariably consists of a
simple latch circuit with two stable operating points (states). Depending on the preserved state
of the two-inverter latch circuit, the data being held in the memory cell will be interpreted either
as a logic "0" or as a logic " 1." To access (read and write) the data contained in the memory
cell via the bit line, we need at least one switch, which is controlled by the corresponding word
line, i.e., the row address selection signal (Fig. 10.21(a)). Usually, two complementary access
switches consisting of nMOS pass transistors are implemented to connect the 1-bit SRAM cell
to the complementary bit lines (columns). This can be likened to turning the car steering wheel
with both left and right hands in complementary directions.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 9


VLSI DESIGN AND TESTING 21EC63

Figure 10.2 1 (b) shows the generic structure of the MOS static RAM cell, consisting of two
cross-coupled inverters and two access transistors. The load devices may be polysilicon
resistors, depletion-type nMOS transistors, or pMOS transistors, depending on the type of the
memory cell. The pass gates acting as data access switches are enhancement-type nMOS
transistors. The use of resistive-load inverter with undoped polysilicon resistors in the latch
structure typically results in a significantly more compact cell size, compared with the other
alternatives (Fig. 10.2 1(c)). This is true since the resistors can be stacked on top of the cell
(using double-polysilicon technology), thereby reducing the cell size to four transistors, as
opposed to the six-transistor cell topologies. If multiple polysilicon layers are available, one
layer can be used for the gates of the enhancement-type nMOS transistors, while another level
is used for load resistors and interconnects. In order to attain acceptable noise margins and
output pull-up times for the resistiveload inverter, the value of the load resistor has to be kept
relatively low, as already examined in Section 5.2. On the other hand, a high-valued load
resistor is required in order to reduce the amount of standby current being drawn by each
memory cell. Thus, there is a trade-off between the high resistance required for low power and
the requirement to provide wider noise margins and high speed. The power consumption issue
will be addressed later in more detail. The six-transistor depletion-load nMOS SRAM cell
shown in Fig. 10.21(d) can be easily implemented with one polysilicon and one metal layer,
and the cell size tends to be relatively small, especially with the use of buried metal-diffusion
contacts. The static characteristics and the noise margins of this memory cell are typically better
than those of the resistive-load cell. The static power consumption of the depletion-load SRAM
cell, however, makes it an unsuitable candidate for high-density SRAM arrays. The full CMOS
SRAM cell shown in Fig. 10.21(e) achieves the lowest static power dissipation among the
various circuit configurations presented here. In addition, the CMOS cell offers superior noise
margins and switching speed as well: The comparative advantages and disadvantages of the
CMOS static RAM cell will be investigated in depth later in this section.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 10


VLSI DESIGN AND TESTING 21EC63

Full CMOS SRAM Cell:

Figure 10.10: Full CMOS SRAM cell


Consider the SRAM cell as shown in figure 10.10 which uses a pair of cross coupled inverters.
The two stable operating points of this basic latch are used to store a one bit price of
information. To read and write operations, two nMOS pass transistors, two pMOS pass
transistors of both of which are driven by row select signal RS
SRAM cell is accessed via two bit lines or columns. When the word line (R S) is not selected
i.e.; when the voltage level of line RS is equal to logic ‘0’ the pass transistors are turned OFF
(M3and M4) the latch circuit consisting of two cross connected inverters preserves one of its
two stable operating points hence the data is being held. The buses (bit lines C and ) are
precharged to logic 1 before read and write operations takes place.
Write operation

• Both bit line C and are precharged to VDD (logic1) in coincidence with 1 of an assumed
two-phase clock. Precharging is effected by p transistors M7 and M8.
• The appropriate column select line is activated in coincidence with clock phase 2 and
either the bit line C or is discharged by the logic levels present on the I/O bus lines.
Selection of the memory cell is done by raising the word line voltage to logic ‘1’ and hence the
pass transisitors M3 and M4 are turned ON once the memory cell is selected.

Write ‘1’ operation: The voltage level of column is forced to logic low by the Data Write
circuitry. The driver transistor M1 turns off the voltage V1 attains logic high level. While V2
goes low.
Read ‘1’ operation: The voltage of column C retains its precharged level while the voltage of
column is pulled down by M2 and M4. The data read circuitry detects the small voltage
difference (VC > ) and amplifies it as “logic1” data output.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 11


VLSI DESIGN AND TESTING 21EC63

Write ‘0’ operation: The voltage level of column C is forced to logic low by Data-Write
circuitry. The driver transistor M2 turns off. The voltage V2 attains a logic high level while V1
goes low.

Read ‘0’ operation: The voltage of column retains its precharged level while the voltage
of column C is pulled down by M1 and M3. The Data-Read circuitry detects the small voltage
difference (VC> ) and amplifies it as logic “0” data output.
the most important advantage of this circuit topology is that the static power dissipation is even
smaller; essentially, it is limited by the leakage current of the pMOS transistors. A CMOS
memory cell thus draws current from the power supply only during a switching transition. The
low standby power consumption has certainly been a driving force for the increasing
prominence of high- density CMOS SRAMs. Other advantages of CMOS SRAM cells include
high noise immunity due to larger noise margins, and the ability to operate at lower power
supply voltages than, for example, the resistive-load SRAM cells. The major disadvantages of
CMOS memories historically were larger cell size, the added complexity of the CMOS process,
and the tendency to exhibit "latch-up" phenomena.

One transistor DRAM Cell

Figure 10.7: One transistor DRAM cell.


A DRAM memory circuit uses charge storage on a capacitor to represent binary data values.
Dynamic RAM gets its name because the charge stored on the capacitor cell leaks off with time
causing the stored value to be Dynamic. To prevent loss of data, the voltage on the capacitor
cell must be sampled and restored written a specific time period. This sample and restore
operation is called as memory refresh. Additional external circuitry is required to ensure that
all memory cells are refreshed periodically. 2ms is the maximum time period between refreshes
for DRAM memories. This is a cell with a single select transistor is a pass transistor that serves
to connect the stored value to a data bus under control of select line. The select line
simultaneously selects all transistors along the same row causing data to be placed on column
lines corresponding to each selected cell.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 12


VLSI DESIGN AND TESTING 21EC63

Three transistor DRAM


The circuit diagram of a typical three-transistor dynamic RAM cell is shown in Fig. 10.37 as
well as the column pull-up (precharge) transistors and the column read/write circuitry. Here,
the binary information is stored in the form of charge in the parasitic node capacitance Cl. The
storage transistor M2 is turned on or off depending on the charge stored in C1, and the pass
transistors Ml and M3 act as access switches for data read and write operations. The cell has
two separate bit lines for "data read" and "data write," and two separate word lines to control
the access transistors.
The operation of the three-transistor DRAM cell and its peripheral circuitry is based on a two-
phase non-overlapping clock scheme. The precharge events are driven by 1, whereas the
"read" and "write" events are driven by 2. Every "data read" and "data write" operation is
preceded by a precharge cycle, which is initiated with the precharge signal PC going high.
During the precharge cycle, the column pull-up transistors are activated, and the corresponding
column capacitances C2 and C3 are charged up to logic-high level.

All "data read" and "data write" operations are performed during the active 2 phase, i.e., when
PC is low. Figure 10.38 depicts the typical voltage waveforms associated with the 3-T DRAM
cell during a sequence of four consecutive operations: write " 1," read "1," write "0," and read

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 13


VLSI DESIGN AND TESTING 21EC63

"0." The four precharge cycles shown in Fig. 10.38 are numbered 1, 3,5, and 7, respectively.
Figure 10.39 illustrates the transient currents charging up the two columns (Din and Dout)
during a precharge cycle. The precharge cycle is effectively completed when both capacitance
voltages reach their steady-state values. Note here that the two column capacitances C2 and C3
are at least one order of magnitude larger than the internal storage capacitance C

For the write "1" operation, the inverse data input is at the logic-low level, because the data to
be written onto the DRAM cell is logic "1." Consequently, the "data write" transistor MD is
turned off, and the voltage level on column Din
remains high. Now, the "write select" signal WS is
pulled high during the active phase of 2. As a result,
the write access transistor Ml is turned on. With Ml
conducting, the charge on C2 is now shared with Cl
Since the capacitance C2 is very large compared to C1,
the storage node capacitance Cl attains approximately
the same logic-high level as the column capacitance
C2 at the end of the charge-sharing process

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 14


VLSI DESIGN AND TESTING 21EC63

After the write "1" operation is completed, the write access transistor MI is turned off. With
the storage capacitance C charged-up to a logic-high level, transistor M2 is now conducting.
In order to read this stored "1," the "read select" signal RS must be pulled high during the active
phase of 2, following a precharge cycle. As the read access transistor M3 turns on, M2 and
M3 create a conducting path between the "data read" column capacitance C3 and the ground.
The capacitance C3 discharges through M2 and M3, and the falling column voltage is
interpreted by the "data read" circuitry as a stored logic "1."
For the write "0" operation, the inverse data input is at the logic-high level, because the data to
be written onto the DRAM cell is a logic "0." Consequently, the data write transistor is turned
on, and the voltage level on column Din is pulled to logic "0." Now, the "write select" signal
WS is pulled high during the active phase of 2. As a result, the write access transistor Ml is
turned on. The voltage level on C2, as well as that on the storage node Cl, is pulled to logic "0"
through MI and the data write transistor,. Thus, at the end of the write "0" sequence, the storage
capacitance C1 contains a very low charge, and the transistor M2 is turned off since its gate
voltage is approximately equal to zero.
In order to read this stored "0," the "read select" signal RS must be pulled high during the active
phase of 2, following a precharge cycle. The read access transistor M3 turns on, but since M2
is off, there is no conducting path between the column capacitance C3 and the ground.
Consequently, C3 does not discharge, and the logic-high level on the Dout column is
interpreted by the data read circuitry as a stored "0" bit.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 15


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 16


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 17


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 18


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 19


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 20


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 21


VLSI DESIGN AND TESTING 21EC63

Ferro-electric RAM

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 22


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 23


VLSI DESIGN AND TESTING 21EC63

Module 4: Faults

FAULTS IN logic CIRCUITS


• A failure is said to have occurred in a logic circuit or system if it deviates from its
specified behavior [1].
• A fault, on the other hand, refers to a physical defect in a circuit. For example, a short
between two signal lines in the circuit or a break in a signal line is a physical defect.
• An error is usually the manifestation of a fault in the circuit; thus a fault may change
the value of a signal in a circuit from 0 (correct) to 1 (erroneous) or vice versa.
However, a fault does not always cause an error; in that case, the fault is considered to be
latent.
• A fault is characterized by its nature, value, extent, and duration [2]. The nature of a
fault can be classified as logical or nonlogical.
• A logical fault causes the logic value at a point in a circuit to become opposite to the
specified value. Nonlogical faults include the rest of the faults such as the malfunction
of the clock signal, power failure, etc.
• The value of a logical fault at a point in the circuit indicates whether the fault creates
fixed or varying erroneous logical values. The extent of a fault specifies whether the
effect of the fault is localized or distributed. A local fault affects only a single
variable, whereas a distributed fault affects more than one. A logical fault, for
example, is a local fault, whereas the malfunction of the clock is a distributed fault.
The duration of a fault refers to whether the fault is permanent or temporary.
Stuck-At Fault
• The most common model used for logical faults is the single stuck-at fault. It assumes
that a fault in a logic gate result in one of its inputs or the output is fixed at either a
logic 0 (stuck-at-0) or at logic 1
(stuck-at-1).
• Stuck-at-0 and stuck-at-l faults are
often abbreviated to s-a-0 and s-a-1,
respectively.
• Let us assume that in Figure 1.1 the A input of the NAND gate is s-a-1.
• The NAND gate Perceives the A input as a logic 1 irrespective of the logic value
placed on the input. For example, the output of the NAND gate is 0 for the input
pattern A=0 and B=1, when input A is s-a-1 in.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 1


VLSI DESIGN AND TESTING 21EC63

• In the absence of the fault, the output will be 1. Thus, AB=01 can be considered as the
test for the A input s-a-l, since there is a difference between the output of the fault-free
and faulty gate.
• The single stuck-at fault model is often referred to as the classical fault model and
offers a good representation for the most common types of defects [e.g., shorts and
opens in complementary metal oxide semiconductor (CMOS) technology]. Figure 1.2
illustrates the CMOS realization of the two-input NAND:
• The number 1 in the figure indicates an open, whereas the numbers 2 and 3 identify
the short between the output node and the ground and the short between the output
node and the VDD, respectively.
• A short in a CMOS results if not
enough metal is removed by the
photolithography, whereas over-
removal of metal results in an
open circuit [3]. Fault 1 in
Figure 1.2 will disconnect input
A from the gate of transistors T1
and T3. It has been shown that
in such a situation one transistor
may conduct and the other
remain nonconducting [4]. Thus,
the fault can be represented by a stuck at value of A; if A is s-a-0, T1 will be ON and
T3 OFF, and if A is s-a-l, T1 will be OFF and T3 ON. Fault 2 forces the output node to
be shorted to VDD, that is, the fault can be considered as an s-a-l fault. Similarly, fault
3 forces the output node to be s-a-0.
The stuck-at model is also used to represent multiple faults in circuits. In a multiple stuck-at
fault, it is assumed that more than one signal line in the circuit are stuck at logic 1 or logic 0;
A variation of the multiple faults is the unidirectional fault. A multiple fault is unidirectional
if all of its constituent faults are either s-a-0 or s-a-l but not both simultaneously. For
example, in Figure 1.2, faults 3
and 4 create stuck-on transistors
faults. As a further example, we
consider Figure 1.3, which
represents CMOS implementation
of the Boolean function:

Two possible shorts numbered 1


and 2 and two possible opens
numbered 3 and 4 are indicated
in the diagram. Short number 1
can be modelled by s-a-1 of input
E; open number 3 can be
modelled by s-a-0 of input E,
input F, or both. On the other

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 2


VLSI DESIGN AND TESTING 21EC63

hand, short number 2 and open number 4 cannot be modeled by any stuck-at fault because
they involve a modification of the network function.

For example, in the presence of short number 2, the network function will change to:

4.1.2 Bridging Faults


Bridging faults form an important class of permanent faults that cannot be modelled as stuck-
at faults. A bridging fault is said to have occurred when two or more signal lines in a circuit
are accidentally connected together. Earlier study of bridging faults concentrated only on the
shorting of signal lines in gate-level circuits. It was shown that the shorting of lines resulted
in wired logic at the connection.
Bridging faults at the gate level has been classified into two types: input bridging and
feedback bridging. An input bridging fault corresponds to the shorting of a certain number of
primary input lines. A feedback bridging fault results if there is a short between an output and
input line. A feedback bridging fault may cause a circuit to oscillate, or it may convert it into
a sequential circuit.
Bridging faults in a transistor-level circuit may occur between the terminals of a transistor

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 3


VLSI DESIGN AND TESTING 21EC63

or between two or more signal lines. Figure 1.5 shows the CMOS logic realization of the

Boolean function:

A short between two lines, as indicated by the dotted line in the diagram will change the
function of the circuit.
The effect of bridging among the terminals of transistors is technology-dependent. For
example, in CMOS circuits, such faults manifest as either stuck-at or stuck-open faults,
depending on the physical location and the value of the bridging resistance.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 4


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 5


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 6


VLSI DESIGN AND TESTING 21EC63

4.1.3 Delay Faults


The stuck-at fault model cannot represent all manufacturing defects in VLSI circuits. The size
of a defect determines its impact on the circuit's logic function, with smaller defects causing
partial open or shorts due to manufacturing process variations.
These defects result in the failure of a circuit to meet its timing specifications without any
alteration of the logic function of the circuit. A small defect may delay the transition of a
signal on a line either from 0 to 1, or vice versa. This type of malfunction is modeled by a
delay fault.
Two types of delay faults have been proposed in literature: gate delay fault and path delay
fault.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 7


VLSI DESIGN AND TESTING 21EC63

Gate delay faults have been used to model defects that cause the actual propagation delay of a
faulty gate to exceed its specified worst-case value. For example, if the specified worst case
propagation delay of a gate is x units and the actual delay is x+Δx units, then the gate is said
to have a delay fault of size Δx.
The main deficiency of the gate delay fault model is that it can only be used to model
isolated defects, not distributed defects, for example, several small delay defects. The path
delay fault model can be used to model isolated as well as distributed defects. In this model, a
fault is assumed to have occurred if the propagation delay along a path in the circuit under
test exceeds the specified limit.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 8


VLSI DESIGN AND TESTING 21EC63

1.2 BREAKS AND TRANSISTOR S STUCK-OPEN AND STUCK-ON OR STUCK-


OPEN FAULTS IN CMOS
The stuck-at fault model may not accurately represent all defects in CMOS VLSI, as breaks
and transistor stuck-ons may remain undetected, making them a significant percentage of
CMOS circuit defects.
1.2.1 Breaks
Breaks or opens in CMOS circuits are caused either by missing conducting material or extra
insulating material. Breaks can be either of the following two types [3]:
1. Intragate breaks;
2. Signal line breaks.
An intragate break occurs internal to a gate. Such a break can disconnect the source, the
drain, or the gate from a transistor, identified by b1, b2, and b3, respectively, in Figure 1.6.
The presence of b3, will have no logical effect on the operation of a circuit, but it will
increase the propagation delay; that is, the break will result in a delay fault. Similarly, the
break at b1 will also produce a delay fault without changing the function of the circuit.
However, the break at b2 will make the p-transistor nonconducting; that is, the transistor can
be assumed to be stuck-open.
An intragate break can also disconnect the p-
network, the n-network, or both networks
(b4, b5, and b6 in Figure 1.6) from the
circuit. The presence of b4 or b5 will have
the same effect as the output node getting
stuck-at-0 or stuck-at-1, respectively. In the
presence of b6, the output voltage may have
an intermittent stuck-at-1 or stuck-at-0
value; thus, if the output node

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 9


VLSI DESIGN AND TESTING 21EC63

simultaneously drives a p-transistor and an n-transistor, then one of the transistors will be ON
for some unpredictable period of time. Signal line breaks can force the gates of transistors in
static CMOS circuits to float.
As shown in Figure 1.6, such a break can make the gate of only a p-transistor and an n-
transistor to float. It is also possible, depending on the position of a break, that the gates of
both transistors may float, in which case one transistor may conduct and the other remain in a
nonconducting state
Stuck-On and Stuck-Open Faults
A stuck-on transistor fault implies the permanent closing of the path between the source and
the drain of the transistor. Although the stuck-on transistor, in practice, behaves in a similar
way as a stuck-closed transistor
A stuck-on transistor has the same drain-source resistance as the on resistance of a fault-free
transistor, whereas a stuck-closed transistor exhibits a drain-source resistance that is
significantly lower than the normal on-resistance. In other words, in the case of stuck-closed
transistor, the short between the drain and the source is almost perfect, and this is not true for
a stuck-on transistor.
A transistor stuck-on (stuck-closed) fault may be modelled as a bridging fault from the source
to the drain of a transistor.
A stuck-open transistor implies the permanent opening of the connection between the source
and the drain of a transistor. The drain-
source resistance of a stuck-open
transistor is significantly higher than the
off-resistance of a nonfaulty transistor. If
the drain-source resistance of a faulty
transistor is approximately equal to that
of a fault-free transistor, then the
transistor is considered to be stuck-off.
For all practical purposes, transistor
stuck-off and stuck-open faults are
functionally equivalent.

stuck-open transistor fault like a feedback bridging fault can turn a combinational circuit

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 10


VLSI DESIGN AND TESTING 21EC63

into a sequential circuit. Figure 1.7 shows a two-input CMOS NOR gate. A stuck-open fault
causes the output to be connected neither to GND nor to VDD. If, for example, transistor T2
is open-circuited, then for input AB=00, the pull-up circuit will not be active and there will be
no change in the output voltage. In fact, the output retains its previous logic state; however,
the length of time the state is retained is determined by the leakage current at the output node.
Table 1.1 shows the truth table for the two-input CMOS NOR gate. The fault-free output is
shown in column Z; the three columns to the right represent the outputs in presence of the
three stuck-open (s-op) faults. The first, As-op, is caused by any input, drain, or source
missing connection to the pull-down FET T3. The second, Bs-op, is caused by any input,
drain, or source missing connection to the pull-down FET T4. The third, VDDs-op, is caused
by an open anywhere in the series, p-channel pull-up connection to VDD. The symbol Zt is
used to indicate that the output state retains the previous logic value.

Temporary Faults
An error is a manifestation of a fault. A temporary fault can result in an intermittent or a
transient error. Transient errors are the major source of failures in VLSI chips. They are
nonrecurring and are not repairable because there is no physical damage to the hardware.
Very deep submicron technology has enabled the packing of millions of transistors on a VLSI
chip by reducing the transistor dimensions. However, the reduction of transistor sizes also
reduces their noise margins. As a result, they become more vulnerable to noise, cross-talk,
etc., which in turn result in transient errors. In addition, small transistors are affected by
terrestrial radiation and suffer temporary malfunction, thereby increasing the rate of transient
errors.
Intermittent faults are recurring faults that reappear on a regular basis. Such faults can occur
due to loose connections, partially defective components, or poor designs. Intermittent faults
occurring due to deteriorating or aging components may eventually become permanent. Some
intermittent faults also occur due to environmental conditions such as temperature, humidity,
vibration, etc. The likelihood of such intermittent faults depends on how well the system is
protected from its physical environment through shielding, filtering, cooling, etc. An
intermittent fault in a circuit causes a malfunction of the circuit only if it is active; if it is
inactive, the circuit operates correctly.
A circuit is said to be in a fault active state if a fault present in the circuit is active, and it is
said to be in the fault-not-active state if a fault is present but inactive [11]. Because
intermittent faults are random, they can be modeled only by using probabilistic methods.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 11


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 12


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 13


VLSI DESIGN AND TESTING 21EC63

The aim of testing at the gate level is to verify that each logic gate in the circuit is functioning
properly and the interconnections are good. If only a single stuck-at fault is assumed to be
present in the circuit under test, then the problem is to construct a test set that will detect the
fault by utilizing only the inputs and the outputs of the circuit.
One of the main objectives in testing is to minimize the number of test patterns. If the
function of a circuit in the presence of a fault is different from its normal function (i.e., the
circuit is nonredundant), then an n-input combinational circuit can be completely tested by
applying all 2n combinations to it; however, 2n increases very rapidly as n increases. For a
sequential circuit with n inputs and m flip-flops, the total number of input combinations
necessary to exhaustively test the circuit is 2n×2m=2m+n. If, for example, n=20 and m=40,
there would be 260 tests. At a rate of 10,000 tests per second, the total test time for the circuit
would be about 3.65 million years! Fortunately, a complete truth table exercise of the logic
circuit is not necessary−only the input combinations that detect most of the faults in the
circuit are required.
The efficiency of a test set is measured by a figure of merit called fault coverage. The term
fault coverage refers to the percentage of the possible single stuck-at faults that a test set will
detect. The computation time needed to generate tests for combinational circuits is
proportional to the square of the number of gates in the circuit. For example, the test
generation time for a 100,000-gate circuit is 100 times that for a 10,000-gate circuit. The task
is even more complicated for sequential circuits because the number of internal states is an
exponential function of the number of memory elements. Thus, sequential circuits have to be
designed so that the fault detection in such circuits becomes easier.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 14


VLSI DESIGN AND TESTING 21EC63

TEST GENERATION FOR COMBINATIONAL LOGIC CIRCUITS


Several distinct test generation methods have been developed over the years for
combinational circuits. These methods are based on the assumptions that a circuit is
nonredundant and only a single stuck-at fault is present at any time.
2.1.1 Truth Table and Fault Matrix
The most straightforward method for
generating tests for a particular fault is
to compare the responses of the fault-
free and the faulty circuit to all possible
input combinations. Any input
combination for which the output
responses do not match is a test for the
given fault.
Let the inputs to a combinational circuit
be x1, x2, ..., xn and let Z be the output
of the circuit. Let Za be the output of
the circuit in the presence of the fault α.
The test generation method starts with
the construction of the truth tables of Z
and Za. Then for each row of the truth
table, Z⊕Za is computed; if the result
is 1, the input combination
corresponding to the row is a test for
the fault.
As an example, let us consider the
circuit shown in Figure 2.1a and assume
that tests for faults  s-a-0 and  s-a-l have to be derived. The truth table for the circuit is
shown in Figure 2.1b, where column Z denotes the fault-free output, and Z and Z
correspond to the circuit output in presence of faults  s-a-0 and  s-a-l, respectively. The
tests for the faults are indicated as l’s in the columns corresponding to Z⊕Za and Z⊕Z.
Thus, the test for  s-a-0 is x1x2x3=110, and the test for  s-a-l is x1x2x3=001. For all other
input combinations, the output of the fault-free circuit is the same as the output in the
presence of the fault; consequently, they are not tests for α s-a-0 and β s-a-l.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 15


VLSI DESIGN AND TESTING 21EC63

The minimum number of tests required to detect a set of faults in a combinational circuit can
be obtained from a fault matrix. The columns in a fault matrix list the single faults to be
tested, and the rows indicate the tests. A fault matrix for the circuit of Figure 2.2a is shown in
Figure 2.2b. A 1 at the intersection of the ith row and the jth column indicates that the fault
corresponding to the jth column can be detected by the ith test. As can be seen from Figure
2.2b, a fault matrix is identical to a prime implicant chart used in logic minimization. Thus,
the problem of finding the minimum number of tests is the same as the problem of finding the
minimum number of prime implicants (i.e., rows) so that every column has a 1 in at least one
row. In Figure 2.2b, rows 110, 101, and 111 are equivalent (i.e., each test detects the same
faults as the other two); hence, 101 and 111 can be omitted. Furthermore, row 000 covers row
100 and row 001 covers row 011; thus, rows 100 and 011 can be omitted. Elimination of

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 16


VLSI DESIGN AND TESTING 21EC63

rows 100, 101, 011, and 111 yields the minimal test set as shown in Figure 2.2c. These four
tests detect all of the six faults under consideration.

Path Sensitization
The basic principle of the path sensitization method is to choose some path from the origin of
the fault to the circuit output.
A path is sensitized if the inputs to the gates along the path are assigned values such that the
effect of the fault can be propagated to the output.
To illustrate, let us consider the circuit shown in Figure 2.3 and assume that line α is s-a-1.
To test for α, both G3 and C must be set at 1. In addition, D and G6 must be set at 1 so that
G7=1 if the fault is absent. To propagate the fault from G7 to the circuit output f via G8
requires the output of G4 to be 1. This is because if G4=0, the output f will be forced to be 1,
independent of the value of gate G7. The process of propagating the effect of the fault from
its original location to the circuit output is known as the forward trace.
The next phase of the method is the backward trace, in which the necessary signal values at
the gate outputs specified in the forward trace phase are established. For example, to set G3 at
1, A must be set at 0, which also
sets G4=1. In order for G6 to be
at 1, B must be set at 0; note
that G6 cannot be set at 1 by
making C=0 because this is
inconsistent with the
assignment of C in the forward
trace phase. Therefore, the test
ABCD=0011 detects the fault α
s-a-1, since the output f will be
0 for the fault-free circuit and 1
in the presence of the fault.

4.1.3 D-Algorithm
The D-algorithm is guaranteed to find a test if one exists for detecting a fault. It uses a
cubical algebra for automatic generation of tests. Three types of cubes are considered:
1. Singular cube;
2. Propagation D-cube;
3. Primitive D-cube of a fault.
Singular cube. A singular cube corresponds to a
prime implicant of a function. Figure 2.6 shows the
singular cubes for the two-input NOR function; x’s
or blanks are used to denote that the position may
be either 0 or l.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 17


VLSI DESIGN AND TESTING 21EC63

Propagation D-cube. D-cubes represent the input/output behaviour of the good and the faulty
circuit. The symbol D may assume 0 or 1. D takes on the value opposite to 𝐷 ̅ (i.e., if D=1, 𝐷
̅
̅
=0 and if D=0, D =1). The definitions of D and 𝐷 could be interchanged, but they should be
consistent throughout the circuit. Thus, all D’s in a circuit imply the same value (0 or 1) and
̅ ’s will have the opposite value.
all 𝐷
The propagation D-cubes of a gate are those that cause the output of the gate to depend only
on one or more of its specified inputs. Thus, a fault on a specified input
is propagated to the output. The propagation D-cubes for a two-input
NAND gate are:
The propagation D-cubes 1D𝐷 ̅ and D1𝐷 ̅ indicate that if one of the inputs
of the NAND gate is 1, the output is the complement of the other. DD𝐷 ̅
propagates multiple input changes through the NAND gate. Propagation
D-cubes of a gate can be constructed by intersecting its singular cubes
with output values. The intersection rules are as follows:

For example, the propagation D-cube of a three-input NOR gate can be formed as shown in
Figure 2.7.

Primitive D-cube of a fault. The primitive D-cube of a fault (pdcf ) is used to specify the
existence of a given fault. It consists of an input pattern which shows the effect of a fault on
the output of the gate. For example, if the output of the NOR gate shown in Figure 2.6 is s-a-
0, the corresponding pdcf is:

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 18


VLSI DESIGN AND TESTING 21EC63

Here, D is interpreted as being 1 if the circuit is fault-free and is 0 if the fault is present. The
pdcf ’s for the NOR gate output s-a-1 are:

The pdcf ’s corresponding to an output s-a-0 fault in a gate can be obtained by intersecting
each singular cube having output 1 in the fault-free gate with each singular cube having
output 0 in the faulty gate. Similarly, the pdcf ’s corresponding to an output s-a-1 fault can be
obtained by intersecting each singular cube with output 0 in the fault-free gate, with each
singular cube having output1 in the gate. The intersection rules are similar to those used for
propagation D-cubes.
As an example, let us consider a three-input NAND gate with input lines a, b, and c and
output line f. The singular cubes for the fault-free NAND gate are:

Assuming the input line b is s-a-1, the singular cubes for the faulty NAND gate are:

Therefore, the primitive D-cube of the b s-a-1 fault is 101D. The pdcf ’s for all single stuck-at
faults for the three-input NAND gate are:

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 19


VLSI DESIGN AND TESTING 21EC63

Let us next consider how the various cubes described are used in the D-algorithm method to
generate a test for a given fault. The test generation process consists of three steps:
• Step 1. Select a pdcf for the given fault.
• Step 2. Drive the D (or 𝐷 ̅ ) from the output of the gate under test to an output of the
circuit by successively intersecting the current test cube with the propagation D-cubes
of successive gates. A test cube represents the signal values at various lines in the
circuit during each step of the test generation process. The intersection of a test cube
with the propagation D-cube of a successor gate results in a test cube.
• Step 3. Justify the internal line values by driving back toward the inputs of the circuit,
assigning input values to the gates so that a consistent set of circuit input values may
be obtained.
Let us demonstrate the application of the D-algorithm by deriving a test for detecting the
α s-a-1 fault in Figure 2.8a. The test generation process is explained in Figure 2.8b. As
can be seen in Figure

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 20


VLSI DESIGN AND TESTING 21EC63

2.8b, the consistency operation at step 4 terminates unsuccessfully because the output of
G3 has to be set to 1. This can be done only by making input B=0; however, B has already
been assigned 1 in step 1. A similar problem will arise if D is propagated to the output via
G3 instead of G2. The only way the consistency problem can be resolved is if the 𝐷 ̅
output of G1 is propagated to the output of the circuit via both G2 and G3 as shown in
Figure 2.8c. No consistency operation is needed in this case, and the test for the given
fault is AB=11. This test also detects the output of G2 s-a-0, the output of G3 s-a-0, and
the output of G4 s-a-1.
As a further example of the application of the D-algorithm, let us derive a test for the s-a-
0 fault at the output of gate G2 in the circuit shown in Figure 2.9a. The test derivation is
as shown in Figure 2.9b. The test is ABC=011.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 21


VLSI DESIGN AND TESTING 21EC63

2.1.4 PODEM
PODEM is an enumeration algorithm in which all input patterns are examined as tests for a
given fault. The search for a test continues until the search space is exhausted or a test pattern
is found. If no test pattern is found, the fault is considered to be undetectable. In D-algorithm,
line justification, i.e., line values assigned during the backtracking toward the inputs of the
circuit, allows assignments on any internal lines. In PODEM, backtracking is allowed on
primary inputs only, thus reducing the number of backtracks. PODEM consists of six steps:
Step 1. Assume all primary inputs are x, which are unassigned. Determine an initial objective;
an objective is defined by a logic (0 or 1) value referred to as objective logic level. The initial
objective is to select a logic value so that the fault to be detected is sensitized.
Step 2. Select a primary input and assign a logic value that has good likelihood of satisfying
the initial objective.
Step 3. Propagate forward the value at the selected primary input in conjunction with X ’s at
the rest of the primary inputs by using the five-valued logic 0, 1, X, D, and D .

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 22


VLSI DESIGN AND TESTING 21EC63

Step 4. If it is a test, a D or a D is propagated to the output of the circuit, exit; otherwise,


Assign the complement of the previous value to the primary input and determine whether it is
a test.
Step 5. Assign a 0 or a 1 to one more primary input, and go to step 4 to check whether the
resulting combination is a test.
Step 6. Continue with steps 4 and 5 until a test is found, or the fault is found to be
undetectable.
The main differences between PODEM and D-algorithm are as follows:
In PODEM, backtracking is allowed only on primary inputs not on any internal line. PODEM
does not require the consistency check operation.
Let us illustrate the application of PODEM by deriving a test for fault l s-a-1 in the circuit
shown in Figure 2.10. Since a test for fault l s-a-1 is to be derived, the initial objective is to
set l to 0. Either B or C can be
assigned 1 to satisfy the
objective. Assuming we choose
B to be at 1, the result of the
forward propagation is:

̅ (or D) through n to output F. This can be done by


The next objective is to propagate 𝐷
assigning proper logic value to input C. Suppose we set C to 1, this results in the following:

This will block the propagation of D because n is forced to 0. However if C is assigned 1, D


is propagated through n:

The final objective is to propagate D ( or D) to output F. This can be done by assigning


proper logic value i.e. 0 to input A.

Thus, ABC=010 is the test for l s-a-1.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 23


VLSI DESIGN AND TESTING 21EC63

2.1.5 FAN
The FAN algorithm is in principle similar to PODEM but is made more efficient by reducing
the number of backtracks. Several terms have to be defined before discussing the test
generation process used by FAN. A bound line is a gate output that is part of a reconvergent
fan-out loop. A line that is not bound is considered to be free. A headline is a free line that
drives a gate that is part of a reconvergent fan-out loop. In Figure 2.12, for example, nodes H,
I, and J are bound lines, A through H are free lines, and G, H, and F are headlines. Because
by definition headlines are free lines, they can be considered as primary input lines and can
always be assigned values arbitrarily. Thus, during a backtrack operation if a headline is
reached, the backtrack stops; it is not necessary to reach a primary input to complete the
backtrack.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 24


VLSI DESIGN AND TESTING 21EC63

FAN uses a technique called multiple backtracks to reduce the number of backtracks that
must be made during the search process. For example, in Figure 2.13, if the objective is to set
H at logic 1, PODEM would backtrack along one of the paths to the primary inputs. Suppose
the backtrack is done via the path H-E-C, which will set E to 1. Because E is at 1, C will set
to 0. However a 0 at C sets F to 1, G to 0, and H to 0. Because this assignment fails to
achieve the desired objective, the backtrack process is performed via another path, for
example, H−G−F−C, and the desired goal can be achieved. Thus, in PODEM, several
backtracks may be necessary before the requirement of setting up a particular logic value on a
line is satisfied. FAN avoids this waste of computation time by backtracking along multiple
paths to the fan-out point. For example, if multiple backtrack is done via both H−E−C and
H−G−F−C, the value at C can be set so that the value at H is justified.

In PODEM, a logic value assigned to a primary input in order to achieve one objective may
in turn result in the failure of satisfying another objective, thereby forcing a backtrack.
2.1.6 Delay Fault Detection
A delay fault in a combinational logic circuit can be detected only by applying a sequence of
two test patterns. The first pattern, known as an initialization pattern, sets up the initial
condition in a circuit so that the fault (slow-to-rise or slow-to-fall signal) at the input or
output of a gate affect an output of the circuit. The second pattern, known as a transition or
propagation pattern, propagates the effect of the activated transition to a primary output of
the circuit.
To illustrate, let us consider a delay (slow-to-rise) fault at the input A of the circuit shown in
Figure 2.15. The test for slow-to-rise fault consists of the initialization pattern ABC=001
followed by the transition pattern ABC=101. Similarly, the two pattern tests for a slow-to-fall

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 25


VLSI DESIGN AND TESTING 21EC63

delay fault at input A will be ABC=101, 001. Note that a slow-to-rise fault and a slow-to-fall
fault correspond to a transient stuck-at-0 and transient stuck-at-1 fault, respectively.

To identify the presence of a delay fault in a combinational circuit, the hardware model
shown in Figure 2.16 is frequently used in literature. The initialization pattern is loaded into
input latches, followed by the transition pattern and output pattern. The output pattern is
loaded into output latches, and a delay fault is confirmed if the output value differs from the
expected value.

Delay tests can be classified into


two groups: nonrobust and
robust [6]. A delay fault is
nonrobust if it can detect a fault
in the path under consideration
provided there are no delay
faults along other paths. For
example, the input vector pair
(111, 101) can detect the slow-
to-rise fault at e in Figure 2.17a
as long as the path b−d−f does
not have a delay fault. However,
if there is a slow-to-fall fault at
d, the output of the circuit will be
correct for the input pair, thereby
invalidating the test for the delay
fault at e. Therefore, the test (111, 101) is nonrobust.
A delay test is considered to be robust if it detects the fault in a path independent of delay
faults that may exist in other paths of the circuit. For example, let us assume a slow-to-fall
delay fault at d in the path a−c−d−f of the circuit shown in Figure 2.17b. The input vector pair
(01, 11) constitutes a robust test for the delay fault because the output of any gate on the other
paths does not change when the second vector of the input pair is applied to the circuit. Thus,
any possible delay fault in these paths will not affect the circuit output. Robust tests do not
exist for many paths in large circuits [7, 8].

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 26


VLSI DESIGN AND TESTING 21EC63

Module 5:

Test generation for sequential circuits is challenging because the behaviour of a sequential
circuit depends both on the present and on the past input values.
The mathematical model of a synchronous sequential circuit is usually referred to as a
sequential machine or a finite state machine.
Henceforth, a synchronous sequential circuit will be referred to as a sequential circuit.

Figure 2.18 shows the general model of a synchronous sequential circuit. As can be seen
from the diagram, sequential circuits are basically combinational circuits with memory to
remember past inputs. The combinational part of the circuit receives two sets of input signals:
primary (coming from the external environment) and secondary (coming from the memory
elements). The particular combination of secondary input variables at a given time is called
the present state of the circuit; the secondary input variables are also known as state
variables. If there are m secondary input variables in a sequential circuit, then the circuit can
be in any one of 2m different present states. The outputs of the combinational part of the
circuit are divided into two sets. The primary outputs are available to control operations in the
circuit environment, whereas the secondary outputs are used to specify the next state to be
assumed by the memory. It takes an entire sequence of inputs to detect many of the possible
faults in a sequential circuit.
Sequential circuits can be tested by checking that such a circuit functions as specified by its
state table [9, 10]. This is an exhaustive approach and is practical only for small sequential
circuits.
The approach may be summarized as follows: Given the state table of a sequential circuit,
find an input/output sequence pair (X, Z) such that the response of the circuit to X will be Z if

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 1


VLSI DESIGN AND TESTING 21EC63

and only if the circuit is operating correctly. The application of this input sequence X and the
observation of the response, to see if it is Z, is called a checking experiment; the sequence
pair (X, Z) is referred to as a checking sequence.
The derivation of checking sequence for a sequential circuit is based on the following
assumptions:
1. The circuit is fully specified and deterministic. In a deterministic circuit the next state
is determined uniquely by the present state and the present input.
2. The circuit is strongly connected; that is, for every pair of states qi and qj of the
circuit, there exists an input sequence that takes the circuit from qi to qj.
3. The circuit in the presence of faults has no more states than those listed in its
specification.
In other words, the presence of a fault will not increase the number of states.
To design checking experiments, it is necessary to know the initial state of the circuit which
is determined by a homing sequence or a distinguishing sequence. An input sequence is said
to be a homing sequence for a sequential circuit if the circuit’s response to the sequence is
always sufficient to determine uniquely its final state. For an example, consider the state table
of a circuit shown in Figure 2.19. It has a homing sequence 101, for, as indicated in Figure
2.20, each of the output sequences that might result from the application of 101 is associated
with just one final state. A homing sequence need not always leave a machine in the same
final state; it is only necessary that the final state can be identified from the output sequence.

A distinguishing sequence is an input sequence that, when applied to a sequential circuit, will
produce a different output sequence for each choice of initial state. For example, 101 is also a
distinguishing sequence for the circuit shown in Figure 2.19. As shown in Figure 2.20, the
output sequence that the machine produces in response to 101 uniquely specifies its initial
state. Every distinguishing sequence is also a homing sequence because the knowledge of the
initial state and the input sequence is always sufficient to determine uniquely the final state as
well. On the other hand, not every homing sequence is a distinguishing sequence. For
example, the circuit specified by the state table of Figure 2.21a has a homing sequence 010.
As shown in Figure 2.21b, the output sequence produced in response to 010 uniquely
specifies the final state of the circuit but cannot distinguish between the initial states C and D.
Every reduced sequential circuit possesses a homing sequence, whereas only a limited
number of sequential circuits have distinguishing sequences.
At the start of an experiment, a circuit can be in any of its n states. In such a case, the initial
uncertainty regarding the state of the circuit is the set that contains all the states of the circuit.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 2


VLSI DESIGN AND TESTING 21EC63

A collection of states of the circuit that is known to contain the present state is referred to as
the uncertainty.
The uncertainty of a circuit is thus any subset of the state of the machine. For example, the
state table of Figure 2.19 can initially be in any of its four states; hence, the initial uncertainty
is (ABCD). If an input 1 is applied to the circuit, the successor uncertainty will be (AD) or
(BC) depending on whether the output is 0 or 1, respectively. The uncertainties (C)(DBC) are
the 0- successors of (ABCD). A successor tree, which is defined for a specified circuit and a
given initial uncertainty, is a structure that displays graphically the xi-successor uncertainties
for every possible input sequence xi.
A collection of uncertainties is referred to as an uncertainty vector, the individual
uncertainties contained in the vector are called the components of the vector. An uncertainty
vector, the components of which contain a single state each, is said to be a trivial uncertainty
vector. An uncertaintyvector, the components of which contain either single states or
identical repeated states, is said to be a homogeneous uncertainty vector. For example, the
vectors (AA)(B)(C) and (A)(B)(A)(C) are homogeneous and trivial, respectively.
A homing sequence is obtained from the homing tree; a homing tree is a successor tree in
which a node becomes terminal if one of the following conditions occurs:
The node is associated with an uncertainty vector, the nonhomogeneous components of
which are associated with the same node at a preceding level.
The node is associated with a trivial or a homogeneous vector. The path from the initial
uncertainty to a node in which the vector is trivial or homogeneous defines a homing
sequence.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 3


VLSI DESIGN AND TESTING 21EC63

A distinguishing tree is a successor tree in which a node becomes terminal if one of the
following conditions occurs:
The node is associated with an uncertainty vector, the nonhomogeneous components of
which are associated with the same node at a preceding level.
The node is associated with an uncertainty vector containing a homogeneous nontrivial
component.
The node is associated with a trivial uncertainty vector.

During the design of checking experiments, it is often necessary to take the circuit into a
predetermined state, after the homing sequence has been applied. This is done with the help
of a transfer sequence, which is the shortest input sequence that takes a machine from state Si
to state Sj. The procedure is an adaptive one, because the transfer sequence is determined by
the response of the homing sequence. As an example, let us derive a transfer sequence that
will take state table of Figure 2.19 from state B to state C. To accomplish this, we assume that
the circuit is in state B. We form the transfer tree as shown in Figure 2.24; it can be seen from
the successor tree that the shortest transfer sequence that will take the machine from state B to
state C is 00.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 4


VLSI DESIGN AND TESTING 21EC63

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 5


VLSI DESIGN AND TESTING 21EC63

The phrase design for testability refers to how a circuit is either designed or modified so that
the testing of the circuit is simplified. Several techniques have been developed over the years
for improving the testability of logic circuits. These can be categorized into two categories:
ad hoc and structured
AD HOC TECHNIQUES
To enhance a circuit's testability, it's essential to add more tests and control points. Test
points observe the response at a node, while control points control the internal node's value to
a desired value. For instance, in Figure 3.1a, fault α s-a-0 is undetectable at the output.

By incorporating a test point at node α as shown in Figure 3.1b, the input combination 010
or 011 can be applied to detect the fault.
The circuit in Figure 3.2a shows the importance of adding a control point to control the EX-
NOR gate's input and operation. If the output is always 1, it's impossible to determine if the
gate is functioning correctly. To test for an s-a-1 fault at the EX-NOR gate's output, the
control point is set at logic 0 and an input combination producing logic 1 at the outputs is
applied.

To enhance a circuit's testability, multiplexers can be added to increase the number of internal
nodes that can be controlled or observed from external points. For instance, a 2-to-1
multiplexer can detect faults like fault α s-a-0. When the test input is at logic 1, the circuit's
output is transferred to the multiplexer's output. Conversely, when the control input is at logic
0 and input combinations 010 or 011 are applied, node α's state is observed.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 6


VLSI DESIGN AND TESTING 21EC63

Tristate drivers can be used to access internal nodes, with a test mode signal putting the
driver into high-impedance state. The OR gate input can be set to logic 0 or 1 from an
external point, making it a test point.

Frequently, flip-flops, counters, shift registers, and other memory elements assume
unpredictable states when power is applied, and they must be set to known states before
testing can begin. Ideally, all memory elements should be reset from external points (Figure
3.5a). Alternatively, a power-up reset may be added to provide internal initialization (Figure
3.5b).

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 7


VLSI DESIGN AND TESTING 21EC63

A long counter chain presents another test problem. For example, the counter chain in Figure
3.6 requires thousands of clock pulses to go through all the states. A long counter chain can
be tested by breaking it into smaller chains with jumpers, which can be removed during
testing. A tristate driver can function as a jumper, connecting the input to the clock and
output to the output of the second counter chain. Disabling the control input disconnects the
clock from the second chain, allowing it to be tested separately from the first chain.

A feedback loop is also difficult to test because it hides the source of the fault. The source
can be located by breaking the loop and bringing both lines to external points that are shown
during normal operation. When not shorted, the separate lines provide a control point and a
test point. An alternative way of breaking a feedback loop is to add to the feedback path a
gate that can be interrupted by a signal from a control point (Figure 3.7).

On-circuit clock oscillators should be disconnected during test and replaced with an external
clock. The external clock can be single-stepped to check the logic values at various nodes in
the circuit during the fault diagnosis phase. Figure 3.8 shows how the onboard clock can be
replaced by an external one.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 8


VLSI DESIGN AND TESTING 21EC63

3.2 SCAN-PATH TECHNIQUE SCAN-PATH TECHNIQUE FOR TESTABLE


SEQUENTIAL CIRCUIT DESIGN
Testing sequential circuits can be challenging due to memory element state issues, but these
can be resolved by modifying the design of the general sequential circuit.
1. The circuit can easily be set to any desired internal state.
2. It is easy to find a sequence of input patterns such that the resulting output sequence
will indicate the internal state of the circuit.
The concept involves adding an extra input c to the memory excitation logic to control the
circuit's mode. When c=0, the circuit operates normal, but when c=1, it enters a shift register
mode by connecting elements. This is achieved by inserting a double-throw, i.e., a 2-to-1
multiplexer switch in each input lead of every memory element, allowing the circuit to
operate in either normal or shift register mode. Figure 3.9 shows a sequential circuit using D
flip-flops; the circuit is modified as shown in Figure 3.10. indicated in Figure 3.11. One
additional input connection to the modified circuit is required to supply the signal c to control
all the switches.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 9


VLSI DESIGN AND TESTING 21EC63

the shift register mode, the first flip-flop can be set directly from the primary inputs (scan-in
inputs) and the output of the last flip-flop can be directly monitored on the primary output
(scan-out output). This means that the circuit can be set to any desired state via the scan-in
inputs and that the internal state can be determined via the scan-out output.
The procedure for testing the circuit is as follows:
1. Set c = 1 to switch the circuit to shift register mode.
2. Check operation as a shift register by using scan-in inputs, scan-out output, and the
clock.
3. Set the initial state of the shift register.
4. Set c = 0 to return to normal mode.
5. Apply test input pattern to the combinational logic.
6. Set c = 1 to return to shift register mode.
7. Shift out the final state while setting the starting state for the next test.
8. Go to step 3.
The testing time is significantly impacted by setting the state, which requires a number of
clock pulses equal to the length of the shift register. To reduce this time, several short shift
registers can be formed instead of a single long one, reducing the time needed to set or read
the state. The number of shift registers can be increased based on the available input and
output connections.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 10


VLSI DESIGN AND TESTING 21EC63

LEVEL-SENSITIVE SCAN DESIGN


One of the best known and the most widely practiced methods for synthesizing testable
sequential circuits is IBM’s level-sensitive scan design (LSSD) [2–5]. The level-sensitive
aspect of the method means that a sequential circuit is designed so that the steady-state
response to any input state change is independent of the component and wire delays within
the circuit.
Clocked Hazard-Free Latches
In LSSD, all internal storage is implemented in hazard-free polarity-hold latches. The
polarityhold latch has two-input signals as shown in Figure 3.12a. The latch cannot change
state if C=0. If C is set to 1, the internal state of the latch takes the value of the excitation
input D. A flow table for this sequential circuit, along with an excitation table and a logic
implementation, is shown in Figure 3.12b, 3.12c, and 3.12d, respectively.

The clock signal C will normally occur (change from 0 to 1) after the data signal D has
become stable at either 1 or 0. The output of the latch is set to the new value of the data signal
at the time the clock signal occurs. The correct changing of the latch does not depend on the
rise or fall time of the clock signal, but only on the clock signal being 1 for a period equal to a
greater than the time required for the data signal to propagate through the latch and stabilize
A shift register latch (SRL) can be formed by adding a clocked input to the polarity-hold
latch L1 and including a second latch L2 to act as intermediate storage during shifting (Figure
3.13). As long as the clock signals A and B are both 0, the L1 latch operates exactly like a
polarityhold latch. Terminal I is the scan-in input for the SRL and +L2 is the output. The
logic implementation of the SRL is shown in Figure 3.14. When the latch is operating as a
shift register data from the preceding stage are gated into the polarity-hold switch via I,
through a change of the clock A from 0 to 1. After A has changed back to 0, clock B gates the

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 11


VLSI DESIGN AND TESTING 21EC63

data in the latch L1 into the output latch L2. Clearly, A and B can never both be 1 at the same
time if the SRL is to operate properly.

The SRLs can be interconnected to form a shift register as shown in Figure 3.15.The input I
and the output +L2 are stung together in a loop, and the clocks A and B are connected in
parallel. A specific set of design rules has been defined to provide level-sensitive logic
subsystems with a scannable design that would aid testing:
Rule 1. Use only hazard-free polarity-hold latches as memory elements.
Rule 2. The latches must be controlled by nonoverlapping clocks.
Rule 3. Clock signals must be applied via primary inputs.
Rule 4. Clocks may not feed the data inputs to memory elements either directly or
through combinational logic.
Rule 5. Test sequences must be applied via a primary input.
3.3.2 Double-Latch and Single-Latch LSSD
A sequential logic circuit that is level-sensitive and also has the scan capability is called a
Level Sensitive Scan Design (LSSD). Figure 3.16 depicts a general structure for an LSSD
system, known as a double-latch design in which all system outputs are taken from the L2
latch. In this configuration, each SRL operates in a master–slave mode. Data transfer occurs
under system clock and scan clock B during normal operation and under scan clock A and
scan clock B during scan-path operation.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 12


VLSI DESIGN AND TESTING 21EC63

Both latches are therefore required during system operation.


In the single-latch configuration, the combinational logic is partitioned into two disjoint sets,
Combl and Comb2 (Figure 3.17). The system clocks used for SRLs (shift register latch) in
Combl and Comb2 are denoted by Clock 1 and Clock 2, respectively; they are
nonoverlapping. The outputs of the SRLs in Combl are fed back as secondary variable inputs
to Comb2, and vice versa. This configuration uses the output of latch L1 as the system output;
the L2 latch is used only for shifting. In other words, the L2 latches are redundant and
represent the overhead for testability.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 13


VLSI DESIGN AND TESTING 21EC63

RANDOM ACCESS SCAN TECHNIQUE


In random access scan, each flip-flop in a logic circuit is selected individually by an address
for control and observation of its state. The basic memory element in a random-access scan-
in/scan-out network is an addressable latch. The circuit diagram of an addressable latch is
shown in Figure 3.18. A latch is selected by X–Y address signals, the state of which can then
be controlled and observed through scan-in/scan-out lines. When a latch is selected and its
scan clock goes from 0 to 1, the scan data input is transferred through the circuit to the scan
data output, where the inverted value of the scan data can be observed. The input on the
DATA line is transferred to the latch output Q during the negative transition (1 to 0) of the
clock. The scan data out lines from all latches are then AND-gated to produce the chip scan-
out signal: the scan-outline of a latch remains at logic 1 unless the latch is selected by the X–Y
signals random access memory.

A tree of AND gates is used to combine all scan-out signals. Clear input of all latches are tied
together to form a master reset signal. Preset inputs of all latches receive the same scan-in
signal gated by the scan clock however, only the latch accessed by the X–Y address is
affected.
The test procedure of a sequential circuit with random access scan-in/scan-out feature is as
follows:
1. Set test input to all test points.
2. Apply the master reset signal to initialize all memory elements.
3. Set scan-in address and data and then apply the scan clock.
4. Repeat step 3 until all internal test inputs are scanned in.
5. Clock once for normal operation.
6. Check states of the output points.
7. Read the scan-out states of all memory elements by applying appropriate.
8. X–Y signals.

The random access scan-in/scan-out technique has several advantages:


1. The observability and controllability of all system latches are allowed.
2. Any point in a combinational circuit can be observed with one additional gate and one
address per observation point.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 14


VLSI DESIGN AND TESTING 21EC63

3. A memory array in a logic circuit can be tested through a scan-in/scan-out circuit. The
scan address inputs are applied directly to the memory array. The data input and the
write- enable input of the array receive the scan data and the scan clock, respectively. The
output of the memory array is AND-gated into the scan-out tree to be observed.

The technique has also a few disadvantages:


1. Extra logic in the form of two address gates for each memory element, plus the
address decoders and output AND trees, result in 3–3 gates overhead per memory
element.
2. Scan control, data, and address pins add up to 10–20 extra pins. By using a serially
loadable address counter, the number of pins can be reduced to around 6.
3. Some constraints are imposed on the logic design such as the exclusion of
asynchronous latch operation.

PARTIAL SCAN
In full scan, all flip-flops in a circuit are connected into one or more shift registers; thus, the
states of a circuit can be controlled and observed via the primary input and outputs,
respectively. In partial scan, only a subset of the circuit flip-flops is included in the scan chain
in order to reduce the overhead associated with full scan design. Figure 3.21 shows a
structure of partial scan design.

This has two separate clocks: a system clock and a scan clock. The scan clock controls only
the scan flip-flops. Note that the scan clock is derived by gating the system clock with the
scan-enable signal; no external clock is necessary. During the normal mode of operation, i.e.,
when the scan-enable signal is at logic 0, both scan and nonscan flip-flops update their states

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 15


VLSI DESIGN AND TESTING 21EC63

when the system clock is applied. In the scan mode operation, only the state of the shift
register (constructed from the scan flip-flops) is shifted one bit with the application of the
scan flip-flop; the nonscan flip-flops do not change their states.
The disadvantage of two-clock partial scan is that the routing of two separate locks with
small skews is very difficult to achieve. Also, the use of a separate scan clock does not allow
the testing of the circuit at its normal operating speed.
A partial scan scheme that uses the system clock as the scan clock is shown in Figure 3.22.
Both scan and nonscan flip-flops move to their next states when the system clock is applied.
A b test sequence is derived by shifting data into the scan flip-flops. This data together with
contentsof nonscan flip-flops constitute the starting state of the test sequence. The other
patterns in the sequence are obtained by single-bit shifting of the contents of scan flip-flops,
which form part of the required circuit states. The remaining bits of the states, i.e., the
contents of the scan flip-flops are determined by the functional logic. Note this form of partial
scan scheme allows only a limited number of valid next states to be reached from the starting
state of the test sequence. This may limit the fault coverage obtained by using the technique.

The selection flip-flops to be included in the partial scan is done by heuristic methods. It has
been shown that the fault coverage in a circuit can be significantly increased by including 13–
23% of the flip-flops flops in the partial scan.

Dr. Kiran Kumar V G Department of ECE AJIET Mangaluru 16


VLSI Design AND Testing Module 2 and 3 question bank
1. Explain with a waveform the propagation delay, Rise times and Fall times of a CMOS
inverter
2. Explain the timing optimization at different logic levels
3. Explain linear delay model compare the logical effort of the
following gates with the help of schematic diagrams i) 3-
input NAND gate ii) 3-input NOR gate
4. Write a note on Elmore delay calculations

5. Estimate tpdf and tpdr for the 3-input NAND gate shown in
the figure if the output is loaded with h identical NAND gates.
6. Estimate Elmore delay tpd for a unit inverter driving m identical unit inverters
7. Compute the Elmore delay for Vout in the 2nd order RC system
8. Estimate Elmore delay tpd for a inverter of width w driving m identical unit inverters.
9. Explain briefly linear delay model.
10. If a unit transistor has R = 10 kΩ and C = 0.1 fF in a 65 nm process, compute the delay,
in picoseconds, of the inverter in Figure Q1C with a fanout of h = 4.

11. Define logical effort explain the calculation of inverter, NAND and NOR gates
12. Derive an expression for delay in multistage networks.
13. Estimate the minimum delay, in  to compute F = AB + CD using the NAND and NOR
gates. Each input can present a maximum of 20  of transistor width. The output must
drive a load equivalent to 100  of transistor width. Choose transistor sizes to achieve
this delay
14. Estimate the minimum delay of the path from A to B in Figure 4.31 and choose transistor
sizes to achieve this delay. The initial NAND2 gate may present a load of 8  of
transistor width on the input and the output load is equivalent to 45  of transistor width.

15. With neat schematic diagram explain the operation of 1TDRAM cell
16. With neat schematic diagram explain the operation of Full CMOS static RAM cell
17. With a neat schematic explain the working of CMOS sense amplifier
18. With neat schematic diagram explain the Data Programming and erasing methods in a
Flash memory cell.
19. Explain the programming, erase and read operation in NOR Flash memory cell
20. Explain the programming, erase and read operation in NAND Flash memory cell
21. With neat Schematic diagram explain the operations of Three Transistor DRAM cell
22. With neat memory structure diagram explain the step sensing scheme of FRAM
23. Explain NAND based ROM Circuit with an example
24. explain briefly address decoders
25. Explain NOR based ROM Circuit with an example
26. Briefly explain the classification of memory

Module 4 and 5
1 Explain breaks and transistor stuck-ON fault in CMOS circuits with examples.

2 What is temporary fault? Draw state diagram of Markov model and explain

3 Using D-algorithm for Fig.Q1c, find test pattern for line 6, s-a-1.

4 derive a test for the s-a-0 fault at the output of gate G2 in the circuit shown in Figure

5 Explain various types of bridging faults with examples.


6 How Boolean difference concept is used in VLSI testing and define the rules of
Boolean difference.
7 Find the Fault matrix and Minimal test set for the circuit under test shown in fig
Q2c for faults α s-a-0 and β s-a-l
8 For state table of sequential machine shown below:
Find :
(i) Response of machine for 010.
(ii) Homing tree
(iii) Distinguishing tree
9 What is clocked hazard-free latches in LSSD explain
10 Demonstrate random access scan technique with a neat diagram
11 For the circuit shown in figure a find the Boolean difference with respect to x3
For the circuit shown in figure b find the Boolean difference with respect to x3

12 To enhance a circuit's testability what are the different AD HOC techniques used?
Explain
13 Demonstrate double-latch design with a neat diagram
14 Illustrate path sensitization
15 What is controllability and observability in testing?
16 Illustrate stuck-at-0 and stuck-at-1 fault
17 With a neat diagram Explain partial scan testing technique.
18 Write a note on delay faults
19 Explain PODEM with an example
20 Write a note on FAN
27.

You might also like