0% found this document useful (0 votes)
123 views60 pages

Volume11 Issue2 Verification Horizons Publication HR

Uploaded by

Deniz Kurt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views60 pages

Volume11 Issue2 Verification Horizons Publication HR

Uploaded by

Deniz Kurt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

A PUBLICATION OF MENTOR GRAPHICS — VOLUME 11, ISSUE 2 — JUNE 2015

WHAT’S ON Welcome to our once-again-


THE HORIZON? super-sized DAC summer edition!
By Tom Fitzpatrick, Editor and Verification Technologist

Verifying Airborne Electronics


Hardware —How to maintain requirements Now that it’s May here in New England, I’m sure you’ll all be happy to know that, even after
traceability when using assertions...page 6 our record-setting snowfall this past winter, there is, in fact, no more snow on my lawn. When
this statement became a fact is still a source of amazement, but I’m just glad the snow is gone.
DO-254 Testing of High Speed Of course, that means it’s time to get the lawn in shape, and I’m faced with the unfortunate
FPGA Interfaces—Ensure maximum
situation that my 17-year-old son is too busy (with homework and standardized tests) to be
flexibility for “robustness testing”...page 10
of much help these days. Still, I’d rather ride the tractor than shovel snow any day, so I can’t
Formal and Assertion-Based really complain. Speaking of greener grass, the lawn is lush on this issue’s side of the fence.
Verification of MBIST MCPs—Vali-
dating MCPs in MBIST controllers...page 15 We start out this issue with two articles about verifying
airborne electronic hardware under the DO-254 standard.
Starting Formal Right from Formal The first article comes from our friends at eInfochips, who
Test Planning—Employing the ideas used
focus on an assertion-based approach in “Verifying Airborne
in simulation-based verification...page 20
Electronics Hardware: Automating the Capture of Assertion
Reuse MATLAB Functions and Verification Results for DO-254.” In particular, this article
Simulink Models in UVM Environments deals with how to maintain requirements traceability—a key
—Improving verification reliability...page 24 part of the DO-254 process—when using assertions. The
next article, from Verisense, shows how to apply a UVM-
Intelligent Testbench Automation
based testbench with a hardware tester validation platform to
with UVM and Questa—For application-
specific instruction-set processors...page 31 achieve “DO-254 Testing of High Speed FPGA Interfaces.”
As you’ll see, the key is to reuse UVM-based simulation
Unit Testing Your Way to a Reliable stimuli and results while relying on the hardware to ensure
Testbench—Avoid iterating between pesky maximum flexibility for “robustness testing,” another critical
testbench bugs and design bugs...page 35
element of DO-254 compliance.
“If you’re at DAC
Hardware Emulation: Three
Decades of Evolution, Part 2 We continue with the theme of assertion-based verification in this year, please
—The expansion of emulation...page 40 “Formal and Assertion-Based Verification of MBIST MCPs,”
a joint article from FishTail Design Automation and Mentor
stop by the
Accelerating RTL Simulation Graphics. Here we see how to apply formal verification to Verification Academy
Techniques—Identifying performance-
confirm the validity of multi-cycle paths in MBIST controllers. booth (#2408)
sapping coding styles...page 43
We will also learn how generating assertions for failing paths
Emulation Based Approach to can help identify issues and improve the robustness of what is to say “hi.” ”
ISO 26262 Compliant Processors usually a very complex operation. —Tom Fitzpatrick
Design—An Automotive example...page 47
We follow this up with an article from our friends at Oski who
Resolving the Limitations of a show us the importance of “Starting Formal Right from Formal
Traditional VIP for PHY Verification
—Let’s finish on time!...page 54
Test Planning.” Many of us have talked for years about the We’re all familiar with the concept of unit testing, but usually
importance of verification planning, and rightly so. This we only think about it when applied to the DUT. Our next
article does a great job of extending the ideas typically article, from Neil Johnson of XtremeEDA and Mark Glasser
used in a simulation-based verification plan and applying of NVIDIA, shows how to expand on this by “Unit Testing
them specifically to the job of planning formal verification. If Your Way to a Reliable Testbench.” As with design-unit
formal verification is in your future, you’ll find this article very testing, the idea of verifying each unit of the eventual
interesting and useful. testbench by itself—rather than waiting until everything gets
thrown together and iterating between “testbench bugs” and
In our next article, our friends at MathWorks show us how “design bugs” (which isn’t always clear)— makes a lot of
to “Reuse MATLAB® Functions and Simulink Models in sense.
UVM Environments with Automatic SystemVerilog DPI
Component Generation.” They also explain how their HDL Next, we bring back Dr. Lauro Rizzatti with part two of
Verifier facilitates the co-simulation of MATLAB models in his three-part article on “Hardware Emulation: Three
UVM environments. The trick is that HDL VerifierTM can now Decades of Evolution.” This section covers the expansion
generate SystemVerilog DPI components directly, which of emulation beyond verifying graphics and processors and
are then integrated into UVM components. In effect, it lets the growth of FPGA-based emulators.
you embed the specification (in the form of the algorithmic
MATLAB model) directly into your testbench, improving the We close out our partner articles with “Accelerating
reliability of verification. RTL Simulation Techniques” from our friends at Marvell
Semiconductor. I think you’ll find this an extremely practical
Staying in the UVM arena for a bit, we have an article “how-to” article to identify some subtle performance-
from our friends at Codasip® who share their thoughts sapping coding styles that unfortunately are all too common.
on “Intelligent Testbench Automation with UVM and It’s always important to remember that the fastest simulator
Questa®.” This involves automatically generating the HDL in the world can be slowed down by poorly-written code, so
representation of an application-specific instruction-set please pay attention.
processor (ASIP) along with a UVM-based verification
environment—including the ASIP reference model. A
genetic algorithm is used to evolve the stimulus and
optimize coverage.

2
We wrap up this DAC edition of Verification Horizons with Verification Horizons is a publication
two articles from my colleagues here at Mentor Graphics. of Mentor Graphics Corporation,
In “Emulation Based Approach to ISO 26262 Compliant all rights reserved.
Processors Design,” the author shows us how to apply
fault-injection to processor verification in automotive Editor: Tom Fitzpatrick
Program Manager: Rebecca Granquist
applications—a critical element in any notion of a self-
Wilsonville Worldwide Headquarters
driving car (at least any such car I’d consider buying). And
8005 SW Boeckman Rd.
last but not least, we have “Resolving the Limitations of a
Wilsonville, OR 97070-7777
Traditional VIP for PHY Verification” in which we see how Phone: 503-685-7000
we can assemble a protocol-specific kit of verification
components and stimulus to ensure that the PHY To subscribe visit:
verification is self-contained and won’t take away from your www.mentor.com/horizons
system verification when it’s part of your SoC. To view our blog visit:
VERIFICATIONHORIZONSBLOG.COM
As always, if you’re at DAC this year, please stop by the
Verification Academy booth (#2408) to say “hi.” It’s always
gratifying to hear from so many of you about how helpful
you find both the Verification Academy and this newsletter.
I’m proud to be able to help bring both of them to you.

Respectfully submitted,
Tom Fitzpatrick
Editor, Verification Horizons

3
Table of Contents June 2015 Issue

Page 6
Verifying Airborne Electronics Hardware: Automating
the Capture of Assertion Verification Results for DO-254
by Vipul Patel, ASIC Engineer, eInfochips

Page 10
DO-254 Testing of High Speed FPGA Interfaces
by Nir Weintroub, CEO, and Sani Jabsheh, Verisense

Page 15
Formal and Assertion-Based Verification of MBIST MCPs
by Ajay Daga, CEO, FishTail Design Automation, and Benoit Nadeau-Dostie,
Chief Architect, Mentor Graphics

Page 20
Starting Formal Right from Formal Test Planning
by Jin Zhang, Senior Director of Marketing & GM Asia Pacific, and Vigyan Singhal, President & CEO,
OSKI Technology

Page 24
Reuse MATLAB® Functions and Simulink® Models
in UVM Environments with Automatic SystemVerilog DPI
Component Generation
by Tao Jia, HDL Verifier Development Lead, and

Jack Erickson, HDL Product Marketing Manager, MathWorks

Page 31
Intelligent Testbench Automation with UVM and Questa
by Marcela Simkova, Principal Consultant, and Neil Hand, VP of Marketing
and Business Development, Codasip Ltd.

4
Page 35
Unit Testing Your Way to a Reliable Testbench
by Neil Johnson, Principal Consultant, XtremeEDA, and Mark Glasser, Principal Engineer,
Verification Architect, NVIDIA

Page 40
Hardware Emulation: Three Decades of Evolution – Part II
by Dr. Lauro Rizzatti, Verification Consultant, Rizzatti LLC

Page 43
Accelerating RTL Simulation Techniques
by Lior Grinzaig, Verification Engineer, Marvell Semiconductor Ltd.

Page 47
Emulation Based Approach to
ISO 26262 Compliant Processors Design
by David Kaushinsky, Application Enginner, Mentor Graphics

Page 54
Resolving the Limitations of a Traditional VIP
for PHY Verification
by Amit Tanwar and Manoj Manu, Questa VIP Engineering, Mentor Graphics

5
Verifying Airborne Electronics Hardware: Automating the Capture
of Assertion Verification Results for DO-254
by Vipul Patel, ASIC Engineer, eInfochips

INTRODUCTION However, the effort spent in manual tracing can be quite


This article focuses on Assertion-Based Verification (ABV) significant and prone to errors too, though it is a good target
methodology and discusses automation techniques for for automation. In this article, we describe a script that can
capturing verification results to accelerate the verification capture the results of executing assertions automatically
process. Also, it showcases how requirement traceability written in SystemVerilog and aligned to the requirements.
is maintained with use of assertions to comply with the The simulation results are stored by Mentor’s Questasim
mandatory DO-254 standards. tool in simulation log files as well as in waveform (.wlf)
files. The latter debugs the issues in visual form in a
waveform window in a swift manner. The automation script
WHAT IS REQUIREMENT TRACEABILITY AND ITS identifies the SystemVerilog assertions from the procedure
IMPORTANCE FOR DO-254 COMPLIANCE? implementation file. It interprets the user-defined arguments
When developing airborne electronics hardware (AEH), such as assertion checking pass instances, the width
documenting the evidence-linking verification results to and height resolution of assertion waveform snapshot
requirements is an important compliance criterion for (image file), and then proceeds to prepare the assertion
satisfying regulatory guidance for certification. Section results. The automation script can also create an assertion
6.2 of the RTCA/DO-254 “Design Assurance Guidance snapshot image file, a smaller and targeted version of the
for Airborne Electronic Hardware” defines verification assertion wave file along with the required result to pose as
processes and the subsequent objectives that are required evidence for traceability. Compared to the manual capture
for safety assurance and correctness, as given below: of a large number of assertion results, the automated script
saves significant amount of time and even reduces the
• Section 6.2.1(1) – Evidence is provided that the instances of operator errors.
hardware implementation meets the requirements.
• Section 6.2.1(2) – Traceability is established between
hardware requirements, the implementation, and the IMPORTANCE OF ASSERTIONS AND THEIR
verification procedures and results. USABILITY AND TRACEABILITY IN AEH DEVICES
• Section 6.2.1(3) – Acceptance test criteria are identified, Assertion-Based Verification (ABV) is a methodology,
can be implemented, and are consistent with the where the designer and verification engineer can utilize
hardware design assurance levels of the hardware “assertions” to verify design functionality and improve
functions. the overall verification productivity.

Assertion-Based Verification (ABV) methodology is An assertion is actually a statement about a specific


increasingly used to handle the complexity of present- functional characteristic or property that is expected to
day AEH designs in the avionics industry. Requirements hold for the design being verified. Generally assertions
tracing using ABV methodology can be accomplished are developed using SystemVerilog assert, property,
by associating targeted functionality from requirements and sequence constructs. For example, consider the
to assertion execution results. The entire process would device has the following requirement:
include simulation log, assertion waveform, and assertion
coverage through several intermediate steps as shown
A “valid” should be followed by “ready” signal occurring
below in Figure 1. no more than three clocks after valid is “asserted”.

6
The following assertion can be developed to verify in log file and dumps the assertion checking states in
the above requirement: .wlf file to be viewed in waveform window later.

// Property Specification The requirement can be verified by declaring the assertion


property p_ready_valid; using “assert”, “property” and “sequence” construct of the
@ (posedge clk) disable iff (reset) language. The sequence is the statement of a sequence
valid |-> ##[1:3] ready; of signals assertion and de-assertions that are required to
endproperty
be followed depending on the requirement. The property
// Assertion Directive can use multiple sequences. And the “assert” can use
a_ready_valid : assert property (p_ready_valid) the property to verify the requirement and send the report
$display(“@%0dns Assertion Passed”, $time); message.
else
$display(“@%0dns Assertion Failed”, $time) Questa® Tools Outputs (SV assertions,
log messages, waveform dumps) and
usage of these for Traceability
The Questa tool gives an output of the assertion procedure
q_ready_valid is the assertion statement which uses results in simulation log file, waveform file and UCDB file
“assert” to verify the property. The name given before to display the coverage of the assertion. The log message
“assert” keyword is assertion label. This assertion contains the final PASS and FAIL status of the assertion
uses the p_ready_valid property declared before it. along with the user-debugged message. The waveform
file contains the assertion ACTIVE, INACTIVE, PASS or
p_ready_valid is a property that checks valid signals on FAIL, etc. state along with the information of signals used
each positive edge of the clock. Whenever valid is detected in assertion. The waveform (.wlf) file is used to load the
as asserted, it will verify the requirement by checking ready assertion in the waveform window and help the verification
signal for 1 to 3 clock cycles. If the assertion finds the engineer to further analyze and debug device behavior.
ready signal asserted within three clock cycles, it reports The UCDB file contains the assertion coverage information
the PASS message in log file and dumps the assertion to be viewed in the coverage report. The coverage report
checking states in .wlf file to be viewed in waveform window. will have detailed information including the assertion hit
count, pass count, etc. These output files can be used to
If the assertion doesn’t find the “ready” signal asserted
provide traceability of the requirement.
within three clock cycles, it reports with the FAIL message

7
In case the
assertion ID
does not match
the assertion
label in the
implementation
file, the user will
be notified with an error
message while running the script.
AUTO CAPTURE OF ASSERTION RESULTS The implementation of the assertions can then be
Developing and implementing an automated script checked likewise using the script.
is a step-by-step process.
Once assertion labels are identified, the simulation log
Step 1: Develop Script file is checked for pass or fail messages depending on
First of all, a generic assertion capture automation script assertion pass count. The script will get simulation time
is created to parse an assertion input file. One can use grep from the message. This time is used for preparing
scripting language like Perl, shell, tcl, etc. to develop the width and height resolution of assertion waveform snapshot
script. (image file).

Step 2: Prepare Assertion Input File The script will automatically find an appropriate height of
Create an assertion input file and include information in waveform snapshot depending on signals used in assertion
it like clock period, reset period, assertion module name, implementation and even find an appropriate width from
assertion ID, assertion implementation file name (.sv file), the simulation time grep from simulation log file. If the user
simulation log file name, simulation wlf file name (incase log requires extra width or height, then it can be provided by
and wlf files have different names), assertion pass count, etc. specifying the same in the assertion input file.

What is assertion pass count? By using information such as specific width, height, signals,
Assertion triggers multiple times during simulation. Hence, etc., the script will be able to capture an assertion waveform
users can obtain the snapshot for specific passed/failed snapshot (image file).
assertion using assertion pass count, as illustrated below in
figure 5. The next step involves the generation of a small and
targeted version of .wlf file of assertion results. The .wlf
Step 3: Parsing Input file by Script file is the input and assertion start/end time will be used to
The script will use the assertion ID and search assertion create the small and targeted version of assertion .wlf file.
label (refer to Sample Assertion Code to understand
assertion label) from assertion implementation file. This HOW TO APPLY A SCRIPT?
assertion label will be subsequently used for further Once regression is completed, an assertion input file is
processing. There is a provision to input single or multiple created. The input file will have the required information
assertion IDs while developing the script too. as specified in Step 2: Prepare Assertion Input File.

8
On running the script, it will manipulate input attributes and DISADVANTAGES
capture results in an image file (.bmp) and a waveform file • Simulation tool version or feature changes will require
(.wlf). The resultant snapshots will be used for creating a similar changes in the script
verification result document and provide treaceability to For example, the simulation tool commands
requirements, as shown in Figure 1: Steps in Requirements used by script such as wlfman. If this command
Tracing for DO-254 using Assertions. is removed or altered, the script needs to
accommodate the changes. Similarly there
ADVANTAGES are many commands in simulation tool that if
• Automates the tedious process of capturing assertion changed will lead to script alteration
results
• Saves time and manual efforts REFERENCES
• Drastically reduces probability of errors • RTCA/DO-254 “Design Assurance Guidance for
For example, errors in identification of active Airborne Electronic Hardware”
and pass points in result, hiding signals, • https://www.doulos.com/knowhow/sysverilog/tutorial/
inaccurate size assessment, etc. assertions/
• Assertion capture script is reusable in different
verification projects

9
DO-254 Testing of High Speed FPGA Interfaces
by Nir Weintroub, CEO, and Sani Jabsheh, Verisense

As the complexity of electronics for airborne applications to monitor all FPGA interfaces, including high-speed
continues to rise, an increasing number of applications need interfaces, at the FPGA pin level.
to comply with the RTCA DO-254/ Eurocae ED-80 standard
for certification of complex electronic hardware, which
includes FPGAs and ASICs. MEETING DO-254 STANDARDS
The goal of DO-254 certification is safety. No shortcuts can
The DO-254 standard requires that for the most stringent be taken. It is mandatory to prove the design correctness
levels of compliance (levels A and B), the verification of an FPGA by verifying its entire feature set. High-speed
process for FPGAs and ASICs must measure and record interfaces are complicated interfaces which are usually
the verification coverage by running tests on the device in linked to the main functionality of a specific FPGA. Thus
its operational environment. What this essentially means is the comprehensive testing of these interfaces must be a
that verification engineers and designers need to compare basic requirement in the process of verifying the design
the behavior of the physical outputs of the device on the correctness of any device.
hardware device pins with their corresponding RTL model
simulation results. In addition, the standard requires running Therefore, an FPGA using a high speed DDR interface, for
robustness tests on the interface pins. The robustness example, cannot be properly validated solely by connecting
testing is accomplished by forcing abnormal or slightly out- it to a standard DDR device and testing read/write
of-spec behavior on the device and ensuring that it is able operations, because it does not allow testing and verifying
to deal with this behavior without catastrophic results. the FPGA behavior in abnormal conditions.

These requirements become especially challenging for high- The best way to qualify Level A and Level B DO-254
speed interfaces, such as DDR3 or PCIe, because it is not devices is to compare the physical device outputs to
possible to create and observe the abnormal behavior when the simulation model results. Doing this comparison for
an FPGA is connected to the regular operational interfaces. complex FPGAs is a complicated technological challenge.
For example, when a real memory is connected to a DDR Adding high-speed interfaces makes it monumental. While
model, there is no way to control the DDR behavior and use in simple devices one can use scopes and logic analyzers
different kinds of DDR memories. As for robustness testing, for monitoring and comparing hardware signals, doing so
there is no way to test incorrect behavior of a memory while for complex devices is not viable.
connected to a real memory, because the real memory
does not allow for the type of error injection desired. It is In the avionic industry, the need for completely bug-free
also impossible to test different timing behavior of the high designs is similar to other industries; however, such need
speed interface signals. arises from ensuring there are absolutely no critical safety
issues. Since FPGAs in the avionic industry can rise to
To overcome these challenges, Verisense developed a the level of ASIC complexity, there becomes a necessity
new approach, the Advance Validation Environment (AVE), to innovate an advanced hardware validation approach to
which makes it possible to comply more completely with cover all the safety requirements.
the DO-254 requirements for in-hardware testing of high
speed interfaces at the pin level and enables users to easily The new AVE approach suggested by Verisense for
run an array of robustness tests. Based on the Universal providing a comprehensive solution that can qualify Level
Verification Methodology (UVM), this novel approach A and Level B devices is based on applying the concepts
involves the migration of a UVM testbench and verification of constrained random verification in the simulation world,
environment into an FPGA operational environment. to the real hardware environment. Based on advanced pre-
Applying UVM concepts and components to target- silicon verification methodologies that will be introduced,
hardware simulation enables reuse of all simulation runs for the article will present a new approach for hardware testing
the in-hardware testing. It also provides the ability that makes it not only possible but also straightforward to

10
comply more completely with the DO-254 requirements for component responsible for monitoring the interface buses
real in-hardware testing including easily running an array and reporting all the transactions which occurred on the
of robustness tests. interface to the reference model block.

The reference model generates the expected transaction


ADVANCED HARDWARE VALIDATION ENVIRONMENT data on the output interfaces as a result of the input
At the most basic level, the AVE methodology implements stimulus. Both the expected transactions and the
the UVM verification environment in a unique testing monitored output transactions are sent to a scoreboard for
platform. The main idea behind the UVM is that the comparison.
verification environment is a sophisticated software
machine that emulates the FPGA or ASIC real-life system The UVM provides the ability to randomly generate all the
environment. possible scenarios according to constraints which limit the
generation into real and possible scenarios; this is why it
is known as constrained random
verification. Running all random
scenarios results in 100% functional
coverage.

Although AVE was built from the


ground up, as can be seen by
comparing Figures 1 and 2, the
AVE post-silicon environment is
a direct descendent of the UVM
environment. This is made possible
by the coupling of Questa® with the
AVE platform. In this scenario, the
UVM agents are swapped out for
Verisense emulators. The emulators
are memory models implemented

Figure 1 provides an overview of a UVM environment. In


its simplest form each of the DUT interfaces is connected
to an agent. Each agent is made up of a number of blocks,
the most important of which is the sequencer, which
coordinates the feeding of transactions (stimulus) into the
driver. The driver converts these transactions, which are
written at a given level of abstraction, to a lower level of
abstraction according to the protocol being verified.

The low level block is the Bus Functional Model (BFM).


This block is responsible for toggling the relevant interface
according to its protocol with the relevant transactions
which were received from the driver. The monitor is the

11
as Verisense IP, allowing, for example, a real DDR memory The software implements the high-level components of
to be emulated. Like the UVM agent blocks, the emulator the verification environment, including the input generation,
includes sequencers, drivers, and monitors, so AVE can reference models, and off-line comparisons between the
emulate all the interfaces exactly as in a simulation. The behavior of the DUT in the verification environment and
same reference model is used and a scoreboard test-report its behavior on the final target hardware.
generator summarizes results. In addition, voltage, clock,
and signal control blocks have been added to qualify the The software is preprogrammed via automatically gener-
DUT against tolerance specifications; this could not be done ated configuration files with the relevant setting of the
with even the most advanced UVM verification environment. testing environment and the DUT hardware. Based on
A further benefit, is that the AVE hardware architecture is this information, the software configures the validation
modular, allowing a high degree of reuse between projects. environment hardware and sets the clocks and voltages
to the DUT as required for each test.
The AVE testing platform architecture is divided into
two main components: software and hardware.

The main function of the hardware is to implement


the low-level components of the verification environ-
ment — basically the BFM and the monitor. In addition,
through the use of high-speed interface emulators
that connect to the DUT, the user is given full control
of testing all interfaces with maximal flexibility.

All of the DUT interfaces, including high-speed


interfaces, are connected to Verisense emulators,
which are able to function as BFMs as well as mon-
itors and emulate the full functionality. The emulator
effectively has complete visibility of the DUT high-
speed input and output pins, and it provides full test-
ing capabilities, ranging from simple protocol checkers
to complicated robustness testing. The hardware
Figure 3: Advanced Validation
emulator uses its own memory to store both input data Environment Software Test Flow
information (BFM) and the recorded output data (monitors).

The requirement for robustness testing on all interfaces The steps in the software test flow are identical for both
includes controlling timing of signals, voltage levels, and regular and high-speed interfaces and are as follows:
in the case of a memory, also changing the actual data
and data sequences. The AVE solution is able to provide 1. The software simulation environment generates Value
extensive support for robustness testing on all interfaces, Change Dump (VCD) files from the waveforms of the
including high-speed interfaces, because the DUT pins verification tests. The VCD file is an industry standard
connect directly to the hardware emulators; thus, the file format containing waveform information that is
environment not only monitors activity on the pins but generated from the simulation environment, regardless
also can inject data and create abnormal situations. The of the verification methodology used in the software
emulators can control the timing on all interfaces or any simulation. The tester software parses the VCD files
subset thereof. It also has the ability to modify the latency of and generates two sets of vectors. The first is used
the protocols, the frequency, and even the voltage level of a as input vectors to inject at the DUT pin level on the
given interface. All the interface electrical and logical values hardware platform. The second is used as the expected
can be modified within the interface specification or beyond results vectors for comparing with the hardware
the specification limits for verifying real abnormal situations. tester results at the end of the process.
12
2. The hardware tester injects the input vectors onto The VS-254 platform contains a complicated DUT FPGA
the pins of the DUT, running the test that was that has two DDR3 interfaces. The DUT FPGA DDR
previously run in the software simulation environment. interfaces are connected directly to three other FPGAs and
3. While the test is running, the tester monitors not to DDR memories. The FPGAs also include the DDR
and records all the DUT output pins. emulators.
4. Once the test is completed, the recorded pin
behavior on the tester is processed by the software. The VS-254 includes many enhanced features to reduce
5. The actual recorded pins waveform is then the verification effort and to make the engineering effort
automatically compared with the expected results predictable. These include a high degree of reuse between
from the software simulation. Any mismatches are the simulation environment and the hardware verification,
flagged and reported. If there are no mismatches, built-in support for regression and waivers, and the VS
the test has passed successfully. system simulation tool for easier debug of test failures.
6. All appropriate log and test result files are The VS-254 FPGA implements the AVE methodology and
then generated automatically for documentation includes a DDR3 high-speed interface.
and traceability purposes.
DDR3 is a type of double data rate-synchronous dynamic
random access memory (DDR-SDRAM). In an SDRAM,
High-Speed Interface Example - DDR 3
the DDR is synchronized with its master (e.g., a processor).
In 2014, Verisense provided the VS-254 FPGA verification
Unlike asynchronous memories, which react to changes in
tester tool to multiple avionics customers to assist them in
the control, synchronous memories can be pipelined, thus
the certification of their DO-254 Level A and B products.
achieving high speeds and efficient access to memory. The
The VS-254 is the first, and remains the only, solution that
DDR data is stored in a simple dynamic physical element,
provides the complete required DO-254 functionality for
which enables higher densities and, thus, a lower cost per
high-speed interfaces, including pin-level verification on all
bit. To keep the data from getting lost, the information is
pins and full robustness testing on all high-speed interfaces.
refreshed from time to time. The refresh mechanism adds
complexity to the controller, and although it somewhat
reduces the memory throughput, it is considered a good
tradeoff. The memory works on both clock edges, which
doubles the memory throughput. As memory speeds rise,
signal integrity becomes an important issue. Among other
things, there is high sensitivity for the signal termination,
which makes signal probing and monitoring operations
much more complicated and challenging.

The complexity of the transactions and the number of


signals in a 40-bit width high-speed DDR3 interface running
at 333 MHz can be seen in Figure 5. Meeting DAL A and
B robustness and abnormal testing requirements is next
to impossible using a manual or semi-automatic testing
environment. And if the FPGA DDR is connected directly
to a DDR memory, there is absolutely no control of the pins
Figure 4: The VS-254 Platform and virtually no ability for robustness testing on these pins.

13
Figure 5: High-Speed Interface Using DDR3

The DDR emulators perform all the special testing features interface to multiple clients that require arbitration
required for DO-254 hardware testing requirements, governed by master/slave relationships.
including the complete set of robustness testing. The
emulator is programmed through multiple configuration
registers, which control all the DDR parameters, including: CONCLUSION
Taking concepts from advanced pre-silicon verification
• DDR latency methodologies (e.g., UVM) and applying them to a hardware
• Control and data line timing tester validation platform (i.e., AVE) delivers unprecedented
• DDR power level control over all aspects of hardware testing, especially high-
• DDR frequency shifts speed interfaces. As technology progresses, we expect
• DDR error injection and correction to see more and more devices in the market with different
types of high-speed interfaces.
The AVE runs the following tests on the high-speed
interfaces: This breakthrough approach enables, for the first time,
complete testing that includes high-speed interfaces. It
• Validate the physical layer connections and test the gives developers and manufacturers a much higher degree
normal interface behavior of certainty in the correctness and safety of their complex
• Compare the transaction level data with the reference electronic devices.
model from the software verification environment
• Modify the input data to the DUT, and test for abnormal DO-254 expects developers to use their best effort to
situations validate their designs. There is no absolute definition of
• Vary the voltage levels and clock frequencies to what is a “good enough” effort. In the absence of absolute
understand their impact on the DUT proof, you need to invest much time and energy to show
• Change the physical layer signal timings including that what you have done is as good as technically possible.
control and data As solutions are provided that allow for increased coverage,
these solutions eventually become the de-facto DO-254
Providing this level of functionality and features is requirement. Because AVE is currently the most complete
challenging, especially considering that high-speed technological solution to validate high-speed interfaces, we
interfaces often include complex protocols and frequently expect it to become a de-facto requirement.

14
Formal and Assertion-Based Verification of MBIST MCPs
by Ajay Daga, CEO, FishTail Design Automation, and Benoit Nadeau-Dostie, Chief Architect, Mentor Graphics

Built-In Self-Test (BIST) is widely used to test embedded prohibitive. Formal verification of timing constraints is
memories. This is necessary because of the large number clearly a better alternative although there are a number of
of embedded memories in a circuit which could be in the obstacles to overcome before it can be used effectively.
thousands or even tens of thousands. It is impractical to There is a learning curve involved in using such verification
provide access to all these memories and apply a high tools and new scripts need to be generated and maintained
quality test. The memory BIST (MBIST) tool reads in user to integrate the tools in the design-for-test flow. Until
RTL, finds memories and clock sources, generates a test now, there were also a number of limitations of the formal
plan that the user can customize if needed, generates verification tools that made it difficult to use on our MBIST
MBIST IP, timing constraints, simulation test benches and IP. A first limitation was that these tools generally analyze
manufacturing test patterns adapted to end-user circuit. constraints individually. However, the MBIST constraints
are not all self-contained and were causing a large number
Multi-cycle paths (MCPs) are used to improve the of false alarms. For example, take the following set of
performance of the circuit without having to use expensive constraints:
pipelining which would be required when testing memories
at Gigahertz frequencies. Most registers of the MBIST set_multicycle_path 2 -setup -from STEP_COUNTER*
controller only update every two clock cycles. Only a few set_multicycle_path 1 -setup -from STEP_COUNTER*
registers need to operate at full speed to perform the -to STEP_COUNTER*
test operations. The architecture takes advantage of the
fact that most memory test algorithms require complex The first constraint declares MCPs from the counter to
operations such as Read-Modify-Write. all destinations but the second constraint resets some of
the paths to be SCPs. The first constraint is not true when
MBIST IP has a large number of MCP sources with tens or analyzed in isolation.
even hundreds of destinations. Some of the MCP sources
also have single-cycle path destinations. The identification A second issue is correctly understanding the coded RTL,
and classification of MCPs is done by analysis based on so that false violations are not flagged by the formal tool.
experience acquired over the years. Timing constraints are Noisy results are a legitimate concern when deploying a
verified in-house using SystemVerilog assertions (SVA) on formal solution, so we needed to be convinced that the
representative benchmark circuits. The benchmark circuits results from the tool are consistent with the information
must be chosen carefully because they correspond to provided to it. The third challenge is the ability to provide
specific instances derived from a highly parameterizable architectural input to the tool in situations where MCPs are
RTL template. The timing constraints are implemented in not completely supported by the RTL. This architectural
a way to minimize the number of parameter combinations input (waivers) needs to be specified in a way that accounts
affecting the constraints. Nevertheless, there is always a for the parameterizable nature of the design. Common
small possibility that a combination was missed. waivers need to be specified for hundreds of MBIST
controllers, each using a different set of circuit parameters
End users currently don’t have a mechanism to run formal which might affect the composition of the timing constraints.
verification of all constraints (MBIST and functional). They
need to assume that MBIST constraints are correct by FISHTAIL’S MCP VERIFICATION METHODOLOGY
construction and waive violations causing a disruption FishTail’s Confirm product performs MCP verification.
of the design flow. Timing simulations need to be used if The tool requires the following information for a design:
desired to validate MBIST constraints. This is very difficult
because of the large number of memories in a circuit. • Synthesizable RTL
• Tcl MBIST constraints
It is well known that simulation-based methods (using • Simulation models for standard cells instantiated
either full timing or assertions) are limited by the quality in the RTL
of test benches which might not exercise all useful • Liberty models for hard macros, memories
signal transitions and by simulation time which might be
15
Figure 1: Example MCP

With this information the tool formally establishes the the startpoint is allowed to transition. In Figure 1, FF1
correctness of the multi-cycle paths without requiring receives a clock without gating and the only enable logic is
any stimulus. For any paths that fail verification the tool on the datapath. If valid is low the old value is maintained
generates a waveform that shows why the path fails on FF1 and when valid goes high a new value is sampled by
and provides aids for an engineer to debug the failure. FF1. So, the STC is “valid”, i.e. valid must be high for FF1 to
In addition, the tool generates SVA assertions for all transition in the next cycle. Next, we consider the condition
failing paths. These assertions may be imported into RTL necessary for the transition to propagate from startpoint to
functional simulation to obtain third-party confirmation of endpoint. We refer to this as the Path Propagation Condition
the failure. Engineers can provide architectural input to the (PPC). In Figure 1, there is no interesting enable logic on
tool (information that is not available in the synthesizable the datapath, the only enable that controls the propagation
RTL) that support the timing exceptions. For example, the of a path from FF1 to FF2 is the clock-gate enable for the
fact that configuration registers are static, or expected to clock pin on FF2. FF2 only receives a clock when valid is
be programmed in a certain way, or that failures to certain high. As a result the PPC is “valid”. The STC establishes
endpoints can be ignored, etc. With this additional input the condition for the startpoint to transition, and the PPC
the tool is often able to prove the correctness of a timing establishes the condition for the transition to propagate
exception that was earlier flagged as incorrect. Engineers to the endpoint in the subsequent cycle. Formally proving
decide on the effort they want to spend in getting the tool to an MCP requires proving that if the STC is true in a cycle,
formally prove a timing exception, or whether they want to then in the next cycle it is impossible for the PPC to be true.
reduce this effort and instead establish the correctness of Performing this proof requires working logic cones back
the exceptions based on the feedback they get from running over as many clock cycles as required to reach a definitive
RTL simulations on the assertions generated for failing answer. For the circuit in Figure 1, proving the MCP requires
paths. proving that it is impossible for valid to be high in two
consecutive clock cycles and this proof passes as valid
The formal verification of an MCP requires proving that toggles every clock cycle.
it is impossible for a change at a startpoint to propagate
to an endpoint in a single cycle. Consider the example FishTail’s strength in MCP verification comes from its ability
design in Figure 1. Consider that an engineer has specified to prove MCPs as correct when the provided collateral
a two-cycle MCP from FF1 to FF2. Formally proving this supports the MCP without requiring an engineer to fiddle
MCP requires establishing that in the clock cycle that FF1 with settings that control the sequential depth for the formal
transitions, it is impossible to propagate this transition to proof, or the runtime allowed to prove an MCP. Inconclusive
FF2. Proving the MCP requires establishing the condition or noisy results are the bane of formal tools, and it is
when FF1 can transition. We refer to this as the Startpoint important for a user to have the confidence that when an
Transition Condition (STC) and it takes into account any MCP fails FishTail formal proves it is most likely because of
enable logic in the clock and data path that control when missing architectural information and not because the tool

16
Figure 2: Example report for failing multi-cycle path

ran into complexity issues. The ability to prove MCPs while startpoint to endpoint, as well as any gating along the path.
minimizing complexity issues comes from FishTail’s ability The path may be viewed in schematic form using FishTail’s
to accurately separate signals on a design into either data integration with Verdi®.
or control. Data signals play no role in the STC and PPC
established by the tool, allowing the tool to scale and work The intent of reviewing the PPC/STC and the path detail
back more cycles during formal proof than would otherwise is to establish if things are as expected, or if a path is
be possible. being traversed that is not intended, or if the necessary
gating conditions are not in place to control when the
For every failing timing exception the tool generates startpoint can transition and when this change is allowed
a report explaining why the path is not multi-cycle. to propagate to the endpoint. If everything is as expected,
An example report is shown in Figure 2. then the next step is to review the stimulus the tool
generates showing how the failure happens. An example
The report in Figure 2 shows the start and endpoint that failure stimulus is shown in Figure 3. This stimulus, similar
the failure applies to and the launch and capture clocks to a VCD dump from a simulation tool, shows signal values
associated with the path. Clicking on the PPC/STC shows in clock cycles leading up to a failure. The cycle where the
you these conditions. Clicking on the red path in Figure 2 failure happens is shown in red. In this cycle, the startpoint
shows you more detail on the timing path from the changes and the change propagates to the endpoint.

17
Figure 3: Example failure stimulus

Using FishTail’s integration with Verdi this stimulus may when a startpoint changes then in that clock cycle the PPC
be viewed as a digital waveform but the benefit of the must be false, or in the next cycle the endpoint must not
tabular display is that you can ask the tool to justify a change.
signal value in a given clock cycle. For example, clicking
on the value 1 shown for RESET_REG_DEFAULT_MODE
in the failure cycle results in the tool highlighting signal SUMMARY
values that cause RESET_REG_DEFAULT_MODE to go In this article we have motivated the need to formally verify
high when the startpoint changes. Since RESET_REG_ MBIST MCPs, discussed FishTail’s methodology for MCP
DEFAULT_MODE is a combinational signal the values that verification and the approach used to debug formal MCP
are highlighted are in the current cycle on the registers verification failures or generate assertions for import into
STATE and BIST_EN_RETIME2. You can then click on RTL functional simulation. The methodology was applied
values on these registers, for example, the value 1 on to a small MBIST IP with 59 MCPs. The MCPs applied to
BIST_EN_RETIME2 to see what values in the previous a total of 2090 paths. Initially, after providing the tool just
cycle cause this signal to be high in the failure cycle. In this RTL and constraints as input, 92% of the constrained paths
manner you can keep working backward from the failure were confirmed as good and 167 paths failed formal MCP
cycle to understand how the failure happens. As part of this verification. The runtime was less than a minute. We then
process if you see a transition or a value that is impossible debugged the failures and provided additional architectural
based on architectural considerations (for example, the way information to the tool regarding the way LVISION_WTAP_
configuration registers, such as LVISION_WTAP_IR_REG/ INST/LVISION_WTAP_IR_REG is programmed, that LV_
INST_INT are programmed) then this information can be WRCK, LV_SelectWIR and LV_ShiftWR are static. With this
communicated to the tool. At this point, rather than redo the additional input the number of failing paths dropped to 15
entire MCP verification run you can reverify just the specific (so 99% of the paths constrained by MCPs were verified to
failing path you have been debugging to see if the additional be good). Legitimate MCP issues were flagged by the tool,
input resolves the failure. some that were expected and some that were surprises.

Formal MCP verification debug requires the involvement of The formal verification of MBIST MCPs using RTL input
a designer who is familiar with the RTL and the motivation guarantees the correctness of MBIST MCPS at the time
for the MCP. It requires time, and the benefit is a 100% they are written, and then ensures that while revisions to
result. When time is short, an IP is not well understood, or RTL and constraints are being made, the MBIST MCPs
designer time is scarce an alternate strategy is to take the are continually verified to be good. Customers who receive
assertions generated for failing MCPs and run them through MBIST IP are able to verify MCPs delivered along with this
RTL functional simulation. If an assertion fails then it is a IP, accounting for any customization they make.
strong indicator that an MCP is incorrect. If an assertion is
checked and it always passes then it builds confidence that
the MCP is correct. Figure 4 shows an example assertion
generated by FishTail. Essentially, the assertion checks that

18
// Path propagation condition:
// ( TARGET1_CK100_MBIST1_MBIST_I1/RESET_REG_DEFAULT_MODE )
// Launch Clock: CK100 (rise)
// Startpoint: TARGET1_CK100_MBIST1_MBIST_I1/ALGO_SEL_CNT_REG_reg
// Capture Clock: CK100 (rise)
// Endpoint: TARGET1_CK100_MBIST1_MBIST_I1/MBISTPG_DATA_GEN/WDATA_REG_reg
// Assertion:
module u_mcp449_1 (input bit clk, input logic from_reg, input logic [1:0]
to_reg, input bit v12122);

wire path_propagation_condition =
( v12122 );

property e_mcp449_1;
@(posedge clk) `FT_DISABLE (`FT_TRANSITIONS_AND_NOT_UNKNOWN(from_reg))
|-> ((##0 (!path_propagation_condition)) or
(##1 (`FT_NO_TRANSITION_OR_UNKNOWN(to_reg[1:0]))));
endproperty

mcp449_1: assert property(e_mcp449_1);

endmodule
bind TARGET1_CK100_MBIST1_LVISION_MBISTPG_CTRL u_mcp449_1
sva_u_mcp449_1(.clk(BIST_CLK), .from_reg(ALGO_SEL_CNT_REG), .
to_reg(MBISTPG_DATA_GEN.WDATA_REG), .v12122(RESET_REG_DEFAULT_MODE));

Figure 4: Example assertion for a failing MCP

19
Starting Formal Right from Formal Test Planning
by Jin Zhang, Senior Director of Marketing & GM Asia Pacific, and Vigyan Singhal, President & CEO, OSKI Technology

“By failing to prepare, you are preparing to fail” decision that should be made upfront to avoid surprises
—Benjamin Franklin down the road.

“Productivity is never an accident. It is always At Oski, we follow a systematic 3-stage approach. The
the result of a commitment to excellence, IDENTIFY and EVALUATE stages aim at obtaining a deep
intelligent planning, and focused effort” understanding of the DUT. The PLAN stage combines both
—Paul Meyer formal and design knowledge to create the final test plan.

This article aims to help formal engineers learn how to


Planning is key to success in any major endeavor, and the do proper formal test planning. Good formal test planning
same is true for meaningful formal applications. End-to- requires lots of formal verification experience, at Oski, a
End formal, with the goal of achieving formal sign-off, is a task conducted only by our project leads. However, we
task that usually takes weeks if not months to complete, believe that by knowing what to do during formal planning,
depending on the size and complexity of the design under and practicing through real formal verification projects, one
test (DUT). Dedicating time and effort to planning is of can acquire such skills.
utmost importance. While most formal engineers and their
managers understand the need for formal planning, they do
not know how to conduct thorough planning to arrive at a FORMAL TEST PLANNING STAGE 1 – IDENTIFY
solid formal test plan for execution. The goal of the IDENTIFY stage is to identify what design
block(s) to verify using formal. This decision is based on
As the first step of any formal sign-off project that Oski several factors:
performs for our customers, we routinely dedicate 2 weeks
to craft the formal test plan, which includes the following key 1. What are the verification goals and areas of concern?
components: 2. What blocks are suitable for formal?
3. How many formal engineers are available
• How to conduct formal verification on the design, how and what is their level of formal expertise?
many formal testbenches we need to build to verify the
whole DUT, or which areas of the design are not a good The answers to these questions combined determine on
fit for formal which design blocks to apply formal in order to achieve
• What End-to-End checkers and corresponding the best return-on-investment (ROI). The process to arrive
constraints will be implemented at these answers often requires discussion with design/
• How many cycles we need to target for the Required verification managers and project team leaders.
Proof Depth
• Where to use complexity solving techniques to Over the years we have heard many different goals shared
overcome inherent design complexity by companies wanting to adopt formal:
• How to measure formal progress to achieve sign-off
1. More and more, project teams want to shorten
• How much time it will take to completely verify the
the project schedule by reducing their reliance on
design
simulation. They understand applying formal early
during RTL development can harden RTL sooner, save
The goals for creating the formal test plan are not only in
verification time at the sub-system or system level using
creating a blueprint for execution, but also in aligning all the
simulation or emulation. Their goal is achieving formal
stakeholders on the achievable formal goals for the project.
sign-off on suitable blocks. This is the best adoption
Sometimes a trade-off in scope, schedule and resources
case because applying formal early results in the best
has to be made to fit in the realistic limitations. This is a
verification ROI.

20
2. In many instances, because bugs continue to be found is a legacy design with minor changes and has gone
close to or even after tapeout, formal is used to find through lots of verification, or if it is not a difficult design
missed corner case bugs, hence the formal goal is bug to verify with simulation (few corner cases and not many
hunting. concurrent operations), it is obviously not the best use of
3. Sometimes, project teams have trouble reaching formal resources. On the other hand, if the block is brand
simulation coverage goals. Formal can discover new, or is being developed as an IP to be used internally
unreachable targets, or generate input vectors to reach or externally with lots of parameters, then formal will bring
simulation cover points, and therefore the formal goal better ROI.
is coverage closure.
4. There are also cases where project teams have Last, formal resources and expertise constraints limit
specific needs in mind, such as verifying pre- and post- what types of formal and how much can be done. End-
clock gating designs. In this case, some commercial to-End Formal requires a lot of formal expertise and can
formal apps will be useful. only be attempted by our engineers after going through
several projects with mentors (usually takes about 2 years
Different verification goals translate to different strategy of full-time formal usage). On the other hand, writing
and require different levels of planning. For bug hunting, local assertions to do bug hunting can be carried out by
one only needs to focus on blocks or functionalities with engineers with much less formal experience.
the most issues, without having to worry about completely
The output of the IDENTIFY stage is a list of good
verifying the whole design using formal. On the other
design blocks to apply formal verification on and the
hand, achieving formal sign-off requires the most thorough
corresponding goals.
planning. Spending insufficient time on planning may result
in the ultimate goals not being reached.

Once the verification goals are aligned, we need to identify FORMAL TEST PLANNING STAGE 2 – EVALUATE
blocks suitable for formal. Contrary to common belief The goal of the EVALUATE stage is to finalize the list
that only control types of blocks are good for formal, in of formal testbenches for different design blocks, along
reality data-transport blocks, where data is transported with understanding of the possible formal verification
from inputs to outputs with simple or no modifications, challenges to overcome.
are also good candidates. For example, some typical
To achieve this goal, we need to carefully consider many
control and data-transport blocks could be arbiters of all
factors, such as design interfaces, register transfer level
kinds, interrupt controllers, power management units, tag
(RTL) metrics and critical design functionalities.
generators, schedulers, bus bridges, memory controllers,
DMA controllers and standard interfaces such as PCI Knowledge of design interfaces is important to decide
Express, and USB. On the other hand, data-transform the best places to partition the formal testbenches. One
designs, where algorithmic options are often performed, consideration is whether the block interface is standard
are not good for formal property verification (also called and well documented. Designers are usually very busy and
model checking). Instead these should be verified are less willing to answer lots of questions about interface
using other techniques such as Sequential Equivalence behaviors. So it is best to partition at the level where
Checking. To understand the functionality of different design interfaces are standard, or easy to follow. An ideal
design blocks, block diagrams and design specs will be situation is when formal verification starts in parallel to RTL
useful. Conversation with designers can also help get the development so designers have the freedom of moving
high-level functionality of different design blocks for this logic from one block to another to simplify the interfaces.
assessment.
Next, we use formal tools to report the following RTL
Not all blocks that are suitable for formal should be verified metrics: register counts, RTL lines of code (LOC), the
by formal. Again one needs to consider the ROI. If a block number of inputs and outputs, and parameter variations.

21
These numbers further assist in deciding formal testbench It is worth noting that implementing formal End-to-End
boundaries and estimate the amount of effort it will take to checkers often requires building reference models. As
formally verify each chosen block: a matter of fact, 95% of the effort may be in writing the
reference models in Verilog or SystemVerilog, with only 5%
• A block with lots of inputs and outputs means more of the effort in writing the SystemVerilog Assertions. So one
effort in modeling constraints. On the other hand, a needs to factor the time it takes to write reference models
block with large RTL LOC or register counts means when estimating the overall effort level. Also, internal
more effort in modeling checkers and managing assertions might be used during the project, but only when
complexity. A balance needs to be made between needed for debugging or helping End-to-End checkers
simple interfaces vs. manageable block sizes for formal. reach closure. Therefore internal assertions are not included
• People often ask what a good design size for formal in the formal test plan.
is. While RTL LOC and register counts help guide the
decision on formal testbench boundaries, there are Formal complexity discussion and resolution should be
no precise rules to say what the right size of block for included in the formal test plan. Because each design is
formal is, as a small RTL block could be very complex. unique, often there is no existing solution to use. We need
A rule of thumb for formally manageable block is a block to estimate the effort in coming up with a solution as well
that can be designed by a single designer. Anything as writing, verifying and using the solution. This is the
smaller will not bring the best ROI, and anything bigger most unpredictable part of the process. Often when we
will pose a challenge for formal tools. underestimated the effort level in our projects, it is when we
• Understanding the impact of parameters may help didn’t fully comprehend the challenge involved in solving
reducing formal complexity. If design parameters can be complexity. Careful consideration here will save surprises
reduced without reducing corner case coverage, formal later.
testbench should use smaller parameters for better
performance. Exact metrics to measure success need to be established.
Once there is a total number of checkers and constraints
During the process of working with RTL, one also gains an to implement, a weekly tracking spreadsheet can be used
understanding of the micro architecture characteristics of to track the numbers of checkers and constraints written
the design block to anticipate the kind of formal complexity each week, their verification statuses, bugs found, and
and complexity resolution techniques one might use. This percentage towards completion. In recent years, formal
process also helps decide the Required Proof Depth. tools have added formal coverage features, which can
effectively measure how much the formal testbench is
At the end of this stage, there should be a mapping between covering the DUT. This will be another useful metric to use
each candidate design and one or more formal testbenches, to decide when a formal testbench is complete and formal
along with a determination of who is responsible for sign-off has been reached.
developing which testbench. It is common that only a subset
of the target blocks is chosen at the end of this process. The following list includes typical chapters
in our formal test plan:

FORMAL TEST PLANNING STAGE 3 – PLAN 1. Design overview


The goal of the PLAN stage is to put everything together 2. Formal testbench overview
and create the actual implementation plan for execution, as 3. End-to-End Checkers
well as the estimated time for the formal project. 4. Interface checkers
5. Constraints
With the understanding of design functionality, an 6. Required Proof Depth analysis
English list of End-to-End and interface checkers and the 7. Complexity analysis
corresponding constraints will be captured in the formal test 8. Metrics to measure formal sign-off
plan, often in a word document. 9. Schedule estimation

22
The formal test plan is created for each block designated Clock domain: 1
to have a separate formal testbench. This is a living Data latency: 6
document and may be updated during the verification Required Proof Depth: 28 cycles
process. For example, an End-to-End checker may be too
complex, so needs to be split into two or more checkers, When asked “How Long It Takes to Formally Verify This
thus affecting the list in the document. At the end of the Design”, over 70% of respondents guessed in the 2-4
project, the actual formal testbench implementation should month range. In reality, we spent 5 months on the project
be consistent with the formal test plan. The formal test plan with the following breakdown:
may serve as a user guide for future projects when the
formal testbench is reused. 1. Formal test planning: 0.5 months
2. End-to-End Checker (including reference models):
1.5 months
A REAL CASE STUDY 3. Constraints: 1 month
Often people underestimate the amount of time it takes to 4. Interface checkers and internal assertions: 1 month
do formal sign-off projects. At DAC2014, we conducted a 5. Abstraction Model: 0.5 months
guessing game by proving the following information of a 6. Iteration with designs on bug fixes: 0.5 months
design that we verified before.
As the breakdown shows, significant time is spent in
Design Description: formal testbench development. However once the block
Reorders IP packets that can arrive out of order and is completed verified with formal and achieved sign-off,
dequeue them in order. When an exception occurs, the the chance of missing a bug is very low. This level of
design flushes the IP packets for which exceptions has confidence cannot be provided by simulation. If this design
occurred. Design supports 36 different inputs that can will be used by several projects, then the formal investment
send the data for one or more ports. Another interface is well worth the effort.
provides dequeue requests for different ports. Design
supports 48 different ports.
CONCLUSION
Design Interface Standard: Due to the complexity of formal sign-off projects, it is
Packets arrive with valid signal. A request/grant important to do thorough planning at the beginning so the
mechanism for handling requests from 36 different formal testbench implementation can be done efficiently.
sources; All 36 inputs are independent and can arrive This article provided an overview of the stages involved
concurrently. All 48 ports can be dequeued in parallel in formal test planning. Since this is a very important
again using another request/grant mechanism. step, and one often ignored by project teams, we can’t
emphasize enough the value of dedicating time to do
Micro Architecture Details:
formal test planning. Oski CEO, Vigyan Singhal will give
Supports enqueue and dequeue for IP packets for 32
a talk on this subject at Mentor Verification Academy
different input and 48 different ports respectively. 48
on Wednesday June 10th, at 11am. It will be a good
different queues used to store IP packets for different
opportunity to learn more about the process and ask
ports. A round robin arbiter resolves contention between
questions.
enqueue requests from different source for the same port
at the same cycle.

RTL Stats:
RTL Line of code: 10830
Register count: 84,027
Inputs: 3,404 bits
Outputs: 9,137 bits

23
Reuse MATLAB® Functions and Simulink® Models in UVM Environments
with Automatic SystemVerilog DPI Component Generation
by Tao Jia, HDL Verifier Development Lead, and Jack Erickson, HDL Product Marketing Manager, MathWorks

BACKGROUND cross-platform situations. And managing many parallel


The growing sophistication of verification environments HDL simulator sessions that communicate with
has increased the amount of infrastructure that verification corresponding MATLAB and Simulink sessions can
teams must develop. For instance, UVM environments add an extra layer of complexity.
offer scalability and flexibility at the cost of upfront efforts
MATLAB and Simulink have mature and robust C code
to create the UVM infrastructure, bus-functional models,
generation capabilities for production embedded software
coverage models, scoreboard, and test sequences.
for automobiles, aircraft, spacecraft, robotics, medical
Engineers everywhere use MATLAB and Simulink to design devices, and a host of other applications. Some of our more
systems and algorithms. Math-intensive algorithms, such as innovative customers began to use this code generation to
signal and image processing, typically begin with MATLAB re-use some of their system-level models and tests in their
language-based design. Complex systems, such as control verification environments. However, integrating the code
and communications, typically begin with Simulink and into hardware using SystemVerilog’s Direct Programming
Model-Based Design. At this early stage, engineers create Interface (DPI) still requires lots of work. Could there be
behavioral models of the algorithms and subsystems along a way to automatically connect the system intent with
with system environment models and test sequences hardware verification?
to validate the design given the requirements. The rest
of this article will refer to this stage as “system design
and verification.” At the end of this stage, the detailed GENERATE DPI COMPONENTS AUTOMATICALLY
requirements are captured in a paper specification for the HDL Verifier now has the ability to generate SystemVerilog
hardware and software teams. DPI components from MATLAB and Simulink. With
this capability, hardware verification teams can run
The verification team has to read and interpret this the algorithmic models, system components, and test
specification in order to create the verification plan, sequences used by their system design teams directly in
infrastructure, bus-functional models, coverage models, their SystemVerilog simulator. This not only saves the team
scoreboard, and test sequences. This is a large the effort of reading the specifications and writing the tests
undertaking, one that is often complicated by vagaries in or and models, it also reduces the risk of misinterpreting the
misinterpretation of the spec. This specification gap creates specifications. This is because the tests and models are
a gulf between system design and hardware verification. essentially executable specifications generated directly
For instance, when a late change is required, the hardware from the MATLAB functions and the Simulink models.
team typically just patches it into the design and verification Moreover, since the SystemVerilog DPI generation process
collateral without going back to update the algorithm where is automatic, the verification components are available to
it can be validated. verification engineers early in the development process.
When the team needs to make changes to the specification
HDL VerifierTM bridges the specification gap by using cosim- in the middle of development, the verification components
ulation with HDL simulators such as the Mentor Graphics® can be automatically regenerated from the changed models.
Questa® Advanced Simulator. This enables the system
models and tests to run in their native MATLAB or Simulink At a high level, generating a SystemVerilog DPI component
environment, while verifying the design-under-test (DUT) is as straightforward as deciding from where you will be
in its native HDL simulator environment. This allows for generating the model, issuing the command to generate
debugging visibility into both. However, as the project it, and integrating the component into your simulation.
progresses to the stage where more automated regression The exported DPI component will be in the form of a thin
testing is done, cosimulation has its drawbacks. The SystemVerilog wrapper file and a shared library built
handshaking interface adds runtime overhead, especially in automatically as part of the generation step. If your

24
simulation runs on a different platform than from where and outputs be? What other signals will you need access
it’s generated, you can use the generated makefile to to? In our example, we will generate a checker function
build the shared library on that platform. that will calculate the floating point result of a 64-point
radix-2 FFT and compare this to the outputs of a fixed-
These components run quickly in simulation, as they point RTL implementation.
are behavioral-level C code without the bit-level
implementation details. Using C as the description
language allows you to export a wide variety of model
types. Some example applications include:

• Bus-functional checker algorithm for UVM scoreboard


• UVM sequence item
• Analog model representation in SoC UVM
environment
• Digital model in an analog circuit simulator

This article uses an example FFT design to demonstrate


how to generate a bus-functional checker algorithm
written in MATLAB, and it outlines how to integrate it Image 2: MATLAB code for the checker function.
into a UVM scoreboard. ©2015 The MathWorks, Inc.

This function returns a ratio of the root-mean-square error


of the fixed-point implementation versus floating-point
reference outputs, divided by the root-mean-square of the
fixed-point implementation outputs.

This is a good time to check whether your model is


compatible with code generation. SystemVerilog DPI
component generation relies on underlying C code
generation technology from MATLAB Coder or Simulink
Coder, and there are certain requirements models need
to comply with for code generation. Running one of the
code generation checking utilities helps to identify and
remedy many potential issues before you try to generate
a DPI component:
Image 1: Generating a SystemVerilog DPI component
from a MATLAB algorithm for use in a UVM • MATLAB: Add the %#codegen directive (or pragma)
scoreboard. ©2015 The MathWorks, Inc. to your function after the function signature to indicate
that you intend to generate code for the MATLAB
algorithm. Adding this directive instructs the MATLAB
SETTING UP AND GENERATING THE DPI COMPONENT code analyzer to help you diagnose and fix violations
The first step is to identify what you will need the that would result in errors during code generation,
component to do in your verification environment. In other such as blocks, functions, or features not supported
words, how will it be used? What will the primary inputs for C code generation.
25
• MATLAB: Select your file in the current folder and right by browsing in System target file and selecting
click Check Code Generation Readiness. systemverilog_dpi_grt.tlc.
• Simulink: Select your subsystem, right click C/C+ Code
> Code Generation Advisor Once you are satisfied with your settings, click OK to close
the form. Then select the subsystem for which you want to
At this point, you can generate the model using the default generate the DPI component by right clicking C/C++ Code >
settings or customizing the output. This example generates Build This Subsystem.
the DPI component from MATLAB, which uses the dpigen
Building the subsystem in Simulink, or running dpigen in
command:
MATLAB, will generate the collateral necessary to build
dpigen -args {int16(ones(1,64)),int16(ones(1,64)),int16(on your DPI component including the C files for the core
es(1,64)),int16(ones(1,64))} fft_checker function and its DPI interface, the necessary header files,
a SystemVerilog wrapper, and a makefile. By default, it will
even build the shared library to load into the simulator. Of
We specify the function’s input types and size (the –args course, if your simulation runs on a different platform, you
argument), and the name of the MATLAB function from will want to use the generated makefile to build the shared
which we’re generating (fft_checker). The generated library on that platform.
testbench is a SystemVerilog file that reads input vectors
and expected output vectors from the MATLAB testbench to INTEGRATING THE DPI COMPONENT INTO SIMULATION
verify that the DPI component is functionally equivalent to The shared library encapsulates everything you need for the
the original MATLAB function. component to execute; you just need to point your simulator
to it at runtime (more on that later). First, you need to make
To generate the DPI component from Simulink, use a the SystemVerilog side aware of it. This SystemVerilog file
parallel workflow, but with the Simulink UI infrastructure. was generated as part of the build process:
Set up by selecting the Code > C/C++ Code > Code
Generation Options pull down menu. On this form,
you first need to specify that it will be a DPI component

26
Image 4: Auto-generated SystemVerilog
wrapper for the checker DPI component.
©2015 The MathWorks, Inc.

You can see that it created two functions in the


component: an initialization function called at reset (DPI_
fft_checker_initialize), and the main operation function
(DPI_fft_checker). These are the functions that will be
used in SystemVerilog. We could just instantiate this fft_
checker_dpi module in a testbench and feed it the inputs.
Image 5: Initializing the DPI function by returning
However, since this example uses a UVM environment,
a handle to it. ©2015 The MathWorks, Inc.
we need to work these functions into its phasing. In this
case, we will just take the two import “DPI” function
statements over to our UVM testbench code and declare When a transaction comes through the analysis port
them there. These will be used in the scoreboard. First, from the monitor on the DUT’s output, we need to run
the initialization function will return a handle to this the checker function. The original stimulus transactions
instance during the build phase: have been stored in a FIFO, so the checker function will
pop the next input from this FIFO and send its real and
imaginary components to the checker along with the real
and imaginary components of the DUT output.

27
Image 6: Sending the stimulus transaction and
implementation output data into the checker, and
comparing the result against the defined threshold.
© 2015 The MathWorks, Inc.

The return value, nmrs, is the return value from the MATLAB
function. HDL Verifier created an output port for it when
Image 8: Questa Advanced Simulator run script,
it created the SystemVerilog wrapper. This is compared
pointing to the generated shared library.
against a threshold of error tolerance that we defined in our
© 2015 The MathWorks, Inc.
UVM testbench.

So now our verification environment looks something like


this, with the scoreboard receiving transactions from the In the above script, libfft_checker_dpi is the name of the
driver and monitor via analysis ports: shared library automatically generated by HDL Verifier from
the MATLAB function fft_checker.

RESULTS
This simple example shows how to generate
SystemVerilog DPI components using HDL
Verifier, and it illustrates how to integrate
the generated components into a UVM
environment. The simulation passes its simple
test sequence (see image 9):

What effect did this have on our verification


efforts? The high-level phases of a verification
Image 7: High-level block diagram of the UVM environ- project typically include developing the verification plan,
ment and where the generated SystemVerilog DPI implementing the verification environment, and executing
component is used. © 2015 The MathWorks, Inc. verification. The plan and implementation are driven
primarily through reading and interpreting the specification,
At this point, we can run simulation. Remember, we need to and that spec is written as a result of the design and
point the simulator to our shared library. Here’s how we do it verification efforts of the system team.
with the Questa simulator:

28
Image 9: Results from the simulation run. © 2015 The MathWorks, Inc.

In this approach, the spec is the primary – and sometimes The ability to automatically generate SystemVerilog
the only – means of communication between system DPI components directly from the system design and
design and hardware verification teams. Building verification environment is akin to directly passing the
the environment, models, and tests by reading and specification into the hardware verification environment.
interpreting the spec is labor intensive and error prone.
If there is a spec change, it usually just gets addressed Rather than having the verification team spend weeks
in the hardware design and verification environment. It reading the spec, writing and debugging a floating point
rarely gets propagated back to the system design and reference model for an FFT, and checking functionality
verified there – why bother when there is a deadline to in SystemVerilog, we can automatically generate the DPI
meet? components from the system-level models. They are
immediately available to the hardware verification team.

29
Image 11: HDL Verifier SystemVerilog DPI component generation automatically generates system-level intent in a
format that is consumable by the verification environment. This reduces manual efforts and possible specification
misinterpretations. © 2015 The MathWorks, Inc.

If the spec changes, that change can be updated and


verified in the system-level model, and the DPI component
can be automatically regenerated.

This approach can be used for any project the verification


team needs a model for, whether the model is digital,
analog, or a mixture of the two. This saves the verification
team weeks of implementation time and lets them focus on
the core task at hand – verifying the design.

30
Intelligent Testbench Automation with UVM and Questa
by Marcela Simkova and Neil Hand, VP of Marketing and Business Development Codasip Ltd.

This article describes an automated approach to improve UVM GENERATION WITH CODASIP STUDIO
design coverage by utilizing genetic algorithms added to Codasip Studio automates the generation of UVM
standard UVM verification environments running in Questa® environments for execution in the Questa® simulator
from Mentor Graphics®. To demonstrate the effectiveness of from Mentor Graphics®. In this environment, the HDL
the approach, the article will utilize real-world data from the representations of either individual ASIPs or complex
verification of a 32-bit ASIP processor from Codasip®. platforms consisting of several ASIPs, buses, memories,
and other IP components act as the design under
verification (DUV). The reference model, which is a very
INTRODUCTION important part in the UVM architecture, is automatically
Application-specific instruction-set processors (ASIP) have generated from the high-level IA model of an ASIP and
become an integral part of embedded systems, as they can from C++ models of external IP components. A generator
be optimized for high performance, small area, and low of random applications for the target ASIP is included in
power consumption by changing the processor’s instruction- the generated ASIP toolchain and will be used to generate
set to complement the class of target applications. This stimuli for ASIP verification. For illustration, see Figure 1.
ability makes them an ideal fit for many applications —
including the Internet of Things (IoT) and medical devices
— that require advanced data processing with extremely
low power.

The ASIP used in this example was modeled using


Codasip® Studio, a highly integrated development
environment that allows rapid ASIP prototyping and
development. Using Codasip Studio, the instruction set Figure 1. Automated generation of the HDL repre-
of an ASIP — the instruction-accurate model (IA) — and sentation of an ASIP and the UVM environment with
the behavior of that ASIP at the hardware level — the the reference model from Codasip Studio.
cycle-accurate model (CA) — are described in a processor
description language called CodAL™. Using this description,
everything needed for implementation and integration of An important feature of the generated UVM environment is
the ASIP is automatically generated; including the HDL the optimization algorithm that runs in the background and
representation of the ASIP (in Verilog, VHDL, or SystemC), adjusts the constraints of the random application generator
a complete ASIP toolchain based on LLVM/GNU (compiler, in order to automate and speed up coverage closure.
simulator, debugger, profiler, assembler, and libraries), This optimization reduces the effort needed to prepare
a virtual prototype (SystemC and QEMU), and the UVM comprehensive verification stimuli and improves verification
based verification environment. productivity. The remainder of this article will look at how
this UVM environment was structured and how it has been
The power and flexibility of ASIPs, however, raises some used for real-world ASIP verification.
important verification issues. One concern is the verification
of the generated HDL description of the processor with
respect to the reference model (e.g., the pipeline behavior, THE EXPERIMENTAL ASIP
communication with a memory, etc.). Another is the Codix-RISC™ is a 32-bit processor with six pipeline
verification of the ability of the ASIP to execute software stages. It contains 32 x 32b general purpose registers
applications (sequences of instructions) correctly. In with up to three read and one write ports by default.
practice, verification is achieved by evaluating a large Hazards are handled by the generated LLVM compiler,
amount of applications (programs) in simulation. These although hardware hazard management is an option.
applications originate from various benchmarks/test suites Codix-RISC is suitable for embedded signal, video,
and can be manually written or automatically generated. and wireless processing applications that include
31
some application specific code, such as video codecs, time, genetic operators ensure diversity, so the algorithm
malware detection, security algorithms, and any other is resilient to the problem of local optimum. For coverage-
computationally demanding low-power tasks. In the driven verification, GA serves unconventionally as an
standard verification of Codix-RISC, different test optimizer that runs in the background of the verification
applications are evaluated. Initially, benchmark applications process. This means that in contrast to the typical
and language test suites are used; then applications from application of GA, it is not intended only for searching
the random application generator are applied. for good candidate solutions. Moreover, as profitable
values of GA parameters are found, additional tuning is
no longer required. These parameters have been tested
PROPOSED UVM OPTIMIZATION during verification of several ASIPs, as well as other IP
TO IMPROVE COVERAGE components, and resulted in a stable level of optimization.
Coverage metrics tracked for Codix-RISC in the standard
UVM-based environment are: code coverage (statement, The GA-driven approach is provided as an extension of the
branch, expression, FSM), functional coverage (requests, basic functional verification environment prepared according
status, and responses on the bus interface), and instruction to the UVM and follows the principle of object oriented
coverage (the complete instruction set and sequences of programming (OOP). Our aim was to integrate the GA
instructions). components effectively so that interference to the standard
architecture of UVM is minimal. Figure 2 highlights the
In the previously described standard verification, no SystemVerilog components/classes that are added to the
feedback is provided to the generator about the achieved standard UVM environment.
coverage. In the random approach, the generator
just produces random applications; therefore, to
achieve reasonable coverage, a considerable
number of applications must be applied to the
ASIP. To optimize this approach, we incorporated
a genetic algorithm (GA) as the main optimization
tool, which provides feedback about achieved
coverage and modifies the constraints of the
generator. In particular, the additional constraints
restrict the size of applications (100–1000
instructions) and define the probability that
each instruction from the instruction-set will be
generated. The generator works with its original
basic set of constraints, plus the probability and
size constraints, the values of which are modified
by the GA.
Figure 2. The UVM environment with GA components.
GA employs a population of candidate solutions that evolve
through several generations. The quality of candidate
solutions is determined by a fitness function. According to The GA component represents the core of the algorithm.
the fitness function, the best solutions are selected and It produces chromosomes (coding representations of
serve as parents for the next generation. New candidate candidate solutions) with encoded constraints that are
solutions are created by genetic operator mutation and passed through the Chromosome Sequencer and then
crossover. If the algorithm and its parameters are tuned propagated to the random application generator. The
well, the average fitness of the population improves generator then generates one application per chromosome.
over time. This means the algorithm is spending effort in For the Codix-RISC processor, generated applications are
exploring profitable parts of the search space. At the same directly loaded to the program memory of the processor

32
using the Application Loader component, and the
class CodixRiscTest extends CodixRiscTestBase;
verification in simulation is started immediately.
// registration of component tools
`uvm_component_utils( codix_risc_platform_
The following is an example of simplified UVM source
ca_t_test )
code of the CodixRiscChromosome class, and of the
run_phase of the CodixRiscTest class.

class CodixRiscChromosome; task run_phase( uvm_phase phase );


real fitness; // Fitness value of chromosome CodixRiscChromosome best_chromosome;
rand bit[6:0] chromosome_parts[]; // Population p, new_p;
Constraints for the generator int counter = 0;

function void crossover( inout CodixRisc- p = createOrLoadInitialPopulation();
Chromosome chrom );
// Temporary chromosome // Evaluate population in simulation
CodixRiscChromosome tmpChrom = new(); evaluatePopulation(p);

// Position of crossover // Get best chromosome


int main_pos = $urandom_range best_chromosome = getBestChromosome(p);
(CONSTRAINTS_NUMBER);
while (counter < NUMBER_OF_GENERATIONS)
tmpChrom = chrom.clone(); begin
chrom.chromosome_parts[main_pos] = // create new population
this.chromosome_parts[main_pos]; new_p = new();
this.chromosome_parts[main_pos] =
tmpChrom.chromosome_parts[main_pos]; // check elitism
if (ELITISM) new_p[0] = best_chromosome;
endfunction: crossover
// select parents and create new chromosomes
function CodixRiscChromosome mutate by crossover, mutation
( int unsigned maxMutations ); new_p = selectAndReplace(p);
// Position of mutation
int main_pos; // Evaluate new chromosomes in simulation
evaluatePopulation(new_p);
// Number of mutations
int mutationCount = $urandom_range // Get best chromosome
(maxMutations); best_chromosome = getBestChromosome
(new_p);
for (int i=0; i < mutationCount; i++) begin
// Position of crossover // create final population
main_pos = $urandom_range p = createNewPopulation(p, new_p);
(CONSTRAINTS_NUMBER);
// Make bit flips at the position counter ++;
bitflips(this.chromosome_parts[main_pos]); end
end endtask: run_phase
endfunction : mutate
endclass: CodixRiscTest;
endclass: CodixRiscChromosome;

33
We compared the effectiveness of the GA optimization The results show that for all tests utilizing GA optimization,
to two standard approaches. In the first approach, the we were able to achieve better coverage than with standard
benchmark applications are used for verification of the approaches. It is important to mention another significant
ASIP. In the second approach, the random application advantage of this approach. When selecting applications
generator is used. The proposed GA-driven approach also generated for the best chromosome in each generation,
uses the random application generator but adds additional the result is a set of very few applications with very good
constraints to the generator encoded in the chromosome. coverage. So for example, for the GA-optimization running
The graph in Figure 3 demonstrates the results of these 20 generations we can get the best 20 applications from
three approaches. It compares average values from 20 every generation that are able to achieve 98.1 percent
different measures for each approach (using various seeds). coverage. These applications can be used for very efficient
The x-axis represents the number of evaluated applications regression testing.
on the Codix-RISC processor, and the y-axis shows the
achieved level of total coverage. The computational burden of the GA optimization was as
follows. Evaluating one application in simulation took an
The average level of coverage achieved by 1000 benchmark average of 12.626 seconds, generating one application
applications was 88 percent. The average level of coverage around one second and preparing a new population in 0.095
achieved by 1000 random applications was 97.3 percent. In second. Experiments ran on a 3.33 GHz Intel® CoreTM i5
the GA-driven approach, the random generator produced CPU with 8 GB of RAM using the Questa simulator.
random applications for 20, 50, and 100 generations
(the 100 generations scenario is captured in the graph). SUMMARY
The size of the population was always the same: ten We have shown that GA easily integrates into a Questa
chromosomes. For the test with 20 generations, the average UVM verification environment, is computationally efficient,
level of coverage was 97.78 percent. For the test with 50 and significantly reduces the time and effort required to
generations, the average level of coverage was 98.72 achieve needed coverage for ASIP based designs.
percent. For the test with 100 generations, the average level
of coverage was 98.89 percent.

34
Unit Testing Your Way to a Reliable Testbench
by Neil Johnson, Principal Consultant, XtremeEDA, and Mark Glasser, Principal Engineer, Verification Architect, NVIDIA

AN INTRODUCTION TO UNIT TESTING more complex and verification teams have realized the
Writing tests, particularly unit tests, can be a tedious chore. extent of the investment required to build testbenches
More tedious - not to mention frustrating - is debugging the need for reuse began to emerge. Code that will be in
testbench code as project schedules tighten and release your verification library for years and used to verify many
pressure builds. With quality being a non-negotiable aspect designs must be highly reliable.
of hardware development, verification is a pay-me-now
Randomized testing as a driver of testbench quality is less
or pay-me-later activity that cannot be avoided. Building
obvious but no less significant. When much of your stimulus
and running unit tests has a cost, but there is a greater
is randomized you cannot tell a priori what will happen in
cost of not unit testing. Unit testing is a proactive pay now
the DUT and thus what exactly will happen in the testbench.
technique that helps avoid running up debts that become
You rely heavily on the checkers and scoreboards to give
much more expensive to pay later.
you good information about the correctness of the DUT
Despite academics and software developers advocating operation. In general, you are relying on the testbench
the practice of writing the test suite before you write the infrastructure to always do the right thing in the presence
code, this is rarely, if ever done by hardware developers of highly randomized stimuli where you are looking
or verification engineers. This applies to design and it is for interesting and hard to reach corner cases. Since
also typical in verification where dedicating time to test randomized stimulus is, by its very nature, unpredictable,
testbench code is not generally part of a verification project you have to be sure that the testbench does the right thing
plan. As a result, testbench bugs discovered late in the no matter what.
process can be very expensive to fix and add uncertainty
to a project plan. Even worse, they can mask RTL bugs
making it possible for them to reach customers undetected. UNIT TESTING TESTBENCH COMPONENTS
WITH SVUNIT
Unit testing is a technique borrowed from software SVUnit, an open-source SystemVerilog-based unit
development. It is a low level, yet effective verification testing framework, provides a lightweight but powerful
activity where developers isolate and test small features infrastructure for writing unit-level tests for Verilog testbench
in a class or program. By showing that individual features code1. It has been modeled after successful software
and functions are working correctly, developers reduce frameworks, like JUnit, with the intention of providing
the likelihood of bugs infiltrating either subsystem or chip a similar base level of functionality and support. While
level testbenches. In short, unit testing can greatly reduce relatively new to the hardware community, neither SVUnit
testbench bugs, making the entire verification process more nor its application are novel ideas.
reliable and cost effective.
The SVUnit release package includes a combination of
There are several forces driving the need for testbench scripts and a Verilog framework. The usage model is meant
quality. One is the size of the testbench effort and another to be complete yet simple; developers have everything they
is randomized testing methodologies. Another obvious one need to write and run unit tests with a short ramp-up period.
is that the testbench is the arbiter of design quality. The
quality of the product you will ultimately sell to customers is Code generator scripts written in perl are used to create
only as good as the quality of its testbench. Verilog test infrastructure and code templates. The
generated infrastructure ensures that tests are coded and
Early in the history of verification, testbench code was reported in a consistent fashion. Users write tests within the
considered throw-away code. It only had to be good enough generated templates, then use a single command line script
to demonstrate that the DUT is working (however “working” - runSVUnit - to execute the tests. The runSVUnit script
was defined) and that was it. Once the design went to supports many popular EDA simulators including Mentor
fabrication the testbench code became expendable. In Graphics® Questa®.2
recent years as designs have become orders of magnitude

35
Your First SVUnit Project In SVUnit, tests are defined within the `SVTEST and
SVUnit can be used to test any/all of Verilog modules, `SVTEST_END macros. The macros are important because
classes or interfaces. In this example, the unit under test they let users focus on test content while forgetting about
(UUT) is a class called simple_model which is a UVM the mechanics of the underlying framework. A basic test
functional model that retrieves a simple_xaction from an to illustrate macro usage is xformation_test, a test that
input channel, performs a simple transformation - multiply ensures the simple_model data transformation happens
‘xaction.field’ by 2 - and sends the modified transaction to as expected.
an output channel3. The public interface to simple_model
is shown in figure 1.

Figure 1 - simple_model public interface

Figure 4 - Simple unit test with SVTEST/


A unit test template can be generated by specifying the
SVTEST_END and FAIL_IF macros
filename of the UUT as an input to create_unit_test.pl. If
simple_model is defined in a file called simple_model.sv,
the corresponding test template is written to simple_model_ In xformation_test, an input transaction in_tr has it’s field set
unit_test.sv. to 2. The in_tr is then applied to the UUT via the put_port.
Subsequently, the transaction is retrieved from the get_port
and the out_tr.field is expected to be equal to 4 thereby
verifying the simple multiply-by-2 transformation.
Figure 2 - Creating a unit test template for simple_model

Notice the exit status of xformation_test is contingent on a


macro called `FAIL_IF. `FAIL_IF is one of several assertion
For unit tests to interact with the UUT, it must first be
macros included with the SVUnit framework and integrated
instantiated and integrated within the generated unit test
with the reporting mechanism.
template. The build() function in the template is used for
this purpose. Figure 3 shows the simple_model instantiation To run the unit test using Questa, runSVUnit is invoked
being connected to two FIFOs, one for the input and another from the command line.
for the output.

Figure 3 - Instantiating and integrating a UUT

36
TESTSUITES AND SVUNIT AT SCALE
To enable verification engineers to easily test all the
Figure 5 - Running unit tests with Questa
components in a testbench, SVUnit scales such that
multiple templates can be run within the same executable
using a single call to runSVUnit. For example, when a
Assuming simple_model performs the data transformation
simple_driver component is added to the testbench, a
as expected, a passing status is reported in the log file and
corresponding simple_driver_unit_test template can be
the simulation exits with a passing status.
created and run along with the simple_
model_unit_test template. With unit tests
running against both components, example
log output from a single simulation would
appear as in figure 9 (NOTE: simple_model
unit tests are labelled [simple_model_ut]
and simple_driver unit tests are labelled
Figure 6 - Passing log output [simple_driver_ut]).

Finally, SVUnit can also be used to simulate unit tests


If however, we have a bug in our simple_model and the data
spread through multiple directories, still within the same
transformation is not happening as expected, an error and
executable. In this case, all templates within a directory are
failure is reported and the simulation exits with a failing status.
grouped into a testsuite (NOTE: as of SVUnit
v3.6, testsuite names are derived from the
directory name). The output log then reports
pass/fail status for each testsuite, as well
as pass/fail status for the aggregate. For
example, in the case where component
templates are kept in one directory while
Figure 7 - Failing log output

Any number of tests can be written and run within the


unit test template. For example, if we add another_test
to the template, both tests are run sequentially and the
corresponding test status is included in the output log as
shown in figure 8. The overall exit status takes both tests
into account. If all tests pass, the exit status is pass. If any
test fails, the exit status is fail.

coverage classes are kept in another,


components are grouped into one testsuite
(__components_ts) while coverage classes
are grouped in another (__coverage_ts).
Figure 8 - Passing log output for multiple unit tests Testsuite names are reported in the output
log. Collecting and grouping unit test templates is handled
automatically by runSVUnit; no user intervention and no
extra command line switches are required.
37
Adding unit testing as a prerequisite
to subsystem and chip level
verification is a complement —
not replacement - to current best
practices like constrained random
verification. By first unit testing to
reduce defect rates, initial reliability
improves significantly. This is
particularly valuable when it comes to
verifying scoreboards and checkers
respond correctly to odd corner
cases you work so hard to generate
in the DUT. In testing them outside of
a production environment, verification
Figure 10 - Passing log output for multiple testsuites engineers can artificially drive those
corner cases into the checkers to see if they produce the
correct response.
Additional features useful as testbenches grow are the
ability to specify Verilog file lists and user specified With high reliability components unit tested in isolation,
command line arguments as well as the ability to specify the testbench “bring-up” proceeds much more quickly and
run directory. teams are meaningfully exercising a DUT much earlier
than they would be otherwise.

HOW UNIT TESTING COMPLEMENTS Aside from the addition of an open-source framework like
CURRENT BEST PRACTICES SVUnit and a simulator, there are no other tool or licensing
Before you can begin to collect meaningful coverage data requirements. Nor are teams required to replace existing
a testbench must be reliable. For example, in a highly practices. In short, unit testing is a cheap, low-risk, high-
randomized environment scoreboards and checkers are reward complement to existing best practices.
critical to ensuring that incorrect behaviors (coverage) are
observed and defects are found. When scoreboards and
checkers are not reliable, coverage data is thrown into LESSONS LEARNED FOR VERIFICATION ENGINEERS
question thereby threatening to undermine the quality of the Shortly after you start coding a project you need to verify
design-under-test (DUT). Testbench reliability, therefore, is your assumptions. You ask “does the code work more or
critical. less the way I think it should?” It can be difficult to answer
this question until you have a fair amount of code in
Commonly, our industry produces reliability through place. It’s important to create some confidence that your
application of a testbench in situ. That is, a testbench is code is viable before you write too much. You want to get
completed and integrated with the DUT then becomes some feedback on your code and avoid rework. You can
incrementally more reliable as bugs are found and fixed compile the code to make sure it is at least self-consistent
using black-box test scenarios against the DUT. Initially, from the compiler’s perspective. But that doesn’t tell you
testbench reliability is extremely low. This is especially true if anything actually works. You have to build some sort of
for complex constrained random testbenches. Testbench working example, which at the early coding stages can be
bugs are discovered frequently; getting through testbench time consuming and seem tangential to the work at hand.
“bring-up” (i.e. reaching the point at which a testbench is The alternative is to build a series of small programs to
reliable enough to properly drive and check test scenarios instantiate and exercise your classes. This can be a tedious
against the DUT) is time consuming and reliability improves exercise because you have to create all of the infrastructure
slowly. necessary to make a complete, working program for each

38
test. SVUnit can help with this phase of testing. It will SUMMARY
generate the infrastructure code so that all you have to do is Spending time on testing the units that comprise your
supply the interesting parts of the test. You can build a small testbench is time well spent. SVUnit, a SystemVerilog unit
test or two at first and keep adding on. testing framework built in the mold of JUnit, gives you the
tools to start writing unit tests fairly quickly and easily.
The availability of SVUnit becomes an encouragement
There’s no longer an excuse to avoid building unit tests. So
to write tests as you write the code. As you finish a tricky
what are you waiting for?
piece of code and want to know whether the search works
correctly or a loop terminates after the proper number of You can download the complete open source SVUnit
iterations you can quickly write a few lines or maybe a few framework, including examples, from sourceforge at http://
tens of lines of code to try it out. This greatly increases sourceforge.net/projects/svunit.
confidence in your new code and does not detract from the
coding effort.

As coding proceeds and you are getting closer to being ENDNOTES


done, the little tests that you have created serve as a
regression suite to let you know if you’ve screwed up 1 SVUnit download information and examples can
anything. When you do some refactoring to simplify code be found at http://www.agilesoc.com/svunit
you can see if you did it correctly or if you have left anything
out. Toward the end of the coding work you already have 2 Mentor Graphics® Questa®, Cadence® Incisive®, Aldec
a pretty good test suite in hand. You can review the tests RivieraTM PRO and Synopsys® VCS® are supported as of
you’ve written and fill in any holes you may have missed SVUnit version 3.6
-- unit-level functionality that has not yet been exercised in
3 simple_model and other examples are included
your test suite. Later when you go to build real examples
in the SVUnit release package
in which the classes will be used in concert to perform
complete operations you have a fairly high degree of
confidence that the unit-level functionality is in good shape.

One aspect that of SVUnit that can be annoying is that


the generate unit test skeletons assume you only need to
instantiate a single class -- the one for which you generated
the skeleton. If the class has dependencies, is part of a
library, for example, you will have to hand edit the first few
lines of the skeleton to include and/or import the pieces you
need to enable the class you are testing to compile and run.
In my work (Mark) this editing was fairly simple and took
only a few minutes for each unit test.

The real benefit of SVUnit is that it encourages you to


write unit tests. Writing unit-level tests is simple and
straightforward. The generator scripts create fairly
lightweight skeletons that can be easily filled in with
arbitrary unit tests. The early feedback about your code
that you get is so valuable that you find yourself wanting
to write tests instead of avoiding an otherwise tedious task.

39
Hardware Emulation: Three Decades of Evolution – Part II
by Dr. Lauro Rizzatti, Verification Consultant, Rizzatti LLC

THE SECOND DECADE A drawback of that technology not appreciated at the time
In the second decade, the hardware emulation landscape was potentially higher power consumption than in the FPGA
changed considerably with a few mergers and acquisitions approach for the same design capacity.
and new players entering the market. The hardware
In 1997, Quickturn introduced the Concurrent Broadcast
emulators improved notably via new architectures based
Array Logic Technology (CoBALT) emulator, based on the
on custom ASICs. The supporting software improved
IBM technology, that became known as processor-based
remarkably and new modes of deployment were devised.
emulator.
The customer base expanded outside the niche of
processors and graphics, and hardware emulation slowly In 1998, Cadence® purchased Quickturn and over time
attracted more and more attention. launched five generations of processor-based emulators
under the name of Palladium®. Two or so years later,
While commercial FPGAs continued to be used in
Cadence discontinued the FPGA-based approach, including
mainstream emulation systems of the time (i.e., Quickturn,
an experimental custom FPGA-based emulator called
Zycad and IKOS) four companies — three startups plus
Mercury Plus.
IBM — pioneered different approaches.
The idea of developing a custom FPGA targeted to
IBM continued the experimentation it started a decade
emulation came from a French startup by the name of Meta
earlier with the YSE and EVE. By 1995, it had perfected its
Systems1. Conceived as a programmable device similar to
technology, based on arrays of simple Boolean processors
an FPGA but customized for emulation applications, the
that processed a design data structure stored in a large
Meta custom FPGA would have been a poor choice as a
memory via a scheduling mechanism. The technology was
general-purpose FPGA. Its fabric included configurable
now applicable to emulation. While IBM never launched a
elements, a brilliant interconnect matrix, embedded
commercial product, in 1995 it signed an exclusive OEM
multi-port memories, I/O channels, a debug engine with
agreement with Quickturn that gave the partner the right to
probing circuitry based on on-board memories, and clock
deploy the technology in a new emulation product.
generators.
By then, Quickturn grew disappointed with the difficulties
The approach yielded three benefits:
posed by the adoption of a commercial FPGA in an
emulation system. To reach adequate design capacity, it • Easy setup time and fast compilation time
was necessary to interconnect many hundreds of FPGAs • Total design visibility without compilation
mounted on several boards. Partitioning and routing such • Scalability at the increase of design size
a huge array of FPGAs became a challenging task, with
setups in the order of many months. Design visibility had
to be implemented through the compilation process that In fact, the Meta custom FPGA provided the same
competed for routing resources with the DUT, and killed benefits of the processor-based approach with less power
fast design iterations. Finally, the system did not scale consumption.
linearly at the increase of design size, suffering significant
performance drops. The processor-based approach was not unique to IBM.
It was also used by Arkos, a startup with a lifespan of a
The IBM technology promised to address falling star in a clear August night. After being acquired by
all of these shortcomings: Synopsys® in 1996, it was sold soon after to Quickturn.

• Very slow setup and compilation time In the course of the second decade, significant progress
• Rather poor debugging capabilities was made in several aspects of the hardware emulator. For
• Significant drop in execution speed example, by the mid-2000s, design capacity increased more
at the increase of design size than 10-fold to 20+ million ASIC-equivalent gates in a single

40
chassis. By then, all vendors supported multi-chassis into the design at run-time without requiring compilation.
configurations that expanded the total capacity to well This led to very fast iteration times.
over 100 million gates. Speed approached the threshold
of 1MHz. Multiple concurrent user capabilities began to The cost of emulation decreased on a per-gate basis
show up in datasheets. by 10X.

Major enhancements were made in the supporting By the turn of the century, it seemed that emulators built
software. The compiler technology saw progress across on arrays of commercial FPGAs were destined for the
the board. The two popular HDL languages, Verilog and dust bin. But two startups proved that premise to be false.
VHDL, were supported. Synthesis and partitioning were
Although only a few years had passed since Quickturn’s
improved.
dreadful experience with commercial FPGAs, a new
New modes of deployment were concocted, in addition breed of FPGAs developed by Xilinx and Altera changed
to ICE. It was now possible to connect an HDL testbench the landscape forever. Fully loaded with programming
running on the host PC to a DUT mapped inside the high- resources and enriched with extensive routing resources,
speed emulator. This approach leveraged the existing they boasted high capacity, fast speed of execution and
RTL/HDL testbench and eliminated the need for external faster place & route time. The Virtex® family from Xilinx
rate adapters, necessary to support ICE. It became also included a read-back mechanism that provided
known as simulation acceleration mode. As good as it full visibility of all registers and memory banks without
sounded, it traded speed for flexibility. The weak link requiring compilation. This capability came at the
was the PLI interface between the simulator in charge expense of a dramatic drop in speed during the read-
of the testbench and the emulator in charge of the DUT. back operation. All of the above were a windfall for two
Typically, the acceleration factor was limited to a low new players.
single digit.
In 1999, Axis4, a startup in Silicon Valley led by
To address this drawback, IKOS2 pioneered a new entrepreneurs from China, introduced a simulation
approach called transaction-based acceleration or TBX3. accelerator based on a patented Re-Configurable
TBX raised the abstraction level of the testbench by Computing (RCC) technology that provided accelerated
moving the signal-level interface to the emulated DUT simulation. The technology was implemented in an array
within the emulator and introducing a transaction-level of FPGAs called Excite. This was followed by an emulator
interface in its place. The scheme achieved up to a million built on the same technological foundation with the name
times faster execution speed, and simplified the writing of Extreme. Extreme became successful for the ability
the testbench. to swap a design from the emulator onto a proprietary
simulator to take advantage of the debugging interactivity
Another mode of deployment, called targetless emulation, of the simulator. This feature was called Hot-Swap.
consisted of mapping the testbench together with the
DUT onto the emulator. By removing the performance On the other side of the Atlantic, a French startup
dependency on the software-based testbench executed named Emulation Verification Engineering (EVE)
on the host PC, it was possible to achieve the maximum led by four French engineers who left Mentor Graphics
speed of execution allowed by the emulator. The caveat in 2000 developed an emulator implemented on a
was that the testbench had to be synthesizable, hence the PC card with two of the largest Xilinx Virtex-6000/8000
name of Synthesizable Testbench (STB) mode. devices. The product name was ZeBu for Zero-Bugs.
The implementation did not support ICE. Instead,
Debugging also improved radically. One of the benefits it promoted transaction-based emulation based on a
of the processor-based emulators as well as of the patented technology called “Reconfigurable Testbench”
custom FPGA-based emulators was 100% visibility (RTB). The team also harnessed the read-back feature

41
of the Virtex devices to implement 100% design visibility
at run-time without compilation. As mentioned, the
drawback was a drop in performance during the reading
process.

Architectures Arrays of Processors, custom FPGAs, Commercial FPGAs


Total Design Capacity Over 100 million gates (*)
Deployment Modes ICE –– Simulation Acceleration –– TBX –– STB
Speed of Emulation Up to 1MHz
Time to Emulation Up to 30MG/hour (**)
Ease of Use Medium
Deployment Support Limited
Concurrent Users Yes, max number dependent on the emulator
Dimensions Similar to small home refrigerators
Reliability (MTBF) Several weeks
Typical Cost 10 cents/gate

(*) Based on multi-box configurations


(**) Requirements: A single PC with processor-
based emulator; PC farms with FPGA-based emulators

By the end of the second decade, for the first time


hardware emulation was being considered by companies
outside its traditional core use of processors and graphics
designs. Now designs in fields as different as embedded
processors, networking, storage, video, multimedia, etc.,
started to adopt hardware emulation.

ENDNOTES
1. Meta Systems was acquired by Mentor Graphics
in 1996.
2. IKOS was acquired by Mentor in 2002.
3. Today, different vendors call it Transaction-based
verification (TBV) or transaction-based acceleration
(TBA).
4. Axis was acquired by Verisity on November 16, 2004.
Three months later, Cadence purchased Verisity.

42
Accelerating RTL Simulation Techniques
by Lior Grinzaig, Verification Engineer, Marvell Semiconductor Ltd.

Long simulation run times are a bottleneck in the


count++;
verification process.
wait (ARREADY == 1);
A lengthy delay between the start of a simulation run and end
the availability of simulation results has several implications: end

• Code development (design and verification) and the


debug process are slow and clumsy. Some scenarios The above code is a trivial way to think about how to
are not feasible to verify on a simulator and must be implement the counting code. Notice, however, that
verified on faster platforms — such as an FPGA or during the time the clock is toggling and no transactions
emulator, which have their own weaknesses. are present on the bus, the if condition is unnecessarily
• Long turn-around times. checked, over and over again.
• Engineers must make frequent context-switches, which
can reduce efficiency and lead to mistakes. Now consider the following adjusted code:

Coding style has a significant effect on simulation run times. initial begin
Therefore it is imperative that the code writer examine forever
his/her code, not only by asking the question “does the begin
code produce the desired output?” but also “is the code wait (ARVALID == 1)
economical, and if not, what can be done to improve it?” count++;
The following discussion presents some useful methods for wait (ARREADY == 1);
analyzing code based on these questions. @(posedge clk);
end
end
MICRO CODE MODIFICATIONS
Sensitivity Lists/Triggering Events
The key thing to remember about a sensitivity list at an You can see that this code, although less trivial, functionally
always block or a trigger event at a forever block is that counts the same thing, but much more efficiently, with
when the trigger occurs, the simulator starts to execute respect to the number of calculations needed.
some code. This is trivial, of course, but by asking the
Asynchronous example:
second question — instead of only the functional one —
The following is taken from actual code found inside one
engineers can make the code more economical. In this
of our IP. It is a BFM code of an internal PHY. For this
sense, it is desirable to determine when a signal can be
example, the code has been edited to use only eight
exempt from the sensitivity list or which event should be
phases; the original code included 128 phases.
chosen for triggering.

Synchronous example: wire IN0 = IN;


Consider the following example (counting the number wire #(25) IN1 = IN0;
of transactions over a synchronous bus): wire #(25) IN2 = IN1;
wire #(25) IN3 = IN2;
always @(posedge clk) wire #(25) IN4 = IN3;
begin wire #(25) IN5 = IN4;
if (ARVALID == 1) wire #(25) IN6 = IN5;
begin wire #(25) IN7 = IN6;

43
always @(*) wire #(150) IN6 = IN && ({DELAY_SEL_IN2, DELAY_
begin SEL_IN1, DELAY_SEL_IN0}==’d6);
case ({DELAY_SEL_IN2, DELAY_SEL_IN1, wire #(175) IN7 = IN && ({DELAY_SEL_IN2, DELAY_
DELAY_SEL_IN0}) SEL_IN1, DELAY_SEL_IN0}==’d7);
4’d0 : OUT = IN0 ;
4’d1 : OUT = IN1 ; always @(*)
4’d2 : OUT = IN2 ; begin
4’d3 : OUT = IN3 ; case ({DELAY_SEL_IN2, DELAY_SEL_IN1,
4’d4 : OUT = IN4 ; DELAY_SEL_IN0})
4’d5 : OUT = IN5 ; 4’d0 : OUT = IN0 ;
4’d6 : OUT = IN6 ; 4’d1 : OUT = IN1 ;
4’d7 : OUT = IN7 ; 4’d2 : OUT = IN2 ;
endcase 4’d3 : OUT = IN3 ;
end 4’d4 : OUT = IN4 ;
4’d5 : OUT = IN5 ;
4’d6 : OUT = IN6 ;
Examining this code carefully shows that for each change 4’d7 : OUT = IN7 ;
of IN, the always block is invoked eight times. This is due endcase
to the cascading changes of the INx signals: IN0 changes end
at “t” invoke the always block that initially processes the
case logic; then IN1 changes at “t+25” and invokes the
always block again, and so on, until IN7 invokes it at “t+175”. Based on the assumption that the delay configuration is not
Remember that the code originally supported 128 phases, reconfigured simultaneously with the modules’ functional
so for each change there were 128 invocations. The case data flow, we have reduced the code complexity to M*N.
itself was composed of 128 options, and this module was Actually, if we do not care in our simulation about the
implemented on every bit of the PHY’s 128-bit bus! “analog delay” on the bus, we can simply write OUT=IN and
reduce the complexity to M only.
This resulted in a complexity magnitude of ~M*N2 (where
M is the number of bits in the bus, and N is the number of This simple code change alone accelerated our full-chip
phases). tests (SoC of ~40M gates) by two!

Now, consider this adjusted code: Wrong or inefficient modeling:


The following code is a small part of a memory model.
wire IN0 = IN && ({DELAY_SEL_IN2, DELAY_
SEL_IN1, DELAY_SEL_IN0}==’d0); always @( negedge rst_n)
wire #(25) IN1 = IN && ({DELAY_SEL_IN2, DELAY_ begin
SEL_IN1, DELAY_SEL_IN0}==’d1); if (!(rst_n))
wire #(50) IN2 = IN && ({DELAY_SEL_IN2, DELAY_ for (i=0; i<(2<<20); i++)
SEL_IN1, DELAY_SEL_IN0}==’d2); mem[i]=0;
wire #(75) IN3 = IN && ({DELAY_SEL_IN2, DELAY_ end
SEL_IN1, DELAY_SEL_IN0}==’d3);
wire #(100) IN4 = IN && ({DELAY_SEL_IN2, DELAY_
SEL_IN1, DELAY_SEL_IN0}==’d4);
This example code seems fine, but it can actually be
wire #(125) IN5 = IN && ({DELAY_SEL_IN2, DELAY_
optimized as well.
SEL_IN1, DELAY_SEL_IN0}==’d5);

44
This is a large array and looping over one million entries it, forces can be used to override the normal behavior.
will take a long time. Fortunately, this time can be saved Again, this can be controlled by a parameter.
during the initial reset of the chip (before the memory is
filled) by masking the first reset negedge —as the array is In a design with a complex clock scheme, engineers may
already filled with zeros. Beyond that, however, a different try to find the best clock ratios that are relevant for that
approach can be applied. Using an associative array type of test. If the test depends on cores, it may help to
instead of a fixed array enables the array to be nullified increase the core clock frequency. If the test depends
with one command, instead of by using a loop. on DMA activity, the core frequency can be reduced
when the core is idle. It is good practice to choose ratios
that are used by default for most of the tests and make
MACRO CODE MODIFICATIONS adjustments to only specific ones.

Using Different Code for Development


and Debug PERFORMANCE ANALYZING TOOLS
In a design or a verification environment that is composed A problem with trying to optimize the performance of
of many different components, where not all of them have the simulator is that it is often impossible to know exactly
to be active all the time, it may be beneficial to eliminate where the bottlenecks are and what is slowing down
parts of the code that are not essential for the majority of the simulator. As shown in the above code examples,
the tests. sometimes even slight and negligible code can have
a big effect on the entire simulation.
If the code is a module in the design itself, a “stub” or a
simple BFM can be created to replace that module, and Performance analyzing tools, provided by the simulator
then a generate if block with an else option is added. vendors, are used to identify those parts of the code
Depending on the global parameter added during that consume the most cycles of the simulator. As a side
simulation time, it will generate the real code or the note, since there is a correlation between the usage
economical code. Engineers can decide when to use of the simulator calculation resources and the power
which option. If they want a specific test to always use the consumption of the chip, it is sometimes even possible to
economical code, it is added to the regular run command. find design bugs in the early stages of the project.
Alternatively, if they want to support an economic model
during “debug mode” only, the parameter is set only when Simulation Phases
debugging or developing, but not when running the full When using analysis tools, it is best to perform different
regression. analyses for the different stages of the simulation (i.e.,
different time-frames). These stages include the out-of-
If the code is not in the design, engineers can use the reset phase, the configuration phase, and the run phase
generate if method as well, or simply add a parameter that (which can be further sub-divided). Using small time
interacts with the testbench component directly to disable frames produces more accurate analysis per simulation
the component. Alternatively, if the code is of a class type, stage, since different parts of the design are active at
the parameter should be used to prevent its creation. these different stages; thereby consuming different
simulator resources. Conversely, examining the simulation
For example, in the universal verification methodology run globally makes it harder to analyze the design for
(UVM), the configuration object of an agent can hold anomalies.
information indicating whether the agent should skip the
creation of a monitor. Acceleration Measurements
After finding the critical components in the code that
System Modes affect simulation time and finding the right solution for
In some cases, leaving some modules in a reset state, them, it is recommended to measure the benefit from
or with no clock, is a valid design mode. Even when it is those optimizations. Here are some tips regarding
not a valid system mode, if the test(s) are not affected by those measurements:
45
• When measuring, know exactly what is being measured.
For example, when measuring simulation run time,
do not include the time required to load the simulator
software and do not include the time taken to load the
design code into the simulator. Those times may be
important, and may be optimized as well, but they are
irrelevant for this type of calculation.
• Make true comparisons. Compare A to A, not A to B.
• Random elements. Usually, engineers disable
random elements by using the same random seed
to simulate the same scenario. However, there
are cases where a change itself is the cause of
a different random generation result. For such
cases, using the same seed is not recommended;
instead, continue using random seeds along with
a statistical analysis method as described below.
• Statistical analysis. The run-time can be affected
by other things beyond the content of the code; for
example, server usage by other processes or data
in the server’s cache. For a good comparison, run
the compared test several times (20–30), with and
without the change, and compare the average
times. Also check that the standard deviation
is reasonable (around 20% of the average). If
there are abnormal measurements, restart the
process or throw away specific “off the chart”
measurements that may be the result of some
specific server problem.

CONCLUSION
Slow simulations are not necessarily decreed by fate.
Engineers as well as managers should pay attention to the
importance of coding economically for simulation as well
as the different ways to analyze simulations and tackle
simulation bottlenecks.

46
Emulation Based Approach to ISO 26262
Compliant Processors Design
by David Kaushinsky, Application Engineer, Mentor Graphics

I. INTRODUCTION 2. Drive-by-Wire Systems


All types of electronic systems can malfunction due to with Mechanical Backup
external factors. The main sources causing faults within Examples include electric power steering, electronic braking
electronic components are radiation, electromigration and and throttle systems. While these systems are electronically
electromagnetic interference. The evaluation of a fault- controlled and operated, there is still a mechanical backup
tolerant system is a complex task that requires the use of system if an electrical problem develops which makes the
different levels of modeling. Compared with other possible system fail-safe.
approaches such as proving or analytical modeling, fault
3. Drive-by-Wire Systems
injection is particularly attractive.
without Mechanical Backup
Fault injection is a key requirement of functional safety The successful use of fly-by-wire systems in aviation along
standards like the Automotive ISO 26262 and is highly with the positive experience of drive-by-wire systems with
recommended during hardware integration tests to mechanical backup for braking and power steering have led
verify the completeness and correctness of the safety to the development of complete drive-by-wire systems that
mechanisms implementation with respect to the hardware reduce the cost of a vehicle.
safety requirements. 1, 2
4. Higher-Level Automotive Control
This article reviews the use of processors in the automotive These are systems that may influence several other
industry, the origin of faults in processors, architectures basic systems on the vehicle. As an example, adaptive
of fault tolerant processors and techniques for processor cruise control allows a vehicle to maintain a certain gap
verification with fault injection. between itself and the vehicle in front, or to maintain a
speed previously set by the driver by controlling both
We then propose an emulation based framework for the engine and the brakes.
performing fault-injection experiments on embedded
processor architectures.
III. THE ORIGIN AND MITIGATION
OF FAULTS IN PROCESSORS
II. THE USE OF ELECTRONICS The main causes for faults within electronic components
IN THE AUTOMOTIVE INDUSTRY are radiation, electromigration and electromagnetic
The integration of electronics in automobiles began interference. 4
during the 1970’s and became well established by the
1980’s. Today’s top-of-the-line vehicle uses over 80 1. Radiation Effects on Integrated Circuits
microcontrollers, and an even greater number of power One source of alpha particles is from the radioactive
semiconductors and smart power ICs. 3, 4 impurities found mainly in the package materials and to a
lesser extent in the materials used for the fabrication of the
1. Electronic Driver-Assisting Systems semiconductor device, with uranium and thorium having
In these cases the existing mechanical systems are the highest radioactivity among them. The second source
supported by electronics. Examples include antilock braking comes from extraterrestrial cosmic rays that bombard the
system, traction control system, electronic stability program earth’s surface. The cosmic rays mainly consist of protons,
and brakes assist. This is a fail-safe design which means neutrons, pions and muons with different energies. Cosmic
that at least a basic part of the system’s functionality is rays also include particles that originate from the sun, with
provided in case the electronic system fails. relatively low energies. When they penetrate the Earth’s
atmosphere, some start reacting with other particles of the
atmosphere thus the levels of radiation depend heavily on

47
the altitude. Approximately 1% of the cosmic ray’s neutrons IV. ARCHITECTURE OF
reach the Earth’s surface and they present a very wide FAULT TOLERANT PROCESSORS
energy spectrum. Processors for Fault Tolerant applications are typically
required to achieve the following targets: high performance,
2. EMC for Integrated Circuits
low cost, low power dissipation, and reliability. The problem
There are two main issues concerning electromagnetic
is that most available processors and integrated systems-
compatibility (EMC) and ICs. The first is the electromagnetic
on-chip achieve only some of the targets and fail on others.
energy emission of ICs while they are operating and the
This is indicated below when relative advantages and
second is the susceptibility of ICs to electromagnetic waves
disadvantages are listed, and exemplified in later sections.
from the operational environment. With the ever increasing
The following sections survey different industry approaches
use of electronic systems the electromagnetic environment
to this tradeoff. 5
becomes more complex, making the EMC requirements of
systems more challenging to meet. 1. Radiation Hardened (RH) Processors
In this approach processors are fabricated on dedicated
Crosstalk between the metal lines within the chip is also a
RH processes. Advantages include: High tolerance to
significant source of errors, especially in multi-layered chips.
radiation effects, thanks to the RH process. In some cases,
3. Electromigration such processors achieve high performance. This can
Over a period of time the flow of electrical current through be especially true when using custom design methods
metal tends to displace metal ions. In some places voids similar to those employed for the design of COTS high-
open up in the wires leading to open circuits and in other performance processors. This approach can offer high
places ions are deposited causing shorts. This phenomenon level of integration, including the inclusion of special I/O
is known as electromigration. controllers dedicated to fault tolerant applications.

4. Impact of New CMOS Technologies Disadvantages of using RH processes include: High


The stored charge that is used within digital circuits to cost—RH processes have limited use and the high price
represent data has decreased dramatically in recent years of modern fab; Not widely available—there are only
as a direct result of the decreased power supply. This about two RH fabs in the USA and no similar advanced
means that the critical charge has also decreased with a processes elsewhere. Use of the RH processes in the USA
negative impact on the error susceptibility to particle hits. is International Traffic in Arms Regulations (ITAR) controlled
and is not widely available to non-USA customers; Lags
As frequency increases, the errors observed will be several generations behind commercial off-the-shelf (COTS)
dominated by transient faults originating from combinational processors, in terms of performance and power—typical
logic rather than single event upset (SEU) on sequential RH processes are based on 150nm CMOS, while high-end
logic. The increasing frequency will also tend to increase the COTS processes belong to the 28nm generation, about
occurrence of multiple-bit errors, since the duration of the six processing generations more advanced; SEU rate
transient pulse may overlap more than one clock edge. getting worse—the RH process enables a fixed SEU per
bit but as the chips become more advanced and contain
Higher currents flow through the power supply more memory and more flip-flops, the total SEU per chip is
lines, consequently increasing the susceptibility to higher.
electromigration. Furthermore, the increased number of
metal layers makes crosstalk between the interconnection 2. Radiation Hardening
lines more probable as the distance between them by Design (RHBD) Processors
decreases. In this approach radiation hardness is achieved by design
techniques in the layout, circuit, logic and architecture
areas, hence the name. Advantages include high tolerance
to radiation effects, medium cost—more expensive

48
than COTS processors, mostly due to low production are designed to boost performance. Forcing two such
quantities and high cost of qualification, but at the same processors to execute in lock-step every clock cycle may
time, they are less expensive than RH processors thanks require significant slowdown of the processors.
to using a regular commercial fabrication process. Finally,
RHBD processors can offer high integration since they 5. Triple COTS: TMR at the system level
are designed as ASIC and since typically the CPU itself Triple Modular Redundancy (TMR) architectures combine
takes only a small portion of the silicon die. three COTS processors and voting logic. The processors
do not need to be stopped on SEU. TMR offers high
Disadvantage is RHBD processors are usually slower performance and high SEU tolerance.
than COTS processors since they are designed as ASIC
chips and not as custom processors. Disadvantages of TMR is high cost, requiring large area
and power, as well as special hardware for voting and
3. Single COTS Processor usually additional hardware and software for recovery
with Time Redundancy (SIFT) from internal SEU errors (inside the processors) that
In this approach, a single COTS processor is used cannot be fixed by voting and require scrubbing or reset.
together with Software Implemented Fault Tolerance
(SIFT), which executes the entire software or certain 6. TTMR on COTS VLIW processors
software sections twice or more. There are two levels COTS VLIW processors execute multiple instructions
of granularity: Instruction level redundancy, where each in parallel, and the parallel instruction streams are pre-
instruction is executed twice and additional instructions programmed. Each instruction can be executed three
compare the results, requiring compiler transformation times and the results can be compared and voted, all
of the code, and procedure level redundancy, where the within the same VLIW processors. TTMR offers high
programmer writes the code to invoke certain procedures performance (in fact, TTMR processors are the fastest
twice, compare the results and use software for recovery available space processors today) and high SEU
in case of mismatch. The latter approach may also require tolerance, thanks to embedded TMR mechanism, but it is
some additional hardware to protect the critical data and expensive, is limited to VLIW processors, and is hard to
the critical software. The main advantage of this approach generate code for. The code executes two copies of an
is that it is relatively inexpensive. instruction, compares the result, on mismatch executes
the same instruction the third time and compares for
Disadvantage is the major performance penalty majority voting.
due to the computational overhead.

4. Duplex COTS: DMR V. PROCESSOR VERIFICATION


This architecture employs two equal COTS processors WITH FAULT INJECTION TECHNIQUES
(aka Dual Modular Redundancy, DMR), a matching Fault injection is the deliberate change of the state
hardware, and software for recovery from mismatches. of an element within a computer system. FI is critical
There is no voting as there are only two copies of in the development of fault tolerant systems. They are
execution. On mismatch, computation is cancelled mainly used to assess the effectiveness of fault and
and repeated by software control. DMR offers high error detection mechanisms and to help predict the
performance and is relatively inexpensive. system’s error rate. 6

The disadvantages are that DMR requires special 1. Software Implemented


hardware and software for matching and recovery, Verification (SWIFI)
and that modern COTS processors are sometimes The use of SWIFI tools has been very popular mainly
unpredictable at the clock cycle level, due to methods because they are easy to implement and adapt to a target
of internal branch speculation and other algorithms that system. They are also cost-effective since they do not

49
require extra hardware. Furthermore they are usually
Techniques Advantages Disadvantages
fast since they do not introduce significant delay in the
execution of the target applications. Physical Level, • High time- • Limited set of
(cont.) resolution injection points
for hardware and limited set of
2. Simulation Based Verification
triggering and injectable faults
Simulation is favored since it allows the testing of fault monitoring • Requires special
tolerant systems very early in the design stage. If a HDL • Well suited for hardware
description of the system is available, testing through the low-level fault • Debug is hard
simulation can be performed in great detail and it is models • Limited coverage
potentially very accurate since it gives realistic emulation of • Not intrusive
• No model
faults and detailed monitoring of their consequences on the
development
system. or validation
required
3. Physical Level Validation • Able to model
Injection of physical faults on the actual target system permanent faults
hardware can be achieved through pin-level fault injection, at the pin level
heavy-ion radiation, electromagnetic interference and laser
fault injection. The major advantage of these approaches Software • Can be targeted • Limited set of
to applications injection instants
is that the environment is realistic and the results obtained
and operating • It cannot inject
can give accurate information on the behavior of the system systems faults into
under such conditions. • Experiments can locations that are
be run in near inaccessible to
4. FPGA Based Verification real-time software
This technique can allow the designer to study the actual • No specific • Require a
behavior of the circuit in the application environment, hardware instrumentation of
taking into account real-time interactions. However, when • No model the source code
development • Limited
an emulator is used the initial VHDL description must be
or validation observability and
synthesizable required controllability
• Can be • Difficult to model
5. Emulation Based Verification expanded for new permanent faults
Emulation enables pre-silicon fault injection and debug at classes of faults
hardware speeds, using real-world data. The scenarios of
real-time software and hardware fault injection debug with Simulation • Support all • Slow
abstraction levels • Model is not
simulation-like visibility are achieved.
• Non-intrusive. readily available.
• Full control of • No real-time faults
both fault models • Coverage is
Techniques Advantages Disadvantages
and injection limited
Physical Level • Fast • Risk of damage to mechanisms
• Can access system under test • Does not require
locations that • Low portability any special-
are hard to be and observability purpose hardware
accessed by other • Maximum
means observability and
controllability

50
Techniques Advantages Disadvantages VI. A FRAMEWORK FOR
Simulation, (cont.) • Allows performing VELOCE® BASED FAULT INJECTION
reliability We suggest the following scheme for implementing a
assessment at generic fault injection system using the Veloce emulator.
different stages
in the design 1. Faults, Errors, Failures
process
• Able to model a) Fault—A fault is a deviation in a hardware or
both transient and software component from its intended function.
permanent faults
Faults can be categorized into permanent and
transient faults by their duration.
FPGA Prototype • Injection time • High effort of
is faster than partition and b) Error—An error is the manifestation of a fault
simulation based synthesis and on the observed interfaces.
techniques limited signal c) Failure—A failure is defined as the deviation
• The visibility resulting of the delivered service from the specified service.
experimentation in long debug
time can be cycles
reduced by • Intrusive
implementing instrumentation
the input pattern techniques
generation in • Testing functional
the FPGA; these behavior of
patterns are injected fault.
already known • Unanticipated
when the circuit behavior analysis
to analyze is is hard
synthesized

Emulation • Support most • Requires special


abstraction levels hardware and
• Supports Netlist expertise
• Real-time faults • Analog design
• Full coverage is is not directly
achievable supported
• Mostly Non- • Fault injection
intrusive TB overhead Figure 1: Fault, errors, system failures
• Full control of can hamper
both fault models acceleration
and injection performance
2. Flow
mechanisms
• Maximum
1. Setup phase
observability and
controllability a. Determine an injection fault distribution function:
• Allows performing Uniform Random (UR), Activity Based Random
reliability (ABR), and Manual Direct (MD).
assessment at b. For ABR:
different stages i. Run a test and measure activity
in the design
with the Switching Activity Interchange
process
Format (SAIF).
ii. Extract subset of FF’s for fault injection.

51
c. Create a Fault Injection DB. VII. CONCLUSIONS
d. Instantiate in top with a Golden Model (GM) DUT Reliability and safety are of major importance to the
and Fault Injected Model (FIM) DUT. introduction of automotive drive-by-wire ISO 26262
2. Emulation Phase compliant systems. Their required high safety integrity
a. Run a Golden Model (GM) and capture all interface necessitates that all electronic components will be fault
signals to log. tolerant with regard to failures in electronic hardware and
b. Run Fault Injected Model (FIM) and capture software. Fault-tolerant processors properties can be
interface signals to log. obtained primarily by static or dynamic redundancy, leading
c. Post process compares GM vs FIM. to systems that are fail-operational for at least one failure.
3. Evaluation phase
a. Analyze results. The comparison of different fault injection techniques
b. Create reports. leads to the conclusion that Emulation based approach
has key advantages for achieving the goals required for
fault-tolerant tolerance.

52
VIII. REFERENCES
1. ISO 26262 Road vehicles – Functional safety – Part
5: Product development: hardware leve
2. ISO 26262 Road vehicles – Functional safety – Part
10: Guideline
3. E. Touloupis, ”A fault tolerant microarchitecture for
safety-related automotive control” , A Doctoral Thesis,
2005 https://dspace.lboro.ac.uk/2134/14402
4. R. Isermann, R. Schwarz, and S. Stolzl. Fault-tolerant
drive-by-wire systems. IEEE Control Systems
Magazine, 22(5):64–81, Oct 2002.
5. R. Ginosar, “A survey of processors for space,” in
Data Systems in Aerospace (DASIA). Eurospace, May
2012.
6. H. Ziade, R. Ayoubi, and R. Velazco, “A Survey on
Fault Injection Techniques,” The International Arab
Journal of Information Technology”, Vol. 1, No. 2, pp.
171-186,2004, July 2004

53
Resolving the Limitations of a Traditional VIP for PHY Verification
by Amit Tanwar and Manoj Manu, Questa VIP Engineering, Mentor Graphics

Because of the complexities involved in the entire design A comprehensive PHY verification plan must verify all
verification flow, a traditional Verification IP (VIP) tends of the PHY functionality in various conditions. However,
to overlook the subtle aspects of the physical layer (PHY) a traditional VIP tends to miss out verifying all the
verification, often leading to costly debug phases later in the functionality.
verification cycle.
Note: In this article, the PHY features are described in
In addition, because of the several possible topologies in the context of PCI Express and USB protocols. However,
a PHY implementation, completely exercising the role and in terms of the PHY verification methodology, this article
related functionality of a PHY becomes challenging for a applies to all serial protocols that use a common PHY.
traditional VIP.

Furthermore, the analog signaling and the homologous PHY VERIFICATION ENVIRONMENT
functionality of the physical layer in serial protocols, led the Usually, a PHY verification environment requires one bus
industry to define a common PHY that multiple protocols functional model (BFM) at the serial interface and another at
could use and that segregates the PHY logic from that of the PIPE interface.
the general ASIC. One such common PHY is used in PCI
Express, USB 3.0 and 3.1, and SATA protocols. Similarly, In the following figure, the VIP acts as the USB host or the
M-PHY is used in SSIC, M-PCIe, and LLI protocols, among PCIe RC at the PIPE interface and as the USB device or
others. the PCIe EP at the serial interface.

This article describes the limitations of a traditional VIP for


PHY verification, which can typically be resolved using an
exclusive PHY verification kit.

The common PHY found in PCI Express, USB 3.0 and 3.1,
and SATA devices help accelerate development of these
devices by implementing the physical layer functionality as
a discreet IC or macro cell, which can be easily included in
ASIC designs.
However, in the following figure, the connections are flipped.
In bus-based layered protocols, PHY typically provides the
following functionality:

• Various serial data transmission rates


• 8, 16, or 32-bit parallel interface to transmit and receive
data
• Recovery of data and clock from the serial stream
• Holding registers to stage transmit and receive data
• Direct disparity control to transmit compliance patterns
• Various encode/decode and error indications
• Receiver detection
• Beacon transmission and reception Note: The VIP refers to a model that has the BFM, stimulus
• Low Frequency Periodic Signaling (LFPS) transmission generator, coverage collector, and a protocol checker.
• Selectable Tx margining, Tx de-emphasis, and signal
swing values For a comprehensive verification of PHY, the testbench
• COMINIT and COMRESET transmission and reception must enable the same stimulus to run in both cases (figure 1
• Multi-lane de-skew and figure 2) without any required changes in the testbench.
54
The limitations of a traditional VIP for PHY verification 1) The data pin for a 16-bit PIPE with four lanes can be
can be resolved with a comprehensive and exclusive PHY declared as a single pin of 64 bits or four separate
verification project comprising the following phases: wires of 16 bits each. In a case where the data pin is
declared as a single pin of 64 bits, it is confusing to
1) Initial Phase find out whether the [15:8] index represents the first
a. Pin Connections byte of the second lane or the second byte of the first
b. Configuration lane. So a wrong connection, in this case, might lead
c. Link Up and First Transaction the connection to link up on a fewer number of lanes
2) Verification Phase than required. Debugging this issue at a later stage in
a. Test Plan the verification cycle might cause unnecessary time
b. Test Case and efforts.
c. Test Run 2) The signal width varies across different versions
d. Debug of the PIPE specification. For example, TxDeemph
3) Regression Phase is an 18-bit signal in the latest specification, while
a. Regression Environment it was a 1-bit signal in an earlier specification. So a
b. Coverage Metrics common mistake, in this case, is to leave the signal
unconnected or to a supply bit-wise OR of 18-bit
TxDeemph to the PHY.
INITIAL PHASE 3) On the PIPE interface, some signals are shared,
A traditional VIP focuses more on the protocol verification while some are per-lane. For example, the
and provides connections for the specific configurations PowerDown signal width is 3-bit in the latest
of the serial and PIPE mode. Moreover, because of the specification while it was 2-bit in an earlier
variety of configurations possible when setting up the specification. In this case, issues in the connection
pin connections, a traditional VIP tends to ignore the link up may occur if this signal is per-lane in the DUT
intricacies of an accurate PHY connection. This is a and is declared as a segregated signal for all lanes. A
problem in the following integral areas of a design: similar issue could occur for the Rate signal, which is
2-bit in the latest specification, while it was 1-bit in an
• Protocol specification version
earlier specification.
• PIPE specification version
4) PHY can generate the clock or, alternatively, the
• ECNs supported by PHY
testbench can supply the clock externally. However,
• Signals supported by PHY
a problem occurs when either of the devices misses
• Width of each signal
the clock connection or there are multiple drivers
• Clock frequency required by PHY
on the clock. Generally, the GEN1 clock is supplied
• Reset mechanism
externally. When the connection speed changes, the
clock remains GEN1 and causes problems.
The PHY verification kit removes these limitations
5) Leaving reset unconnected, keeping reset always on,
by focusing on PHY connections for all possible
or keeping the reset duration too short or too large are
configurations. This minimizes the risk of issues that arise
some of the common issues in the reset connection.
out of inaccurate connections.

The following are a few examples of the error-prone Note: A connection file in the PHY verification kit contains
scenarios that might be encountered when setting up special notes for the signals whose width changed across
the pin connections: specification versions. These special notes provide
guidelines to connect a signal that requires particular
attention.

55
There can be many other error-prone scenarios depending Configuring or debugging issues because of an incorrect or
on the design being verified. Using a PHY verification kit unset PHY configuration requires considerable time later in
minimizes the occurrence of these error-prone scenarios the verification cycle. Therefore, it is always recommended
but cannot eradicate all the issues completely. to set the correct configuration right at the beginning. The
PHY verification kit enables engineers to set all the relevant
The PHY verification kit enables engineers to focus on the PHY configurations right at the beginning.
main configuration problem in PHY verification, which is to
enable VIP features that are not supported in PHY. After the engineer finishes connecting the pins and setting
up all the relevant configurations, the next objective is
The following are a few examples of the error-prone to ensure the connection link up and initiate the first
scenarios that might be encountered during configuration: transaction. More often than not, most link up issues occur
in the initial PIPE signal handshaking; for example, the
1) The VIP is configured for a clock frequency transition
receiver detection.
when the PHY supports changing the PIPE width during
speed change. This configuration causes deadlock. The following are a few examples of the error-prone
2) In the PIPE width change configuration, the engineer scenarios that might be encountered during the link up
forgets to set the initial PIPE width at which the link up process:
must occur.
3) Generally, MAC performs data scrambling on the 1) No receiver detection initiative from MAC occurs
PIPE interface. However, if PHY is also configured for because PHY configured the Phystatus signal as low.
scrambling, this configuration leads to unrecognized 2) PHY requires another receiver detection attempt.
data at the other end. 3) PHY does not respond to the receiver detection attempt
because of a reset issue.
4) PHY does not respond to the change in the
PowerDown signal.
5) Timing of the RxElecIdle signal is not correct, which
causes even the valid packet to be truncated in the
middle.
6) During the first equalization process, PHY does not
respond because of incorrect co-efficient values.
7) During the first speed change, PHY does not respond
to the change in the Rate signal from the MAC.
8) PHY is unable to perform a speed change in the
recovery speed because of a short timer value.

4) While configuring the loopback master, the engineer


configures both VIPs at the serial and PIPE interface The PHY verification kit provides a comprehensive
as the loopback masters. Generally, only the VIP at the troubleshooting note that helps users debug several issues
PHY side is configured as the loopback slave. The VIP in the PHY link up process and also provides timing details
at the other side must be configured as the loopback for each signal.
master. A wrong configuration in this case causes
problems in the loopback state.
5) Generally, the LTSSM timers are configured according VERIFICATION PHASE
to the protocol specifications. A faster simulation can be
In the verification phase, the real verification process
achieved by adjusting the timer values according to the
begins. Engineers follow a test plan, write the test cases,
PHY requirements.
and finally run the test cases.

56
3) To create some frequency difference between
the serial and PIPE clock and check whether PHY
performs the SKP addition deletion.
4) To check for the receiver underflow or overflow
situation by creating frequency differences and
stopping the SKP signal from the serial side.
5) To configure the serial side to perform polarity
inversions on the differential pin so that the MAC
asserts the RxPolarity signal. Then check whether
PHY performs the polarity inversion on the data
received from the serial side.
6) To exercise all the power saving states to check the
behavior on the electrical idle state exit and entry.

REGRESSION PHASE

  A robust regression environment ensures the following:

A good test plan is required to have robust test cases. 1) The design can be compiled independently.
The process to write test cases for a PHY is slightly 2) The testbench can be compiled independently.
different than for the physical layer of a SoC. 3) The individual tests can be run in parallel.
4) Any test failures can be easily checked
For example, consider a VIP that provides a test-suite and reproduced.
for all layers. Could only the physical layer test-suite be 5) Log files and waveform file names show an
extracted to verify the PHY? Probably yes. However, can association with the tests.
the extracted test-suite provide a comprehensive PHY 6) Individual test coverage data is saved to the
verification? Probably not, because the physical-layer universal coverage database (UCDB) format
test-suite targets the protocol SoC, and, in some specific for coverage metrics.
cases, particular tests might not cover stand-alone PHY
verification requirements. In addition, specific test cases The PHY verification kit offers a robust regression
might be required to cover stand-alone PHY verification environment that ensures that any change in the design
corner cases; for example, a test case to verify whether or the testbench can be easily validated by running all the
PHY performs the frequency compensation. A PHY tests in one go. This saves time and eliminates manual
verification kit removes these limitations by providing a effort.
test-suite with a test plan that targets 100 percent PHY
verification. Engineers can keep track of the verification objectives
with the help of functional and code coverage metrics.
The following are a few more examples of test cases They can track code coverage using switches in the
targeting stand-alone PHY verification: simulator. To track the functional coverage, they can use
covergroups, coverpoints, or crosses in the verification
1) To inject a disparity error from the serial side plan.
and expect decode or disparity error code in the
RxStatus signal on the PIPE side. In the verification plan, which is an XML file, all the
2) To configure the serial side as the loopback master available covergroups, coverpoints, and crosses
and check if PHY performs loopback after seeing are mapped to individual sections of the protocol
the asserted TxDetectRxLoopback signal from the specifications.
MAC side.
57
Once the required covergroups are enabled, the coverage REFERENCES
data needs to be saved to a UCDB format to view the
coverage metrics in the Questa® simulator. The UCDB • PHY Interface for the PCI Express, SATA and USB 3.1
is a repository for all the coverage data (including code Architectures, Version 4.2 specification
coverage, cover directives, cover points, and assertion • PCI Express specification
coverage), which is collected during simulation by the • Universal Serial Bus 3.1 specification
Questa platform. The Questa verification platform also • USB 2.0 PHY verification paper
enables all the coverage results to be merged in the UCDB • http://www.design-reuse.com/articles/15011/
format, which is then accessible in the Questa GUI or in the usb-2-0-phy-verification.html
form of a log file.

Engineers can also enhance or modify the verification


plan XML file according to their requirements. The PHY
verification can then be signed-off once the desired
coverage goal is achieved.

CONCLUSION
The limitations of a traditional VIP can be resolved using
an exclusive PHY verification kit that offers the following
advantages:

• Faster link up with a concurrent focus on accurate


connections and correct configurations with a large
number of signals, varying widths across specification
versions, and complex timer configurations.
• A specific test plan, which targets all PHY scenarios, to
achieve a definitive closure on PHY verification.
• A standard regression environment to run all test cases
in parallel and exclusively analyze test results.

58
VERIFICATION
ACADEMY
The Most Comprehensive Resource for Verification Training

21 Video Courses Available Covering


• Intelligent Testbench Automation
• Metrics in SoC Verification
• Verification Planning
• Introductory, Basic, and Advanced UVM
• Assertion-Based Verification
• FPGA Verification
• Testbench Acceleration
• PowerAware Verification
• Analog Mixed-Signal Verification
UVM and Coverage Online Methodology Cookbooks
Discussion Forum with more than 5000 topics
UVM Connect and UVM Express Kits

www. verificationacademy.com
Editor: Tom Fitzpatrick
Program Manager: Rebecca Granquist

Wilsonville Worldwide Headquarters


8005 SW Boeckman Rd.
Wilsonville, OR 97070-7777
Phone: 503-685-7000

To subscribe visit:
www.mentor.com/horizons

To view our blog visit:


VERIFICATIONHORIZONSBLOG.COM

You might also like