Volume11 Issue2 Verification Horizons Publication HR
Volume11 Issue2 Verification Horizons Publication HR
2
We wrap up this DAC edition of Verification Horizons with Verification Horizons is a publication
two articles from my colleagues here at Mentor Graphics. of Mentor Graphics Corporation,
In “Emulation Based Approach to ISO 26262 Compliant all rights reserved.
Processors Design,” the author shows us how to apply
fault-injection to processor verification in automotive Editor: Tom Fitzpatrick
Program Manager: Rebecca Granquist
applications—a critical element in any notion of a self-
Wilsonville Worldwide Headquarters
driving car (at least any such car I’d consider buying). And
8005 SW Boeckman Rd.
last but not least, we have “Resolving the Limitations of a
Wilsonville, OR 97070-7777
Traditional VIP for PHY Verification” in which we see how Phone: 503-685-7000
we can assemble a protocol-specific kit of verification
components and stimulus to ensure that the PHY To subscribe visit:
verification is self-contained and won’t take away from your www.mentor.com/horizons
system verification when it’s part of your SoC. To view our blog visit:
VERIFICATIONHORIZONSBLOG.COM
As always, if you’re at DAC this year, please stop by the
Verification Academy booth (#2408) to say “hi.” It’s always
gratifying to hear from so many of you about how helpful
you find both the Verification Academy and this newsletter.
I’m proud to be able to help bring both of them to you.
Respectfully submitted,
Tom Fitzpatrick
Editor, Verification Horizons
3
Table of Contents June 2015 Issue
Page 6
Verifying Airborne Electronics Hardware: Automating
the Capture of Assertion Verification Results for DO-254
by Vipul Patel, ASIC Engineer, eInfochips
Page 10
DO-254 Testing of High Speed FPGA Interfaces
by Nir Weintroub, CEO, and Sani Jabsheh, Verisense
Page 15
Formal and Assertion-Based Verification of MBIST MCPs
by Ajay Daga, CEO, FishTail Design Automation, and Benoit Nadeau-Dostie,
Chief Architect, Mentor Graphics
Page 20
Starting Formal Right from Formal Test Planning
by Jin Zhang, Senior Director of Marketing & GM Asia Pacific, and Vigyan Singhal, President & CEO,
OSKI Technology
Page 24
Reuse MATLAB® Functions and Simulink® Models
in UVM Environments with Automatic SystemVerilog DPI
Component Generation
by Tao Jia, HDL Verifier Development Lead, and
Page 31
Intelligent Testbench Automation with UVM and Questa
by Marcela Simkova, Principal Consultant, and Neil Hand, VP of Marketing
and Business Development, Codasip Ltd.
4
Page 35
Unit Testing Your Way to a Reliable Testbench
by Neil Johnson, Principal Consultant, XtremeEDA, and Mark Glasser, Principal Engineer,
Verification Architect, NVIDIA
Page 40
Hardware Emulation: Three Decades of Evolution – Part II
by Dr. Lauro Rizzatti, Verification Consultant, Rizzatti LLC
Page 43
Accelerating RTL Simulation Techniques
by Lior Grinzaig, Verification Engineer, Marvell Semiconductor Ltd.
Page 47
Emulation Based Approach to
ISO 26262 Compliant Processors Design
by David Kaushinsky, Application Enginner, Mentor Graphics
Page 54
Resolving the Limitations of a Traditional VIP
for PHY Verification
by Amit Tanwar and Manoj Manu, Questa VIP Engineering, Mentor Graphics
5
Verifying Airborne Electronics Hardware: Automating the Capture
of Assertion Verification Results for DO-254
by Vipul Patel, ASIC Engineer, eInfochips
6
The following assertion can be developed to verify in log file and dumps the assertion checking states in
the above requirement: .wlf file to be viewed in waveform window later.
7
In case the
assertion ID
does not match
the assertion
label in the
implementation
file, the user will
be notified with an error
message while running the script.
AUTO CAPTURE OF ASSERTION RESULTS The implementation of the assertions can then be
Developing and implementing an automated script checked likewise using the script.
is a step-by-step process.
Once assertion labels are identified, the simulation log
Step 1: Develop Script file is checked for pass or fail messages depending on
First of all, a generic assertion capture automation script assertion pass count. The script will get simulation time
is created to parse an assertion input file. One can use grep from the message. This time is used for preparing
scripting language like Perl, shell, tcl, etc. to develop the width and height resolution of assertion waveform snapshot
script. (image file).
Step 2: Prepare Assertion Input File The script will automatically find an appropriate height of
Create an assertion input file and include information in waveform snapshot depending on signals used in assertion
it like clock period, reset period, assertion module name, implementation and even find an appropriate width from
assertion ID, assertion implementation file name (.sv file), the simulation time grep from simulation log file. If the user
simulation log file name, simulation wlf file name (incase log requires extra width or height, then it can be provided by
and wlf files have different names), assertion pass count, etc. specifying the same in the assertion input file.
What is assertion pass count? By using information such as specific width, height, signals,
Assertion triggers multiple times during simulation. Hence, etc., the script will be able to capture an assertion waveform
users can obtain the snapshot for specific passed/failed snapshot (image file).
assertion using assertion pass count, as illustrated below in
figure 5. The next step involves the generation of a small and
targeted version of .wlf file of assertion results. The .wlf
Step 3: Parsing Input file by Script file is the input and assertion start/end time will be used to
The script will use the assertion ID and search assertion create the small and targeted version of assertion .wlf file.
label (refer to Sample Assertion Code to understand
assertion label) from assertion implementation file. This HOW TO APPLY A SCRIPT?
assertion label will be subsequently used for further Once regression is completed, an assertion input file is
processing. There is a provision to input single or multiple created. The input file will have the required information
assertion IDs while developing the script too. as specified in Step 2: Prepare Assertion Input File.
8
On running the script, it will manipulate input attributes and DISADVANTAGES
capture results in an image file (.bmp) and a waveform file • Simulation tool version or feature changes will require
(.wlf). The resultant snapshots will be used for creating a similar changes in the script
verification result document and provide treaceability to For example, the simulation tool commands
requirements, as shown in Figure 1: Steps in Requirements used by script such as wlfman. If this command
Tracing for DO-254 using Assertions. is removed or altered, the script needs to
accommodate the changes. Similarly there
ADVANTAGES are many commands in simulation tool that if
• Automates the tedious process of capturing assertion changed will lead to script alteration
results
• Saves time and manual efforts REFERENCES
• Drastically reduces probability of errors • RTCA/DO-254 “Design Assurance Guidance for
For example, errors in identification of active Airborne Electronic Hardware”
and pass points in result, hiding signals, • https://www.doulos.com/knowhow/sysverilog/tutorial/
inaccurate size assessment, etc. assertions/
• Assertion capture script is reusable in different
verification projects
9
DO-254 Testing of High Speed FPGA Interfaces
by Nir Weintroub, CEO, and Sani Jabsheh, Verisense
As the complexity of electronics for airborne applications to monitor all FPGA interfaces, including high-speed
continues to rise, an increasing number of applications need interfaces, at the FPGA pin level.
to comply with the RTCA DO-254/ Eurocae ED-80 standard
for certification of complex electronic hardware, which
includes FPGAs and ASICs. MEETING DO-254 STANDARDS
The goal of DO-254 certification is safety. No shortcuts can
The DO-254 standard requires that for the most stringent be taken. It is mandatory to prove the design correctness
levels of compliance (levels A and B), the verification of an FPGA by verifying its entire feature set. High-speed
process for FPGAs and ASICs must measure and record interfaces are complicated interfaces which are usually
the verification coverage by running tests on the device in linked to the main functionality of a specific FPGA. Thus
its operational environment. What this essentially means is the comprehensive testing of these interfaces must be a
that verification engineers and designers need to compare basic requirement in the process of verifying the design
the behavior of the physical outputs of the device on the correctness of any device.
hardware device pins with their corresponding RTL model
simulation results. In addition, the standard requires running Therefore, an FPGA using a high speed DDR interface, for
robustness tests on the interface pins. The robustness example, cannot be properly validated solely by connecting
testing is accomplished by forcing abnormal or slightly out- it to a standard DDR device and testing read/write
of-spec behavior on the device and ensuring that it is able operations, because it does not allow testing and verifying
to deal with this behavior without catastrophic results. the FPGA behavior in abnormal conditions.
These requirements become especially challenging for high- The best way to qualify Level A and Level B DO-254
speed interfaces, such as DDR3 or PCIe, because it is not devices is to compare the physical device outputs to
possible to create and observe the abnormal behavior when the simulation model results. Doing this comparison for
an FPGA is connected to the regular operational interfaces. complex FPGAs is a complicated technological challenge.
For example, when a real memory is connected to a DDR Adding high-speed interfaces makes it monumental. While
model, there is no way to control the DDR behavior and use in simple devices one can use scopes and logic analyzers
different kinds of DDR memories. As for robustness testing, for monitoring and comparing hardware signals, doing so
there is no way to test incorrect behavior of a memory while for complex devices is not viable.
connected to a real memory, because the real memory
does not allow for the type of error injection desired. It is In the avionic industry, the need for completely bug-free
also impossible to test different timing behavior of the high designs is similar to other industries; however, such need
speed interface signals. arises from ensuring there are absolutely no critical safety
issues. Since FPGAs in the avionic industry can rise to
To overcome these challenges, Verisense developed a the level of ASIC complexity, there becomes a necessity
new approach, the Advance Validation Environment (AVE), to innovate an advanced hardware validation approach to
which makes it possible to comply more completely with cover all the safety requirements.
the DO-254 requirements for in-hardware testing of high
speed interfaces at the pin level and enables users to easily The new AVE approach suggested by Verisense for
run an array of robustness tests. Based on the Universal providing a comprehensive solution that can qualify Level
Verification Methodology (UVM), this novel approach A and Level B devices is based on applying the concepts
involves the migration of a UVM testbench and verification of constrained random verification in the simulation world,
environment into an FPGA operational environment. to the real hardware environment. Based on advanced pre-
Applying UVM concepts and components to target- silicon verification methodologies that will be introduced,
hardware simulation enables reuse of all simulation runs for the article will present a new approach for hardware testing
the in-hardware testing. It also provides the ability that makes it not only possible but also straightforward to
10
comply more completely with the DO-254 requirements for component responsible for monitoring the interface buses
real in-hardware testing including easily running an array and reporting all the transactions which occurred on the
of robustness tests. interface to the reference model block.
11
as Verisense IP, allowing, for example, a real DDR memory The software implements the high-level components of
to be emulated. Like the UVM agent blocks, the emulator the verification environment, including the input generation,
includes sequencers, drivers, and monitors, so AVE can reference models, and off-line comparisons between the
emulate all the interfaces exactly as in a simulation. The behavior of the DUT in the verification environment and
same reference model is used and a scoreboard test-report its behavior on the final target hardware.
generator summarizes results. In addition, voltage, clock,
and signal control blocks have been added to qualify the The software is preprogrammed via automatically gener-
DUT against tolerance specifications; this could not be done ated configuration files with the relevant setting of the
with even the most advanced UVM verification environment. testing environment and the DUT hardware. Based on
A further benefit, is that the AVE hardware architecture is this information, the software configures the validation
modular, allowing a high degree of reuse between projects. environment hardware and sets the clocks and voltages
to the DUT as required for each test.
The AVE testing platform architecture is divided into
two main components: software and hardware.
The requirement for robustness testing on all interfaces The steps in the software test flow are identical for both
includes controlling timing of signals, voltage levels, and regular and high-speed interfaces and are as follows:
in the case of a memory, also changing the actual data
and data sequences. The AVE solution is able to provide 1. The software simulation environment generates Value
extensive support for robustness testing on all interfaces, Change Dump (VCD) files from the waveforms of the
including high-speed interfaces, because the DUT pins verification tests. The VCD file is an industry standard
connect directly to the hardware emulators; thus, the file format containing waveform information that is
environment not only monitors activity on the pins but generated from the simulation environment, regardless
also can inject data and create abnormal situations. The of the verification methodology used in the software
emulators can control the timing on all interfaces or any simulation. The tester software parses the VCD files
subset thereof. It also has the ability to modify the latency of and generates two sets of vectors. The first is used
the protocols, the frequency, and even the voltage level of a as input vectors to inject at the DUT pin level on the
given interface. All the interface electrical and logical values hardware platform. The second is used as the expected
can be modified within the interface specification or beyond results vectors for comparing with the hardware
the specification limits for verifying real abnormal situations. tester results at the end of the process.
12
2. The hardware tester injects the input vectors onto The VS-254 platform contains a complicated DUT FPGA
the pins of the DUT, running the test that was that has two DDR3 interfaces. The DUT FPGA DDR
previously run in the software simulation environment. interfaces are connected directly to three other FPGAs and
3. While the test is running, the tester monitors not to DDR memories. The FPGAs also include the DDR
and records all the DUT output pins. emulators.
4. Once the test is completed, the recorded pin
behavior on the tester is processed by the software. The VS-254 includes many enhanced features to reduce
5. The actual recorded pins waveform is then the verification effort and to make the engineering effort
automatically compared with the expected results predictable. These include a high degree of reuse between
from the software simulation. Any mismatches are the simulation environment and the hardware verification,
flagged and reported. If there are no mismatches, built-in support for regression and waivers, and the VS
the test has passed successfully. system simulation tool for easier debug of test failures.
6. All appropriate log and test result files are The VS-254 FPGA implements the AVE methodology and
then generated automatically for documentation includes a DDR3 high-speed interface.
and traceability purposes.
DDR3 is a type of double data rate-synchronous dynamic
random access memory (DDR-SDRAM). In an SDRAM,
High-Speed Interface Example - DDR 3
the DDR is synchronized with its master (e.g., a processor).
In 2014, Verisense provided the VS-254 FPGA verification
Unlike asynchronous memories, which react to changes in
tester tool to multiple avionics customers to assist them in
the control, synchronous memories can be pipelined, thus
the certification of their DO-254 Level A and B products.
achieving high speeds and efficient access to memory. The
The VS-254 is the first, and remains the only, solution that
DDR data is stored in a simple dynamic physical element,
provides the complete required DO-254 functionality for
which enables higher densities and, thus, a lower cost per
high-speed interfaces, including pin-level verification on all
bit. To keep the data from getting lost, the information is
pins and full robustness testing on all high-speed interfaces.
refreshed from time to time. The refresh mechanism adds
complexity to the controller, and although it somewhat
reduces the memory throughput, it is considered a good
tradeoff. The memory works on both clock edges, which
doubles the memory throughput. As memory speeds rise,
signal integrity becomes an important issue. Among other
things, there is high sensitivity for the signal termination,
which makes signal probing and monitoring operations
much more complicated and challenging.
13
Figure 5: High-Speed Interface Using DDR3
The DDR emulators perform all the special testing features interface to multiple clients that require arbitration
required for DO-254 hardware testing requirements, governed by master/slave relationships.
including the complete set of robustness testing. The
emulator is programmed through multiple configuration
registers, which control all the DDR parameters, including: CONCLUSION
Taking concepts from advanced pre-silicon verification
• DDR latency methodologies (e.g., UVM) and applying them to a hardware
• Control and data line timing tester validation platform (i.e., AVE) delivers unprecedented
• DDR power level control over all aspects of hardware testing, especially high-
• DDR frequency shifts speed interfaces. As technology progresses, we expect
• DDR error injection and correction to see more and more devices in the market with different
types of high-speed interfaces.
The AVE runs the following tests on the high-speed
interfaces: This breakthrough approach enables, for the first time,
complete testing that includes high-speed interfaces. It
• Validate the physical layer connections and test the gives developers and manufacturers a much higher degree
normal interface behavior of certainty in the correctness and safety of their complex
• Compare the transaction level data with the reference electronic devices.
model from the software verification environment
• Modify the input data to the DUT, and test for abnormal DO-254 expects developers to use their best effort to
situations validate their designs. There is no absolute definition of
• Vary the voltage levels and clock frequencies to what is a “good enough” effort. In the absence of absolute
understand their impact on the DUT proof, you need to invest much time and energy to show
• Change the physical layer signal timings including that what you have done is as good as technically possible.
control and data As solutions are provided that allow for increased coverage,
these solutions eventually become the de-facto DO-254
Providing this level of functionality and features is requirement. Because AVE is currently the most complete
challenging, especially considering that high-speed technological solution to validate high-speed interfaces, we
interfaces often include complex protocols and frequently expect it to become a de-facto requirement.
14
Formal and Assertion-Based Verification of MBIST MCPs
by Ajay Daga, CEO, FishTail Design Automation, and Benoit Nadeau-Dostie, Chief Architect, Mentor Graphics
Built-In Self-Test (BIST) is widely used to test embedded prohibitive. Formal verification of timing constraints is
memories. This is necessary because of the large number clearly a better alternative although there are a number of
of embedded memories in a circuit which could be in the obstacles to overcome before it can be used effectively.
thousands or even tens of thousands. It is impractical to There is a learning curve involved in using such verification
provide access to all these memories and apply a high tools and new scripts need to be generated and maintained
quality test. The memory BIST (MBIST) tool reads in user to integrate the tools in the design-for-test flow. Until
RTL, finds memories and clock sources, generates a test now, there were also a number of limitations of the formal
plan that the user can customize if needed, generates verification tools that made it difficult to use on our MBIST
MBIST IP, timing constraints, simulation test benches and IP. A first limitation was that these tools generally analyze
manufacturing test patterns adapted to end-user circuit. constraints individually. However, the MBIST constraints
are not all self-contained and were causing a large number
Multi-cycle paths (MCPs) are used to improve the of false alarms. For example, take the following set of
performance of the circuit without having to use expensive constraints:
pipelining which would be required when testing memories
at Gigahertz frequencies. Most registers of the MBIST set_multicycle_path 2 -setup -from STEP_COUNTER*
controller only update every two clock cycles. Only a few set_multicycle_path 1 -setup -from STEP_COUNTER*
registers need to operate at full speed to perform the -to STEP_COUNTER*
test operations. The architecture takes advantage of the
fact that most memory test algorithms require complex The first constraint declares MCPs from the counter to
operations such as Read-Modify-Write. all destinations but the second constraint resets some of
the paths to be SCPs. The first constraint is not true when
MBIST IP has a large number of MCP sources with tens or analyzed in isolation.
even hundreds of destinations. Some of the MCP sources
also have single-cycle path destinations. The identification A second issue is correctly understanding the coded RTL,
and classification of MCPs is done by analysis based on so that false violations are not flagged by the formal tool.
experience acquired over the years. Timing constraints are Noisy results are a legitimate concern when deploying a
verified in-house using SystemVerilog assertions (SVA) on formal solution, so we needed to be convinced that the
representative benchmark circuits. The benchmark circuits results from the tool are consistent with the information
must be chosen carefully because they correspond to provided to it. The third challenge is the ability to provide
specific instances derived from a highly parameterizable architectural input to the tool in situations where MCPs are
RTL template. The timing constraints are implemented in not completely supported by the RTL. This architectural
a way to minimize the number of parameter combinations input (waivers) needs to be specified in a way that accounts
affecting the constraints. Nevertheless, there is always a for the parameterizable nature of the design. Common
small possibility that a combination was missed. waivers need to be specified for hundreds of MBIST
controllers, each using a different set of circuit parameters
End users currently don’t have a mechanism to run formal which might affect the composition of the timing constraints.
verification of all constraints (MBIST and functional). They
need to assume that MBIST constraints are correct by FISHTAIL’S MCP VERIFICATION METHODOLOGY
construction and waive violations causing a disruption FishTail’s Confirm product performs MCP verification.
of the design flow. Timing simulations need to be used if The tool requires the following information for a design:
desired to validate MBIST constraints. This is very difficult
because of the large number of memories in a circuit. • Synthesizable RTL
• Tcl MBIST constraints
It is well known that simulation-based methods (using • Simulation models for standard cells instantiated
either full timing or assertions) are limited by the quality in the RTL
of test benches which might not exercise all useful • Liberty models for hard macros, memories
signal transitions and by simulation time which might be
15
Figure 1: Example MCP
With this information the tool formally establishes the the startpoint is allowed to transition. In Figure 1, FF1
correctness of the multi-cycle paths without requiring receives a clock without gating and the only enable logic is
any stimulus. For any paths that fail verification the tool on the datapath. If valid is low the old value is maintained
generates a waveform that shows why the path fails on FF1 and when valid goes high a new value is sampled by
and provides aids for an engineer to debug the failure. FF1. So, the STC is “valid”, i.e. valid must be high for FF1 to
In addition, the tool generates SVA assertions for all transition in the next cycle. Next, we consider the condition
failing paths. These assertions may be imported into RTL necessary for the transition to propagate from startpoint to
functional simulation to obtain third-party confirmation of endpoint. We refer to this as the Path Propagation Condition
the failure. Engineers can provide architectural input to the (PPC). In Figure 1, there is no interesting enable logic on
tool (information that is not available in the synthesizable the datapath, the only enable that controls the propagation
RTL) that support the timing exceptions. For example, the of a path from FF1 to FF2 is the clock-gate enable for the
fact that configuration registers are static, or expected to clock pin on FF2. FF2 only receives a clock when valid is
be programmed in a certain way, or that failures to certain high. As a result the PPC is “valid”. The STC establishes
endpoints can be ignored, etc. With this additional input the condition for the startpoint to transition, and the PPC
the tool is often able to prove the correctness of a timing establishes the condition for the transition to propagate
exception that was earlier flagged as incorrect. Engineers to the endpoint in the subsequent cycle. Formally proving
decide on the effort they want to spend in getting the tool to an MCP requires proving that if the STC is true in a cycle,
formally prove a timing exception, or whether they want to then in the next cycle it is impossible for the PPC to be true.
reduce this effort and instead establish the correctness of Performing this proof requires working logic cones back
the exceptions based on the feedback they get from running over as many clock cycles as required to reach a definitive
RTL simulations on the assertions generated for failing answer. For the circuit in Figure 1, proving the MCP requires
paths. proving that it is impossible for valid to be high in two
consecutive clock cycles and this proof passes as valid
The formal verification of an MCP requires proving that toggles every clock cycle.
it is impossible for a change at a startpoint to propagate
to an endpoint in a single cycle. Consider the example FishTail’s strength in MCP verification comes from its ability
design in Figure 1. Consider that an engineer has specified to prove MCPs as correct when the provided collateral
a two-cycle MCP from FF1 to FF2. Formally proving this supports the MCP without requiring an engineer to fiddle
MCP requires establishing that in the clock cycle that FF1 with settings that control the sequential depth for the formal
transitions, it is impossible to propagate this transition to proof, or the runtime allowed to prove an MCP. Inconclusive
FF2. Proving the MCP requires establishing the condition or noisy results are the bane of formal tools, and it is
when FF1 can transition. We refer to this as the Startpoint important for a user to have the confidence that when an
Transition Condition (STC) and it takes into account any MCP fails FishTail formal proves it is most likely because of
enable logic in the clock and data path that control when missing architectural information and not because the tool
16
Figure 2: Example report for failing multi-cycle path
ran into complexity issues. The ability to prove MCPs while startpoint to endpoint, as well as any gating along the path.
minimizing complexity issues comes from FishTail’s ability The path may be viewed in schematic form using FishTail’s
to accurately separate signals on a design into either data integration with Verdi®.
or control. Data signals play no role in the STC and PPC
established by the tool, allowing the tool to scale and work The intent of reviewing the PPC/STC and the path detail
back more cycles during formal proof than would otherwise is to establish if things are as expected, or if a path is
be possible. being traversed that is not intended, or if the necessary
gating conditions are not in place to control when the
For every failing timing exception the tool generates startpoint can transition and when this change is allowed
a report explaining why the path is not multi-cycle. to propagate to the endpoint. If everything is as expected,
An example report is shown in Figure 2. then the next step is to review the stimulus the tool
generates showing how the failure happens. An example
The report in Figure 2 shows the start and endpoint that failure stimulus is shown in Figure 3. This stimulus, similar
the failure applies to and the launch and capture clocks to a VCD dump from a simulation tool, shows signal values
associated with the path. Clicking on the PPC/STC shows in clock cycles leading up to a failure. The cycle where the
you these conditions. Clicking on the red path in Figure 2 failure happens is shown in red. In this cycle, the startpoint
shows you more detail on the timing path from the changes and the change propagates to the endpoint.
17
Figure 3: Example failure stimulus
Using FishTail’s integration with Verdi this stimulus may when a startpoint changes then in that clock cycle the PPC
be viewed as a digital waveform but the benefit of the must be false, or in the next cycle the endpoint must not
tabular display is that you can ask the tool to justify a change.
signal value in a given clock cycle. For example, clicking
on the value 1 shown for RESET_REG_DEFAULT_MODE
in the failure cycle results in the tool highlighting signal SUMMARY
values that cause RESET_REG_DEFAULT_MODE to go In this article we have motivated the need to formally verify
high when the startpoint changes. Since RESET_REG_ MBIST MCPs, discussed FishTail’s methodology for MCP
DEFAULT_MODE is a combinational signal the values that verification and the approach used to debug formal MCP
are highlighted are in the current cycle on the registers verification failures or generate assertions for import into
STATE and BIST_EN_RETIME2. You can then click on RTL functional simulation. The methodology was applied
values on these registers, for example, the value 1 on to a small MBIST IP with 59 MCPs. The MCPs applied to
BIST_EN_RETIME2 to see what values in the previous a total of 2090 paths. Initially, after providing the tool just
cycle cause this signal to be high in the failure cycle. In this RTL and constraints as input, 92% of the constrained paths
manner you can keep working backward from the failure were confirmed as good and 167 paths failed formal MCP
cycle to understand how the failure happens. As part of this verification. The runtime was less than a minute. We then
process if you see a transition or a value that is impossible debugged the failures and provided additional architectural
based on architectural considerations (for example, the way information to the tool regarding the way LVISION_WTAP_
configuration registers, such as LVISION_WTAP_IR_REG/ INST/LVISION_WTAP_IR_REG is programmed, that LV_
INST_INT are programmed) then this information can be WRCK, LV_SelectWIR and LV_ShiftWR are static. With this
communicated to the tool. At this point, rather than redo the additional input the number of failing paths dropped to 15
entire MCP verification run you can reverify just the specific (so 99% of the paths constrained by MCPs were verified to
failing path you have been debugging to see if the additional be good). Legitimate MCP issues were flagged by the tool,
input resolves the failure. some that were expected and some that were surprises.
Formal MCP verification debug requires the involvement of The formal verification of MBIST MCPs using RTL input
a designer who is familiar with the RTL and the motivation guarantees the correctness of MBIST MCPS at the time
for the MCP. It requires time, and the benefit is a 100% they are written, and then ensures that while revisions to
result. When time is short, an IP is not well understood, or RTL and constraints are being made, the MBIST MCPs
designer time is scarce an alternate strategy is to take the are continually verified to be good. Customers who receive
assertions generated for failing MCPs and run them through MBIST IP are able to verify MCPs delivered along with this
RTL functional simulation. If an assertion fails then it is a IP, accounting for any customization they make.
strong indicator that an MCP is incorrect. If an assertion is
checked and it always passes then it builds confidence that
the MCP is correct. Figure 4 shows an example assertion
generated by FishTail. Essentially, the assertion checks that
18
// Path propagation condition:
// ( TARGET1_CK100_MBIST1_MBIST_I1/RESET_REG_DEFAULT_MODE )
// Launch Clock: CK100 (rise)
// Startpoint: TARGET1_CK100_MBIST1_MBIST_I1/ALGO_SEL_CNT_REG_reg
// Capture Clock: CK100 (rise)
// Endpoint: TARGET1_CK100_MBIST1_MBIST_I1/MBISTPG_DATA_GEN/WDATA_REG_reg
// Assertion:
module u_mcp449_1 (input bit clk, input logic from_reg, input logic [1:0]
to_reg, input bit v12122);
wire path_propagation_condition =
( v12122 );
property e_mcp449_1;
@(posedge clk) `FT_DISABLE (`FT_TRANSITIONS_AND_NOT_UNKNOWN(from_reg))
|-> ((##0 (!path_propagation_condition)) or
(##1 (`FT_NO_TRANSITION_OR_UNKNOWN(to_reg[1:0]))));
endproperty
endmodule
bind TARGET1_CK100_MBIST1_LVISION_MBISTPG_CTRL u_mcp449_1
sva_u_mcp449_1(.clk(BIST_CLK), .from_reg(ALGO_SEL_CNT_REG), .
to_reg(MBISTPG_DATA_GEN.WDATA_REG), .v12122(RESET_REG_DEFAULT_MODE));
19
Starting Formal Right from Formal Test Planning
by Jin Zhang, Senior Director of Marketing & GM Asia Pacific, and Vigyan Singhal, President & CEO, OSKI Technology
“By failing to prepare, you are preparing to fail” decision that should be made upfront to avoid surprises
—Benjamin Franklin down the road.
“Productivity is never an accident. It is always At Oski, we follow a systematic 3-stage approach. The
the result of a commitment to excellence, IDENTIFY and EVALUATE stages aim at obtaining a deep
intelligent planning, and focused effort” understanding of the DUT. The PLAN stage combines both
—Paul Meyer formal and design knowledge to create the final test plan.
20
2. In many instances, because bugs continue to be found is a legacy design with minor changes and has gone
close to or even after tapeout, formal is used to find through lots of verification, or if it is not a difficult design
missed corner case bugs, hence the formal goal is bug to verify with simulation (few corner cases and not many
hunting. concurrent operations), it is obviously not the best use of
3. Sometimes, project teams have trouble reaching formal resources. On the other hand, if the block is brand
simulation coverage goals. Formal can discover new, or is being developed as an IP to be used internally
unreachable targets, or generate input vectors to reach or externally with lots of parameters, then formal will bring
simulation cover points, and therefore the formal goal better ROI.
is coverage closure.
4. There are also cases where project teams have Last, formal resources and expertise constraints limit
specific needs in mind, such as verifying pre- and post- what types of formal and how much can be done. End-
clock gating designs. In this case, some commercial to-End Formal requires a lot of formal expertise and can
formal apps will be useful. only be attempted by our engineers after going through
several projects with mentors (usually takes about 2 years
Different verification goals translate to different strategy of full-time formal usage). On the other hand, writing
and require different levels of planning. For bug hunting, local assertions to do bug hunting can be carried out by
one only needs to focus on blocks or functionalities with engineers with much less formal experience.
the most issues, without having to worry about completely
The output of the IDENTIFY stage is a list of good
verifying the whole design using formal. On the other
design blocks to apply formal verification on and the
hand, achieving formal sign-off requires the most thorough
corresponding goals.
planning. Spending insufficient time on planning may result
in the ultimate goals not being reached.
Once the verification goals are aligned, we need to identify FORMAL TEST PLANNING STAGE 2 – EVALUATE
blocks suitable for formal. Contrary to common belief The goal of the EVALUATE stage is to finalize the list
that only control types of blocks are good for formal, in of formal testbenches for different design blocks, along
reality data-transport blocks, where data is transported with understanding of the possible formal verification
from inputs to outputs with simple or no modifications, challenges to overcome.
are also good candidates. For example, some typical
To achieve this goal, we need to carefully consider many
control and data-transport blocks could be arbiters of all
factors, such as design interfaces, register transfer level
kinds, interrupt controllers, power management units, tag
(RTL) metrics and critical design functionalities.
generators, schedulers, bus bridges, memory controllers,
DMA controllers and standard interfaces such as PCI Knowledge of design interfaces is important to decide
Express, and USB. On the other hand, data-transform the best places to partition the formal testbenches. One
designs, where algorithmic options are often performed, consideration is whether the block interface is standard
are not good for formal property verification (also called and well documented. Designers are usually very busy and
model checking). Instead these should be verified are less willing to answer lots of questions about interface
using other techniques such as Sequential Equivalence behaviors. So it is best to partition at the level where
Checking. To understand the functionality of different design interfaces are standard, or easy to follow. An ideal
design blocks, block diagrams and design specs will be situation is when formal verification starts in parallel to RTL
useful. Conversation with designers can also help get the development so designers have the freedom of moving
high-level functionality of different design blocks for this logic from one block to another to simplify the interfaces.
assessment.
Next, we use formal tools to report the following RTL
Not all blocks that are suitable for formal should be verified metrics: register counts, RTL lines of code (LOC), the
by formal. Again one needs to consider the ROI. If a block number of inputs and outputs, and parameter variations.
21
These numbers further assist in deciding formal testbench It is worth noting that implementing formal End-to-End
boundaries and estimate the amount of effort it will take to checkers often requires building reference models. As
formally verify each chosen block: a matter of fact, 95% of the effort may be in writing the
reference models in Verilog or SystemVerilog, with only 5%
• A block with lots of inputs and outputs means more of the effort in writing the SystemVerilog Assertions. So one
effort in modeling constraints. On the other hand, a needs to factor the time it takes to write reference models
block with large RTL LOC or register counts means when estimating the overall effort level. Also, internal
more effort in modeling checkers and managing assertions might be used during the project, but only when
complexity. A balance needs to be made between needed for debugging or helping End-to-End checkers
simple interfaces vs. manageable block sizes for formal. reach closure. Therefore internal assertions are not included
• People often ask what a good design size for formal in the formal test plan.
is. While RTL LOC and register counts help guide the
decision on formal testbench boundaries, there are Formal complexity discussion and resolution should be
no precise rules to say what the right size of block for included in the formal test plan. Because each design is
formal is, as a small RTL block could be very complex. unique, often there is no existing solution to use. We need
A rule of thumb for formally manageable block is a block to estimate the effort in coming up with a solution as well
that can be designed by a single designer. Anything as writing, verifying and using the solution. This is the
smaller will not bring the best ROI, and anything bigger most unpredictable part of the process. Often when we
will pose a challenge for formal tools. underestimated the effort level in our projects, it is when we
• Understanding the impact of parameters may help didn’t fully comprehend the challenge involved in solving
reducing formal complexity. If design parameters can be complexity. Careful consideration here will save surprises
reduced without reducing corner case coverage, formal later.
testbench should use smaller parameters for better
performance. Exact metrics to measure success need to be established.
Once there is a total number of checkers and constraints
During the process of working with RTL, one also gains an to implement, a weekly tracking spreadsheet can be used
understanding of the micro architecture characteristics of to track the numbers of checkers and constraints written
the design block to anticipate the kind of formal complexity each week, their verification statuses, bugs found, and
and complexity resolution techniques one might use. This percentage towards completion. In recent years, formal
process also helps decide the Required Proof Depth. tools have added formal coverage features, which can
effectively measure how much the formal testbench is
At the end of this stage, there should be a mapping between covering the DUT. This will be another useful metric to use
each candidate design and one or more formal testbenches, to decide when a formal testbench is complete and formal
along with a determination of who is responsible for sign-off has been reached.
developing which testbench. It is common that only a subset
of the target blocks is chosen at the end of this process. The following list includes typical chapters
in our formal test plan:
22
The formal test plan is created for each block designated Clock domain: 1
to have a separate formal testbench. This is a living Data latency: 6
document and may be updated during the verification Required Proof Depth: 28 cycles
process. For example, an End-to-End checker may be too
complex, so needs to be split into two or more checkers, When asked “How Long It Takes to Formally Verify This
thus affecting the list in the document. At the end of the Design”, over 70% of respondents guessed in the 2-4
project, the actual formal testbench implementation should month range. In reality, we spent 5 months on the project
be consistent with the formal test plan. The formal test plan with the following breakdown:
may serve as a user guide for future projects when the
formal testbench is reused. 1. Formal test planning: 0.5 months
2. End-to-End Checker (including reference models):
1.5 months
A REAL CASE STUDY 3. Constraints: 1 month
Often people underestimate the amount of time it takes to 4. Interface checkers and internal assertions: 1 month
do formal sign-off projects. At DAC2014, we conducted a 5. Abstraction Model: 0.5 months
guessing game by proving the following information of a 6. Iteration with designs on bug fixes: 0.5 months
design that we verified before.
As the breakdown shows, significant time is spent in
Design Description: formal testbench development. However once the block
Reorders IP packets that can arrive out of order and is completed verified with formal and achieved sign-off,
dequeue them in order. When an exception occurs, the the chance of missing a bug is very low. This level of
design flushes the IP packets for which exceptions has confidence cannot be provided by simulation. If this design
occurred. Design supports 36 different inputs that can will be used by several projects, then the formal investment
send the data for one or more ports. Another interface is well worth the effort.
provides dequeue requests for different ports. Design
supports 48 different ports.
CONCLUSION
Design Interface Standard: Due to the complexity of formal sign-off projects, it is
Packets arrive with valid signal. A request/grant important to do thorough planning at the beginning so the
mechanism for handling requests from 36 different formal testbench implementation can be done efficiently.
sources; All 36 inputs are independent and can arrive This article provided an overview of the stages involved
concurrently. All 48 ports can be dequeued in parallel in formal test planning. Since this is a very important
again using another request/grant mechanism. step, and one often ignored by project teams, we can’t
emphasize enough the value of dedicating time to do
Micro Architecture Details:
formal test planning. Oski CEO, Vigyan Singhal will give
Supports enqueue and dequeue for IP packets for 32
a talk on this subject at Mentor Verification Academy
different input and 48 different ports respectively. 48
on Wednesday June 10th, at 11am. It will be a good
different queues used to store IP packets for different
opportunity to learn more about the process and ask
ports. A round robin arbiter resolves contention between
questions.
enqueue requests from different source for the same port
at the same cycle.
RTL Stats:
RTL Line of code: 10830
Register count: 84,027
Inputs: 3,404 bits
Outputs: 9,137 bits
23
Reuse MATLAB® Functions and Simulink® Models in UVM Environments
with Automatic SystemVerilog DPI Component Generation
by Tao Jia, HDL Verifier Development Lead, and Jack Erickson, HDL Product Marketing Manager, MathWorks
24
simulation runs on a different platform than from where and outputs be? What other signals will you need access
it’s generated, you can use the generated makefile to to? In our example, we will generate a checker function
build the shared library on that platform. that will calculate the floating point result of a 64-point
radix-2 FFT and compare this to the outputs of a fixed-
These components run quickly in simulation, as they point RTL implementation.
are behavioral-level C code without the bit-level
implementation details. Using C as the description
language allows you to export a wide variety of model
types. Some example applications include:
26
Image 4: Auto-generated SystemVerilog
wrapper for the checker DPI component.
©2015 The MathWorks, Inc.
27
Image 6: Sending the stimulus transaction and
implementation output data into the checker, and
comparing the result against the defined threshold.
© 2015 The MathWorks, Inc.
The return value, nmrs, is the return value from the MATLAB
function. HDL Verifier created an output port for it when
Image 8: Questa Advanced Simulator run script,
it created the SystemVerilog wrapper. This is compared
pointing to the generated shared library.
against a threshold of error tolerance that we defined in our
© 2015 The MathWorks, Inc.
UVM testbench.
RESULTS
This simple example shows how to generate
SystemVerilog DPI components using HDL
Verifier, and it illustrates how to integrate
the generated components into a UVM
environment. The simulation passes its simple
test sequence (see image 9):
28
Image 9: Results from the simulation run. © 2015 The MathWorks, Inc.
In this approach, the spec is the primary – and sometimes The ability to automatically generate SystemVerilog
the only – means of communication between system DPI components directly from the system design and
design and hardware verification teams. Building verification environment is akin to directly passing the
the environment, models, and tests by reading and specification into the hardware verification environment.
interpreting the spec is labor intensive and error prone.
If there is a spec change, it usually just gets addressed Rather than having the verification team spend weeks
in the hardware design and verification environment. It reading the spec, writing and debugging a floating point
rarely gets propagated back to the system design and reference model for an FFT, and checking functionality
verified there – why bother when there is a deadline to in SystemVerilog, we can automatically generate the DPI
meet? components from the system-level models. They are
immediately available to the hardware verification team.
29
Image 11: HDL Verifier SystemVerilog DPI component generation automatically generates system-level intent in a
format that is consumable by the verification environment. This reduces manual efforts and possible specification
misinterpretations. © 2015 The MathWorks, Inc.
30
Intelligent Testbench Automation with UVM and Questa
by Marcela Simkova and Neil Hand, VP of Marketing and Business Development Codasip Ltd.
This article describes an automated approach to improve UVM GENERATION WITH CODASIP STUDIO
design coverage by utilizing genetic algorithms added to Codasip Studio automates the generation of UVM
standard UVM verification environments running in Questa® environments for execution in the Questa® simulator
from Mentor Graphics®. To demonstrate the effectiveness of from Mentor Graphics®. In this environment, the HDL
the approach, the article will utilize real-world data from the representations of either individual ASIPs or complex
verification of a 32-bit ASIP processor from Codasip®. platforms consisting of several ASIPs, buses, memories,
and other IP components act as the design under
verification (DUV). The reference model, which is a very
INTRODUCTION important part in the UVM architecture, is automatically
Application-specific instruction-set processors (ASIP) have generated from the high-level IA model of an ASIP and
become an integral part of embedded systems, as they can from C++ models of external IP components. A generator
be optimized for high performance, small area, and low of random applications for the target ASIP is included in
power consumption by changing the processor’s instruction- the generated ASIP toolchain and will be used to generate
set to complement the class of target applications. This stimuli for ASIP verification. For illustration, see Figure 1.
ability makes them an ideal fit for many applications —
including the Internet of Things (IoT) and medical devices
— that require advanced data processing with extremely
low power.
32
using the Application Loader component, and the
class CodixRiscTest extends CodixRiscTestBase;
verification in simulation is started immediately.
// registration of component tools
`uvm_component_utils( codix_risc_platform_
The following is an example of simplified UVM source
ca_t_test )
code of the CodixRiscChromosome class, and of the
run_phase of the CodixRiscTest class.
…
33
We compared the effectiveness of the GA optimization The results show that for all tests utilizing GA optimization,
to two standard approaches. In the first approach, the we were able to achieve better coverage than with standard
benchmark applications are used for verification of the approaches. It is important to mention another significant
ASIP. In the second approach, the random application advantage of this approach. When selecting applications
generator is used. The proposed GA-driven approach also generated for the best chromosome in each generation,
uses the random application generator but adds additional the result is a set of very few applications with very good
constraints to the generator encoded in the chromosome. coverage. So for example, for the GA-optimization running
The graph in Figure 3 demonstrates the results of these 20 generations we can get the best 20 applications from
three approaches. It compares average values from 20 every generation that are able to achieve 98.1 percent
different measures for each approach (using various seeds). coverage. These applications can be used for very efficient
The x-axis represents the number of evaluated applications regression testing.
on the Codix-RISC processor, and the y-axis shows the
achieved level of total coverage. The computational burden of the GA optimization was as
follows. Evaluating one application in simulation took an
The average level of coverage achieved by 1000 benchmark average of 12.626 seconds, generating one application
applications was 88 percent. The average level of coverage around one second and preparing a new population in 0.095
achieved by 1000 random applications was 97.3 percent. In second. Experiments ran on a 3.33 GHz Intel® CoreTM i5
the GA-driven approach, the random generator produced CPU with 8 GB of RAM using the Questa simulator.
random applications for 20, 50, and 100 generations
(the 100 generations scenario is captured in the graph). SUMMARY
The size of the population was always the same: ten We have shown that GA easily integrates into a Questa
chromosomes. For the test with 20 generations, the average UVM verification environment, is computationally efficient,
level of coverage was 97.78 percent. For the test with 50 and significantly reduces the time and effort required to
generations, the average level of coverage was 98.72 achieve needed coverage for ASIP based designs.
percent. For the test with 100 generations, the average level
of coverage was 98.89 percent.
34
Unit Testing Your Way to a Reliable Testbench
by Neil Johnson, Principal Consultant, XtremeEDA, and Mark Glasser, Principal Engineer, Verification Architect, NVIDIA
AN INTRODUCTION TO UNIT TESTING more complex and verification teams have realized the
Writing tests, particularly unit tests, can be a tedious chore. extent of the investment required to build testbenches
More tedious - not to mention frustrating - is debugging the need for reuse began to emerge. Code that will be in
testbench code as project schedules tighten and release your verification library for years and used to verify many
pressure builds. With quality being a non-negotiable aspect designs must be highly reliable.
of hardware development, verification is a pay-me-now
Randomized testing as a driver of testbench quality is less
or pay-me-later activity that cannot be avoided. Building
obvious but no less significant. When much of your stimulus
and running unit tests has a cost, but there is a greater
is randomized you cannot tell a priori what will happen in
cost of not unit testing. Unit testing is a proactive pay now
the DUT and thus what exactly will happen in the testbench.
technique that helps avoid running up debts that become
You rely heavily on the checkers and scoreboards to give
much more expensive to pay later.
you good information about the correctness of the DUT
Despite academics and software developers advocating operation. In general, you are relying on the testbench
the practice of writing the test suite before you write the infrastructure to always do the right thing in the presence
code, this is rarely, if ever done by hardware developers of highly randomized stimuli where you are looking
or verification engineers. This applies to design and it is for interesting and hard to reach corner cases. Since
also typical in verification where dedicating time to test randomized stimulus is, by its very nature, unpredictable,
testbench code is not generally part of a verification project you have to be sure that the testbench does the right thing
plan. As a result, testbench bugs discovered late in the no matter what.
process can be very expensive to fix and add uncertainty
to a project plan. Even worse, they can mask RTL bugs
making it possible for them to reach customers undetected. UNIT TESTING TESTBENCH COMPONENTS
WITH SVUNIT
Unit testing is a technique borrowed from software SVUnit, an open-source SystemVerilog-based unit
development. It is a low level, yet effective verification testing framework, provides a lightweight but powerful
activity where developers isolate and test small features infrastructure for writing unit-level tests for Verilog testbench
in a class or program. By showing that individual features code1. It has been modeled after successful software
and functions are working correctly, developers reduce frameworks, like JUnit, with the intention of providing
the likelihood of bugs infiltrating either subsystem or chip a similar base level of functionality and support. While
level testbenches. In short, unit testing can greatly reduce relatively new to the hardware community, neither SVUnit
testbench bugs, making the entire verification process more nor its application are novel ideas.
reliable and cost effective.
The SVUnit release package includes a combination of
There are several forces driving the need for testbench scripts and a Verilog framework. The usage model is meant
quality. One is the size of the testbench effort and another to be complete yet simple; developers have everything they
is randomized testing methodologies. Another obvious one need to write and run unit tests with a short ramp-up period.
is that the testbench is the arbiter of design quality. The
quality of the product you will ultimately sell to customers is Code generator scripts written in perl are used to create
only as good as the quality of its testbench. Verilog test infrastructure and code templates. The
generated infrastructure ensures that tests are coded and
Early in the history of verification, testbench code was reported in a consistent fashion. Users write tests within the
considered throw-away code. It only had to be good enough generated templates, then use a single command line script
to demonstrate that the DUT is working (however “working” - runSVUnit - to execute the tests. The runSVUnit script
was defined) and that was it. Once the design went to supports many popular EDA simulators including Mentor
fabrication the testbench code became expendable. In Graphics® Questa®.2
recent years as designs have become orders of magnitude
35
Your First SVUnit Project In SVUnit, tests are defined within the `SVTEST and
SVUnit can be used to test any/all of Verilog modules, `SVTEST_END macros. The macros are important because
classes or interfaces. In this example, the unit under test they let users focus on test content while forgetting about
(UUT) is a class called simple_model which is a UVM the mechanics of the underlying framework. A basic test
functional model that retrieves a simple_xaction from an to illustrate macro usage is xformation_test, a test that
input channel, performs a simple transformation - multiply ensures the simple_model data transformation happens
‘xaction.field’ by 2 - and sends the modified transaction to as expected.
an output channel3. The public interface to simple_model
is shown in figure 1.
36
TESTSUITES AND SVUNIT AT SCALE
To enable verification engineers to easily test all the
Figure 5 - Running unit tests with Questa
components in a testbench, SVUnit scales such that
multiple templates can be run within the same executable
using a single call to runSVUnit. For example, when a
Assuming simple_model performs the data transformation
simple_driver component is added to the testbench, a
as expected, a passing status is reported in the log file and
corresponding simple_driver_unit_test template can be
the simulation exits with a passing status.
created and run along with the simple_
model_unit_test template. With unit tests
running against both components, example
log output from a single simulation would
appear as in figure 9 (NOTE: simple_model
unit tests are labelled [simple_model_ut]
and simple_driver unit tests are labelled
Figure 6 - Passing log output [simple_driver_ut]).
HOW UNIT TESTING COMPLEMENTS Aside from the addition of an open-source framework like
CURRENT BEST PRACTICES SVUnit and a simulator, there are no other tool or licensing
Before you can begin to collect meaningful coverage data requirements. Nor are teams required to replace existing
a testbench must be reliable. For example, in a highly practices. In short, unit testing is a cheap, low-risk, high-
randomized environment scoreboards and checkers are reward complement to existing best practices.
critical to ensuring that incorrect behaviors (coverage) are
observed and defects are found. When scoreboards and
checkers are not reliable, coverage data is thrown into LESSONS LEARNED FOR VERIFICATION ENGINEERS
question thereby threatening to undermine the quality of the Shortly after you start coding a project you need to verify
design-under-test (DUT). Testbench reliability, therefore, is your assumptions. You ask “does the code work more or
critical. less the way I think it should?” It can be difficult to answer
this question until you have a fair amount of code in
Commonly, our industry produces reliability through place. It’s important to create some confidence that your
application of a testbench in situ. That is, a testbench is code is viable before you write too much. You want to get
completed and integrated with the DUT then becomes some feedback on your code and avoid rework. You can
incrementally more reliable as bugs are found and fixed compile the code to make sure it is at least self-consistent
using black-box test scenarios against the DUT. Initially, from the compiler’s perspective. But that doesn’t tell you
testbench reliability is extremely low. This is especially true if anything actually works. You have to build some sort of
for complex constrained random testbenches. Testbench working example, which at the early coding stages can be
bugs are discovered frequently; getting through testbench time consuming and seem tangential to the work at hand.
“bring-up” (i.e. reaching the point at which a testbench is The alternative is to build a series of small programs to
reliable enough to properly drive and check test scenarios instantiate and exercise your classes. This can be a tedious
against the DUT) is time consuming and reliability improves exercise because you have to create all of the infrastructure
slowly. necessary to make a complete, working program for each
38
test. SVUnit can help with this phase of testing. It will SUMMARY
generate the infrastructure code so that all you have to do is Spending time on testing the units that comprise your
supply the interesting parts of the test. You can build a small testbench is time well spent. SVUnit, a SystemVerilog unit
test or two at first and keep adding on. testing framework built in the mold of JUnit, gives you the
tools to start writing unit tests fairly quickly and easily.
The availability of SVUnit becomes an encouragement
There’s no longer an excuse to avoid building unit tests. So
to write tests as you write the code. As you finish a tricky
what are you waiting for?
piece of code and want to know whether the search works
correctly or a loop terminates after the proper number of You can download the complete open source SVUnit
iterations you can quickly write a few lines or maybe a few framework, including examples, from sourceforge at http://
tens of lines of code to try it out. This greatly increases sourceforge.net/projects/svunit.
confidence in your new code and does not detract from the
coding effort.
39
Hardware Emulation: Three Decades of Evolution – Part II
by Dr. Lauro Rizzatti, Verification Consultant, Rizzatti LLC
THE SECOND DECADE A drawback of that technology not appreciated at the time
In the second decade, the hardware emulation landscape was potentially higher power consumption than in the FPGA
changed considerably with a few mergers and acquisitions approach for the same design capacity.
and new players entering the market. The hardware
In 1997, Quickturn introduced the Concurrent Broadcast
emulators improved notably via new architectures based
Array Logic Technology (CoBALT) emulator, based on the
on custom ASICs. The supporting software improved
IBM technology, that became known as processor-based
remarkably and new modes of deployment were devised.
emulator.
The customer base expanded outside the niche of
processors and graphics, and hardware emulation slowly In 1998, Cadence® purchased Quickturn and over time
attracted more and more attention. launched five generations of processor-based emulators
under the name of Palladium®. Two or so years later,
While commercial FPGAs continued to be used in
Cadence discontinued the FPGA-based approach, including
mainstream emulation systems of the time (i.e., Quickturn,
an experimental custom FPGA-based emulator called
Zycad and IKOS) four companies — three startups plus
Mercury Plus.
IBM — pioneered different approaches.
The idea of developing a custom FPGA targeted to
IBM continued the experimentation it started a decade
emulation came from a French startup by the name of Meta
earlier with the YSE and EVE. By 1995, it had perfected its
Systems1. Conceived as a programmable device similar to
technology, based on arrays of simple Boolean processors
an FPGA but customized for emulation applications, the
that processed a design data structure stored in a large
Meta custom FPGA would have been a poor choice as a
memory via a scheduling mechanism. The technology was
general-purpose FPGA. Its fabric included configurable
now applicable to emulation. While IBM never launched a
elements, a brilliant interconnect matrix, embedded
commercial product, in 1995 it signed an exclusive OEM
multi-port memories, I/O channels, a debug engine with
agreement with Quickturn that gave the partner the right to
probing circuitry based on on-board memories, and clock
deploy the technology in a new emulation product.
generators.
By then, Quickturn grew disappointed with the difficulties
The approach yielded three benefits:
posed by the adoption of a commercial FPGA in an
emulation system. To reach adequate design capacity, it • Easy setup time and fast compilation time
was necessary to interconnect many hundreds of FPGAs • Total design visibility without compilation
mounted on several boards. Partitioning and routing such • Scalability at the increase of design size
a huge array of FPGAs became a challenging task, with
setups in the order of many months. Design visibility had
to be implemented through the compilation process that In fact, the Meta custom FPGA provided the same
competed for routing resources with the DUT, and killed benefits of the processor-based approach with less power
fast design iterations. Finally, the system did not scale consumption.
linearly at the increase of design size, suffering significant
performance drops. The processor-based approach was not unique to IBM.
It was also used by Arkos, a startup with a lifespan of a
The IBM technology promised to address falling star in a clear August night. After being acquired by
all of these shortcomings: Synopsys® in 1996, it was sold soon after to Quickturn.
• Very slow setup and compilation time In the course of the second decade, significant progress
• Rather poor debugging capabilities was made in several aspects of the hardware emulator. For
• Significant drop in execution speed example, by the mid-2000s, design capacity increased more
at the increase of design size than 10-fold to 20+ million ASIC-equivalent gates in a single
40
chassis. By then, all vendors supported multi-chassis into the design at run-time without requiring compilation.
configurations that expanded the total capacity to well This led to very fast iteration times.
over 100 million gates. Speed approached the threshold
of 1MHz. Multiple concurrent user capabilities began to The cost of emulation decreased on a per-gate basis
show up in datasheets. by 10X.
Major enhancements were made in the supporting By the turn of the century, it seemed that emulators built
software. The compiler technology saw progress across on arrays of commercial FPGAs were destined for the
the board. The two popular HDL languages, Verilog and dust bin. But two startups proved that premise to be false.
VHDL, were supported. Synthesis and partitioning were
Although only a few years had passed since Quickturn’s
improved.
dreadful experience with commercial FPGAs, a new
New modes of deployment were concocted, in addition breed of FPGAs developed by Xilinx and Altera changed
to ICE. It was now possible to connect an HDL testbench the landscape forever. Fully loaded with programming
running on the host PC to a DUT mapped inside the high- resources and enriched with extensive routing resources,
speed emulator. This approach leveraged the existing they boasted high capacity, fast speed of execution and
RTL/HDL testbench and eliminated the need for external faster place & route time. The Virtex® family from Xilinx
rate adapters, necessary to support ICE. It became also included a read-back mechanism that provided
known as simulation acceleration mode. As good as it full visibility of all registers and memory banks without
sounded, it traded speed for flexibility. The weak link requiring compilation. This capability came at the
was the PLI interface between the simulator in charge expense of a dramatic drop in speed during the read-
of the testbench and the emulator in charge of the DUT. back operation. All of the above were a windfall for two
Typically, the acceleration factor was limited to a low new players.
single digit.
In 1999, Axis4, a startup in Silicon Valley led by
To address this drawback, IKOS2 pioneered a new entrepreneurs from China, introduced a simulation
approach called transaction-based acceleration or TBX3. accelerator based on a patented Re-Configurable
TBX raised the abstraction level of the testbench by Computing (RCC) technology that provided accelerated
moving the signal-level interface to the emulated DUT simulation. The technology was implemented in an array
within the emulator and introducing a transaction-level of FPGAs called Excite. This was followed by an emulator
interface in its place. The scheme achieved up to a million built on the same technological foundation with the name
times faster execution speed, and simplified the writing of Extreme. Extreme became successful for the ability
the testbench. to swap a design from the emulator onto a proprietary
simulator to take advantage of the debugging interactivity
Another mode of deployment, called targetless emulation, of the simulator. This feature was called Hot-Swap.
consisted of mapping the testbench together with the
DUT onto the emulator. By removing the performance On the other side of the Atlantic, a French startup
dependency on the software-based testbench executed named Emulation Verification Engineering (EVE)
on the host PC, it was possible to achieve the maximum led by four French engineers who left Mentor Graphics
speed of execution allowed by the emulator. The caveat in 2000 developed an emulator implemented on a
was that the testbench had to be synthesizable, hence the PC card with two of the largest Xilinx Virtex-6000/8000
name of Synthesizable Testbench (STB) mode. devices. The product name was ZeBu for Zero-Bugs.
The implementation did not support ICE. Instead,
Debugging also improved radically. One of the benefits it promoted transaction-based emulation based on a
of the processor-based emulators as well as of the patented technology called “Reconfigurable Testbench”
custom FPGA-based emulators was 100% visibility (RTB). The team also harnessed the read-back feature
41
of the Virtex devices to implement 100% design visibility
at run-time without compilation. As mentioned, the
drawback was a drop in performance during the reading
process.
ENDNOTES
1. Meta Systems was acquired by Mentor Graphics
in 1996.
2. IKOS was acquired by Mentor in 2002.
3. Today, different vendors call it Transaction-based
verification (TBV) or transaction-based acceleration
(TBA).
4. Axis was acquired by Verisity on November 16, 2004.
Three months later, Cadence purchased Verisity.
42
Accelerating RTL Simulation Techniques
by Lior Grinzaig, Verification Engineer, Marvell Semiconductor Ltd.
Coding style has a significant effect on simulation run times. initial begin
Therefore it is imperative that the code writer examine forever
his/her code, not only by asking the question “does the begin
code produce the desired output?” but also “is the code wait (ARVALID == 1)
economical, and if not, what can be done to improve it?” count++;
The following discussion presents some useful methods for wait (ARREADY == 1);
analyzing code based on these questions. @(posedge clk);
end
end
MICRO CODE MODIFICATIONS
Sensitivity Lists/Triggering Events
The key thing to remember about a sensitivity list at an You can see that this code, although less trivial, functionally
always block or a trigger event at a forever block is that counts the same thing, but much more efficiently, with
when the trigger occurs, the simulator starts to execute respect to the number of calculations needed.
some code. This is trivial, of course, but by asking the
Asynchronous example:
second question — instead of only the functional one —
The following is taken from actual code found inside one
engineers can make the code more economical. In this
of our IP. It is a BFM code of an internal PHY. For this
sense, it is desirable to determine when a signal can be
example, the code has been edited to use only eight
exempt from the sensitivity list or which event should be
phases; the original code included 128 phases.
chosen for triggering.
43
always @(*) wire #(150) IN6 = IN && ({DELAY_SEL_IN2, DELAY_
begin SEL_IN1, DELAY_SEL_IN0}==’d6);
case ({DELAY_SEL_IN2, DELAY_SEL_IN1, wire #(175) IN7 = IN && ({DELAY_SEL_IN2, DELAY_
DELAY_SEL_IN0}) SEL_IN1, DELAY_SEL_IN0}==’d7);
4’d0 : OUT = IN0 ;
4’d1 : OUT = IN1 ; always @(*)
4’d2 : OUT = IN2 ; begin
4’d3 : OUT = IN3 ; case ({DELAY_SEL_IN2, DELAY_SEL_IN1,
4’d4 : OUT = IN4 ; DELAY_SEL_IN0})
4’d5 : OUT = IN5 ; 4’d0 : OUT = IN0 ;
4’d6 : OUT = IN6 ; 4’d1 : OUT = IN1 ;
4’d7 : OUT = IN7 ; 4’d2 : OUT = IN2 ;
endcase 4’d3 : OUT = IN3 ;
end 4’d4 : OUT = IN4 ;
4’d5 : OUT = IN5 ;
4’d6 : OUT = IN6 ;
Examining this code carefully shows that for each change 4’d7 : OUT = IN7 ;
of IN, the always block is invoked eight times. This is due endcase
to the cascading changes of the INx signals: IN0 changes end
at “t” invoke the always block that initially processes the
case logic; then IN1 changes at “t+25” and invokes the
always block again, and so on, until IN7 invokes it at “t+175”. Based on the assumption that the delay configuration is not
Remember that the code originally supported 128 phases, reconfigured simultaneously with the modules’ functional
so for each change there were 128 invocations. The case data flow, we have reduced the code complexity to M*N.
itself was composed of 128 options, and this module was Actually, if we do not care in our simulation about the
implemented on every bit of the PHY’s 128-bit bus! “analog delay” on the bus, we can simply write OUT=IN and
reduce the complexity to M only.
This resulted in a complexity magnitude of ~M*N2 (where
M is the number of bits in the bus, and N is the number of This simple code change alone accelerated our full-chip
phases). tests (SoC of ~40M gates) by two!
44
This is a large array and looping over one million entries it, forces can be used to override the normal behavior.
will take a long time. Fortunately, this time can be saved Again, this can be controlled by a parameter.
during the initial reset of the chip (before the memory is
filled) by masking the first reset negedge —as the array is In a design with a complex clock scheme, engineers may
already filled with zeros. Beyond that, however, a different try to find the best clock ratios that are relevant for that
approach can be applied. Using an associative array type of test. If the test depends on cores, it may help to
instead of a fixed array enables the array to be nullified increase the core clock frequency. If the test depends
with one command, instead of by using a loop. on DMA activity, the core frequency can be reduced
when the core is idle. It is good practice to choose ratios
that are used by default for most of the tests and make
MACRO CODE MODIFICATIONS adjustments to only specific ones.
CONCLUSION
Slow simulations are not necessarily decreed by fate.
Engineers as well as managers should pay attention to the
importance of coding economically for simulation as well
as the different ways to analyze simulations and tackle
simulation bottlenecks.
46
Emulation Based Approach to ISO 26262
Compliant Processors Design
by David Kaushinsky, Application Engineer, Mentor Graphics
47
the altitude. Approximately 1% of the cosmic ray’s neutrons IV. ARCHITECTURE OF
reach the Earth’s surface and they present a very wide FAULT TOLERANT PROCESSORS
energy spectrum. Processors for Fault Tolerant applications are typically
required to achieve the following targets: high performance,
2. EMC for Integrated Circuits
low cost, low power dissipation, and reliability. The problem
There are two main issues concerning electromagnetic
is that most available processors and integrated systems-
compatibility (EMC) and ICs. The first is the electromagnetic
on-chip achieve only some of the targets and fail on others.
energy emission of ICs while they are operating and the
This is indicated below when relative advantages and
second is the susceptibility of ICs to electromagnetic waves
disadvantages are listed, and exemplified in later sections.
from the operational environment. With the ever increasing
The following sections survey different industry approaches
use of electronic systems the electromagnetic environment
to this tradeoff. 5
becomes more complex, making the EMC requirements of
systems more challenging to meet. 1. Radiation Hardened (RH) Processors
In this approach processors are fabricated on dedicated
Crosstalk between the metal lines within the chip is also a
RH processes. Advantages include: High tolerance to
significant source of errors, especially in multi-layered chips.
radiation effects, thanks to the RH process. In some cases,
3. Electromigration such processors achieve high performance. This can
Over a period of time the flow of electrical current through be especially true when using custom design methods
metal tends to displace metal ions. In some places voids similar to those employed for the design of COTS high-
open up in the wires leading to open circuits and in other performance processors. This approach can offer high
places ions are deposited causing shorts. This phenomenon level of integration, including the inclusion of special I/O
is known as electromigration. controllers dedicated to fault tolerant applications.
48
than COTS processors, mostly due to low production are designed to boost performance. Forcing two such
quantities and high cost of qualification, but at the same processors to execute in lock-step every clock cycle may
time, they are less expensive than RH processors thanks require significant slowdown of the processors.
to using a regular commercial fabrication process. Finally,
RHBD processors can offer high integration since they 5. Triple COTS: TMR at the system level
are designed as ASIC and since typically the CPU itself Triple Modular Redundancy (TMR) architectures combine
takes only a small portion of the silicon die. three COTS processors and voting logic. The processors
do not need to be stopped on SEU. TMR offers high
Disadvantage is RHBD processors are usually slower performance and high SEU tolerance.
than COTS processors since they are designed as ASIC
chips and not as custom processors. Disadvantages of TMR is high cost, requiring large area
and power, as well as special hardware for voting and
3. Single COTS Processor usually additional hardware and software for recovery
with Time Redundancy (SIFT) from internal SEU errors (inside the processors) that
In this approach, a single COTS processor is used cannot be fixed by voting and require scrubbing or reset.
together with Software Implemented Fault Tolerance
(SIFT), which executes the entire software or certain 6. TTMR on COTS VLIW processors
software sections twice or more. There are two levels COTS VLIW processors execute multiple instructions
of granularity: Instruction level redundancy, where each in parallel, and the parallel instruction streams are pre-
instruction is executed twice and additional instructions programmed. Each instruction can be executed three
compare the results, requiring compiler transformation times and the results can be compared and voted, all
of the code, and procedure level redundancy, where the within the same VLIW processors. TTMR offers high
programmer writes the code to invoke certain procedures performance (in fact, TTMR processors are the fastest
twice, compare the results and use software for recovery available space processors today) and high SEU
in case of mismatch. The latter approach may also require tolerance, thanks to embedded TMR mechanism, but it is
some additional hardware to protect the critical data and expensive, is limited to VLIW processors, and is hard to
the critical software. The main advantage of this approach generate code for. The code executes two copies of an
is that it is relatively inexpensive. instruction, compares the result, on mismatch executes
the same instruction the third time and compares for
Disadvantage is the major performance penalty majority voting.
due to the computational overhead.
49
require extra hardware. Furthermore they are usually
Techniques Advantages Disadvantages
fast since they do not introduce significant delay in the
execution of the target applications. Physical Level, • High time- • Limited set of
(cont.) resolution injection points
for hardware and limited set of
2. Simulation Based Verification
triggering and injectable faults
Simulation is favored since it allows the testing of fault monitoring • Requires special
tolerant systems very early in the design stage. If a HDL • Well suited for hardware
description of the system is available, testing through the low-level fault • Debug is hard
simulation can be performed in great detail and it is models • Limited coverage
potentially very accurate since it gives realistic emulation of • Not intrusive
• No model
faults and detailed monitoring of their consequences on the
development
system. or validation
required
3. Physical Level Validation • Able to model
Injection of physical faults on the actual target system permanent faults
hardware can be achieved through pin-level fault injection, at the pin level
heavy-ion radiation, electromagnetic interference and laser
fault injection. The major advantage of these approaches Software • Can be targeted • Limited set of
to applications injection instants
is that the environment is realistic and the results obtained
and operating • It cannot inject
can give accurate information on the behavior of the system systems faults into
under such conditions. • Experiments can locations that are
be run in near inaccessible to
4. FPGA Based Verification real-time software
This technique can allow the designer to study the actual • No specific • Require a
behavior of the circuit in the application environment, hardware instrumentation of
taking into account real-time interactions. However, when • No model the source code
development • Limited
an emulator is used the initial VHDL description must be
or validation observability and
synthesizable required controllability
• Can be • Difficult to model
5. Emulation Based Verification expanded for new permanent faults
Emulation enables pre-silicon fault injection and debug at classes of faults
hardware speeds, using real-world data. The scenarios of
real-time software and hardware fault injection debug with Simulation • Support all • Slow
abstraction levels • Model is not
simulation-like visibility are achieved.
• Non-intrusive. readily available.
• Full control of • No real-time faults
both fault models • Coverage is
Techniques Advantages Disadvantages
and injection limited
Physical Level • Fast • Risk of damage to mechanisms
• Can access system under test • Does not require
locations that • Low portability any special-
are hard to be and observability purpose hardware
accessed by other • Maximum
means observability and
controllability
50
Techniques Advantages Disadvantages VI. A FRAMEWORK FOR
Simulation, (cont.) • Allows performing VELOCE® BASED FAULT INJECTION
reliability We suggest the following scheme for implementing a
assessment at generic fault injection system using the Veloce emulator.
different stages
in the design 1. Faults, Errors, Failures
process
• Able to model a) Fault—A fault is a deviation in a hardware or
both transient and software component from its intended function.
permanent faults
Faults can be categorized into permanent and
transient faults by their duration.
FPGA Prototype • Injection time • High effort of
is faster than partition and b) Error—An error is the manifestation of a fault
simulation based synthesis and on the observed interfaces.
techniques limited signal c) Failure—A failure is defined as the deviation
• The visibility resulting of the delivered service from the specified service.
experimentation in long debug
time can be cycles
reduced by • Intrusive
implementing instrumentation
the input pattern techniques
generation in • Testing functional
the FPGA; these behavior of
patterns are injected fault.
already known • Unanticipated
when the circuit behavior analysis
to analyze is is hard
synthesized
51
c. Create a Fault Injection DB. VII. CONCLUSIONS
d. Instantiate in top with a Golden Model (GM) DUT Reliability and safety are of major importance to the
and Fault Injected Model (FIM) DUT. introduction of automotive drive-by-wire ISO 26262
2. Emulation Phase compliant systems. Their required high safety integrity
a. Run a Golden Model (GM) and capture all interface necessitates that all electronic components will be fault
signals to log. tolerant with regard to failures in electronic hardware and
b. Run Fault Injected Model (FIM) and capture software. Fault-tolerant processors properties can be
interface signals to log. obtained primarily by static or dynamic redundancy, leading
c. Post process compares GM vs FIM. to systems that are fail-operational for at least one failure.
3. Evaluation phase
a. Analyze results. The comparison of different fault injection techniques
b. Create reports. leads to the conclusion that Emulation based approach
has key advantages for achieving the goals required for
fault-tolerant tolerance.
52
VIII. REFERENCES
1. ISO 26262 Road vehicles – Functional safety – Part
5: Product development: hardware leve
2. ISO 26262 Road vehicles – Functional safety – Part
10: Guideline
3. E. Touloupis, ”A fault tolerant microarchitecture for
safety-related automotive control” , A Doctoral Thesis,
2005 https://dspace.lboro.ac.uk/2134/14402
4. R. Isermann, R. Schwarz, and S. Stolzl. Fault-tolerant
drive-by-wire systems. IEEE Control Systems
Magazine, 22(5):64–81, Oct 2002.
5. R. Ginosar, “A survey of processors for space,” in
Data Systems in Aerospace (DASIA). Eurospace, May
2012.
6. H. Ziade, R. Ayoubi, and R. Velazco, “A Survey on
Fault Injection Techniques,” The International Arab
Journal of Information Technology”, Vol. 1, No. 2, pp.
171-186,2004, July 2004
53
Resolving the Limitations of a Traditional VIP for PHY Verification
by Amit Tanwar and Manoj Manu, Questa VIP Engineering, Mentor Graphics
Because of the complexities involved in the entire design A comprehensive PHY verification plan must verify all
verification flow, a traditional Verification IP (VIP) tends of the PHY functionality in various conditions. However,
to overlook the subtle aspects of the physical layer (PHY) a traditional VIP tends to miss out verifying all the
verification, often leading to costly debug phases later in the functionality.
verification cycle.
Note: In this article, the PHY features are described in
In addition, because of the several possible topologies in the context of PCI Express and USB protocols. However,
a PHY implementation, completely exercising the role and in terms of the PHY verification methodology, this article
related functionality of a PHY becomes challenging for a applies to all serial protocols that use a common PHY.
traditional VIP.
Furthermore, the analog signaling and the homologous PHY VERIFICATION ENVIRONMENT
functionality of the physical layer in serial protocols, led the Usually, a PHY verification environment requires one bus
industry to define a common PHY that multiple protocols functional model (BFM) at the serial interface and another at
could use and that segregates the PHY logic from that of the PIPE interface.
the general ASIC. One such common PHY is used in PCI
Express, USB 3.0 and 3.1, and SATA protocols. Similarly, In the following figure, the VIP acts as the USB host or the
M-PHY is used in SSIC, M-PCIe, and LLI protocols, among PCIe RC at the PIPE interface and as the USB device or
others. the PCIe EP at the serial interface.
The common PHY found in PCI Express, USB 3.0 and 3.1,
and SATA devices help accelerate development of these
devices by implementing the physical layer functionality as
a discreet IC or macro cell, which can be easily included in
ASIC designs.
However, in the following figure, the connections are flipped.
In bus-based layered protocols, PHY typically provides the
following functionality:
The following are a few examples of the error-prone Note: A connection file in the PHY verification kit contains
scenarios that might be encountered when setting up special notes for the signals whose width changed across
the pin connections: specification versions. These special notes provide
guidelines to connect a signal that requires particular
attention.
55
There can be many other error-prone scenarios depending Configuring or debugging issues because of an incorrect or
on the design being verified. Using a PHY verification kit unset PHY configuration requires considerable time later in
minimizes the occurrence of these error-prone scenarios the verification cycle. Therefore, it is always recommended
but cannot eradicate all the issues completely. to set the correct configuration right at the beginning. The
PHY verification kit enables engineers to set all the relevant
The PHY verification kit enables engineers to focus on the PHY configurations right at the beginning.
main configuration problem in PHY verification, which is to
enable VIP features that are not supported in PHY. After the engineer finishes connecting the pins and setting
up all the relevant configurations, the next objective is
The following are a few examples of the error-prone to ensure the connection link up and initiate the first
scenarios that might be encountered during configuration: transaction. More often than not, most link up issues occur
in the initial PIPE signal handshaking; for example, the
1) The VIP is configured for a clock frequency transition
receiver detection.
when the PHY supports changing the PIPE width during
speed change. This configuration causes deadlock. The following are a few examples of the error-prone
2) In the PIPE width change configuration, the engineer scenarios that might be encountered during the link up
forgets to set the initial PIPE width at which the link up process:
must occur.
3) Generally, MAC performs data scrambling on the 1) No receiver detection initiative from MAC occurs
PIPE interface. However, if PHY is also configured for because PHY configured the Phystatus signal as low.
scrambling, this configuration leads to unrecognized 2) PHY requires another receiver detection attempt.
data at the other end. 3) PHY does not respond to the receiver detection attempt
because of a reset issue.
4) PHY does not respond to the change in the
PowerDown signal.
5) Timing of the RxElecIdle signal is not correct, which
causes even the valid packet to be truncated in the
middle.
6) During the first equalization process, PHY does not
respond because of incorrect co-efficient values.
7) During the first speed change, PHY does not respond
to the change in the Rate signal from the MAC.
8) PHY is unable to perform a speed change in the
recovery speed because of a short timer value.
56
3) To create some frequency difference between
the serial and PIPE clock and check whether PHY
performs the SKP addition deletion.
4) To check for the receiver underflow or overflow
situation by creating frequency differences and
stopping the SKP signal from the serial side.
5) To configure the serial side to perform polarity
inversions on the differential pin so that the MAC
asserts the RxPolarity signal. Then check whether
PHY performs the polarity inversion on the data
received from the serial side.
6) To exercise all the power saving states to check the
behavior on the electrical idle state exit and entry.
REGRESSION PHASE
A good test plan is required to have robust test cases. 1) The design can be compiled independently.
The process to write test cases for a PHY is slightly 2) The testbench can be compiled independently.
different than for the physical layer of a SoC. 3) The individual tests can be run in parallel.
4) Any test failures can be easily checked
For example, consider a VIP that provides a test-suite and reproduced.
for all layers. Could only the physical layer test-suite be 5) Log files and waveform file names show an
extracted to verify the PHY? Probably yes. However, can association with the tests.
the extracted test-suite provide a comprehensive PHY 6) Individual test coverage data is saved to the
verification? Probably not, because the physical-layer universal coverage database (UCDB) format
test-suite targets the protocol SoC, and, in some specific for coverage metrics.
cases, particular tests might not cover stand-alone PHY
verification requirements. In addition, specific test cases The PHY verification kit offers a robust regression
might be required to cover stand-alone PHY verification environment that ensures that any change in the design
corner cases; for example, a test case to verify whether or the testbench can be easily validated by running all the
PHY performs the frequency compensation. A PHY tests in one go. This saves time and eliminates manual
verification kit removes these limitations by providing a effort.
test-suite with a test plan that targets 100 percent PHY
verification. Engineers can keep track of the verification objectives
with the help of functional and code coverage metrics.
The following are a few more examples of test cases They can track code coverage using switches in the
targeting stand-alone PHY verification: simulator. To track the functional coverage, they can use
covergroups, coverpoints, or crosses in the verification
1) To inject a disparity error from the serial side plan.
and expect decode or disparity error code in the
RxStatus signal on the PIPE side. In the verification plan, which is an XML file, all the
2) To configure the serial side as the loopback master available covergroups, coverpoints, and crosses
and check if PHY performs loopback after seeing are mapped to individual sections of the protocol
the asserted TxDetectRxLoopback signal from the specifications.
MAC side.
57
Once the required covergroups are enabled, the coverage REFERENCES
data needs to be saved to a UCDB format to view the
coverage metrics in the Questa® simulator. The UCDB • PHY Interface for the PCI Express, SATA and USB 3.1
is a repository for all the coverage data (including code Architectures, Version 4.2 specification
coverage, cover directives, cover points, and assertion • PCI Express specification
coverage), which is collected during simulation by the • Universal Serial Bus 3.1 specification
Questa platform. The Questa verification platform also • USB 2.0 PHY verification paper
enables all the coverage results to be merged in the UCDB • http://www.design-reuse.com/articles/15011/
format, which is then accessible in the Questa GUI or in the usb-2-0-phy-verification.html
form of a log file.
CONCLUSION
The limitations of a traditional VIP can be resolved using
an exclusive PHY verification kit that offers the following
advantages:
58
VERIFICATION
ACADEMY
The Most Comprehensive Resource for Verification Training
www. verificationacademy.com
Editor: Tom Fitzpatrick
Program Manager: Rebecca Granquist
To subscribe visit:
www.mentor.com/horizons