0% found this document useful (0 votes)
386 views4 pages

Common Path Pessimism Removal An Industry Perspective

CPRR

Uploaded by

ReddySai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
386 views4 pages

Common Path Pessimism Removal An Industry Perspective

CPRR

Uploaded by

ReddySai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Common Path Pessimism Removal:

An industry perspective
Special Session: Common Path Pessimism Removal

Vibhor Garg
Cadence Design Systems, Inc.
2655 Seely Ave, San Jose 95134 U.S.A.
[email protected]

Abstract Process parameters, e.g., transistor width, may


greatly vary not only across multiple manufacturing lots, but II. PESSIMISM ON CLOCK TREE NETWORK
also within the same die from the same manufacturing lot. In Consider the circuit in Fig. 1. There is a data path from the
addition to process variations different parts of a chip may see register ff1 output pin Q to the register ff2 input pin D, with
different voltages and temperatures. These process-voltage- some combinational delay elements along the path. Both
temperature (PVT) variations are termed as On-Chip registers ff1 and ff2 are reached by the same clock signal
Variations (OCV) and can unsystematically affect wire and Clock, defined at the port. There exists a simple clock path of
cell delays. This variability is accounted for by adding OCV buffers, with buffers buf1, buf2, buf3 in the fanin cone of
de-ratings to path delays during static timing analysis (STA), register ff1 clock pin CK and buffers buf1, buf2, buf3, buf4
where the original timing values are split into early and buf5 in the fanin cone of register ff2 clock pin CK.
(lowerbound) and late (upperbound) quantities. Chip timing is Consider the hold timing check performed between pins ff2/D
then done against these new delays to ensure safe chip and ff2/CK.
operation. Any unknown or hard-to-model variation effect can
also be margined for in these OCV de-ratings. However, this
additional pessimism can significantly increase the difficulty
to achieve timing closure, thereby elongating the design cycle
and time to market. In particular, excess pessimism along
clock network creates the most design-cycle churn, as
pessimistic clock delays impact nearly all data paths. This
session discusses the overview and challenges of common
path pessimism removal (CPPR), the method of safely
removing excess pessimism from clock paths, from an
industry perspective.
Hold timing check verifies that the earliest data arrival time at
Keywords static timing analysis, pessimism removal, ff2/D is greater than or equal to the sum of latest clock arrival
CPPR. time at ff2/CK, and the hold time of the capturing register ff2
denotes the time needed such that the data launched by launch
I. INTRODUCTION register ff1 will be captured by the capture register ff2 in the
next clock cycle. Expressed mathematically, the slack of the
Excess pessimism in clock tree network skews the actual hold check can be represented as:
timing of the circuit. This results in concluding that the design
can operate safely at a lower clock frequency than the actual slack Hold = arrivalTime Early(D) arrivalTime Late(CK) delay
silicon. This pessimism needs to be removed to achieve true Hold (1)
performance of the circuit. The remainder of this paper is
organized as follows. First, artificial pessimism introduced on In Fig. 1, the earliest arrival time at ff2/D is comprised of
clock tree network due to OCV de-ratings is discussed and the clock path segment: Clock, buf1, buf2, and ff1/CK pin, and
term Common Path Pessimism Removal (CPPR) is explained. the data segment: ff1/CK, ff1/Q, and combinational element.
Next, the effects and significance of CPPR in timing and the Each component along the path is assumed to be operating at
challenges in determining CPPR and removing pessimism are the Best Case mode, and applied with OCV de-ratings for
discussed. Finally, a brief summary of ongoing industry work minimum (early) delays. The latest arrival time at ff2/CK is
in CPPR is presented. comprised of clock path: Clock, buf1, buf2, buf3, buf4, and
buf5 and ff2/CK. Each component along the path is assumed
to be operating at the Worst Case mode and applied with OCV
de-ratings for maximum (late) delays.

978-1-4799-6278-5/14/$31.00 2014 IEEE 592


Observe that these OCV de-ratings have introduced excess For the hold timing check, using (1), the hold slacks for ff2
pessimism in the clock tree network, as both the launch and and ff3 without CPPR are:
capture clock paths share a common clock path comprising of
components buf1, buf2 and buf3. These components will slack Hold (ff2) = 8 - 10 -1 = -3
always operate under the same PVT conditions (i.e., either for slack Hold (ff3) = 9 - 10 -1 = -2
minimum delays or maximum delays but not both at the same
time). Applying different de-ratings to these components, Negative slacks show that both hold checks are failing, where
therefore, creates an artificial pessimism on the clock path, the violation at ff2/D pin is worse than the violation at ff3/D.
which needs to be removed. The same analysis also applies to Each flip-flop pair ff1 and ff2, and ff1 and ff3 has a
setup checks if the PVT conditions are not expected to change common clock path. The common pin for flop pair ff1 and ff2
significantly between clock cycles. is buf4/Y, and common pin for flop pair for ff1 and ff3 is
buf1/Y. When the excess pessimism from common clock path
Common path pessimism removal (CPPR) is the removal of is removed, using (3), the hold slacks for ff2 and ff3 with
artificially-induced pessimism between a launch and capture CPPR are:
flip-flop pair during timing analysis by identifying the
common clock path between launch and capture clock paths. slack Hold (ff2) = 8 - 10 -1 + (8 4) = 1
The common path pessimism value itself is the sum of delay slack Hold (ff3) = 9 - 10 -1 + (2 1) = -1
differences along the common clock path up to the common
pin due to different OCV de-ratings and can be expressed as The example above illustrates the following key points about
the difference between the latest and earliest arrival times of the effect of removing excess pessimism during STA.
the clock signal at the common pin. The common pin (CP) is
defined as the last pin on the clock tree before the launch and i. The slacks at the end points improve, thereby
capture clock paths diverge. The CPPR value is then credited producing design timing that is less violating than
to the slack of the timed path to remove pessimism. before. Before CPPR, the total negative slack (TNS)
for the design was -5; after CPPR, it is -1.
CPPR credit = arrivalTime Late (CP) arrivalTime Early (CP) (2)
ii. True critical paths in the design are exposed (i.e.,
Therefore, with CPPR credit applied, the hold slack is more false positives are eliminated), and accurate measure
accurately expressed as: of the slack violations of timing checks is
determined. Before CPPR, ff2/D was shown as the
slack Hold = arrivalTime Early(D) arrivalTime Late(CK) delay more critical path; after CPPR, ff3/D is shown to be
Hold + arrivalTime Late(CP) arrivalTime Early(CP) (3) the more critical path.
Similarly, the setup check slack with CPPR credit applied can Both observations are significant for full chip design
be expressed as: implementation flows, as the synthesis and optimization tools
can focus their efforts on true timing-critical paths, and
slack Setup = arrivalTime Early(CK) + Clock Period
optimize these paths only by the amount necessary to meet the
arrivalTime Late(D) delay Setup + arrivalTime Late(CP)
target clock frequency of the chip. For instance, without CPPR
arrivalTime Early(CP) (4)
analysis, an optimization tool may try to meet chip timing by
inserting (a minimum of) 3 buffers on data path to improve its
III. SIGNIFICANCE OF CPPR FOR TIMING CLOSURE slack. After CPPR, an optimization tool may need to only
insert a single buffer to remove timing violations. In addition
Consider the circuit of Fig 2. Launch register ff1, and two to the reduced design-cycle time, removing pessimism also
improves chip yield (e.g., area) and performance (e.g., power
capture registers ff2 and ff3 are reached by the same clock
and leakage).
signal Clock. Let the late (early) rise transition delay through
each buffer element be 2 (1), and denoted as (2, 1). Let the late
(early) rise transition delay for the timing arc CK to Q on ff1 Table 1 shows TNS and number of violating endpoints
be 4 (3), and denoted as (4, 3). Let the hold time for capturing improvements with CPPR credit applied for setup check.
registers ff2 and ff3 be 1, and assume all net delays are 0. Table 1

593
Table 2 shows TNS and number of violating endpoints Due to time to market pressures, designers perform timing
improvements with CPPR credit applied for hold check. analysis on multiple mode, multiple corner setups
simultaneously within a single analysis run. This requires
Table 2 analysis of the same netlist but with potentially different clock
tree structures for different modes to be done simultaneously.
As the number of combinations of modes and delay corners
increase with chip complexity, the runtime and memory
required to compute CPPR credit for a given flop pair across
all those combinations increases as well.

V. ONGOING INDUSTRY WORK IN CPPR


Removing pessimism from the design during timing analysis
is integral to meeting chip timing, area, and power targets.
Given the significance and impact of removing pessimism
from design towards meeting chip targets, it becomes
imperative that todays STA tools overcome the challenges in
determining CPPR credit and develop algorithms that are both
efficient as well as scalable with increasing design complexity.
IV. CHALLENGES IN COMMON PATH PESSIMISM REMOVAL
To that end, commercial STA tools continue to invest heavily
in research and development on this topic and explore new
The biggest inherent challenge in common path pessimism ideas and concepts to improve CPPR runtime and memory
removal is that the amount of pessimism that needs to be usage. Various complex algorithms and efficient data
removed is path specific. For a given (i) launch-capture flop structures are employed to reduce the complexity by creating a
pair, (ii) check type and (iii) transition set that launch and smaller representative graph and then employing various
capture the data at registers, STA tools need to traverse the algorithms to compute CPPR credit quickly and efficiently.
clock tree for launch and capture register clock pins to
determine the common pin in the clock network. Therefore, Within the full digital implementation flow that involves
the runtime and memory required in computing CPPR credit optimization at many stages, from placement to pre-CTS
increases exponentially as the design size and complexity (clock tree synthesis) optimization to post-CTS optimization
grows. Furthermore, the launch-capture flop pair relation is to routing to post-routing optimization, any pessimism added
not known a priori, thereby limiting the potential of efficiently or left in the design during previous stage is seen and needs to
pre-computing the common ancestors of register clock pin be removed at the next or later stage(s). In order to reduce the
pairs in the design. time to remove all violations from the design, at each stage
CPPR credit needs to be accurately computed and applied so
As todays devices grow in size and complexity to handle as to remove pessimism. Since each stage is performing a
multiple applications, the clock network is getting increasingly different and specific task, fine-tuned algorithms need to be
complex with multiple divergence and re-convergence points. developed to determine and apply CPPR credit for best timing,
Using a simplistic approach to determine the common path area and power usage results.
pessimism by considering only the worst and best paths for a
timing check can lead to an optimistic analysis, where a sub- When timing violations or functional bugs are discovered late
critical path may be the active path for clock propagation in the chip design cycle, often times engineering change
during chip operation. Therefore, all possible launch and orders (ECO) are made on a very small set of design
capture path pairs for a given flop pair must be analyzed in components to rectify the situation. However, the changes
order to compute a safe CPPR credit. However, the analysis of need to be fully qualified and its impact on chip timing needs
these re-convergent clock paths exponentially increases the to be determined. If the change happens to be on data network,
difficulty in determining pessimistic CPPR credit. In practice, usually it affects a small set of components in the local
an exhaustive search for all possible paths is infeasible, vicinity of the change, and an incremental STA analysis can
especially for realistic turnaround times. be performed quickly. However, if the change, whether delay
or netlist change, happens to be on a clock network, it can
Multiple functional modes, higher clock frequency and low potentially affect a huge set of components as clock signals
power usage modes also increase complexity for CPPR. With fan out to huge data networks. To shorten the time to market,
more than one functional mode being simultaneously active in this requires that the clock network changes be also handled in
different parts of the chip, or frequent switching from one an incremental manner. Commercial STA tools today are
mode to another, e.g. normal to sleep mode and vice versa, the working actively to create a framework for incremental
chip design must be validated in multiple functional modes, analysis and update of CPPR credit regardless of the nature of
and potentially across multiple delay and parasitic corners.

594
change. Incremental change to clock network and its analysis
is a complex and challenging problem in itself, and one of the
most active areas of work in the industry.

Latest advances in static timing analysis of chip designs


exploit newer use models that use variable OCV de-ratings
based on the depth of a component in a path and the distance
travelled by a signal to complete the path. These are referred
to as Advanced On-Chip Variation (AOCV) methods. Simply
identifying the common pin for CPPR credit is not enough
under these models. For proper pessimism removal, STA
analysis must also generate depth and distance information for
a timing check at given flop pair. New algorithms and data
structures are being developed in commercial STA tools to
remove pessimism accurately and efficiently with AOCV
models.

VI. REFERENCES

[1] J. Hu, D. Sinha, I. Keller, TAU 2014 Contest on


Removing Common Path Pessimism during Timing Analysis,
ISPD 2014, pp. 153-160.

[2] J. Zejda, P. Frain, General Framework for Removal of


Clock Network Pessimism, ICCAD 2002, pp. 632-639.

[3] J. Bhasker, R. Chadha, Static Timing Analysis for


Nanometer Designs: A practical approach, Springer, 2009.

[4] D. Hathaway, J. P. Alvarez, K. P. Belkhale, Network


timing analysis method which eliminates timing variations
between signal traversing a common circuit path, United
States patent 5,636,372 (June 1997).

595

You might also like