0% found this document useful (0 votes)
80 views

Char Using Curve Fitting

characterization using curve fitting
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Char Using Curve Fitting

characterization using curve fitting
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Fast timing characterization of cells in standard cell

library design based on curve fitting


Kenza Charafeddine* Faissal Ouardi
ANISSE Team, Faculty of Sciences, Mohammed V ANISSE Team, Faculty of Sciences, Mohammed V
University in Rabat, Morocco University in Rabat, Morocco
[email protected] [email protected]
*corresponding author

example of a fall transition timing table where the index1


Abstract—This paper presents a fast method for timing represents the input slew and the index 2 represents the load.
characterization of standard cell library. It is based on curve
fitting to solve the CPU resources and storage issues for the
generation of a large scale liberty files. In our approach,
generation of a characterized file of a full library takes less than
1 hour instead of 650 hours of simulation time. The paper also
presents the method used to qualify and check accuracy of the
generated data on a real circuit design. Implementation results
demonstrate that the error of the interpolated timing and power
liberty file against the characterized one is less than 3%.

Keywords— Liberty file, Standard cell Library, Characterization,


scaling, Timing

I. INTRODUCTION AND PRELIMINARIES


Fig. 1- Example of timing LUT table in the liberty file
Standard cell library is a package of logic blocks like
INVERTER, AND, OR, etc. It also contains a logical library The values in the table are the results of spice simulations of
with timing and power characteristics of these logic cells [1]. the transition time.
Using a standard cell library presents a number of advantages
as reducing product development design time as well as adding The many conditions and combinations (PVT) to simulate
robustness and flexibility to the designs. A number of factors for the different cells of a standard cell library increase indeed
should be taking into account like area, timing performance, the time and resources needed for the standard cell
power, testability, etc. The library can be designed either for characterization. In our case, a full standard cell library
high performance targeting high speed circuits or for high characterization of 800 cells in 14nm technology can take up to
density suitable to low power circuits. A high quality standard 650 hours of simulation time in the slowest corners. It requires
cell library is an important criteria to achieve a good and a high amount of CPUs and disk space which makes this step
optimized implementation of RTL [2, 3, 4, 5, 6]. of the standard cell design flow costly.

Characterization is a key step in the standard cell design Many researchers proposed new characterization
flow. It consists on performing simulation of each cell in the methodology either for timing or power, focusing more on
library under different conditions of process, voltage and accuracy than the runtime reduction [9-12]. Sandeep Miryala
temperature (PVT), in order to check performances and targets introduces a new physical based inverter delay model that
of timing, power consumption, leakage, etc. The results of saves 51% of SPICE simulation during this standard cell
these simulations are stored under a file format named liberty characterization while maintaining accuracy [12]. Even if a
file. Liberty files also contain general information relative to reduction of 51% of SPICE simulations is consequent, the
the cell: input/output pins, cell height, area, details of characterization time still needs a high amount of resources.
characterization corner (voltage, temperature, process…), units Also, the proposed methodology has been tested on an
etc [6]. INVERTER only. The total saving on SPICE simulations for
full library may not reach the same results as for an Inverter.
Liberty files are generally constructed around Look Up
Table (LUT). In this approach, data are extracted from spice In this work, a new methodology of timing characterization
simulation. For example, transition time is reported using spice is proposed based on mathematical modelling. Instead of
simulation for different values of input slew which is the input characterizing tens of PVTs, only a reduced set of corners is
transition time and output load capacitance [8]. Fig. 1 shows an characterized and the others are interpolated based on

978-1-5090-6681-0/17/$31.00 ©2017 IEEE


equations developed, which describe the behaviour of each
parameter in the liberty file. Only voltage and temperature are
taken into account in our models. For each parameter in the
liberty file, a different mathematical model is developed based
on its physical variation with voltage and temperature to ensure
a high accuracy. The aim of this new methodology is to make a
faster characterization and save costs in this step of the
standard cell design flow.
The accuracy of the liberty file is related to the models used
[5]. The presented methodology is independent of the CMOS
technology used, however all results shown in this paper are
Fig. 2: Variation of delay with voltage for an inverter
based on simulation performed on 14nm technology library.
The structure of the paper is as follows. In section 2, we For the temperature and the voltage, we consider the minimum
detail the specifications and the characterization flow used. The and the maximum values allowed by the platform. Whereas the
third part of this section is devoted to a fast timing and timing two other intermediate values of voltage are determined so that
constraints characterization of cells. In section 3, methods used the difference of delay corresponding to each consecutive
for correlation between a scaled liberty file and a characterized voltages is equal. This means, if we refer to the Fig. 2, which
liberty file are presented. Also accuracy is checked on a real represents the delay variation with voltage for an Inverter, that
design by doing a full implementation and timing analysis on Δt1 is equal to Δt2 and Δt3. It is important that cells be
an ARM-Cortex A9. characterized under the same conditions for the different
corners: file structure, number of pins, arc, tool versions etc.
II. NEW CHARACTERIZATION METHOD
B. Characterization flow
In this section, we will detail the fast characterization
method to generate with a high precision a liberty from The characterization flow used is described in the Fig.3.
characterized files under different corners. In this work, a
corner stands for a different PVT condition. As the scaling is
done for a given process in the presented methodology, a
corner will refer to a changing of voltage or temperature or
both in the simulations condition.
The characterization method uses available characterized
liberty files to create interpolated new corners based on curve
fitting. The scaling in this methodology concerns only voltage
and temperature. Variation with voltage and temperature of
each parameter in the liberty file is modelled using a
mathematical equation. For the need of models development,
more corners have been characterized initially but only eight of
them are used in finding equations parameters and for the
generation of scaled files. A total of 800 cells have been
characterized in the present work.

A. Specifications Fig. 3: Characterization flow

The choice of the initial PVTs is an important step as it Each cell is processed separately. Input corners are
impacts the accuracy. For each cell, a total of eight initial characterized using the tool Liberate 13.1.3 from Cadence
characterized liberty file corners are recommended and are Design Systems. After parsing characterized .lib files to
used in the results shown below for each process: Two extract the data fitting parameters of each model, interpolated
temperatures and four voltages. The same structure is required liberty files are generated using these parameters and based on
for all the input liberty files and they have to cover the whole the input list of PVTs the user wants to generate. All these
platform range. steps are automated by a script developed in JAVA. The script
also ensures that the output files have the same structure as
input files.
Fig. 5: Delay calculation

Fig. 4: Example of Parameters extractions

In order to improve accuracy, different values of fitting


parameters are extracted separately for each corresponding
values in the LUTs. For example, for a timing look up table of
64 values, they will be a table of 64 for each parameter in the
timing model equation as illustrated in the Fig. 4. The extracted
parameters will be used in the files generation of new PVTs.
Timing characterization includes simulating the
propagation delay through a cell, rising and falling transition (a)
times, tri-state enable and disable times, and timing constraints
in sequential cells. The following sections describe the delay
and transition time models we have developed. And as both
are represented with the same models, only the example of the
delay is shown below. Also the mathematical model used for
timing constraints fitting is presented.

C. Timing Model
As a full standard cell library represents a huge amount of
data that cannot be handled manually, we worked only on a
reduced set of representative cells to develop the mathematical
models. This set of cells includes an INVERTER, a flip flop,
and a NAND gate. It was then deployed on the full library in (b)
order to check the accuracy of the scaled liberty files using the Fig. 6: a) Delay variation with voltage for different temperatures;
JAVA script we have developed. b) Delay variation of characterized liberty files data vs modelled data at
125°C – INV_X1N characterized data
1) Voltage model
Delay is the time it takes for the output signal voltage, to The example of an INVERTER has been taken here. The
fall or to rise, whether it is a fall or a rise transition, to the input simulations results shows that delay decreases exponentially
threshold point set after the input signal voltage has raised or while increasing the voltage. Based on this, the delay variation
fallen to the output threshold point [7]. In order to illustrate this with voltage has been modelled by the following equation (1):
definition, delay calculation is shown in Fig 5. In a cell, Td =T0 + S/Vn (1)
different delays are calculated for each input-output path.
Where Td refers to delay, V refers to voltage, and T0, S and n
Fig. 6-a shows delay variation with the voltage for are constants that represent the fitting parameters to extract
different temperature based on characterized data. from timing data.
Fig. 6-b shows an example of delay variation with voltage of
characterized data and delay data generated with our model at
a temperature of -40°C. The figure clearly shows that
modelled curve match quite well with the simulated curve.
The voltage model is determined at a fixed temperature, either
the minimum or the maximum temperature. The choice here is The setup time represents the minimum time the data
at the minimum temperature (Tmin). should remain stable before the clocks active edge. Any
transition during setup time can lead to an incorrect captured
2) Temperature Model
value. Hold time represents the minimum amount of time the
Fig.7-a shows a linear behavior of the variation of delay data should not change after the clock's active edge. And as
with temperature for different voltages. These data are for the setup time, any transition during this time makes
extracted from additional characterized corners which are not latched data unreliable [7].
used in the scaled liberty files generation. And as the Setup and hold come in LUTs tables of 4x4 indexed by slew
temperature model is assumed to be linear, we add to the of data transition (slew1) and the slew of clock transition
voltage model, a linear dimension function of the temperature (slew2).
reflecting the dependency with temperature: Finding a suitable model for setup and hold represented a
great challenge as the dependence of these parameters with
Td =T0 + S/Vn + aT (T – Tmin) (2)
voltage doesn’t follow the same tendency as shown in the Fig.
8. However fitting these parameters is important as sequential
cells take the longest characterization time.

(a)

(a)

(b) (b)
Fig. 7: a) Delay variation with temperature for different voltages;
b) Delay slope variation with temperature based on characterized data vs
modelled data

Fig. 7-a shows, that aT, the slope of temperature equation is


different from a voltage to another. It follows an exponential
variation function of the voltage as shown in Fig. 7-b. The
slope variation regarding the voltage has been modelled with
following equation (3):
aT=K – eα*V+β (3)
With K, α, β are the fitting parameters to be extracted.
(c)

D. Constraints fitting Model Fig. 8: Setup variation with voltage for three different values of slew1 and
slew2
Sequential cells have other characterization requirements.
In addition to cell delay and transition delay, timing In order to find an adequate voltage model, two additional
constraints of these cells also need to be determined. It parameters have been introduced in the equation. Fig. 9 shows
includes setup, hold, recovery and removal time. the variation of setup time with slew1 Fig. 9-a and with slew 2
Fig. 9-b follows the same tendency. It has been modelled by a A. Validation At Cell Level
polynomial equation (4) as follow:
The first validation of the timing model is done at cell
Tsu=a2s1*S12+a1s1*S1+a2s2*S22+a1s2*S2+b (4)
level. Once the models parameters are extracted, the same
Where Tsu represents setup time, S1 is slew1 and S2 is slew2
input corners are regenerated using the scaling methodology.
and a2s1, a1s1, a2s2, a1s2 and b are the fitting parameters to be
Then, for each cell in the standard cell library, timing values
extracted using the automated script.
of the different tables are compared to the characterized liberty
The parameters a2s1, a1s1, a2s2, a1s2 and b variation with the
file. Here, two types of errors are checked: absolute errors and
voltage is modelled a polynomial and power model according
relative error. We call absolute error, the difference between
to the following equations (5,6,7,8,9):
the values we compare as shown in the equation (10). Relative
a2s1= b22*V2+b21*V+b20 (5)
error, refer to the ratio of the absolute error and the
a1s1= b12*V2+b11*V+b10 (6)
characterized values as shown in the equation below (11)
a2s2= b’22*V2+b’21*V+b’20 (7)
where Vscaled refer to the scaled value and Vcharac refer to
a1s2= b’12*V2+b’11*V+b’10 (8)
the characterized value. Errors are calculated for each value of
b= b0+ S/Vn (9)
timing in the scaled liberty file and should be less than a
predefined tolerance.
absolute_err=(Vscaled-Vcharac) (10)
relative_err=absolute_err/Vcharac (11)
The absolute error has been introduced in the checks in order
to avoid high relative errors especially for the smallest values
of the table. Smallest values usually contain errors which
reflect convergence issues of the characterization tools that
can’t be predictable.
Tolerance for absolute error has been set to 1ps whereas the
default relative error tolerance is set as 5% on delay values
(a) and 10% for transition time. The highest errors are usually
reported on the smallest values of LUT, for the minimum
capacitance load.
The check at the cell level is part of our characterization flow.
A cell is considered as “PASS”, if its meets all the checks
above. A report is given at the end of the characterization with
PASS and FAIL cells. Only liberty files of PASS cells are
generated.

B. Validation at Circuit Level


The accuracy of interpolated liberty files is also checked at
circuit level on an ARM-Cortex A9 by comparing results of
(b) static timing analysis using the modelled liberty file and the
Fig. 9: (a) Setup variation with slew1, (b) setup variation with slew2 characterized one. ARM Cortex-A9 is a single core processor.
It is a high-performance processor suitable for low power
III. RESULTS AND ACCURACY devices [12]. Three intermediate corners have been generated
by interpolation and characterization is used in this
Characterization of a full library which contains around correlation. We call intermediate corners, PVTs we want to
800 cells in 14nm technology, takes around 650hours of generate which are not used in the parameter extraction step.
simulation time. In the proposed methodology scaling, The implementation of a circuit is liberty file dependent. If we
generating a liberty file takes about 1hour. The gain in time is change the liberty file, it will lead to two different
considerable. However generated liberty files should be implementations. This will make implementation results not
qualified. High accurate timing models allow designers to comparable. So in order to be in the same comparison
reduce margins of their designs and to optimize results. This conditions, implementation and static timing analysis (STA)
section presents the validation methods used to validate the were done using the characterized liberty file. The same
timing models at cell level as well as at circuit level. All implementation was kept to perform the STA on the scaled
results have been validated on a slow process and a fast liberty file. STA is a method used to estimate the circuit delay
process but only results of slow process are shown in the and maximum frequency by checking all possible paths of
section below. timing violation under worst-case condition. Corners used in
the correlation are intermediate corners not used in the initial
parameters extraction. A frequency of 500MHz was targeted.
Table1 shows results of the cortex A9 implementation of [13] ARM Inc., “Cortex A9 Technical Reference Manual”
characterized vs scaled liberty files. The worst case corner
gives us less than 3% error for the achieved frequency.

Table 1: Same implementation results of characterized liberty files Vs


Scaled liberty files

IV. CONCLUSION
In this paper, a fast cell characterization approach is
proposed for characterization of standard cell library. Using
this approach, liberty files can be generated in reduced time.
Instead of hundreds of hours of simulation time, less than an
hour is needed to generate each corner which is a huge gain in
term of runtime. Also, the accuracy of the developed models
has been proved on real design implementation. All
experiments are done using a 14nm technology. Validation on
other technology nodes should be considered on the future.
REFERENCES
[1] B. Joseph Leandro, Peje, L. Hani Hebert, Jr. Ho floro Barot, G. Maria Fe,
E. Bautista Carl Christian, E. Misagal John Richard, P. Hizon, Louis
Alarcon, “An Ultra Low-Voltage Standard Cell Library in 65-nm CMOS
Process Technology”, TENCON 2014, Bangkok, Thailand – IEEE
Region 10 Conference, Oct 22-27, 2014.
[2] K. Scott, K. Keutzer, “Improving Cell Library for Synthesis”, Proc. of
Custom Integrated Circuit Conference, pp. 128-131, 1994.
[3] S. Gavrilov, A. Glebov, S. Pullela, S. C. Moore, A. Dharchoudhury, R.
Panda, G. Vijayan and D. T. Blaauw, “Library-Less Synthesis for Static
CMOS Combinational Logic Circuits”, Proc. IEEE Int. Conf. on
Computer-Aided Design (ICCAD), pp. 658-662, 1997.
[4] A.Gregory, Northrop and L. Pong-Fei, “A Semi-Custom Design Flow in
High-Performance Microprocessor Design”, Proc. of Design
Automation Conference (DAC), pp. 426-431, 2001.
[5] M. Vujkovic and C. Sechen, “Optimized Power-Delay Curve Generation
for Standard Cell ICs”, Proc. IEEE Int. Conf. on Computer-Aided
Design (ICCAD), pp. 387-394, 2002.
[6] K. Keutyzer and E. Girczyc, “Panel: Cell libraries - build vs buy; static
vs. dynamic”, Proc. of Design Automation Conference (DAC), pp. 341-
342, 1999.
[7] Liberty User Guides and Reference Manual Suite Version 2013.03
[8] Opensourceliberty website
[9] J. Jianhua, L. Man, W. Lei, and Z. Yumei, “An effective timing
characterization method for an accuracy-proved VLSI standard cell
library”. Journal of Semiconductors, Vol.35, No. 2, 2014.
[10] L. Jiing- Yuan, S. Wen-Zen and J. Jiag-Yang, “A Power Modeling and
Characterization Method for the CMOS Standard Cell”, in Proc.
IEEE/ACM Int. Conf. Computer-Aided Design, pp. 400-404, 1996.
[11] A. imár, M. Rencz , “New accurate temperature dependent timing model
in digital standard cell designs”
[12] S. Miryala, B. Kaur, B. Anand and S. Manhas, “Efficient Nanoscale
VLSI Standard Cell Library Characterization Using a Novel Delay
Model”, Proceedings of the 12th International Symposium on Quality
Electronic Design, (ISQED ’11), pp. 458-463, 2011.

You might also like