Cad For Vlsi 1
Cad For Vlsi 1
Abstraction ex:
Gated arrays
chips that have all their transistors preplaced in regular patterns.
designer specify the wiring patterns
gate arrays as described above are mask programmable
There also exist so-called field-programmable gate arrays (FPGAs)
Interconnections can be configured by applying electrical signals
on some inputs.
Standard Cells
simple logic gates, flip-flops, etc.
Predesigned and have been made available to the designer in a
library
characterization of the cells: determination of their timing
behavior is done once by the library developer
Module Generators
generators exist for those designs that have a regular structure such
as adders, multipliers, and memories.
Due to the regularity of the structure, the module can be described
by one or two parameters.
Hardware-software co-design
Design for a complex system will consist of several chips, some of
which are programmable.
Part of the specification is realized in hardware and some of which in
software. (hardware-software co-design)
the parts with the highest frequencies are the most likely to be
realized in hardware.
The result of co-design:
is a pair of descriptions:
one of the hardware (e.g. in VHDL) that will contain programmable
parts, and
the other of the software (e.g. in C).
Code generation :
Mapping the high-level descriptions of the software to the low-level
instructions of the programmable hardware : CAD problem.
Hardware-software co-simulation:
Verification of the correctness of the result of co-design using
simulation.
Logic synthesis:
Generation and optimization of a circuit at the level of logic gates.
three different types of problems:
1. Synthesis of two-level combinational logic:
Boolean function can be written as sum of products or a product
of sums, can be directly be implemented as programmable logic
arrays (PLAs)
It is, therefore, important to minimize two-level expressions.
2. Synthesis of multilevel combinational logic:
. Some parts of integrated circuits consist of so-called random logic
(circuitry that does not have the regular structure )
. Random logic is often built of standard cells, which means that the
implementation does not restrict the depth of the logic.
3. Synthesis of sequential logic :
. sequential logic has a state which is normally stored in memory
elements
. problem here is to find logic necessary to minimize the state
transitions.
Timing constraints:
designer should be informed about the maximum delay paths
shorter these delays, the faster the operation of the circuit
One possibility of finding out about these delays is by means of
simulation
Or by timing analysis tool: compute delays through the circuit
without performing any simulation
Transistor-level Design
Logic gates are composed of transistors
Depending on the accuracy required, transistors can be simulated at
different levels
At the switch level , transistors are modeled as ideal
bidirectional switches and the signals are essentially digital
At the timing level , analog signals are considered, but the
transistors have simple models (e.g. piecewise linear functions)
At the circuit level , more accurate models of the transistors are
used which often
involve nonlinear differential equations for the currents and
voltages
more accurate the model, the more computer time is necessary for
simulation
Layout Design
Design actions related to layout are very diverse therefore,
different layout tools.
If one has the layout of the subblocks of a design available, together
with the list of interconnections then
1. First, a position in the plane is assigned to each subblock,
trying to minimize the area to be occupied by
interconnections (placement problem).
2. The next step is to generate the wiring patterns that realize
the correct interconnections (routing problem).
goal of placement and routing is to generate the minimal chip
area(1).
Timing constraint (2): As the length of a wire affects the
propagation time of a signal along the wire, it is important to keep
specific wires short in order to guarantee an overall execution speed
of the circuit. (timing-driven layout)
Terminology:
A graph G(V, E) is characterized by two sets:
1. a vertex set V (node) and
2. an edge set E (branch)
The two vertices that are joined by an edge are called the edge's
endpoints, notation (u, v) is used.
The vertices u and v such that (u, v) E, are called adjacent vertices.
Selfloop: An edge (u, u), i.e. one starting and finishing at the same
vertex, is called a selfloop.
Parallel edges: Two edges of the form e1= (v1 , v2) and e2 = (v1 ,
v2), i.e. having the same endpoints, are called parallel edges.
Simple graph: A graph without selfloops or parallel edges is called a
simple graph
each vertex is visited exactly once, all edges are also visited exactly
once.
Assuming that the generic vertex and edge actions have a constant
time complexity, this leads to a time complexity of
depth-first search could be used to find all vertices connected to a
specific vertex u
Breadth-first Search:
directed graphs represented by an adjacency list
The central element is the FIFO queue.
the call shift_in (q, o) adds an object o to the queue q, that shif t_out
( q ) removes the oldest object from the queue q.
adding and removing objects from a FIFO queue can be done in
constant time.
Status of queue
Dijkstra's Shortest-path
Algorithm
a weighted directed
graph G(V, E) is given
edge weights w(e), w(e)
>0
Visited vertices of the set
V are transferred one by
one to a set T
Ordering of vertices is
done using vertex
attribute distance.
the distance attribute of a
vertex v is equal to the
edge weight w((vs, v))
One gets a spanning tree by removing edges from E until all cycles in
the graph have
disappeared while all vertices remain connected.
a graph has several spanning trees, all of which have the same number
of edges (number of vertices minus one)
In the case of edge-weighted undirected graphs, spanning tree is to be
found with the least total edge weight, also called the tree length.
(minimum spanning tree problem)
Another example:
Dijkstra's algorithm:
with given source and target vertices vt vs, defines an instance of the
problem
One could associate Boolean variables bi, bi = 1means that the edge
is "selected"
and bi = 0 means that it is not.
solving the shortest-path problem for this graph can be seen as
assigning Boolean values to the variables bi: making the problem
combinatorial.
a combinatorial optimization problem is defined as the set of all
the instances of the problem,
each instance I being defined as a pair (F, c).
F is called the set of feasible solutions (or the search space),
c is a function assigning a cost to each element of F.
Solving a particular instance of a problem consists of finding a
feasible solution
f with minimal cost
nonoptimal solution
optimal solution
review
The decision version of a combinatorial problem can be defined
as the set of its instances (F, c, k).
Note that each instance is now characterized by an extra parameter k;
k is the parameter in the question "Is there a solution with cost less
than or equal to k?'.
An interesting subset of instances is formed by those instances for
which the answer to the question is "yes".
This set is called
Complexity Classes:
it is useful to group problems with the same degree in one complexity
class.
The class of decision problems for which an algorithm is known that
operates in polynomial time is called P (which is an abbreviation of
"polynomial").
Deterministic and nondeterministic computer.
For a common (deterministic) computer it always is clear how a
computation continues at a certain point in the computation. This is also
reflected in the programming languages used for them.
A nondeterministic computer allows for the specification of multiple
computations at a certain point in a program: the computer will make a
nondeterministic choices on, which of them to be performed.
This is not just a random choice, but a choice that will lead to the desired
answer.
The machine splits itself into as many copies as there are choices,
evaluates all choices in parallel, and then merges back to one machine.
NP-completeness :
all decision problems contain in it are polynomially reducible to each
other.
An instance of any NP-complete problem can be expressed as an
instance of any other
NP-complete problem using transformations that have a polynomial time
complexity.
Ex:
HAMILTONIANCYCLE problem: whether a given undirected graph
G(V,E) contains a so-called Hamiltonian cycle, i.e. a simple cycle that
goes through all vertices of V.
TRAVELING SALESMAN, the decision version of TSP amounts to
answering the question of whether there is a tour (simple cycle) through
all vertices, the length of which is less than or equal to k.
A bad placement will have longer connections which normally will lead to
more routing tracks between the cells and therefore to a larger chip area.
Backtracking:
The principle of using backtracking for an exhaustive search of the
solution space is to
start with an initial partial solution in which as many variables as
possible are left unspecified, and
then to systematically assign values to the unspecified variables
until either a single point in the search space is identified or an implicit
constraint makes it impossible to process more unspecified variables.
The cost of the feasible solution found can be computed if all variables
are found.
The algorithm continues by going back to a partial solution generated
earlier and then assigning a next value to an unspecified variable
(hence the name "backtracking")
Branch-and-bound:
Information about a certain partial solution f(k) 1 < k <n, at a certain
level can indicate that any fully-specified solution f(n) D (f(k)) derived
from it can never be the optimal solution.
Function that estimates this cost lower bound will be denoted by .
If inspection of can guarantee that all of the solutions belonging to
f(k) have a higher cost than some solution already found earlier
during the backtracking, none of the children of need any further
investigation.
One says that the node in the tree corresponding to can be killed.
killing partial solutions is called branch-and-bound
Procedure
Lower_bound_costis called to get a lower bound of the partial
solution based on the function
Dynamic Programming:
Dynamic programming is a technique that systematically constructs the
optimal solution of some problem instance by defining the optimal
solution in terms of optimal solutions of smaller size instances.
Dynamic programming can be applied to such a problem if there is a rule
to construct the optimal solution for p = k (complete solution) from the
optimal solutions of instances for which p < k (set of partial solutions).
The fact that an optimal solution for a specific complexity can be
constructed from the optimal lower complexity problems only, is essential
for dynamic programming.
This idea is called the principle of optimality.
The goal in the shortest-path problem is to find the shortest path from a
source vertex
vs to a destination vertex vt in a directed graph G(V, E) where the
distance between two vertices u, v is given by the edge weight w((u, v)).
If p = k, the optimization goal becomes: find the shortest path from vs to
all other vertices in the graph considering paths that only pass through
the first k closest vertices to vs.
The optimal solution for the instance with p = 1 is found in a trivial way
by assigning the edge weight w((vs, u)) to the distance attribute of all
vertices u.
Suppose that the optimal solution for p = k is known and that the k
closest vertices to vs have been identified and transferred from V to T.
Then, solving the problem for p = k+1 is simple: transfer the vertex u
in V having the lowest value for its distance attribute from V to T and
update the value of the distance attributes for those vertices remaining
in V.
additional parameters may be necessary to distinguish multiple instances
of the problem for the same value of p.
b1 is slack variable
It is possible to solve LP problems by a polynomial-time algorithm
called the ellipsoid algorithm
Local
Search:
Local search is a general-purpose optimization method that works with
fully specified solutions f of a problem instance (F, c)
It makes use of the notion of a neighbourhood N(f) of a feasible
solution f.
Works with subset of F that is "close" to f in some sense.
a neighbourhood is a function that assigns a number of feasible
solutions to each feasible solution: N : F 2^F.
2^F denotes the power set of F.
Any g N(f) is called a neighbour of f.
Simulated Annealing:
a material is first heated up to a temperature that allows all its
molecules to move freely around (the material becomes liquid), and
is then cooled down very slowly.
At the end of the process, the total energy of the material is minimal.
The energy corresponds to the cost function.
The movement of the molecules corresponds to a sequence of moves
in the set of feasible solutions.
The temperature corresponds to a control parameter T which
controls the acceptance probability for a move from
Tabu Search
Given
a neighbourhood subset G N(f) of a feasible solution f, the
principle of tabu search is to move to the cheapest element g G even
when c(g) > c(f).
The tabu search method, does not directly restrict uphill moves
throughout the search process
In order to avoid a circular search pattern, a so-called tabu list
containing the k last visited feasible solutions is maintained
This only helps, of course, to avoid cycles of length < k
Genetic
Algorithms:
instead of repetitively transforming a single current solution into a next
one by the
application of a move,
the algorithm simultaneously keeps track of a set P of feasible
solutions, called the population.
In an iterative search process, the current population is replaced by the
next oneusing a procedure
In order to generate a feasible solution two feasible solutions called the
parents of the child are first selected from
is generated in such a way that it inherits parts of its "properties" from
one
parent and the other part from the second parent by the application of
an operation
called crossover.
First of all, this operation assumes that all feasible solutions can be
encoded by a fixed length vector f = [f1,f2 .. . fn]T = f as was the case
for the backtracking algorithm
Bit strings are to represent feasible solutions.
Number of vector elements n is fixed, but that the number of bits to
Suppose that the bit strings of the example represent the coordinates of
the placement
problem on a 10 x 10 grid, now with only a single cell to place (an
artificial problem).
The bit string for a feasible solution is then obtained by concatenating
the two 4-bit values of the coordinates of the cell.
So, f(k) is a placement on position (5, 9) and g(k) one on position (8, 6).
The children generated by crossover represent placements at
respectively (5, 14) and (8, 1).
Clearly, a placement at (5, 14) is illegal: it does not represent a feasible
solution as coordinate values cannot exceed 10.
Layout Compaction:
At the lowest level, the level of the mask patterns for the fabrication of
the circuit, a final optimization can be applied to remove redundant
space.
This optimization is called layout compaction
Layout compaction can be applied in four situations,
1. Converting symbolic layout to geometric layout.
2. Removing redundant area from geometric layout.
3. Adapting geometric layout to a new technology.
4. Correcting small design rule errors
A new technology means that the design rules have changed;
as long as the new and old technologies are compatible (e.g. both are
CMOS technologies), this adaptation can be done automatically, (e.g. by
means of so-called mask-to-symbolic extraction.)
Compaction tools:
Layout is essentially two-dimensional and layout elements can in
principle be moved both horizontally and vertically for the purpose
of compaction.
When one dimensional compaction tools are used, the layout
elements are only moved along one direction (either vertically or
horizontally).
This means that the tool has to be applied at least twice: once for
horizontal and once for vertical compaction.
Two dimensional compaction tools move layout elements in
both directions simultaneously.
Theoretically, only two-dimensional compaction can achieve an
optimal result. This type of compaction is NP-complete. On the
other hand, one
dimensional compaction can be solved
optimally in polynomial time
partitions the edge set E of the constraint graph G(V, E) into two sets Ef
and Eb
The edges in Ef have been obtained from the minimum-distance
inequalities and are called Forward edges.
The edges in Eb correspond to maximum-distance inequalities and are
called backward edges
Assignment(also called binding) problem is called task-to-agent
assignment, where
a task can be an operation or a value and an agent can be an FU or a
register.
Tasks are called compatible if they can be executed on the same agent.
In case of values, compatibility means when their life times do not
overlap.
The set of tasks can be used as the vertex set of a so-called
compatibility graph GC(VC, Ec).
The graph has edges (vi,Vj) Ecif and only if the tasks vi and vj are
compatible.
one can say that two tasks are in conflict if they cannot be executed on
the same agent.
The set of tasks is then used as the vertex set of a conflict graph that
has edges for those vertex pairs that are in conflict.
The conflict graph is the complement graph of the compatibility graph.
The
goal of the assignment problem is to minimize the number of agents
for the given set of tasks.
The vertices of any complete subgraph of a compatibility graph
correspond to a set of tasks that can be assigned to the same agent.
The goal of the assignment problem is then to partition the compatibility
graph in such a way that each subset in the partition forms a complete
graph and the number of subsets in the partition is minimal.
The subsets are pairwise disjoint and the union of the subsets forms the
original set by definition of a partition.
In the literature such a partitioning is called a clique partitioning
combining vertices in the compatibility graph results a supervertex.
The index I of a supervertex represents the set of indices of the vertices
from which the supervertex was formed.
For example, combining vertices 1,3 and 7gives a supervertexV1,3,7.
A supervertexvnis a common neighbor of the superverticesVi,vjVk, if both
edges (vi vn) and (vj, vn) are included in set Ek.
Partitioning Problem
The partitioning problem deals with splitting a network into two or more
parts
by cutting connections.
Partitioning problem is treated here together with placement because
solution methods for the partitioning problem can be used as a
subroutine for some type of placement algorithms.
The data model consists of the three structures cell, port and net.
A cell is the basic building block of a circuit. A NAND gate is an example
of a cell.
The point at which a connection between a wire and a cell is established
is called a port.
The wire that electrically connects two or more ports is a net.
A set of ports is associated with each net.
A port can only be part of a single net.
Wire-length Estimation
total wire length is used to evaluate the quality of placement
Estimation:
A wire-length metric is applied to each net, resulting in a length
estimate per net.
The total wire length estimation is then obtained by summing the
individual estimates.
The total wiring area can then be derived from this length by assuming
a certain wire width and a wire separation distance.
All metrics refer to a cell's coordinates.
common metrics are
Half perimeter: This metric computes the smallest rectangle that
encloses all terminals of a net and takes the sum of the width and
height of the rectangle as an estimation of the wire length.
The estimation is exact for two- and three terminal nets and gives a
lower bound for the wire length of nets with four or more terminals.
Placement Algorithms:
Placement algorithms can be grouped into two categories:
Constructive placement: the algorithm is such that once the
coordinates of a cell have been fixed they are not modified anymore;
Iterative placement: all cells have already some coordinates and
cells are moved around, their positions are interchanged, etc. in order
to get a new (hopefully better) configuration.
An initial placement is obtained in a constructive way and attempts are
made to increase the quality of the placement by iterative
improvement.
Constructive Placement:
Partitioning methods which divide the circuit in two or more
subcircuits of a given size while minimizing the number of connections
between the subcircuits:
1. min-cut partitioning and
2. Clustering
1. min-cut partitioning
The basic idea of min-cut placement is to split the circuit into two
subcircuits of more or less equal size while minimizing the number of
nets that are connected to both subcircuits
The two subcircuits obtained will each be placed in separate halves of the
layout
The number of long wires crossing from one half of the chip to the other
will be minimized
bipartitioning is recursively applied
Iterative Improvement:
Iterative improvement is a method that perturbs a given placement by
changing the
positions of one or more cells and evaluates the result
If the new cost is less than the old one, the new placement replaces the
old one and the process continues
Force-directed placement:
It assumes that cells that share nets, feel an attractive "force" from each
other.
The goal is to reduce the total force in the network.
one can compute the "center of gravity" of a cell, the position where the
cell feels a force zero
center of gravity (xig , yig) of a cell i is defined as
perturbation is then to
move a cell to a legal position close to its center of gravity and
if there is another cell at that position to move that cell to some empty
location or to its own center of gravity
Partitioning
When
a large circuit has to be implemented with multiple chips and the
number of pins on the IC packages necessary for interchip
communication should be minimized.
Kernighan-Lin Partitioning Algorithm
There is an edge-weighted undirected graph G(V, E)
The graph has 2n vertices (|V| = 2n); an edge (a, b) Ehas a weight if (a,
b) E, 0.
The problem is to find two sets A and B, subject to A U B = V, A B = 0,
and |A| = |B| = n, which minimizes the cut cost defined as follows:
The
construction of the sets Xm and Ym is based on external and
internal costs for vertices in the sets Am-l and Bm-l.
The external cost Ea of a Am-1 is defined as follows
the external cost for vertex a Am-1 is a measure for the pull that the
vertex experiences from the vertices in Bm-1.
the external cost Ebfor a vertex b Bm-1.
the gain in the cut cost, A, resulting from the interchange of two vertices
can
KL algorithm
Floorplanning:
floorplan-based design methodology: This top-down design
methodology advocates that layout aspects should be taken into
account in all design stages.
At higher levels of abstraction, due to the lack of detailed information,
only the relative positions of the subblocks in the structural description
can be fixed.
Taking layout into account in all design stages also gives early
feedback: structural synthesis decisions can immediately be evaluated
for their layout consequences and corrected if necessary.
The presence of (approximate) layout information allows for an
estimation of wire lengths. From these lengths, one can derive
performance properties of the design such as timing and power
consumption.
One can derive new composition operators from the wheel floorplan and
its mirror image and use them in combination with the horizontal and
vertical composition operators in a floorplan tree.
A floorplan that can be described in this way is called afloorplan of order
5
slicing floorplan can also be called a floorplan of order 2
Abut : When two cells that need to be electrically connected have their
terminals in the right order and separated correctly, the cells can simply
be put against each other without the necessity for a routing channel in
between them. Such cells are said to Abut.
Ideally, all composite cells are created by abutment and no routing
channels are used in a floorplan: This requires the existence of flexible
cells
flexible cells should be able to accommodate feedthrough wires.
floorplan-based design does not exclude the existence of routing
channels. The channels can be taken care of by incorporating them in
the area estimations for the cells.
a small example where both c1 and c2 are inset cells with respective
sizes of 4 x 2 and 5x3. Clearly, there are four ways to stack the two cells
vertically
Routing
The specification of a routing problem will consist of the
1. position of the terminals,
2. the netlist that indicates which terminals should be interconnected
and
3. the area available for routing in each layer.
Routing is normally performed in two stages.
4. The first stage, global or loose routing: determines through which
wiring channels a connection will run.
5. The second stage local or detailed routing: fixes the precise paths
that a wire will
take (its position inside a channel and its layer).
Area Routing (single wiring layer, a grid, the presence of obstacles, and
fixed terminals in all the routing area).
Routing problems in which terminals are allowed anywhere in the area
available for routing are normally classified as area routing problems
path connection" or "maze routing" algorithm
The basic algorithm is meant to realize a connection between two points
("source" terminal, the "target" terminal) in a plane, in an environment
that may contain obstacles.
If a path exists, the algorithm always finds the shortest connection,
going around obstacles.
Obstacles are grid points through which no wire segments can pass.
The distance between two horizontally or vertically neighboring grid
points corresponds to the shortest possible wire segment.
The algorithm consists of
three steps:
wave propagation,
backtracing, and
cleanup
Channel Routing:
Channel routing occurs as a natural problem in standard cell and
building block layout styles, but also in the design of printed circuit
boards (PCBs).
It consists of routing nets across a rectangular channel.
all terminals belonging to the same net have the same number
switchbox routing
A routing problem that has some similarity with channel routing is
switchbox routing
fixed terminals can be found on all four sides of the rectangular routing
area.
the minimization of the area is not an optimization goal.
Switchbox routing is a decision problem.
the goal is to find out whether a solution exists. When a solution can be
found, a
secondary goal is to minimize the total wire length and the number of
vias
The main problem with the fully merged form is the possible existence of
cycles, in which case the corresponding layout cannot be realized: a
segment cannot be at the same time above and below another one.
In the absence of cycles in the VCG, a solution with a single horizontal
segment per net would amount to finding the longest path in the graph.
list i_list that contains the intervals in order of increasing left coordinate.
Once all nets have received a weight, the robust routing algorithm
finds the
maximal-weight subset of nets that can be assigned to the same row.
The nets selected for the subset should not have horizontal
constraints.
For any graph, a set of vertices that does not contain pairs of adjacent
vertices is called an independent set.
The problem of finding the maximal-weight subset of the nets could
therefore be formulated as the maximal-weight independent set
problem of the corresponding interval graph.
In the case of the problem of obtaining the group of nonoverlapping
intervals with maximal total weight, the subinstances can be
identified by a single parameter y, with
1 < y < channel.width.
To obtain the subinstance with y = c, one should remove all intervals
that extend beyond column position c.
The costs of the optimal solutions for the subinstances with y = c are
the optimal cost for the subinstance with y = c can be derived from the
optimal costs of the subinstances with y < c and the weights of the nets
that have their right-most terminals at position c(There are at most two
such nets)
Net n is part of the optimal solution if total[c -1] < wn + total[xnmin
1]
n is part of the optimal solution if n s weight added to the optimal
solution for the subinstance that did not include any nets that
overlapped with n, is larger than the optimal solution for the subinstance
with y =c 1.
n is part of the optimal solution if ns weight added to the optimal
solution for the subinstance that did not include any nets that
overlapped with n, is larger than the optimal solution for the subinstance
with y = c 1
If a net is selected for some c, the net's identification is stored in the
array selectec_net
The rectilinear Steiner tree contains vertical segments that cross the rows
of standard cells. They can be realized in different way.
1. By simply using a wiring layer that is not used by the standard cells.
2. By making use of feedthrough wires that may be available within
standard cells
3. By making use of feedthrough cells; these are cells that are inserted
between functional cells in a row of standard cells with the purpose of
realizing vertical connections.
First of all, it may be necessary to slightly shift the segments in order to
align with feedthrough wire positions.
Second, segments at approximately the same location can be permuted
to reduce the densities in the channels above and below the row that
they cross.
If feedthrough resources are scarce, their use can be minimized by
building a
Steiner tree for which vertical connections have a higher cost than
horizontal ones.
Given the fact that longer wires roughly correspond to larger delays,
cells connected to critical nets (nets that are part of the critical path)
will receive a higher priority to be placed close to each other during
placement.
a long wire in an IC behaves more like a transmission line, partition
the wire into multiple segments, each segment with its own resistance
and capacitance.
A model based on this principle, is the Elmore delay model
the signal flow in a net is unidirectional starting from a source
terminal and propagating to multiple sink terminals, signal changes
will not arrive simultaneously at all sinks
It may e.g. be necessary to optimize the length of the connection from
the source to the critical sink (this is a connection that is part of the
critical path) rather than the overall tree length
In standard-cell layout, global routing minimizes the overall area if it
minimizes
the sum of all channel widths.
once Channel (2) has been routed, its floating terminals at its "bottom"
side are fixed
by the channel router and become fixed terminals for the top side of
Channel (3).
The floating terminals at the left side of Channel (3), receive a fixed
position after completing the routing of the channel and become fixed
terminals for the right side of Channel (4).
local density
The local vertical density dv(i, j) (1 <i < m; 1 < j < n - 1) ; is then
defined as the number of wires crossing the vertical grid segment
located on vertical grid line j between the horizontal lines i -1 and i.
The local horizontal density dh(i, j) (1 < i < m 1; 1 < j < n) is defined
as the number of wires crossing the horizontal grid line i between the
vertical grid lines j - 1and j.
The density Dv(i) (1 < i < m) of the channel between grid lines i- 1 and
i is then given by:
The goal of global routing is to minimize the total channel density given
by:
Divide-and-conquer algorithm
Instead of using the same grid during the complete routing process, one
could start with a very coarse grid, say a 2 x 2 grid, perform global
routing on this grid by assuming
that all terminals covered by an elementary rectangle are located at
the rectangle's center, and
construct Steiner trees that evenly distribute the wires crossing the
grid segments.
One then gets four smaller routing problems that can be solved
recursively following the same approach.
The recursion stops when a sufficient degree of detail has been reached
for handing the problem over to a local router.
The decision on the ordering of wires crossing a boundary for one
subproblem will constrain the search space of the neighboring one.
all candidate points s are visited and the spanning tree for the points in P
U {s} is computed
each time.
The point that leads to the cheapest tree is then selected
SLECTING s POINT:
an optimal rectilinear Steiner tree can always be embedded in the grid
composed of only those grid lines that carry points of the set P.
candidate points are commonly called Hananpoints.
ARCHITECTURAL SYNTHESIS
Architectural synthesis means constructing the macroscopic structure
of a digital circuit, starting from behavioral models that can be captured by data-flow
or sequencing
graphs.
outcome of architectural synthesis
1. a structural view of the circuit, in particular of its data path, and
2. a logic-level specification of its coritrol unit.
The data path is an interconnection of resources (implementing
arithmetic or logic functions)
steering logic circuits (e.g., multiplexers and busses), that send data
to the appropriate
destination at the appropriate time and registers or memory arrays
to store data.
Structural view of the
differential equation
integrator with one
multiplier and one ALU
Scheduling:
We denote the execution delays of the operations by the set D = {di; i =
0, 1, . . . , n}
delay of the source and sink vertices is zero
The start times of the operations, represented by the set T = (ti; i = 0, 1, .
. . , n), are attributes of the vertices of the sequencing graph
latency of a scheduled sequencing graph is denoted by , and it is the
difference between the start time of the sink and the start time of the
source
A scheduled sequencing graph is a vertex-weighted sequencing
graph, where each vertex is labeled by its start time.
Hierarchical Models:
A hierarchical schedule can be defined by associating a start time to
each vertex in each graph entity.
The start times are now relative to that of the source vertex in the
corresponding graph entity.
The latency computation of a hierarchical sequencing graph, with
bounded delay operations, can be performed by traversing the
hierarchy bottom up
Delay modeling
1. vertex is the latency of the corresponding graph entity
2. delay of a branching vertex is the maximum of the latencies of the
comesponding bodies
3. delay of an iteration vertex is the latency of its body times the
maximum number of iterations
SCHEDULING ALGORITHMS
a sequencing graph prescribes only dependencies among the
operations,
the scheduling-of a sequencing graph determines the precise start
time of each task.
Sequencing and concurrency
The start times must satisfy the original dependencies of the sequencing
graph, which limit the amount of parallelism of the operations,
because any pair of operations related by a sequence dependency (or by
a chain of dependencies) may not execute concurrently.
Impact on area:
the maximum number of concurrent operations of any given type at any
step of the schedule is a lower bound on the number of required
hardware resources of that type. Therefore the choice of a schedule
affects also the area of the implementation.
the latency of the schedule equals the weight of the longest path from
source to sink.
unconstrained minimum-latency scheduling problem:
Given a set of operations V with integer delays D and a partial order on
the operations E, find an integer labeling of the operations : V such that
resource-constrained scheduling problem
Given a set of operations V with integer delays D, a partial order on the
operations E and upper bounds (ak: k = 1.2.. . . , nops], find an integer
labeling of the operations : V
Latency-Constrained
Scheduling:
The ALAP Scheduling Algorithm
upper bound on the latency, denoted by bar.
solved by executing the ASAP scheduling algorithm and verifying that
The ASAP scheduling algorithm yields the minimum values of the start
times.
the as late as possible (ALAP) scheduling Algorithm, provides the
corresponding maximum values
mobility (or slack): the difference of the start times computed by the
ALAP and ASAP algorithms. Namely i = tiL- tiS; {i = 0, 1,. . . , n}
Zero mobility implies that an operation can be started only at one given
time
step in order to meet the overall latency constraint.
When the mobility is larger than zero, it measures the span of the time
interval in which it may be started.
Relative Scheduling
We
assume that operations issue completion signals when
execution is finished
a start signal to the source vertex, is also its completion signal.
In sequencing graph G,(V, E), where a subset of the vertices has
unspecified execution delay. Such vertices, as well as the source
vertex, provide a frame of reference for determining the start time of
the operations.
Anchors
The anchors of a constraint graph G(V. E) consist of the source
vertex voand of all vertices with unbounded delay.
start time and stop time of the operations cannot be determined on
an absolute scale
schedule of the operations is relative to the anchors
A defining path p(a. vi) from anchor a to vertex vi is a path in G,(V.
E) with one and only one unbounded weigh: da.
The relevant anchor set of a vertex vi is the subset of anchors
when
considering one path only and when anchors are cascaded along
the path, only the last one affects the start time of the operation at
the head of the path.
An anchor a is redundant for vertex vi, when there is another relevant
anchor b R(vi) such that
For any given vertex vi the irredundant relevant anchor set
represents the smallest subset of anchors that affects the start time
of that vertex.
Let ti be the schedule of operation vi with respect to anchor a,
computed on the polar subgraph induced by anchor a and its
successors, assuming that a is the source of the subgraph and that
all anchors have zero execution delay. Then
Let us denote by t the vector whose entries are the start times.
Then, the minimum-latency scheduling problem under resource
constraints can be stated as
(c holds boolean val for timesteps for which time has be calculated)
y' the largest integer such that all vertices with labels larger than or
equal to + 1 - y' have been scheduled up to the critical step and the
following one
Denominator= l
a schedule exists with a resources that satisfy the latency bound
Using upper 2
recalling
that the - y' schedule steps are used by the algorithm to
schedule the
remaining operations after step c+ 1, the total number of steps used by
the algorithm is
self-forces
let us consider operation vi of type k = T(vi) when scheduled in step I.
The force relating that operation to a step m [ti, tl] is equal to the
type distribution qk(m) times the variation in probability 1 - pi(m).
The self-force is the sum of the forces relating that operation to all
schedule steps in its time frame.
Boolean algebra
Boolean algebra is defined by the set B = (0, 1) and by two operations,
denoted
by + and .
The multi-dimensional space spanned by n binary-valued Boolean
variables is denoted by B^n.
It is often referred to as the n-dimensional cube.
A point in B^n is represented by a binary-valued vector of dimension
n.
A literal is an instance of a variable or of its complement.
A product of n literals denotes a point in the Boolean space: it is a zerodimensional cube.
n-input, m-output function is a mapping f : B^n B^m
the subsets of the domain for which the function takes the values
0, I and * are called the off set, on set, and dc set, respectively
The
consensus of a function with respect to a variable represents the
component that is independent of that variable.
EXPRESSION FORMS.
Scalar Boolean functions can be represented by expressions of literals
linked by the + and . operators.
Single-level forms use only one operator
Standard two-level forms are sum of products of literals and product of
sums of literals
BINARY DECISION DIAGRAMS
A binary decision diagram represents a set of binary-valued decisions,
culminating in an overall decision that can be either TRUE or FALSE.
Isomorphic OBDD:
Two OBDDs are isomorphic if there is a one-to-one mapping between
the vertex
sets that preserves adjacency, indices and leaf values.
ROBDD
An OBDD is said to be a reduced OBDD (or ROBDD) if it contains no
vertex v with low(v) = high(v). nor any pair [v. u] such that the
suhgraphs rooted in v and in u are isomorphic.
(redundancies have been eliminated from the diagram.)
The on set, off set and dc set of a function f can be modeled by covers.
An entry aij is 1 if and only if the jth prime covers the ith minterm.
A minimum cover is a minimum set of columns which covers all rows
Petrick's method:
writing down the covering clauses of the (reduced) implicant table in a
product of
sums form.
The product of sums form is then transformed into a sum ofproducts
form by canying out the products of the sums.
The corresponding sum of products expression is satisfied when any of
its product terms is TRUE.
product terms represent the primes
a minimum cover is identified by any product term of the sum of
products form with the fewest literals.
ESPRESSO-EXACT algorithm:
,The major improvements of the ESPRESSO-EXACT algorithm over
the QuineMceluskee algorithm consist of the construction of a smaller
reduced prime implicant table and of the use of an efficient branchand-bound algorithm for covering.
ESPRESSO-EXACT partitions the prime implicants into three sets:
essentials,
partially redundunt and
totally redundant.
totally redundant primes are those covered by the essentials
partially redundant set includes the remaining ones
The rows of the reduced implicant table correspond to sets of minterms,
rather than to single minterms as in the case of Quine-McCluskey's
algorithm.
each row corresponds to all mintenns which are covered by the same
subset of prime
implicants
ABSTRACT MODELS
Structures
Structural representations can be modeled in terms of incidence
structures.
An incidence structure consists of a set of modules, a set of nets
and an incidence relation among modules and nets.
A simple model for the structure is a hypergraph, where the
vertices correspond to the modules and the edges to the nets.
The incidence relation is then represented by the corresponding
incidence matrix.
An alternative way of specifying a structu~ is to denote each
module by its terminals, called pins (or ports), and to describe the
incidence among nets and pins.
Logic Network
A generalized logic network is a structure, where each leaf module is
associated with a
combinational or sequential logic function.
While this-concept is general and powerful, we consider here two
restrictions to this model: the combinational logic network and the
synchronous logic network.
The combinational logic network, called also logic network or Boolean
network,
is a hierarchical smcture where:
Each leaf module is associated with a multiple-input, single-output
combinational
logic function, called a local function.
Pins are partitioned into two classes, called mnput.i and outputs. Pins
that do not
belong to submodules are also partit~oned into two classes, called
primary inputs
and primary oufput.
Each net has a distinguished terminal, called a source, and an
orientation from
State Diagrams:
The behavioral view of sequential circuits at the logic level can
be expressed by
finite-state machine transition diagrams.
A finite-state machine can be described by:
A set of primary input panems, X.
A set of primary output patterns, Y.
A set of states, S.
A state transition function, S : X x S -t S.
An output function, A : X x S -t Y
for Mealy models or A : S -t
Y
for Moore
models.
An initial state.