Lecture 02 HDL PDF
Lecture 02 HDL PDF
Mahdi Shabany
Department of Electrical Engineering
Sharif University of technology
Standard
Specifications
Cells
Pre-Layout Post-Layout
Simulation Yes Timing Yes Back Yes
RTL Coding Synthesis APR Timing Logic
Pass? Alanysis Annotation Alanysis verification
Pass? Pass?
NO NO
Test Bench Timing NO
Constraints
Tapeout
1. HDL Coding 2. Simulation 3. Synthesis 4. Placement & routing 5. Timing Analysis & Verification
In this course we learn all the above steps in detail for ASIC
Standard
Specifications
Cells
Pre-Layout Post-Layout
Simulation Yes Timing Yes Back Yes
RTL Coding Synthesis APR Timing Logic
Pass? Alanysis Annotation Alanysis verification
Pass? Pass?
NO NO
Test Bench Timing NO
Constraints
Tapeout
HDL allows us to describe the functionality of a logic circuit in a language that is:
Easy to understand
Easy to share
Hides complicated implementation details
Designer more concerned about the design functionality than the detailed circuit
design
Standard
Specifications
Cells
Pre-Layout Post-Layout
Simulation Yes Timing Yes Back Yes
RTL Coding Synthesis APR Timing Logic
Pass? Alanysis Annotation Alanysis verification
Pass? Pass?
NO NO
Test Bench Timing NO
Constraints
Tapeout
After HDL coding, the code has to be tested using “testbenches” (Verification).
Simulation tools:
Synopsys VCS (Synopsys)
Modelsim (Mentor Graphics)
NCVerilog (Cadence)
Standard
Specifications
Cells
Pre-Layout Post-Layout
Simulation Yes Timing Yes Back Yes
RTL Coding Synthesis APR Timing Logic
Pass? Alanysis Annotation Alanysis verification
Pass? Pass?
NO NO
Test Bench Timing NO
Constraints
Tapeout
Synthesis tool:
Analyzes a piece of Verilog code and converts it into optimized logic gates
This conversion is done according to the “language semantics”
We have to learn these language semantics, i.e., Verilog code.
If a designer can design 150 gates a day, it will take 6666 man’s
day to design a 10-million gate design, or almost 20 years for 10
designers! This is assuming a linear grow of complexity when
design gets bigger.
Synthesis tool:
Gate-level Netlist
Synthesis
Tool
Standard
Specifications
Cells
Pre-Layout Post-Layout
Simulation Yes Timing Yes Back Yes
RTL Coding Synthesis APR Timing Logic
Pass? Alanysis Annotation Alanysis verification
Pass? Pass?
NO NO
Test Bench Timing NO
Constraints
Tapeout
Standard
Specifications
Cells
Pre-Layout Post-Layout
Simulation Yes Timing Yes Back Yes
RTL Coding Synthesis APR Timing Logic
Pass? Alanysis Annotation Alanysis verification
Pass? Pass?
NO NO
Test Bench Timing NO
Constraints
Tapeout
Standard
Specifications
Cells
Pre-Layout Post-Layout
Simulation Yes Timing Yes Back Yes
RTL Coding Synthesis APR Timing Logic
Pass? Alanysis Annotation Alanysis verification
Pass? Pass?
NO NO
Test Bench Timing NO
Constraints
Tapeout
Standard
Specifications
Cells
Pre-Layout Post-Layout
Simulation Yes Timing Yes Back Yes
RTL Coding Synthesis APR Timing Logic
Pass? Alanysis Annotation Alanysis verification
Pass? Pass?
NO NO
Test Bench Timing NO
Constraints
Tapeout
Logic Verification
Simulate and test the very final netlist after APR
Timing analysis using testbenches
Send the final design (GDS file) for fabrication
Solution:
Describe the design in text Hardware Description Language (HDL)
Just describe the design “behavior” not the detailed gate-level logic
Gate-level logic is generated automatically by a “synthesis” tool
Designers can make decisions about cost, performance, power, and area
earlier in the design process
Verilog:
Developed by Philip Moorby in 1985 as a proprietary language
Open to public by Cadence Design Systems in 1990
IEEE standard in 1995 and revised in 2001
Verilog is used in this course!
VHDL Verilog
Commissioned in 1981 by Department of Defense Created by Gateway Design Automation in 1985
Initially created for ASIC Synthesis Initially an interpreted language for gate-level simulation
Strong support for package management and large No special extensions for large designs
designs
ADA-like verbose syntax, lots of redundancy C-like concise syntax
Design is composed of entities each of which can have Design is composed of modules which have just one
multiple architectures implementation
Gate-level, dataflow, and behavioral modeling. Gate-level, dataflow, and behavioral modeling.
Synthesizable subset. Synthesizable subset.
Harder to learn and use Easy to learn and use
In Combinational Out
Comb. Logic
Logic
Critical path
Clk
input
output module
Signal
Net
wire:
For interconnecting logic elements (LEs)
To connect an output of a logic element to the input of another LE
tri
Circuit nodes that are connected in a tri-state fashion
Variable
reg (unsigned in general)
Corresponds to a circuit node (not necessarily a register!)
Allow a circuit to be described in terms of its behavior
Retains its value until it is overwritten by a subsequent assignment
integer (signed in general)
Used for loop counters
The “wire” declarations are not necessary as Verilog assumes that signals
are nets by default .
The “reg” declaration is required!
Example:
module DUT (A, B, C) ; Don’t forget semicolon
input [1:0] A;
output B;
inout [2:0] C;
Body-code
endmodule
Example: Out
DUT DUT_
module DUT (s, Out);
input [3:0] s;
Ports
output [2:0] Out;
Wire
wire [2:0] Out; (for interconnection)
Signals reg [2:0] Count;
integer k; Loop counter
Count = 0;
for (k=0; k<4; k=k+1)
Code if (s[k])
Body Count = Count + 1; “;” at the end of each line
assign Out = Count;
endmodule
The keyword “reg” does NOT necessarily denote a storage element or register.
“reg” only models the behavior of a circuit.
May or may not be synthesized as a register.
a a
C C
b b
Clk
0
Scalar Vector
1 Z
X: Unknown
pull Driving
large Storage Type Range Name Value
weak Driving
medium Storage Scalar Vector
small Storage
highz High Impedance weakest
Example:
module DUT (s, Out)
parameter n = 3;
parameter S0 = 4’b1010;
input [n-1:0] s;
output [n:0] Out;
endmodule
~1010 0101
1000
1101 Λ 0100 1001
Logical:
Operation Result
X || 1 = 1 1010 && 1100 1 Non-zero operand=logical “1”
X && 0 = 0
2’b11 || 2’b00 1
!0010 0
& 111 1
Λ 0100 1
Relational:
Operation Result
A=2’b10 B=(A == 2’b10) B=1
D = A << 2 D = 110000
F = A >> 3 F = 000001
Concatenation:
Operation Result
A=2’b11 {A, B} 5’b11010
B=3’b010 {3{A}} 6’b111111
{B, B} 6’b010010
{{3{A}}, {2{B}}} 12’b111111010010
Be generous in {}
D = ({S1,S2}==2’b00)? F: F
00
({S1,S2}==2’b01)? E:
E 01
({S1,S2}==2’b10)? C:B;
D 4-input
Default C
D = ({S1,S2}==2’b00)? F: 10 Multiplexer
({S1,S2}==2’b01)? E: B (MUX)
11
({S1,S2}==2’b10)? C:
({S1,S2}==2’b11)? B:B; S1 S2
Signals
output [3:0] B; Combined
output reg [3:0] B; Body-code
reg [3:0] B;
endmodule
net inout
Outside view of the module
input port: wire or reg input output
output port : wire reg or net net reg or net net
inout: wire
x C
y assign C = x & y; Equivalent
assign C[1] = A[1]&B[1];
assign C[2] = A[2]&B[2];
Statement Assignment assign C[3] = A[3]&B[3];
assign S = x ^ y ^ Cin;
assign {Cout, S} = x + y + Cin;
assign Cout = (x & y)|(x & Cin)|(y & Cin);
endmodule
endmodule
x y Cin Cout S
Cin
0 0 0 0 0
x 0 0 1 0 1
Cout
0 1 0 0 1
0 1 1 1 0
y
1 0 0 0 1
1 0 1 1 0
S 1 1 0 1 0
1 1 1 0 1
wire #2 S;
assign #5 S = x&y;
Correct Incorrect
A given variable should never be assigned a value in more than one always block.
Because always blocks are concurrent with respect to one another.
Evaluated and assigned in a single step Evaluated and assigned in two steps
Sequential nature 1. All RHSs are evaluated in parallel
Assignment ordering IS important 2. Assignments to LHSs are performed together
S=4 “blocks” a=S to be evaluated They all are evaluated all at once
a=S has to wait for S=4 to be evaluated first Assignment ordering is NOT important
S<=4 and a<=S evaluated in parallel
y1 y1
in y1 y2 in y2 in y2
Clk Clk
Clk Clk
in in
y1 y1
y2
y2
assign a=b&c; in b
always @ (c,b) d
begin Clk
d = c^b;
end e
assign e=b|c;
Procedural Continuous
Inside an always block Using assign statement
assign a=b;
Blocking Non-blocking
always @ (*) always @ (*)
begin begin
= <=
= <=
end end
assign can not be used inside an always block b/c assign is used for nets.
Nets can not be assigned inside an always blocks (only reg or integer).
assign assign
always always always always always
In Combinational Out
Comb. Logic
Logic
Critical path
Clk
Critical path of the Comb. Logic determines the max operating frequency
Combinational logic can be realized using assign and always constructs
Sequential logic can only be realized using always blocks.
When using always block for Com. Logic, “blocking” assignments are used
When using an always block, time instant changes when one of the
sensitivity list variables changes
Why?
1. Because powerful statements like if-else and loop constructs can only
be used inside an always block
Comes with more clarity and more concise description than assign
2. Multiple outputs can be assigned within a single always block
Input
Output
Comb. Next State Registers/
Current State Comb.
Logic (NS) Flip Flops
(CS) Logic
(FFs)
Clk
high-fanout
reset signal
Reset
Example:
Clk
Reset
D-Latech
FF (sync Rst)
FF (Async Rst)
wire [n:0] d;
reg [n:0] q;
...
always @ (posedge Clk)
q<=d;
The keyword “reg” does NOT necessarily denote a storage element or register.
“reg” simply means a variable that can hold a value
May or may not be synthesized as a register.
a a
C C
b b
Clk
Race Condition
Clk Clk
in in
y1 y1
y2 ? ? ? ? ? ? y2
Incorrect!
Combinational Out
In
Logic
Clk
Clk
Tsu Thold Tsu Thold
in
Tcq Tlogic Tsu
TClk>Tcq+Tlogic+Tsu
Tlogic<TClk-Tsu-Tcq
© M. Shabany, ASIC/FPGA Chip Design
System Timing Parameters : Minimum Delay
Hold-time Condition:
If violates circuit does not work (even at lower frequencies) (why?)
Combinational Out
In
Logic
Clk
Clk
Tsu Thold Tsu Thold
in
Tcq,d Tlogic,cd Tcq,d Tlogic,cd
Tcq,cd+Tlogic,cd>Thold
module Mux21 (in1, in2, s, out) module Mux21 (in1, in2, s, out)
input in1, in2, s; input in1, in2, s;
Example: output reg out; output reg out;
always @ (in1, in2, s)
in1 0 always @ (in1, in2, s) begin
out if (s==0) out = in1;
in2 out = in1; if (s==1)
1
else out = in2;
s out = in2; end
endmodule endmodule
in1 0
out in1 out
in2 1
s
© M. Shabany, ASIC/FPGA Chip Design
Procedural Statements
Procedural Statements
module Mux21 (in1, in2, s, out) module Mux21 (in1, in2, s, out)
Example: input in1, in2, s; input in1, in2, s;
output reg out; output reg out;
When realizing combinational logic with always block using if-else or case
constructs care has to be taken to avoid latch inference after synthesis
The latch is inferred when “incomplete” if-else or case statements are declared
If there is some logic path through the always block that does not assign a value
to the output, a latch is inferred
A out
D Q A
0
S out
Clk D Q
B 1
Clk
Latch Inference
S[0] S[1]
Latch Inference
© M. Shabany, ASIC/FPGA Chip Design
Latch Inference in Combinational Logic
To avoid latch inference make sure to specify all possible cases “explicitly”
Do NOT let it up to the synthesis tool to act in unspecified cases and do specify
all cases explicitly.
A S[1]
0
out B
D Q
B 1
S[0] out
Clk
S[0] S[1] A
Top Module
M1 M2
Inputs
M3 M1 M1
Outputs
M2 M1
reg out;
3. always block always @(In1, In2)
out = (In1 & In2);
endmodule
defparam stage1.n = 2;
RippleCarryAdderI stage1 (.Cin(C), .X(X[4:3]), .Y(Y[4:3]), .S(S[4:3]), .Cout(Cout));
endmodule
defparam stage0.length = 6;
M1 stage0 (IN[0], IN[1], w1, w2);
defparam stage1.length = 3;
M1 stage1 (.in1(w1), .in2(IN[2]), .out2(w3), .out1(OUT[2]));
endmodule
A function can have multiple inputs but does not have any output
function my4-to-1MUX;
input [0:3] W;
input [1:0] s;
if (s==0) my4-to-1MUX = W[0];
else if (s==1) my4-to-1MUX = W[1];
else if (s==2) my4-to-1MUX = W[2];
else if (s==3) my4-to-1MUX = W[3];
endfunction
if (S[3:2]==0) Out= M[0];
always@ (W, S) else if (S[3:2]==1) Out= M[1];
begin else if (S[3:2]==2) Out= M[2];
M[0] = my4-to-1MUX(W[0:3],S[1:0]); else if (S[3:2]==3) Out= M[3];
M[1] = my4-to-1MUX(W[4:7],S[1:0]);
M[2] = my4-to-1MUX(W[8:11],S[1:0]);
M[3] = my4-to-1MUX(W[12:15],S[1:0]);
end
endmodule
A task can only be called from inside and always (or initial) block
task 4-to-1MUX;
input [0:3] W;
input [1:0] s;
output Result;
begin
if (s==0) Result= W[0];
elseif (s==1) Result = W[1];
elseif (s==2) Result = W[2];
elseif (s==3) Result = W[3];
end
endtask
always@ (W, S)
begin
4-to-1MUX(W[0:3],S[1:0], M[0]);
4-to-1MUX(W[4:7],S[1:0] , M[1]);
4-to-1MUX(W[8:11],S[1:0] , M[2]);
4-to-1MUX(W[12:15],S[1:0] , M[3]);
4-to-1MUX(M[0:3],S[3:2] , Out);
end
endmodule
0
0
always @ (s0,s1, d0, d1) 0
begin d0 1 Q
Q = 0; d1 1
if (s1) Q = d1;
else if (s0) Q = d0; s0
end
s1
Non of the above infer latch, why?
© M. Shabany, ASIC/FPGA Chip Design
Example: Up & Down Counters
4-Bit unsigned down-counter 4-Bit up-counter with
with synchronous set asynchronous reset and
modulo maximum
module D_counter (C, S, Q); module U_counter (C, CLR, Q);
input C, S; parameter
output [3:0] Q; MAX_SQRT = 4,
reg [3:0] tmp; MAX = (MAX_SQRT*MAX_SQRT);
always @(posedge C) input C, CLR;
begin output [MAX_SQRT-1:0] Q;
if (S) reg [MAX_SQRT-1:0] cnt;
tmp <= 4’b1111; always @ (posedge C or posedge CLR)
else begin
tmp <= tmp - 1’b1; if (CLR)
end cnt <= 0;
assign Q = tmp; else
cnt <= (cnt + 1) %MAX;
endmodule end
assign Q = cnt;
endmodule
Input
Output
Comb. Next State Flip Flops Comb.
(NS) Current State
Logic (FFs) Logic
(CS)
Input
Output
Comb. Next State Flip Flops Comb.
(NS) Current State
Logic (FFs) Logic
(CS)
Output
assign ……………
Calculation
2. Half-duplex communication:
A A
3. Bus multiplexing: a 8 8
s 8 Out[7:0]
b 8 8
endmodule
input Si, L, R;
input [7:0] In;
output [7:0] Out;
The length of WI and WF are calculated based on the dynamic range of variables
Total length: WI + WF
Sing Bit
WI WF
0: positive
Sign Bit =
1: negative
W W 1 1 WF
Good to represent quantized numbers in the range: 2 I , 2 I
2
WF
1
Resolution :
2
Example:
in (3,3) 011101 represents 3.625 (smallest number: 0.125)
in (3,5) 10111000 represents -2.25 (smallest number: 0.03125)
Matlab rounding:
round(∙): towards nearest integer
Pos. and neg. numbers are rounded symmetrically about zero
Generally the best possible rounding algorithm
fix(∙): truncates towards zero
Pos. and neg. numbers are rounded symmetrically about zero
floor(∙): rounds towards negative infinity
ceil(∙): rounds towards positive infinity
x x x x x x x x x x x x x
Signed magnitude
Positive and negative numbers both truncate towards zero
Matlab fix(∙)
x x x x x x x x
y y y y y y x x y y y y y
A:
WI WF
WI WI
WF WF
B:
WI WF
(10,4) 0 0 0 0 1 1 0 1 1 1 0 1 0 0 (10,4) 0 0 0 1 1 1 0 1 1 1 0 1 0 0
(7,2) 0 1 1 0 1 1 1 0 1 (7,2) 0 1 1 1 1 1 1 1 1
WI WF WI WF
(10,4) 1 1 1 1 1 1 0 1 1 1 0 1 0 0 (10,4) 1 1 0 1 1 1 0 1 1 1 0 1 0 0
(6,3) 1 1 0 1 1 1 0 1 0 (6,3) 1 0 0 0 0 0 0 0 0
A:
WI WI (6,3) 1 1 0 1 1 1 0 1 0
WI WF WF WF
B: 0 0 (10,4) 1 1 1 1 1 1 0 1 1 1 0 1 0 0
WI WF
assign B {{n{A[WI WF - 1]}},A,2' b0}
Examples: Adding two numbers with different lengths:
wire [2:0] A;
wire [5:0] B;
wire [6:0] C;
assign C = {B[5],B} + {{4{A[2]}},A};
C[n-1:0]
Overflow may happen if:
A[n-1]==1 and B[n-1]==1 and C[n-1]==0 0110 1010
+ +
0111 1001
A[n-1]==0 and B[n-1]==0 and C[n-1]==1 10011
1101
assign SUM = B + A;
assign OV = (A[n-1]==1 && B[n-1]==1 && C[n-1]==0)||
(A[n-1]==0 && B[n-1]==0 && C[n-1]==1);
𝑁 = 2𝐸𝑥𝑝−127 × 𝑀
The most widely used form of floating-point is IEEE Standard for Binary
Floating-Point Arithmetic (IEEE 754) with two major formats:
Single-precision (32-bit)
Double-precision (64-bit)
A = $unsigned (B)
Zero fill B and assign it to A
bit width(B) < bit width (A)
Example
wire [5:0] A;
A = 000110
assign A = $unsigned (3b’110);
Wrong otherwise:
wire [2:0] A, B; 110 (-2)
wire [3:0] SUM; 010 (+2) (Wrong)
assign SUM = B + A; 1000 (-8)
wire [2:0] A, B;
wire Cin; Same result
wire [3:0] SUM; (-2)
1110
assign SUM = {B[2],B} + {A[2],A} + Cin; 0010 (+2)
0001 Cin
10001 (1)
2. Using signed signals
Discard Overflow
wire signed [2:0] A, B;
wire Cin;
wire signed [3:0] SUM;
assign SUM = B + A + $signed({1’b0},Cin);
1110 (-2)
wire signed [2:0] A, B; When Cin=1, it sign extends it, to
0010 (+2)
wire Cin; match the size of A and B,
wire signed [3:0] SUM; 1111 Cin
1111 (-1) which is incorrect!
assign SUM = B + A + $signed(Cin);
Complicated!
Incorrect:
wire signed [2:0] A; wire signed [2:0] A;
wire [2:0] B; wire [2:0] B;
wire signed [5:0] PROD; wire signed [5:0] PROD;
assign PROD = A*$signed(B); assign PROD = A*B;
56=32+16+8 56=64-8
input [7:0] in; input [7:0] in;
wire [13:0] product; wire [13:0] product;
assign product = assign product =
{in[7], in, 5’b00000} {in, 6’b00000}
+ {in[7], in[7], in, 4’b0000} - {in[7], in[7], in[7], in, 3’b000};
+ {in[7], in[7], in[7], in, 3’b000};
2
3
<<2 <<3
MSB
<<1
LSB
0111 b[4]
0101
1 0 1 0 1 0
0011
0001 0 1 0
b[2] b[3]
1111
b[3] b[4] b[3] b[4]
1101 1 0
1011
1001 P=axb
a b
b
C.M. 0
1
4
2
3
LSB
1 0 1 0
b[2] b[3]
0111
0101 b[1]b[2]b[3] 1 0 1 0
b[4] b[3]
0011 +
b[3]b[2]b[1]
0001
1111
1101 1 0 b[4]
1011
Constant Multiplier
1001
P ab
-
c+d
d
bd ac+bd
b
- Real
c ac ac-bd
bd ac+bd
b
- Real
c ac ac-bd
x3 x2 x1 x0 x3 x2 x1 x0
x0 x0
x1 x1
x2 x2
x3 x3
Diagonals (x0 x0, x1 x1, …) can be replaced by the single input bit with
no computation for that bit b/c we have x0 AND x0= x0
x3 x2 x1 x0
x3 x2 x1 x0
x0
x0
x1 x1
x2 x2
x3
x3
inputs n outputs
General Architecture: Data Path
m
Data Path:
Transfer input data signals into outputs
Normally combinational logic or counters Controller
Clk
Controller:
Provides any control signal to determine the direction of data flow
Examples: Reset, set, MUX select signals, …
Sequential logic
HDL Code:
Control Path:
1. a_r * b_r →pp1_reg
2. a_i * b_i →pp2_reg
3. pp1 –pp2 →p_r_reg
a_r * b_i →pp1_reg
4. a_i * b_r →pp2_reg
5.pp1 + pp2 →p_i_reg
Write Read
we
cs&!we&oe
oe
data
Clk CLK data_out
DATAOUT[7:0]
Clk
EN
address RADDR[7:0]
WADDR[7:0]
DATAIN[7:0] cs&!we&oe
cs&we
mem
(SYNC RAM)
we0
Clk
EN
address0 RADDR[7:0]
WADDR[7:0]
cs1
we1
Clk
EN
address1 RADDR[7:0]
WADDR[7:0]
we1
1 data1
Clk CLK data_out1
DATAOUT[7:0]
0 0
Clk
EN
RADDR[7:0]
WADDR[7:0]
cs1 & !we1 & oe1
address1
DATAIN[7:0]
cs1 & we1
mem_dual
(SYNC RAM)
1 data0
DATAIN[7:0] data_out0
DATAOUT[7:0]
0 0
address0 Clk
EN
RADDR[7:0] cs0 & !we0 & oe0
WADDR[7:0]
cs0 & !we0 & oe0
Clk
cs0 mem
(SYNC RAM)
we0
oe0
Masked ROM
Data manufactured into the ROM
When there are multiple assignments to the same variable in an always block,
the last statement is evaluated
Example:
module DUT(Count );
output reg [2:0] Count;
integer k; module DUT(Count );
output reg [2:0] Count;
always @ (*) integer k;
begin
Count <= 0; always @ (*)
for (k=0; k<4; k=k+1) Counter
Count <= Count + 3; 3
Count <= Count + k; endmodule
end
endmodule
Therefore, to avoid mismatch b/w simulation and synthesized version, the sensivity
list of always block should include all the signals on the RHS
b nb
wire na, nb; wire na, nb; s
y
Result: na = a&~s;
wire na; b
s Illegal
HDL: assign y = na|nb; na
(only used for tri-state implementation)
assign na = b&s;
assign na = a&~s;
a