Pipeline Hazards
Pipeline Hazards
Pipeline: Hazards
• Introduction
– Defining Pipelining
– Pipelining Instructions
• Hazards
– Structural hazards \
– Data Hazards
– Control Hazards
• Performance
Pipeline Hazards
LW
ADD
ADD
4
<<2
PC
ADDR RD RN1 RD1
32 5
ALU Zero
Instruction RN2
5
Memory Register
WN File RD2
5
M
WD U ADDR
X
Data
E Memory RD M
U
16 X 32 X
T WD
N
5
D
Executing Multiple Instructions
Clock Cycle 2
SW LW
ADD
ADD
4
<<2
PC
ADDR RD RN1 RD1
32 5
ALU Zero
Instruction RN2
5
Memory Register
WN File RD2
5
M
WD U ADDR
X
Data
E Memory RD M
U
16 X 32 X
T WD
N
5
D
Executing Multiple Instructions
Clock Cycle 3
ADD SW LW
ADD
ADD
4
<<2
PC
ADDR RD RN1 RD1
32 5
ALU Zero
Instruction RN2
5
Memory Register
WN File RD2
5
M
WD U ADDR
X
Data
E Memory RD M
U
16 X 32 X
T WD
N
5
D
Executing Multiple Instructions
Clock Cycle 4
SUB ADD SW LW
ADD
ADD
4
<<2
PC
ADDR RD RN1 RD1
32 5
ALU Zero
Instruction RN2
5
Memory Register
WN File RD2
5
M
WD U ADDR
X
Data
E Memory RD M
U
16 X 32 X
T WD
N
5
D
Executing Multiple Instructions
Clock Cycle 5
SUB ADD SW LW
ADD
ADD
4
<<2
PC
ADDR RD RN1 RD1
32 5
ALU Zero
Instruction RN2
5
Memory Register
WN File RD2
5
M
WD U ADDR
X
Data
E Memory RD M
U
16 X 32 X
T WD
N
5
D
Executing Multiple Instructions
Clock Cycle 6
SUB ADD SW
ADD
ADD
4
<<2
PC
ADDR RD RN1 RD1
32 5
ALU Zero
Instruction RN2
5
Memory Register
WN File RD2
5
M
WD U ADDR
X
Data
E Memory RD M
U
16 X 32 X
T WD
N
5
D
Executing Multiple Instructions
Clock Cycle 7
SUB ADD
ADD
ADD
4
<<2
PC
ADDR RD RN1 RD1
32 5
ALU Zero
Instruction RN2
5
Memory Register
WN File RD2
5
M
WD U ADDR
X
Data
E Memory RD M
U
16 X 32 X
T WD
N
5
D
Executing Multiple Instructions
Clock Cycle 8
SUB
ADD
ADD
4
<<2
PC
ADDR RD RN1 RD1
32 5
ALU Zero
Instruction RN2
5
Memory Register
WN File RD2
5
M
WD U ADDR
X
Data
E Memory RD M
U
16 X 32 X
T WD
N
5
D
One Memory Port Structural Hazards
Alternative View - Multicycle Diagram
I Load Ifetch
ALU
Reg DMem Reg
n
s
ALU
Reg
t Instr 1
Ifetch Reg DMem
r.
ALU
Ifetch Reg DMem Reg
Instr 2
O
r
d
ALU
Ifetch Reg DMem Reg
Instr 3
e
r
One Memory Port Structural Hazards
Alternative View - Multicycle Diagram
ALU
Reg DMem Reg
n
s
ALU
Reg
t Instr 1
Ifetch Reg DMem
r.
ALU
Ifetch Reg DMem Reg
Instr 2
O
r
d
ALU
Ifetch Reg DMem Reg
Instr 3
e
r
One Memory Port Structural Hazards
I Load Ifetch
ALU
Reg DMem Reg
n
s
ALU
Reg
t Instr 1
Ifetch Reg DMem
r.
ALU
Ifetch Reg DMem Reg
Instr 2
O
r
Stall Bubble Bubble Bubble Bubble Bubble
d
e
r
ALU
Ifetch Reg DMem Reg
Instr 3
Structural Hazards
We want to compare the performance of two machines. Which machine is faster?
• Machine A: Dual ported memory - so there are no memory stalls
• Machine B: Single ported memory, but its pipelined implementation has a clock
rate that is 1.05 times faster
Assume:
• Ideal CPI = 1 for both
• Loads are 40% of instructions executed
Speed Up Equations for Pipelining
Value of CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
register $2: 10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
Program
execution
order
(in instructions)
sub $2, $1, $3 IM Reg DM Reg
The use of the result of the SUB instruction in the next three instructions causes a
data hazard, since the register $2 is not written until after those instructions read it.
Data Hazards
Execution Order is:
Read After Write (RAW)
InstrI
InstrJ InstrJ tries to read operand before InstrI writes it
I: add r1,r2,r3
J: sub r4,r1,r3
I: sub r4,r1,r3
J: add r1,r2,r3
K: mul r6,r1,r7
– Called an “anti-dependence” by compiler writers.
This results from reuse of the name “r1”.
Data Hazards
Execution Order is:
Write After Write (WAW)
InstrI
InstrJ tries to write operand before InstrI writes it
InstrJ – Leaves wrong result ( InstrI not InstrJ )
I: sub r1,r4,r3
J: add r1,r2,r3
K: mul r6,r1,r7
• Called an “output dependence” by compiler writers
This also results from the reuse of name “r1”.
Data Hazard Detection in MIPS (1)
Read after Write
Time (in clock cycles)
Value of CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
register $2: 10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
Program
execution IF/ID ID/EX EX/MEM MEM/WB
order
(in instructions)
sub $2, $1, $3 IM Reg DM Reg
0 2 4 6 8 10 12 16 18
W
add $s0,$t0,$t1 IF ID EX MEM s0 $s0
written
here
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
sub $t2,
$s0,$t3 R
IF s0 EX MEM WB
$s0 read
here
Data Hazards - Forwarding
• Key idea: connect new value directly to next stage
• Still read s0, but ignore in favor of new result
0 2 4 6 8 10 12 16 18
ID W
add $s0,$t0,$t1 IF ID EX MEM s0
new value
of s0
R
sub $t2,$ s0,$t3 IF s0 EX MEM WB
Data Hazards - Forwarding
• STALL still required for load - data avail. after MEM
• MIPS architecture calls this delayed load, initial
implementations required compiler to deal with this
0 2 4 6 8 10 12 16 18
ID W
lw $s0,20($t1) IF ID EX MEM s0
new value
of s0
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
R
sub $t2,
$s0,$t3 IF s0 EX MEM WB
• Introduction
– Defining Pipelining
– Pipelining Instructions
• Hazards
– Structural hazards
– Data Hazards
– Control Hazards \
• Performance
Control Hazards
Branches are different portions of the same pipeline that run at different
times. Most pipelines are composed of more than one branch.
Control Hazards
0 2 4 6 8 10 12 16 18
STALL
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
sw $s4,200($t5) IF ID EX MEM WB
beq
writes PC new PC
here used here
Control Hazard - Correct Prediction
0 2 4 6 8 10 12 16 18
tgt:
sw $s4,200($t5) IF ID EX MEM WB
Fetch assuming
branch taken
Control Hazard - Incorrect Prediction
0 2 4 6 8 10 12 16 18
tgt:
sw $s4,200($t5) IF
(incorrect - STALL) BUBBLE BUBBLE BUBBLE BUBBLE
or $r8,$r8,$r9 IF ID EX MEM WB
“Squashed”
instruction
Summary - Control Hazard Solutions
• Stall - stop fetching instr. until result is
available
– Significant performance penalty
– Hardware required to stall
• Predict - assume an outcome and continue
fetching (undo if prediction is wrong)
– Performance penalty only when guess wrong
– Hardware required to "squash" instructions
• Delayed branch - specify in architecture that
following instruction is always executed
– Compiler re-orders instructions into delay slot
– Insert "NOP" (no-op) operations when can't use
– This is how original MIPS worked
Thank You