Pipelining Become Universal Technique in 1985
Pipelining Become Universal Technique in 1985
3
Techniques for Improving ILP
- Loop unrolling
-Basic pipeline scheduling
-Dynamic scheduling, score boarding,
register renaming
-Dynamic memory disambiguation
-Dynamic branch prediction
-Multiple instruction issue per cycle
-Software and hardware techniques
4
Loop-Level Parallelism
• ILP is limited!
-Average basic-block size is 6-7 instructions
-These may be dependent
• LLP
-Easily unroll loop statically or dynamically
-Can use SIMD (vector processors and GPUs)
5
ILP
• ILP is increased by exploit Parallelism among
iteration of a loop
Ex:
For(i=0;i<=999;i=i+1)
x[i]=x[i]+y[i];
• Loops are used parallel
• Techniques are used to convert from LLP to ILP
6
Hazards & Stalls
Structural Hazards
– Cause: resource contention
– Solution: add more resources & better scheduling
Data Hazards
– Cause: Dependences
True data dependence: property of program: RAW
Name dependence: reuse of registers, WAR & WAW
– Solution: loop unrolling, dynamic scheduling, register renaming,
hardware speculation
Control Hazards
– Cause: branch instructions, change of program flow
– Solution: loop unrolling, branch prediction, hardware speculation
7
1. Data Dependence
• Loop-Level Parallelism
– Unroll loop statically or dynamically
– Use SIMD (vector processors and GPUs)
• Challenges:
– Data dependency
• Instruction j is data dependent on instruction i if
• Two possibilities:
- Maintain dependence, but avoid stalls
- Eliminate dependence by code transformation
Example
Instruction Sequence
1
2
3
4
5
Data Dependencies for float data
1
2
3
Data Dependence for Integer Data
11
2. Name Dependence
• Two instructions use the same name(Reg or
Mem) but no flow of information
13
Control Dependence
An example:
● T1;
if p1 {
S1;
}
if p2 {
S2;
}
● Statement S1 is control-dependent on p1, but T1 is not
Statement S2 is control-dependent on p2, but not p1
15
Example 2: • OR instruction dependent on
DADDU R1,R2,R3 DADDU and DSUBU.
BEQZ R4,L • Data flow must be preserved
DSUBU R1,R1,R6
L: …
OR R7,R1,R8
Example 3:
DADDU R1,R2,R3 • Violating control dependence does not
BEQZ R12,skip affect data flow or exception
DSUBU R4,R5,R6 • Assume R4 isn’t used after skip
DADDU R5,R4,R9 – Possible to move DSUBU before the
skip: OR R7,R8,R9 branch