0% found this document useful (0 votes)
1 views

module 4-Pipelining

This document provides an overview of pipelining in processor design, explaining how it allows multiple instructions to be executed simultaneously, thereby increasing throughput. It discusses various hazards that can occur during pipelining, including structural, data, and control hazards, and presents solutions for managing these issues. Additionally, the document highlights the design considerations for instruction sets that facilitate efficient pipelining, particularly in the context of the MIPS architecture.

Uploaded by

Haf hafeefa
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

module 4-Pipelining

This document provides an overview of pipelining in processor design, explaining how it allows multiple instructions to be executed simultaneously, thereby increasing throughput. It discusses various hazards that can occur during pipelining, including structural, data, and control hazards, and presents solutions for managing these issues. Additionally, the document highlights the design considerations for instruction sets that facilitate efficient pipelining, particularly in the context of the MIPS architecture.

Uploaded by

Haf hafeefa
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Module IV

An overview of pipelining - Pipelined data path and control –


Structural hazards - Data hazards - Control hazards
PIPELINING
Overview of Pipelining
• Pipelining is an implementation technique in which multiple
instructions are overlapped in execution.

• Pipelining can be explained using the laundry analogy.


4 steps are involved
• Washing
• Drying
• Folding
• Putting away the clothes
Overview of Pipelining
Non pipelined

Pipelined
Overview of Pipelining
• The washer, dryer, folder, and store each take 30 minutes for their task.

• Sequential laundry takes 8 hours for 4 loads of wash, while pipelined laundry
takes just 3.5 hours.

• Pipelined laundry is potentially four times faster than non pipelined

• If all the stages take about the same amount of time and there is enough work
to do, then the speed-up due to pipelining is equal to the number of stages
Overview of Pipelining
• The same principles apply to processors where we pipeline instruction-
execution.

• Pipe lining doesn’t reduce the time take to complete a single task.

• But it enables the execution tasks in parallel or overlapped manner.

• It reduces the total time taken to complete different tasks.

• So throughput of the system is increased.


Overview of Pipelining
• If all the stages take about the same amount of time and there is enough
work to do, then the speed-up due to pipelining is equal to the number of
stages in the pipeline.

• The same principles apply to processors where we pipeline instruction-


execution.

• MIPS instructions classically take five steps:


1. Fetch instruction from memory.
2. Read registers while decoding the instruction. The regular format of MIPS
instructions allows reading and decoding to occur simultaneously.
3. Execute the operation or calculate an address.
4. Access an operand in data memory.
5. Write the result into a register.
Single-Cycle versus Pipelined Performance
• MIPS pipeline has five stages.

Example
• We are creating a pipeline for the execution of eight instructions:
• load word (lw), store word (sw), add (add), subtract (sub), AND (and), OR (or), set
less than (slt), and branch on equal (beq).
• Compare the average execution time for a single-cycle implementation to a
pipelined implementation.
• The operation times for the major functional units are:
• 200 ps for memory access,
• 200 ps for ALU operation, and
• 100 ps for register file read or write
Single-Cycle versus Pipelined
Performance

Total time for each instruction calculated from the time for each component. This calculation assumes
that the multiplexors, control unit, PC accesses, and sign extension unit have no delay
Single-Cycle versus Pipelined
Single-Cycle versus Pipelined
• In order to compare the pipelined and non pipelined approach, we
consider the example which involves the execution of 3 load word
instructions.

• Each load word instruction takes 800ps for execution.

• There is a fourfold speed-up on average time between instructions,


from 800 ps down to 200 ps
Pipeline Speedup

• Pipelining speed-up discussion can be converted into a formula.

• If the stages are perfectly balanced, then the time between instructions

• Under ideal conditions and with a large number of instructions, the


speed-up from pipelining is approximately equal to the number of pipe
stages.
Pipeline Speedup
• A five stage pipeline is nearly five times faster than non pipeline execution
provided conditions are ideal.

• Here, since the stages are imperfectly balanced, speed-up will be less than
the number of pipeline stages.

• However if we compare the total execution time for 3 instructions:


• Non-pipelined- 2400ps
• Pipelined-1400ps

• The four fold improvement is not reflected in total execution time as the no:of
instructions is less.
Designing Instruction Sets for
Pipelining
• MIPS instruction set was designed for pipelined execution:

• All MIPS instructions are the same length.(32 bits).

• This restriction makes it much easier to fetch instructions in the first pipeline
stage and to decode them in the second stage.

• In instruction sets were instruction lengths are varying pipelining is challenging.


Designing Instruction Sets for
Pipelining
• MIPS have few regular instruction formats:
• So register fields being located in the same place in each instruction.
• Second stage can begin reading the register fi le at the same time that the
hardware is determining what type of instruction was fetched.

• Third, memory operands only appear in loads or stores in MIPS:


• We can use the execute stage to calculate the memory address and then
access memory in the following stage.
Designing Instruction Sets for
Pipelining
• Operands must be aligned in memory:
• requested data can be transferred between processor and memory in a single
pipeline stage.
Pipeline Hazards

• There are situation in pipelining when the next instruction cannot execute in
the following clock cycle.

• These events are call hazards, and there are three different types:

• Structural hazard.
• Data Hazards
• Control Hazards
Structural hazard.
• Hardware cannot support the combination of instructions that we
want to execute in the same clock cycle.

• Eg: using a washer-dryer combination instead of a separate washer


and dryer.

• MIPS instruction set was designed to be pipelined, making it fairly


easy for designers to avoid structural hazards when designing a
pipeline.
Structural hazard
• Suppose, that we had a single memory.

• If the pipeline had a fourth instruction, we would see that in the same
clock cycle the first instruction is accessing data from memory while the
fourth instruction is fetching an instruction from that same memory.

• Without two memories, our pipeline could have a structure hazard


Hence, pipelined data path require separate instruction/data memories
Data Hazards
• Also called a pipeline data hazard.

• When a planned instruction cannot execute in the proper clock cycle


because data that is needed to execute the instruction is not yet
available.

• In computer pipeline, data hazards arise from the dependence of one


instruction on an earlier one that is still in the pipeline.
Data Hazards
Solution
• Adding extra hardware to retrieve the missing item early from the
internal resources- forwarding or bypassing.
Data Hazards
load-use data hazard
• A specific form of data hazard in which the data being loaded by a
load instruction has not yet become available when it is needed by
another instruction.

• lw $s0, 20($t1)- data is available only on fourth stage.


• sub $t2, $s0, $t3- Data is needed at the 3rd stage
• Forwarding also won’t provide a solution
Data Hazards
Solution
• pipeline stall or bubble: A stall or halt is initiated to resolve load-data
hazard.
Control Hazards
• Also called branch hazard.

• Control hazard arising from the need to make a decision based on the
results of one instruction while other are executing.

• Solution 1: solution is to stall immediately after we fetch a branch,


waiting until the pipeline determines the outcome of the branch and
knows what instruction address to fetch from.
Control Hazards
• The cost of this option is too high for most computers.

Solution 2: use prediction to handle branches.

• Predict always that branches are not taken.


• When you’re right, the pipeline proceeds at full speed.
• Only when branches are taken does the pipeline stall.
Control Hazards
More-Realistic Branch Prediction

Static branch prediction

• Based on typical branch behaviour


Example: loop branches
• At the bottom of loops are branches that jump back to the top of the
loop.
• Since they are likely to be taken and branch backward, we could
always predict taken for branches that jump to an earlier address.
More-Realistic Branch
Prediction
Dynamic prediction

• keeping a history for each branch as taken or untaken.

• using the recent past behaviour to predict the future.

• When the guess is wrong, the pipeline control must ensure that the
instructions following the wrongly guessed branch have no effect and
must restart the pipeline from the proper branch address.
Pipelined Data path and Control
• The division of an instruction into five stages means a five-stage
pipeline.

• Up to five instructions will be in execution during any single clock


cycle.

• We must separate the data path into five pieces, with each piece
named corresponding to a stage of instruction execution:
Pipelined Data path and Control

1. IF: Instruction fetch


2. ID: Instruction decode and register fi le read
3. EX: Execution or address calculation
4. MEM: Data memory access
5. WB: Write back
These five components correspond roughly to the way the data-path is drawn;
instructions and data move generally from left to right through the fi ve stages
as they complete execution
The single-cycle data path
• There are, however, two exceptions to this left -to-right flow of
instructions:

• The write-back stage, which places the result back into the register
file in the middle of the data path

• The selection of the next value of the PC, choosing between the
incremented PC and the branch address from the MEM stage
The pipelined version of the
data path
The pipelined version of the
data path
• Pipeline registers: Need registers between stages to hold information
produced in previous cycle.

• The pipeline registers, in color, separate each pipeline stage.

• They are labeled by the stages that they separate.

• The registers must be wide enough to store all the data corresponding
to the lines that go through them.
The pipelined version of the
data path
• For example, the IF/ID register must be 64 bits wide, because it must
hold both the 32-bit instruction fetched from memory and the
incremented 32-bit PC address.
Pipelined Control
• Just as we added control to the single-cycle data path we now add
control to the pipelined data path
• We can divide the control lines into five groups according to the
pipeline stage.

1. Instruction fetch: The control signals to read instruction memory and


to write the PC are always asserted, so there is nothing special to
control in this pipeline stage.
Pipelined Control
1. Instruction fetch: The control signals to read instruction memory and to write the
PC are always asserted, so there is nothing special to control in this pipeline stage
2. Instruction decode/register file read: As in the previous stage, the
same thing happens at every clock cycle, so there are no optional
control lines to set.
3. Execution/address calculation: The signals to be set are RegDst, ALUOp, and
ALUSrc
• The signals select the Result register, the ALU operation, and either Read data 2
or a sign-extended immediate for the ALU.
Pipelined Control
4. Memory access: The control lines set in this stage are Branch, MemRead, and
MemWrite. The branch equal, load, and store instructions set these signals,
respectively.
5. Write-back: The two control lines are MemtoReg, which decides between
sending the ALU result or the memory value to the register fi le, and Reg- Write,
which writes the chosen value.

You might also like