0% found this document useful (0 votes)

35 views15 pages

Understanding Pipelining Techniques

Uploaded by

vijaykrishna2k24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views15 pages

Understanding Pipelining Techniques

Uploaded by

vijaykrishna2k24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

UNIT – 5

Pipelining

The term Pipelining refers to a technique of decomposing a sequential process into

sub-operations, with each sub-operation being executed in a dedicated segment that
operates concurrently with all other segments.

The most important characteristic of a pipeline technique is that several computations can
be in progress in distinct segments at the same time. The overlapping of computation is
made possible by associating a register with each segment in the pipeline. The registers
provide isolation between each segment so that each can operate on distinct data
simultaneously.

The structure of a pipeline organization can be represented simply by including an input

Let us consider an example of combined multiplication and addition operation to get a better
understanding of the pipeline organization.
The term Pipelining refers to a technique of decomposing a sequential process into sub-operations, with
each sub-operation being executed in a dedicated segment that operates concurrently with all other
segments. The most important characteristic of a pipeline technique is that several computations can be
in progress in distinct segments at the same time. The overlapping of computation is made possible by
associating a register with each segment in the pipeline. The registers provide isolation between each
segment so that each can operate on distinct data simultaneously. The structure of a pipeline
organization can be represented simply by including an input register for each segment followed by a
combinational circuit.

Let us consider an example of combined multiplication and addition operation to get a better
understanding of the pipeline organization.

The combined multiplication and addition operation is done with a stream of numbers such as:
Registers R1, R2, R3, and R4 hold the data and the combinational circuits operate in a particular
segment.

The output generated by the combinational circuit in a given segment is applied as an input register
of the next segment. For instance, from the block diagram, we can see that the register R3 is used as
one of the input registers for the combinational adder circuit.
In general, the pipeline organization is applicable for two areas of computer design which includes:
1. Arithmetic Pipeline
2. Instruction Pipeline

Arithmetic Pipeline
Arithmetic Pipelines are mostly used in high-speed computers. They are used to implement floating-
point operations, multiplication of fixed-point numbers, and similar computations encountered in
scientific problems.
To understand the concepts of arithmetic pipeline in a more convenient way, let us consider an
example of a pipeline unit for floating-point addition and subtraction.
The inputs to the floating-point adder pipeline are two normalized floating-point binary numbers
defined as:

Where A and B are two fractions that represent the mantissa and a and b are the exponents.
The combined operation of floating-point addition and subtraction is divided into four segments.
Each segment contains the corresponding sub operation to be performed in the given pipeline. The
sub operations that are shown in the four segments are:
1. Compare the exponents by subtraction.
2. Align the mantissas.
3. Add or subtract the mantissas.
4. Normalize the result.

We will discuss each sub operation in a more detailed manner later in this section.
The following block diagram represents the sub operations performed in each segment of the
pipeline.
Compare exponents by subtraction:

The exponents are compared by subtracting them to determine their difference. The larger exponent
is chosen as the exponent of the result.
The difference of the exponents, i.e., 3 - 2 = 1 determines how many times the mantissa associated
with the smaller exponent must be shifted to the right.

2. Align the mantissas:

The mantissa associated with the smaller exponent is shifted according to the difference of
exponents determined in segment one.

Instruction pipelining:
Pipeline processing can occur not only in the data stream but in the instruction stream as well.
Most of the digital computers with complex instructions require instruction pipeline to carry out
operations like fetch, decode and execute instructions.
In general, the computer needs to process each instruction with the following sequence of steps.
1. Fetch instruction from memory.
2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place.
Each step is executed in a particular segment, and there are times when different segments may take
different times to operate on the incoming information. Moreover, there are times when two or more
segments may require memory access at the same time, causing one segment to wait until another is
finished with the memory.
The organization of an instruction pipeline will be more efficient if the instruction cycle is divided
into segments of equal duration. One of the most common examples of this type of organization is a
Four-segment instruction pipeline.
A four-segment instruction pipeline combines two or more different segments and makes it as a
single one. For instance, the decoding of the instruction can be combined with the calculation of the
effective address into one segment.
The following block diagram shows a typical example of a four-segment instruction pipeline. The
instruction cycle is completed in four segments.
Segment 1:
The instruction fetch segment can be implemented using first in, first out (FIFO) buffer.
Segment 2:
The instruction fetched from memory is decoded in the second segment, and eventually, the effective
address is calculated in a separate arithmetic circuit.
Segment 3:
An operand from memory is fetched in the third segment.
Segment 4:
The instructions are finally executed in the last segment of the pipeline organization.
As computer systems evolve, greater performance can be achieved by taking advantage of
improvements in technology, such as faster circuitry, use of multiple registers rather than a single
accumulator and the use of a cache memory. Another organizational approach is instruction
pipelining in which new inputs are accepted at one end before previously accepted inputs appear as
outputs at the other end.

Figure 3.1a depicts this approach. The pipeline has two independent stages. The first stage fetches
an instruction and buffers it. When the second stage is free, the first stage passes it the buffered
instruction. While the second stage is executing the instruction, the first stage takes advantage of
any unused memory cycles to fetch and buffer the next instruction. This iscalled instruction prefetch
or fetch overlap. This process will speed up instruction execution only if the fetch and execute
stages were of equal duration, the instruction cycle time would be halved. However, if we look
more closely at this pipeline

(Figure 3.1b), we will see that this doubling of execution rate is unlikely for 3 reasons:
1. The execution time will generally be longer than the fetch time. Thus, the fetch stage may have
to wait for some time before it can empty its buffer.
2. A conditional branch instruction makes the address of the next instruction to be fetched
unknown. Thus, the fetch stage must wait until it receives the next instruction address from the
execute stage. The execute stage may then have to wait while the next instruction is fetched.
3. When a conditional branch instruction is passed on from the fetch to the execute stage, the fetch
stage fetches the next instruction in memory after the branch instruction. Then, if the branch is not
taken, no time is lost .If the branch is taken, the fetched instruction must be discarded and a new
instruction fetched.
To gain further speedup, the pipeline must have more stages. Let us consider the following
decomposition of the instruction processing.
1. Fetch instruction (FI): Read the next expected instruction into a buffer.
2. Decode instruction (DI): Determine the opcode and the operand specifiers.
3. Calculate operands (CO): Calculate the effective address of each source operand. This may
involve displacement, register indirect, indirect, or other forms of address calculation.
4. Fetch operands (FO): Fetch each operand from memory.
5. Execute instruction (EI): Perform the indicated operation and store the result, if any, in the
specified destination operand location.
6. Write operand (WO): Store the result in memory.
Figure 3.2 show that a six-stage pipeline can reduce the execution time for 9 instructions from 54
time units to 14 time units.

3.2 Timing Diagram for Instruction Pipeline Operation

FO and WO stages involve a memory access. If the six stages are not of equal duration, there will be
some waiting involved at various pipeline stages. Another difficulty is the conditional branch
instruction, which can invalidate several instruction fetches. A similar unpredictable event is an
interrupt.
3.3 Timing Diagram for Instruction Pipeline Operation with interrupts
Figure 3.3 illustrates the effects of the conditional branch, using the same program as Figure 3.2.
Assume that instruction 3 is a conditional branch to instruction 15. Until the instruction is executed,
there is no way of knowing which instruction will come next. The pipeline, in this example, simply
loads the next instruction in sequence (instruction 4) and proceeds.
In Figure 3.2, the branch is not taken. In Figure 3.3, the branch is taken. This is not determined until
the end of time unit 7.At this point, the pipeline must be cleared of instructions that are not useful.
During time unit 8, instruction 15 enters the pipeline.
No instructions complete during time units 9 through 12; this is the performance penalty incurred
because we could not anticipate the branch. Figure 3.4 indicates the logic needed for pipelining to
account for branches and interrupts.

3.4 Six-stage CPU Instruction Pipeline

Through Put & Speed Up

In computer organization, throughput and speedup are two important performance metrics that
describe the efficiency of a system in processing tasks. Here's an explanation of each:
PIPELINE HAZARDS

Pipeline hazards: What are pipeline hazards?

Hazards are those situations that prevent the next instruction in the instruction stream from executing
during its designated clock cycle. They reduce the performance from the ideal speedup gained by
pipelining.
Classification of hazards:
Structural Hazards: arise from resource conflicts when the hardware can’t support all possible
combinations in simultaneous overlapped execution.

Data hazards: arise when an instruction depends upon the results of a previous instruction in a
way that is exposed by the overlapping of instructions in the pipeline.

Control Hazards: arise from the pipelining of branches and other instructions that change the PC

Structural hazards:
For any system to be free from hazards, pipelining of functional units and duplication of
resources is necessary to allow all possible combinations of instructions in the pipeline.
Structural hazards arise due to the following reasons:
When a functional unit is not fully pipelined, then the sequence of instructions using that unit
cannot proceed at the rate of one per clock cycle.

When the resource is not duplicated enough to allow all possible combinations of instructions.
Ex: a machine may have one register file write port, but it may want to perform 2 writes during the
same clock cycle.

A machine with a shared single memory for data and instructions . An instruction containing data
memory reference will conflict with the instruction reference for a later instruction.

This resolved by stalling the pipeline for one clock cycle when the data memory access occurs.

Data hazards:

It occur when an instruction depends on the result of previous instruction and that result of
instruction has not yet been computed. whenever two different instructions use the same storage.
the location must appear as if it is executed in sequential order.
There are four types of data dependencies: Read after Write (RAW), Write after Read (WAR),
Write after Write (WAW), and Read after Read (RAR). These are explained as follows below.

 Read after Write (RAW) :

It is also known as True dependency or Flow dependency. It occurs when the value produced by
an instruction is required by a subsequent instruction. For example,

ADD R1, --, --;

SUB --, R1, --;

 Write after Read (WAR) :

It is also known as anti dependency. These hazards occur when the output register of
an instruction is used right after read by a previous instruction. For example,

ADD --, R1, --;

SUB R1, --, --;

 Write after Write (WAW) :

It is also known as output dependency. These hazards occur when the output register
of an instruction is used for write after written by previous instruction. For example,

ADD R1, --, --;

SUB R1, --, --;

Data hazards occur when instructions in a pipeline depend on the results of previous
instructions. To ensure smooth execution, various hazard-handling techniques like
forwarding and stalling are used.

Hazards requiring stalls:

Consider the situation where a load and a sub instruction are consecutive, where the
destination register of load is the source register for sub.
This hazard cannot be removed by forwarding. Hence a pipeline interlock is introduced to detect the
hazard and stalls the pipeline until the hazard is cleared. The hazard is checked during the ID phase
and stalls the instruction that wants to use the data until the source instruction produces it.

Control hazards:
Control hazards cause a greater performance loss compared to the losses posed by data
hazards.
The simplest method of dealing with branches is that the pipeline is stalled as soon the branch is
detected in the ID phase and until the MEM stage where the new PC is finally determined.
 Each branch causes a 3 cycle stall in the DLX pipeline which is a significant loss as the 30%
of the instructions used are branch instructions.

 The number of clock cycles in the branch is reduced by testing the condition for branching in
the ID stage and computing the destination address in the ID stage using a separate adder.
Thus there is only clock cycle on branches

We can get two types of conflicts while using this pipelining concept, data conflicts and branch
conflicts.

Unit-5 (Coa) Notes
100% (1)
Unit-5 (Coa) Notes
33 pages
Unit 5 1
No ratings yet
Unit 5 1
21 pages
COA Unit - V Notes
No ratings yet
COA Unit - V Notes
21 pages
Understanding Pipelining Techniques
No ratings yet
Understanding Pipelining Techniques
5 pages
Understanding Parallel Processing Techniques
No ratings yet
Understanding Parallel Processing Techniques
21 pages
Unit 5 Pipeline
No ratings yet
Unit 5 Pipeline
13 pages
COA Module 5 QB Complete Solutions
No ratings yet
COA Module 5 QB Complete Solutions
32 pages
5.pipeline and Multiprocessors
100% (1)
5.pipeline and Multiprocessors
16 pages
Pipeline - 3117
No ratings yet
Pipeline - 3117
22 pages
Pipeline Processing in Computer Systems
No ratings yet
Pipeline Processing in Computer Systems
16 pages
Computer Pipelining Techniques
No ratings yet
Computer Pipelining Techniques
18 pages
Mca Coa-Unit III
No ratings yet
Mca Coa-Unit III
16 pages
Unit 6
No ratings yet
Unit 6
11 pages
Mod 3
No ratings yet
Mod 3
46 pages
Dld&Co Cse-Ds Unit 4-2
No ratings yet
Dld&Co Cse-Ds Unit 4-2
38 pages
Advanced Parallel Processing
0% (1)
Advanced Parallel Processing
12 pages
Lecture 7 - PIPELINING
No ratings yet
Lecture 7 - PIPELINING
16 pages
Understanding CPU Pipelining Basics
No ratings yet
Understanding CPU Pipelining Basics
5 pages
Unit 6
No ratings yet
Unit 6
20 pages
Parallel Processing & Pipelining Guide
No ratings yet
Parallel Processing & Pipelining Guide
8 pages
Coa M3 Bit
No ratings yet
Coa M3 Bit
4 pages
Aca Unit-3 Notes
No ratings yet
Aca Unit-3 Notes
17 pages
Presentation 5156 Content Document 20250301102853AM
No ratings yet
Presentation 5156 Content Document 20250301102853AM
40 pages
BCA Semester II Computer Organisation and Architecture (COA
No ratings yet
BCA Semester II Computer Organisation and Architecture (COA
24 pages
Pipelining Basic Concept
No ratings yet
Pipelining Basic Concept
23 pages
Pipeline and Vector Processing Overview
No ratings yet
Pipeline and Vector Processing Overview
74 pages
Coa Unit 4
No ratings yet
Coa Unit 4
10 pages
Pipeline & Parallel Processing
No ratings yet
Pipeline & Parallel Processing
19 pages
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
No ratings yet
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
10 pages
4-Concept of Pipelining
No ratings yet
4-Concept of Pipelining
20 pages
CO UNIT - 5 - Parallel Processing
No ratings yet
CO UNIT - 5 - Parallel Processing
30 pages
Lec 8 Performance Enhancement-Computer Architecture
No ratings yet
Lec 8 Performance Enhancement-Computer Architecture
23 pages
Operand Forwarding in Pipelining
No ratings yet
Operand Forwarding in Pipelining
34 pages
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
No ratings yet
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
8 pages
ACA - Pipelining
No ratings yet
ACA - Pipelining
25 pages
Module 4-Pipelining
No ratings yet
Module 4-Pipelining
39 pages
Understanding Pipelining in CPUs
No ratings yet
Understanding Pipelining in CPUs
8 pages
Csa Module Iv Notes
No ratings yet
Csa Module Iv Notes
59 pages
Parallel Processing Essentials
No ratings yet
Parallel Processing Essentials
32 pages
Concept of Pipelining - Computer Architecture Tutorial What Is Pipelining?
100% (1)
Concept of Pipelining - Computer Architecture Tutorial What Is Pipelining?
5 pages
Pipeline and Vector Processing
100% (1)
Pipeline and Vector Processing
18 pages
Multiprocessor Systems & Pipelining
No ratings yet
Multiprocessor Systems & Pipelining
11 pages
Csso U 5
No ratings yet
Csso U 5
29 pages
COA Lecture 10
No ratings yet
COA Lecture 10
22 pages
Pipe Lining
No ratings yet
Pipe Lining
14 pages
Pipelining in Instruction Processing
No ratings yet
Pipelining in Instruction Processing
76 pages
Pipeline Concepts for CS Students
No ratings yet
Pipeline Concepts for CS Students
7 pages
Pipelining & Vector Processing Guide
No ratings yet
Pipelining & Vector Processing Guide
28 pages
Lecture 1
100% (1)
Lecture 1
10 pages
Instruction Pipelining Explained
No ratings yet
Instruction Pipelining Explained
40 pages
BNCS1209 Chapter 6
No ratings yet
BNCS1209 Chapter 6
25 pages
COA Chapter 6
No ratings yet
COA Chapter 6
6 pages
Pipelining & Parallel Processing Guide
No ratings yet
Pipelining & Parallel Processing Guide
12 pages
Coa, Unit V, Notes
No ratings yet
Coa, Unit V, Notes
26 pages
3.2 Pipeline Processing
No ratings yet
3.2 Pipeline Processing
18 pages
Unit 7 N
No ratings yet
Unit 7 N
13 pages
Pipelining & Vector Processing Guide
No ratings yet
Pipelining & Vector Processing Guide
29 pages
Chapter 5 Pipelining and Vector Processing Modified
No ratings yet
Chapter 5 Pipelining and Vector Processing Modified
37 pages
Coa Module 5
No ratings yet
Coa Module 5
10 pages
Selling On Walmart Marketplace
No ratings yet
Selling On Walmart Marketplace
20 pages
SIEMENS S7-300 Getting Started Manual
100% (1)
SIEMENS S7-300 Getting Started Manual
74 pages
Dinov 3
No ratings yet
Dinov 3
67 pages
PRO42
No ratings yet
PRO42
63 pages
Comprehensive Tech Learning Guide
No ratings yet
Comprehensive Tech Learning Guide
4 pages
Major Project - Jarvis
No ratings yet
Major Project - Jarvis
80 pages
Software: Computer Software or Just Software, Is A Collection of
No ratings yet
Software: Computer Software or Just Software, Is A Collection of
8 pages
Profile Fitment Questionnaire Guide
No ratings yet
Profile Fitment Questionnaire Guide
12 pages
Google Tools Setup & Collaboration Guide
No ratings yet
Google Tools Setup & Collaboration Guide
48 pages
Instruction Sets Addressing Modes
No ratings yet
Instruction Sets Addressing Modes
23 pages
Tips On Using Phase Manager With Factory Talk Batch
No ratings yet
Tips On Using Phase Manager With Factory Talk Batch
19 pages
Software QA Trainer & Consultant Profile
No ratings yet
Software QA Trainer & Consultant Profile
3 pages
LAN Based Assessment and Billing System For Camiling
No ratings yet
LAN Based Assessment and Billing System For Camiling
2 pages
SAP S4HANA Transport Management
No ratings yet
SAP S4HANA Transport Management
3 pages
314318-Data Communication and Computer Network
No ratings yet
314318-Data Communication and Computer Network
8 pages
Rishabh Programss
No ratings yet
Rishabh Programss
43 pages
Iso/iec 7813
0% (3)
Iso/iec 7813
3 pages
Computer System Diagnosis Guide
No ratings yet
Computer System Diagnosis Guide
2 pages
Tech Note 1035 - Moving The Historian Runtime Database To Another Machine Using SQL Server 2012
No ratings yet
Tech Note 1035 - Moving The Historian Runtime Database To Another Machine Using SQL Server 2012
13 pages
Db2 Native Encryption Guide
No ratings yet
Db2 Native Encryption Guide
4 pages
CS 612 MJ (DevOps Fundamentals)
No ratings yet
CS 612 MJ (DevOps Fundamentals)
2 pages
x431 Pro Package List
No ratings yet
x431 Pro Package List
2 pages
BI2001B - Diseño de Sistemas de Bioinstrumentación Digital: Sistemas Numéricos y Codigos
No ratings yet
BI2001B - Diseño de Sistemas de Bioinstrumentación Digital: Sistemas Numéricos y Codigos
43 pages
Tcs Digital & Ninja Interview Experiences
No ratings yet
Tcs Digital & Ninja Interview Experiences
26 pages
Flight Schedules and the 10958 Problem
No ratings yet
Flight Schedules and the 10958 Problem
16 pages
AI-Enabled Smart Vending Machine Prototype With Currency Detection and Change Return System
No ratings yet
AI-Enabled Smart Vending Machine Prototype With Currency Detection and Change Return System
2 pages
Oop - B58
0% (1)
Oop - B58
11 pages
Smart Home Security and Intrusion Detection System
No ratings yet
Smart Home Security and Intrusion Detection System
7 pages
SAP HANA Database SQL Command Network Protocol en
No ratings yet
SAP HANA Database SQL Command Network Protocol en
53 pages
HPC 5th Unit - 240504 - 160548
No ratings yet
HPC 5th Unit - 240504 - 160548
18 pages

Understanding Pipelining Techniques

Uploaded by

Understanding Pipelining Techniques

Uploaded by

UNIT – 5

The term Pipelining refers to a technique of decomposing a sequential process into

The structure of a pipeline organization can be represented simply by including an input

2. Align the mantissas:

3.2 Timing Diagram for Instruction Pipeline Operation

3.4 Six-stage CPU Instruction Pipeline

Pipeline hazards: What are pipeline hazards?

 Read after Write (RAW) :

ADD R1, --, --;

 Write after Read (WAR) :

ADD --, R1, --;

 Write after Write (WAW) :

ADD R1, --, --;

Hazards requiring stalls:

You might also like