Chapter 6 (Pipelining and Superscalar Techniques)

The document discusses linear and nonlinear pipeline processors. Linear pipelines have stages connected linearly, with data flowing from the first to last stage. Nonlinear pipelines allow feedback and feedforward connections between stages. The key differences are that nonlinear pipelines can be reconfigured, have non-trivial reservation tables due to non-linear data flows, and outputs may come from non-last stages. Pipelining techniques like reservation tables, collision vectors, and state diagrams are used to analyze nonlinear pipelines and avoid resource conflicts between operations.

Uploaded by

Kushal Sh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

632 views10 pages

Chapter 6 (Pipelining and Superscalar Techniques)

Uploaded by

Kushal Sh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Advanced Computer Architecture

Kai Hwang & Naresh Jotwani

Chapter SIX
Pipelining and Superscalar Techniques
Linear Pipeline processors
Linear pipeline processor is a cascade of processing stages which are connected
linearly to perform a fixed function over a data stream flowing from one end to
another.
• A linear pipeline processor has k processing stages (𝑆𝑖 ), such as 𝑆1 , 𝑆2 , 𝑆3 … 𝑆𝑘 .
• Inputs/operands are provided into the first stage 𝑆1 , which is passed into the next
stage 𝑆2 and so on.
• Depending on the data flow control through the pipeline, the linear pipeline models
are categorized in two:
 Asynchronous Model: In this model, when stage 𝑆𝑖 is ready to transmit, it sends a ready signal to
stage 𝑆𝑖+1 . Upon receiving the incoming data, it returns an acknowledge signal to 𝑆𝑖 . This process
in known as handshaking protocol.
 The delay between any two adjacent stages may be different thus have a variable throughput
rate.
 Synchronous Model: In this model, clocked latch (typically a master-slave flip-flop) is used to store the
input only to transmit it later and to interface between stages.
 Upon arrival the clock pulse, all latches transfer data to the next stage simultaneously.
 All stages have equal transfer delays. These delays determine the clock period thus the speed of the
pipeline.
 For a k-stage linear pipeline, k cycle is needed for data flow to the last stage.
 Successive tasks or operations are commenced one per cycle to enter the pipeline. When the pipeline is
filled, one result comes out from the pipeline for each additional cycle.

Legend: 𝑆𝑖 = stage i, L = latch, 𝝉 = clock period, 𝝉𝑚 = maximum stage delay, d = latch delay, Ack = Acknowledge signal
• Clocking and Timing control
• Cycle clock: Let 𝝉𝑖 be the clock period in the stage 𝑆𝑖 and d is the time delay of a latch. The clock cycle 𝝉
is determined as below:
𝝉=𝑚𝑎𝑥{𝑡𝑖 }1𝑘 +𝑑 = 𝝉𝑚 + 𝑑
• Pipeline frequency: Defined as the inverse of the clock time/period:
1
𝑓=
𝝉
• Throughput: The maximum throughput represents the fact when one result is expected to come out of
the pipeline per cycle. The actual throughput may be lower than f depending on the initiation rate of
successive tasks entering the pipeline. This means more than one clock cycle has occurred between
successive task initiations.
• Clock skewing: Normally, it is expected that the clock pulses arrive at all stages at the same time. But
due to a problem, same clock pulse may arrive at different stages with a time offset s. This problem is
known as clock skewing.
• To avoid further complicacy, two constrains must be considered and they are 𝝉𝑚 ≥ 𝑡𝑚𝑎𝑥 + 𝑠 and
𝑑 ≤ 𝑡𝑚𝑖𝑛 − 𝑠, where 𝑡𝑚𝑎𝑥 = maximum time delay within a stage and 𝑡𝑚𝑖𝑛 = minimum time delay
within a stage.
Thus we can write when the clock skew takes places:
𝑑 + 𝑡𝑚𝑎𝑥 + 𝑠 ≤ 𝝉 ≤ 𝝉𝑚 + 𝑡𝑚𝑖𝑛 − 𝑠
When 𝑠 = 0, we get, 𝝉 = 𝝉𝑚 + 𝑑.
• Speedup, Efficiency, Throughput
 The total time required for n tasks in linear pipeline of k stages is:
𝑇𝑘 = [𝑘 + (𝑛 − 1)] 𝝉
Where, k cycles are needed to complete the first ever task and the remaining n-1 tasks require n-1
cycles.
 For a nonlinear pipeline processor, every task needs the delay of 𝑘𝝉 and for n tasks, the total time,
𝑇1 = 𝑛𝑘𝝉.
 Speedup factor: The speedup factor between a k-stage pipeline processor and a non pipelined
processor is
𝑇1 𝑛𝑘𝝉 𝑛𝑘
𝑆𝑘 = = =
𝑇𝑘 [𝑘 + 𝑛 − 1 ]𝝉 𝑘 + (𝑛 − 1)
 Performance/Cost Ratio (PCR): The ratio between the performance and the total pipeline cost is PCR.
Let t be the equal flow-through delay in a k-stage pipeline processor.
With d be the latch delay, the clock period p is 𝑡Τ𝑘 + 𝑑
1 1
So the maximum throughput/performance, 𝑓 = = 𝑡
𝑝 +𝑑
𝑘
The total cost estimated as 𝑐 + 𝑘ℎ, where c is the cost of all logic stages and h is cost of each latch.
Finally, the performance/cost ratio is defined as below:
𝑓 1
𝑃𝐶𝑅 = =
𝑐 + 𝑘ℎ (𝑐 + 𝑘ℎ)( 𝑡 + 𝑑)
𝑘
• Efficiency and Throughput
 Efficiency: The efficiency, 𝐸𝑘 of a k-stage linear pipeline is defined as:
𝑆𝑘 𝑛𝑘 𝑛
𝐸𝑘 = = =
𝑘 𝑘 + 𝑛 − 1 𝑘 𝑘 + (𝑛 − 1)
 Pipeline throughput (𝐻𝑘 ): It is defined as the number of tasks performed per unit of time:
𝑛 𝑛𝑓
𝐻𝑘 = =
[𝑘 + (𝑛 − 1)]𝝉 𝑘 + (𝑛 − 1)

 It is seen from the above two equations:

𝐻𝑘 = 𝐸𝑘 . 𝑓
Nonlinear Pipeline processors
• Reservation table for nonlinear pipeline processor:
 In Fig 6.3, a three stage pipeline is shown, where two outputs X and Y are generated and there
corresponding reservation tables are shown is Fig 6.4(a) and 6.4(b).

Fig 6.3

Fig 6.4(a): Reservation table for function X Fig 6.4(b): Reservation table for function Y

 The number of the columns in the reservation table is called the evaluation time.
 For examples, function X has eight clock cycles and function Y has six.
• Latency analysis:
• Latency: The number of clock cycles between two initiations of the pipeline. A latency of k means
that two initiations are separated by k clock cycles.
• Collision: Any attempt by any two or more initiations to use the same pipeline stage at the same time
creates collision. Collision indicates resource conflicts between two initiations in the pipeline.
• Forbidden latency: Latencies that cause collisions are called forbidden latencies. As an example, for
function X from fig 6.4(a), the forbidden latencies are:
 For stage 𝑆1 = {(6-1),(8-1),(8-6)} = {5, 7, 2}
 For stage, 𝑆2 = {(4-2)} = {2}
 For stage, 𝑆3 = {(5-3)(7-3)(7-5)} = {2, 4, 2}
 Then the forbidden latencies for function X is: {2, 4, 5, 7}
• Collision vectors: From a given reservation table, it is easier to differentiate between the set of
permissible latencies and forbidden latencies. For a reservation table of n columns, the maximum
forbidden latency is m<=n-1. The permissible latency p is in the range of 1<=p<=m-1 or 1<=p<m as it
should be as small as possible. Now, the combined set of permissible and forbidden latencies can be
displayed by a collision vector. This is an m-bit vector
𝐶 = (𝐶𝑚 𝐶𝑚−1 …𝐶2 𝐶1 )
• The value of 𝐶𝑖 =1 if latency i causes a collision and 𝐶𝑖 =0 if latency i happens to be permissible. It is
always 𝐶𝑚 =1 that corresponds to the maximum forbidden latency.
• For the function X, we find the collision vector, 𝐶𝑋 = (𝐶7 𝐶6 𝐶5 𝐶4 𝐶3 𝐶2 𝐶1 ) = (1011010)
• State diagram
 Specifies the permissible state transitions among successive initiations based on the collision vector.
• Differences between linear and nonlinear pipeline processors:

Linear pipeline processor Nonlinear pipeline processor

Linear pipeline are static pipeline as they are used to Non-Linear pipeline are dynamic pipeline as they can be
perform fixed functions. reconfigured to perform variable functions at different
times.
Linear pipeline has only streamline connections. Non-Linear pipeline has streamline connection as well as
feed-forward and feedback connections.
It is easy to partition a given function into a sequence of Function partitioning is relatively difficult as the pipeline
linearly ordered sub functions. stages are interconnected with loops in addition to
streamline connections.
The Output of the pipeline is produced from the last The Output of the pipeline is not necessarily produced
stage of the pipeline. from the last stage.
The reservation table is trivial in the sense that data The reservation table is non-trivial in the sense that there
flows in linear streamline. is no linear streamline for data flows.
Static pipelining is specified by single Reservation table. Dynamic pipelining is specified by more than one
Reservation table.

B.tech CS S8 High Performance Computing Module Notes Module 1
100% (1)
B.tech CS S8 High Performance Computing Module Notes Module 1
19 pages
Ca 4
No ratings yet
Ca 4
39 pages
Viva Questions On CPU Scheduling
No ratings yet
Viva Questions On CPU Scheduling
6 pages
Ccs375 Web Technologies Syllabus
No ratings yet
Ccs375 Web Technologies Syllabus
3 pages
CS3591 Computer Networks Lecture Notes 1
100% (1)
CS3591 Computer Networks Lecture Notes 1
165 pages
OS Question Bank Unit 1-5
No ratings yet
OS Question Bank Unit 1-5
9 pages
hEALTH CARE ANALYTICS
No ratings yet
hEALTH CARE ANALYTICS
2 pages
Unit 5 Toc
No ratings yet
Unit 5 Toc
56 pages
Module 4 - Cloud Programming and Software Environments
No ratings yet
Module 4 - Cloud Programming and Software Environments
25 pages
Unit - I:: Social Networks and Semantic Web
No ratings yet
Unit - I:: Social Networks and Semantic Web
19 pages
CCS366 Software Testing and Automation Notes CCS366 Software Testing and Automation Notes
No ratings yet
CCS366 Software Testing and Automation Notes CCS366 Software Testing and Automation Notes
105 pages
Pipeline Hazards Detailed Notes
No ratings yet
Pipeline Hazards Detailed Notes
49 pages
IOT Unit-4
No ratings yet
IOT Unit-4
16 pages
2-Edge Streamng Analytics
No ratings yet
2-Edge Streamng Analytics
21 pages
DC Question Bank 5 Units
No ratings yet
DC Question Bank 5 Units
17 pages
Primitives For Distributed Communication
100% (2)
Primitives For Distributed Communication
10 pages
Unit I-Introduction of Object Oriented Modeling
No ratings yet
Unit I-Introduction of Object Oriented Modeling
92 pages
Comprehensive FLAT Question Bank
100% (1)
Comprehensive FLAT Question Bank
13 pages
Chap-05 Classless Addressing
No ratings yet
Chap-05 Classless Addressing
53 pages
CS3301 - DS Unit 1 New
100% (1)
CS3301 - DS Unit 1 New
23 pages
En 840D SL Complete Service v26
No ratings yet
En 840D SL Complete Service v26
478 pages
CN Unit 5 Computer Network Notes
No ratings yet
CN Unit 5 Computer Network Notes
33 pages
Data Copy in Copy Out
No ratings yet
Data Copy in Copy Out
2 pages
CNS Notes
No ratings yet
CNS Notes
244 pages
Application Protocol Not Present
No ratings yet
Application Protocol Not Present
4 pages
CS3691 Important Questions
No ratings yet
CS3691 Important Questions
10 pages
Unit-4: Transfer From Analysis To Design in The Characterization Stage: Interaction Diagrams
100% (3)
Unit-4: Transfer From Analysis To Design in The Characterization Stage: Interaction Diagrams
40 pages
Stud CSA Mod4 p2 PipeliningBasics
No ratings yet
Stud CSA Mod4 p2 PipeliningBasics
83 pages
Unit 2 Part 1
No ratings yet
Unit 2 Part 1
34 pages
CommVault Basic Admin
100% (1)
CommVault Basic Admin
118 pages
Architectural Design Challenges + Elasticity
No ratings yet
Architectural Design Challenges + Elasticity
8 pages
CS3451-OS Syllabus 2021
No ratings yet
CS3451-OS Syllabus 2021
1 page
DDM Lab Manual
100% (1)
DDM Lab Manual
80 pages
AAPP Mod3 Latest
No ratings yet
AAPP Mod3 Latest
65 pages
159 - CS8493, CS6401 Operating Systems - Question Bank
100% (1)
159 - CS8493, CS6401 Operating Systems - Question Bank
8 pages
IGI 2 Codes 2
100% (1)
IGI 2 Codes 2
34 pages
Oops cs8392 16 Marks
100% (2)
Oops cs8392 16 Marks
85 pages
Object Oriented Software Engineering - CCS356 - Notes - Unit 3 - Software Design
No ratings yet
Object Oriented Software Engineering - CCS356 - Notes - Unit 3 - Software Design
60 pages
R20-Os-Assignment Questions
100% (1)
R20-Os-Assignment Questions
4 pages
Information System and Databases
No ratings yet
Information System and Databases
10 pages
Os Lab Manual PDF
No ratings yet
Os Lab Manual PDF
60 pages
Ec2352 Computer Networks Anna University Previous Year Question Paper
100% (1)
Ec2352 Computer Networks Anna University Previous Year Question Paper
2 pages
Cs3591 - CN Unit 2 Transport Layer
No ratings yet
Cs3591 - CN Unit 2 Transport Layer
15 pages
CCS366 Sy
No ratings yet
CCS366 Sy
1 page
Program-1 Write A Program in C To Create Two Sets and Perform The Union Operation On Sets
No ratings yet
Program-1 Write A Program in C To Create Two Sets and Perform The Union Operation On Sets
22 pages
Unit V Programming Model
No ratings yet
Unit V Programming Model
53 pages
COA Course File 2023-24
No ratings yet
COA Course File 2023-24
61 pages
CS8691 AI CO-PO Mapping
No ratings yet
CS8691 AI CO-PO Mapping
6 pages
Algorithm For Asynchronous Check Pointing and Recovery
No ratings yet
Algorithm For Asynchronous Check Pointing and Recovery
4 pages
Jolly Roger's Security Thread For Beginners
No ratings yet
Jolly Roger's Security Thread For Beginners
59 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
Question Bank Computer Networks UnitV
100% (1)
Question Bank Computer Networks UnitV
2 pages
Cs8582-Ooad Lab Manual
80% (5)
Cs8582-Ooad Lab Manual
152 pages
Optimization in Engineering - 6th Sem 18-19
No ratings yet
Optimization in Engineering - 6th Sem 18-19
2 pages
CS2354 Advanced Computer Architecture Anna University Question Paper For ME
No ratings yet
CS2354 Advanced Computer Architecture Anna University Question Paper For ME
2 pages
Chapter 5 MCQs With Answers
No ratings yet
Chapter 5 MCQs With Answers
13 pages
Web Lab Question Bank
100% (1)
Web Lab Question Bank
2 pages
CHAPTER-1: Introduction To Microprocessor (10%) : Short Answer Questions
No ratings yet
CHAPTER-1: Introduction To Microprocessor (10%) : Short Answer Questions
6 pages
Network Security Testing - Core
No ratings yet
Network Security Testing - Core
2 pages
Domain Specific Iot
No ratings yet
Domain Specific Iot
17 pages
Chapter 5 (Bus, Cache and Shared Memory)
100% (1)
Chapter 5 (Bus, Cache and Shared Memory)
8 pages
DAA Syllabus
No ratings yet
DAA Syllabus
1 page
Challenges InThreading A Loop - Doc1
100% (2)
Challenges InThreading A Loop - Doc1
6 pages
Chapter 4 (Processors and Memory Hierarchy)
100% (1)
Chapter 4 (Processors and Memory Hierarchy)
17 pages
Lesson 1 - WEB 2.0 VS. WEB 3.0
No ratings yet
Lesson 1 - WEB 2.0 VS. WEB 3.0
15 pages
Int Comput Low Sec8 1
50% (2)
Int Comput Low Sec8 1
28 pages
SQL Server Backup
No ratings yet
SQL Server Backup
26 pages
Electricity Billing System Slide (Slideplayer Com)
25% (4)
Electricity Billing System Slide (Slideplayer Com)
12 pages
CW COMP1252 159762 st846 20100528 124047 0910
No ratings yet
CW COMP1252 159762 st846 20100528 124047 0910
67 pages
Project 6069
67% (3)
Project 6069
30 pages
DAA Question Bank
No ratings yet
DAA Question Bank
9 pages
IGI 2 Codes 3
No ratings yet
IGI 2 Codes 3
2 pages
Electricity Billing System: Software Requirement Specification
No ratings yet
Electricity Billing System: Software Requirement Specification
34 pages
IGI 2 Codes 1
No ratings yet
IGI 2 Codes 1
33 pages
Alien Shooter - Vengeance - Cheats
No ratings yet
Alien Shooter - Vengeance - Cheats
5 pages
Questions
No ratings yet
Questions
31 pages
CS6612 Compiler Lab Manual
100% (4)
CS6612 Compiler Lab Manual
60 pages
DAA Question Bank
No ratings yet
DAA Question Bank
39 pages
Case Study
No ratings yet
Case Study
17 pages
Car Rental Srs Document
No ratings yet
Car Rental Srs Document
116 pages
Chapter 1 (Parallel Computer Models)
No ratings yet
Chapter 1 (Parallel Computer Models)
20 pages
74104-PXR800 User Manual
No ratings yet
74104-PXR800 User Manual
246 pages
Dop Email Policy V1 0
No ratings yet
Dop Email Policy V1 0
37 pages
Chapter 4. Computer Software
No ratings yet
Chapter 4. Computer Software
40 pages
Government CSE
No ratings yet
Government CSE
49 pages
Yosys Manual: Clifford Wolf
No ratings yet
Yosys Manual: Clifford Wolf
174 pages
Shakil Ahmed Senior Lecturer & Head Department of Development Studies North Western University-Khulna
No ratings yet
Shakil Ahmed Senior Lecturer & Head Department of Development Studies North Western University-Khulna
28 pages
Apostrophe
No ratings yet
Apostrophe
1 page
UGRD-ITE6102 Computer Programming 1 - ALL - IN - SOURCE UGRD-ITE6102 Computer Programming 1 - ALL - IN - SOURCE
No ratings yet
UGRD-ITE6102 Computer Programming 1 - ALL - IN - SOURCE UGRD-ITE6102 Computer Programming 1 - ALL - IN - SOURCE
19 pages
FieldServer Configuration Guide
No ratings yet
FieldServer Configuration Guide
83 pages
Report On Making Concepts
No ratings yet
Report On Making Concepts
3 pages
Alessandro Volta
No ratings yet
Alessandro Volta
6 pages
XMC License Difference
No ratings yet
XMC License Difference
5 pages
FNIRSI-DC580 Product Manual v1.1
No ratings yet
FNIRSI-DC580 Product Manual v1.1
9 pages
1 Introduction
No ratings yet
1 Introduction
23 pages
International Conference On Big Data, Iot and Machine Learning (Bim 2021)
No ratings yet
International Conference On Big Data, Iot and Machine Learning (Bim 2021)
1 page
Assignment 2
No ratings yet
Assignment 2
8 pages
Software Design: CMSE 322 Assoc - Prof.Dr - Duygu Çelik Ertuğrul
No ratings yet
Software Design: CMSE 322 Assoc - Prof.Dr - Duygu Çelik Ertuğrul
21 pages
Spec Sheet Genetix Belt
No ratings yet
Spec Sheet Genetix Belt
2 pages
Day 1 Assignment
No ratings yet
Day 1 Assignment
2 pages
ECET460 W5 Homework John Doe
No ratings yet
ECET460 W5 Homework John Doe
3 pages
BGP Aggregation
No ratings yet
BGP Aggregation
4 pages
Central Processing Unit: Chapter - 2
No ratings yet
Central Processing Unit: Chapter - 2
32 pages
Component Level Design
No ratings yet
Component Level Design
2 pages
Report On Making Concepts
No ratings yet
Report On Making Concepts
3 pages
Permanent Address: A.Adhikesavan.: Resume
No ratings yet
Permanent Address: A.Adhikesavan.: Resume
3 pages
Introduction to Linux: Installation and Programming
From Everand
Introduction to Linux: Installation and Programming
N. B. Venkateswarlu
No ratings yet

Chapter 6 (Pipelining and Superscalar Techniques)

Uploaded by

Chapter 6 (Pipelining and Superscalar Techniques)

Uploaded by

Advanced Computer Architecture

Kai Hwang & Naresh Jotwani

 It is seen from the above two equations:

Linear pipeline processor Nonlinear pipeline processor

You might also like