0% found this document useful (0 votes)

82 views

Pipelining Become Universal Technique in 1985

Pipelining became a universal technique in 1985 by overlapping instruction execution and exploiting instruction level parallelism. There are two main approaches to pipelining - hardware-based dynamic approaches used in servers and desktops, and compiler-based static approaches used for scientific applications. Exploiting instruction level parallelism aims to maximize instructions per cycle by minimizing pipeline stalls from structural hazards, data hazards, and control hazards. Loop unrolling, dynamic scheduling, register renaming, and branch prediction are some techniques used to reduce stalls and improve parallelism.

Uploaded by

Rajesh c

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views

Pipelining Become Universal Technique in 1985

Uploaded by

Rajesh c

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Introduction

• Pipelining become universal technique in 1985

– Overlaps execution of instructions
– Exploits “Instruction Level Parallelism”

• There are two main approaches:

– Hardware-based dynamic approaches
• Used in server and desktop processors
• Not used as extensively in PMP processors
– Compiler-based static approaches
• Not as successful outside of scientific applications
Instruction-Level Parallelism
• When exploiting instruction-level parallelism, goal is to
maximize CPI
Pipeline CPI =Ideal pipeline CPI +Structural stalls +Data hazard stalls + Control stalls

• Ideal Pipeline CPI- It is a measure of the maximum performance

attainable by the implementation.
Techniques used to decrease the
overall pipeline CPI

3
Techniques for Improving ILP

- Loop unrolling
-Basic pipeline scheduling
-Dynamic scheduling, score boarding,
register renaming
-Dynamic memory disambiguation
-Dynamic branch prediction
-Multiple instruction issue per cycle
-Software and hardware techniques

4
Loop-Level Parallelism

• Basic block: straight-line code w/o branches Fraction

of branches: 0.15 to 0.25

• ILP is limited!
-Average basic-block size is 6-7 instructions
-These may be dependent
• LLP
-Easily unroll loop statically or dynamically
-Can use SIMD (vector processors and GPUs)

5
ILP
• ILP is increased by exploit Parallelism among
iteration of a loop
Ex:
For(i=0;i<=999;i=i+1)
x[i]=x[i]+y[i];
• Loops are used parallel
• Techniques are used to convert from LLP to ILP

6
Hazards & Stalls
Structural Hazards
– Cause: resource contention
– Solution: add more resources & better scheduling

Data Hazards
– Cause: Dependences
True data dependence: property of program: RAW
Name dependence: reuse of registers, WAR & WAW
– Solution: loop unrolling, dynamic scheduling, register renaming,
hardware speculation

Control Hazards
– Cause: branch instructions, change of program flow
– Solution: loop unrolling, branch prediction, hardware speculation
7
1. Data Dependence
• Loop-Level Parallelism
– Unroll loop statically or dynamically
– Use SIMD (vector processors and GPUs)

• Challenges:
– Data dependency
• Instruction j is data dependent on instruction i if

– Instruction i produces a result that may be used by instruction j

– Instruction j is data dependent on instruction k and instruction k
is data dependent on instruction i

• Dependent instructions cannot be executed

simultaneously
Data Dependence
• Dependencies are a property of programs
• Stalls are properties of the pipeline
• Pipeline organization determines if dependence is
detected and if it cause a stall

• Data dependence convey:

– Possibility of a hazard
– Order in which results must be calculated
– Upper bound on exploitable ILP

• Dependencies that flow through memory locations are

difficult to detect
• Data Hazards
– Read after write (RAW)
– Write after write (WAW)
– Write after read (WAR)

• Two possibilities:
- Maintain dependence, but avoid stalls
- Eliminate dependence by code transformation
Example
Instruction Sequence

1
2
3
4
5
Data Dependencies for float data

1
2
3
Data Dependence for Integer Data

11
2. Name Dependence
• Two instructions use the same name(Reg or
Mem) but no flow of information

– Anti dependence(WAR): instruction j writes a register

or memory location that instruction i reads
• Initial ordering (i before j) must be preserved
– Output dependence(WAW): instruction i and
instruction j write the same register or memory
location
• Ordering must be preserved

• To resolve, use renaming techniques

3. Control Dependence

Ordering of instruction i with respect to a branch

Instruction

• Instruction control dependent on a branch cannot be

moved before the branch so that its execution is no
longer controlled by the branch

• An instruction not control dependent on a branch

cannot be moved after the branch so that its execution
is controlled by the branch

13
Control Dependence
An example:
● T1;
if p1 {
S1;
}
if p2 {
S2;
}
● Statement S1 is control-dependent on p1, but T1 is not
Statement S2 is control-dependent on p2, but not p1

● What this means for execution

– S1 cannot be moved before p1
– T1 cannot be moved inside p2
Examples

Example 1: •Moving load instruction before the

branch the load instruction may
DADDU R2,R3,R4 cause memory protection
BEQZ R2,L1 exception.
LW R1,0(R2) •Preserve Control Dependence
L1:

15
Example 2: • OR instruction dependent on
DADDU R1,R2,R3 DADDU and DSUBU.
BEQZ R4,L • Data flow must be preserved
DSUBU R1,R1,R6
L: …
OR R7,R1,R8

Example 3:
DADDU R1,R2,R3 • Violating control dependence does not
BEQZ R12,skip affect data flow or exception
DSUBU R4,R5,R6 • Assume R4 isn’t used after skip
DADDU R5,R4,R9 – Possible to move DSUBU before the
skip: OR R7,R8,R9 branch

AJ Interview Questions 1
100% (1)
AJ Interview Questions 1
18 pages
Chapter 2 ILP
No ratings yet
Chapter 2 ILP
89 pages
Instruction Level Parallelism and Its Exploitation: Unit Ii by Raju K, Cse Dept
No ratings yet
Instruction Level Parallelism and Its Exploitation: Unit Ii by Raju K, Cse Dept
201 pages
Instruction Level Pipelining
100% (1)
Instruction Level Pipelining
113 pages
Cosc530 Ch3all6up
No ratings yet
Cosc530 Ch3all6up
8 pages
CSE 820 Graduate Computer Architecture Week 5 - Instruction Level Parallelism
No ratings yet
CSE 820 Graduate Computer Architecture Week 5 - Instruction Level Parallelism
38 pages
CompanionAsset 9780128119051 Chapter03 (3)
No ratings yet
CompanionAsset 9780128119051 Chapter03 (3)
67 pages
Topic2c Ss Dynamicscheduling
No ratings yet
Topic2c Ss Dynamicscheduling
94 pages
13) Ilp1 PDF
No ratings yet
13) Ilp1 PDF
85 pages
Instruction-Level Parallelism and Its Exploitation: Prof. Dr. Nizamettin AYDIN
No ratings yet
Instruction-Level Parallelism and Its Exploitation: Prof. Dr. Nizamettin AYDIN
170 pages
Instruction-Level Parallelism (ILP), Since The
100% (1)
Instruction-Level Parallelism (ILP), Since The
57 pages
4-Advanced pipelining_241114_060906
No ratings yet
4-Advanced pipelining_241114_060906
80 pages
03 Dynamic Sched
No ratings yet
03 Dynamic Sched
84 pages
MCP Unit 1
No ratings yet
MCP Unit 1
41 pages
Instruction Level Parallelism: Soner Onder
No ratings yet
Instruction Level Parallelism: Soner Onder
25 pages
CAQA5e ch3
No ratings yet
CAQA5e ch3
45 pages
CS 6461: Computer Architecture Instruction Level Parallelism
No ratings yet
CS 6461: Computer Architecture Instruction Level Parallelism
41 pages
ILP Overview and Scoreboard
No ratings yet
ILP Overview and Scoreboard
60 pages
Compiler Techniques For Exposing ILP
No ratings yet
Compiler Techniques For Exposing ILP
26 pages
Instruction-Level Parallelism: Stalls Control Stalls WAW Stalls WAR Stalls RAW Stalls Structural CPI CPI
No ratings yet
Instruction-Level Parallelism: Stalls Control Stalls WAW Stalls WAR Stalls RAW Stalls Structural CPI CPI
50 pages
Lecture 5
No ratings yet
Lecture 5
76 pages
unit4.aca
No ratings yet
unit4.aca
6 pages
4th Lecture Computer Architecture
No ratings yet
4th Lecture Computer Architecture
15 pages
Instruction-Level Parallel Processors: Objective
No ratings yet
Instruction-Level Parallel Processors: Objective
31 pages
3a.ILP Dipendenze e Superscalare
No ratings yet
3a.ILP Dipendenze e Superscalare
24 pages
Instruction Level Parallelism-Concepts N Challenges
100% (1)
Instruction Level Parallelism-Concepts N Challenges
4 pages
CS 6290 Instruction Level Parallelism
No ratings yet
CS 6290 Instruction Level Parallelism
45 pages
Instruction-Level Parallel Processors: Asim Munir
No ratings yet
Instruction-Level Parallel Processors: Asim Munir
28 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
214 pages
Ch2 Lec7 Instruction Piplining
No ratings yet
Ch2 Lec7 Instruction Piplining
34 pages
pdc2: MODULE2
No ratings yet
pdc2: MODULE2
113 pages
Lec 8
No ratings yet
Lec 8
62 pages
Computer Organization and Architecture What Does Superscalar Mean?
No ratings yet
Computer Organization and Architecture What Does Superscalar Mean?
14 pages
onur-447-spring15-lecture9-branch-prediction-afterlecture
No ratings yet
onur-447-spring15-lecture9-branch-prediction-afterlecture
65 pages
3313
No ratings yet
3313
59 pages
EE457Unit9a_OoO
No ratings yet
EE457Unit9a_OoO
77 pages
CH10-Processor Structure and Function
No ratings yet
CH10-Processor Structure and Function
14 pages
10_Pipelining
No ratings yet
10_Pipelining
44 pages
Module 5 Instruction Level Parallelism and Pipelining (1)
No ratings yet
Module 5 Instruction Level Parallelism and Pipelining (1)
54 pages
study guide chapter 3
No ratings yet
study guide chapter 3
3 pages
Instruction level Parallelism
No ratings yet
Instruction level Parallelism
22 pages
Module 5_Processor Structure and Function
No ratings yet
Module 5_Processor Structure and Function
74 pages
L1.3b_OOOpipelines
No ratings yet
L1.3b_OOOpipelines
72 pages
CH14 COA9e Processor Structure and Function
No ratings yet
CH14 COA9e Processor Structure and Function
40 pages
Unit - 1 Microprocessor Architecture
No ratings yet
Unit - 1 Microprocessor Architecture
52 pages
L13
No ratings yet
L13
15 pages
Lec5 - ILP Issues in Pipeline Design
No ratings yet
Lec5 - ILP Issues in Pipeline Design
38 pages
Pipelining2019_(1)[1]
No ratings yet
Pipelining2019_(1)[1]
82 pages
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
No ratings yet
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
7 pages
Chapter 5 PPTV 41 STDV 1
No ratings yet
Chapter 5 PPTV 41 STDV 1
47 pages
Star Lion College of Engineering & Technology: Cs2354 Aca-2 Marks & 16 Marks
No ratings yet
Star Lion College of Engineering & Technology: Cs2354 Aca-2 Marks & 16 Marks
14 pages
Introduction To Instruction Level Parallelism (ILP) : ECE338 Parallel Computer Architecture Spring 2022
No ratings yet
Introduction To Instruction Level Parallelism (ILP) : ECE338 Parallel Computer Architecture Spring 2022
13 pages
MPCA Assignment 11 B - 66
No ratings yet
MPCA Assignment 11 B - 66
5 pages
Parallelism I: Inside The Core
No ratings yet
Parallelism I: Inside The Core
61 pages
Computer System Organization
No ratings yet
Computer System Organization
26 pages
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
74 pages
Onur 447 Spring15 Lecture12 Ooo Execution Afterlecture
No ratings yet
Onur 447 Spring15 Lecture12 Ooo Execution Afterlecture
67 pages
EEF011 Computer Architecture 計算機結構: Exploiting Instruction-Level Parallelism with Software Approaches
0% (1)
EEF011 Computer Architecture 計算機結構: Exploiting Instruction-Level Parallelism with Software Approaches
40 pages
ACA Unit 3
No ratings yet
ACA Unit 3
50 pages
Lecutre-7 Instruction Pipelining
No ratings yet
Lecutre-7 Instruction Pipelining
29 pages
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
From Everand
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
Mamta Devi
No ratings yet
Module 08
No ratings yet
Module 08
89 pages
Oc23 Mpps
No ratings yet
Oc23 Mpps
30 pages
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
No ratings yet
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
36 pages
Reduced Instruction Set Computers (Risc) : What Are Riscs and Why Do We Need Them?
No ratings yet
Reduced Instruction Set Computers (Risc) : What Are Riscs and Why Do We Need Them?
5 pages
CODch 4 Slides
No ratings yet
CODch 4 Slides
71 pages
Computer Organization and Architecture (AT70.01)
No ratings yet
Computer Organization and Architecture (AT70.01)
29 pages
CODch 6 Slides
No ratings yet
CODch 6 Slides
77 pages
CODch 5 Slides
No ratings yet
CODch 5 Slides
118 pages
Components For Embedded Programs
No ratings yet
Components For Embedded Programs
16 pages
Computer Organization and Architecture (AT70.01)
No ratings yet
Computer Organization and Architecture (AT70.01)
11 pages
01 - Memory System PDF
No ratings yet
01 - Memory System PDF
11 pages
The Compilation Process: The Compilation Process Combines Both Translation and Optimisation of High Level Language Code
No ratings yet
The Compilation Process: The Compilation Process Combines Both Translation and Optimisation of High Level Language Code
20 pages
Cs2358 Internet Programming Lab Anna University Syllabus
No ratings yet
Cs2358 Internet Programming Lab Anna University Syllabus
12 pages
How To Fix MTP Driver Installation On Windows 10 PDF
No ratings yet
How To Fix MTP Driver Installation On Windows 10 PDF
12 pages
CHAPTER - 2 (Data Representation)
No ratings yet
CHAPTER - 2 (Data Representation)
8 pages
03 - 2011 - Markus G. Kuhn - Compromising Emanations of LCD TV Sets
No ratings yet
03 - 2011 - Markus G. Kuhn - Compromising Emanations of LCD TV Sets
7 pages
Power Quality FACTs
No ratings yet
Power Quality FACTs
5 pages
Entry Level Network Engineer Resume Example
No ratings yet
Entry Level Network Engineer Resume Example
1 page
CV Resume
No ratings yet
CV Resume
2 pages
Omni Product Catalogue
No ratings yet
Omni Product Catalogue
30 pages
OPEN ENDED MICROPROCESSOR PDF
No ratings yet
OPEN ENDED MICROPROCESSOR PDF
6 pages
Ns 3 Installation
No ratings yet
Ns 3 Installation
39 pages
Computer Networks QB
No ratings yet
Computer Networks QB
28 pages
Chapter 8. Wi-Fi 7 Network Planning - Wi-Fi 7 in Depth - Your Guide To Mastering Wi-Fi 7, The 802.11be Protocol, and Their Deployment
No ratings yet
Chapter 8. Wi-Fi 7 Network Planning - Wi-Fi 7 in Depth - Your Guide To Mastering Wi-Fi 7, The 802.11be Protocol, and Their Deployment
43 pages
Jaladhi Shah Resume
No ratings yet
Jaladhi Shah Resume
2 pages
Chat Application Through Client Server Management System Project
No ratings yet
Chat Application Through Client Server Management System Project
29 pages
Mongo DB Installation Guide
No ratings yet
Mongo DB Installation Guide
9 pages
EE215 Lab 4
No ratings yet
EE215 Lab 4
6 pages
Akshay Pratap Singh Resume
No ratings yet
Akshay Pratap Singh Resume
3 pages
Be3100 SM Becker en
No ratings yet
Be3100 SM Becker en
41 pages
Introducing Belle Bonne Sage
No ratings yet
Introducing Belle Bonne Sage
4 pages
500V/15A Switching Regulator Applications: Package Dimensions Features
No ratings yet
500V/15A Switching Regulator Applications: Package Dimensions Features
5 pages
Cameron - Das-3037 - Nelson - Ne-567 Service Manual
No ratings yet
Cameron - Das-3037 - Nelson - Ne-567 Service Manual
34 pages
JD - Maintech Technologies
No ratings yet
JD - Maintech Technologies
3 pages
Bits Bytes Data
No ratings yet
Bits Bytes Data
3 pages
GEH-6126 Vol I PDF
100% (1)
GEH-6126 Vol I PDF
134 pages
BCS 111
No ratings yet
BCS 111
27 pages
5th-Sem Syllabus T.U.
No ratings yet
5th-Sem Syllabus T.U.
25 pages
Important Python Frameworks of The Future
No ratings yet
Important Python Frameworks of The Future
4 pages
Minor Project - 1 Report
No ratings yet
Minor Project - 1 Report
17 pages
Linux Commands
No ratings yet
Linux Commands
5 pages
Altivar 61 - ATV61HC40N4D
No ratings yet
Altivar 61 - ATV61HC40N4D
5 pages

Pipelining Become Universal Technique in 1985

Uploaded by

Pipelining Become Universal Technique in 1985

Uploaded by

Introduction

• Pipelining become universal technique in 1985

• There are two main approaches:

• Ideal Pipeline CPI- It is a measure of the maximum performance

• Basic block: straight-line code w/o branches Fraction

– Instruction i produces a result that may be used by instruction j

• Dependent instructions cannot be executed

• Data dependence convey:

• Dependencies that flow through memory locations are

– Anti dependence(WAR): instruction j writes a register

• To resolve, use renaming techniques

Ordering of instruction i with respect to a branch

• Instruction control dependent on a branch cannot be

• An instruction not control dependent on a branch

● What this means for execution

Example 1: •Moving load instruction before the

You might also like