0% found this document useful (0 votes)

17 views38 pages

L04-Pipelining

The document discusses the concepts of pipelining in computer architecture, focusing on the Intel P5 and P6 processors. It covers the 'Iron Law' of processor performance, types of hazards in pipelining, and strategies for resolving data and control hazards. Additionally, it provides insights into exception handling, including traps and interrupts, and their historical context in computing.

Uploaded by

husongtao10103426

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views38 pages

L04-Pipelining

Uploaded by

husongtao10103426

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

http://inst.eecs.berkeley.

edu/~cs152

CS 152/252A Computer
Architecture and Engineering Sophia Shao
Lecture 4 – Pipelining
Intel P5 and P6
CISC, RISC, or CRISC??

https://en.wikipedia.org/wiki/Pentium_(original) A 0.6 µm BiCMOS Processor With

Dynamic Execution, ISSCC 1995
Last Time in Lecture 2
§ Microcoding, an effective technique to manage control unit
complexity, invented in era when logic (tubes), main memory
(magnetic core), and ROM (diodes) used different
technologies
§ Difference between ROM and RAM speed motivated
additional complex instructions
§ Technology advances leading to fast SRAM made technology
assumptions invalid
§ Complex instructions sets impede parallel and pipelined
implementations

2
“Iron Law” of Processor Performance
Time = Instructions Cycles Time
Program Program * Instruction * Cycle
§ Instructions per program depends on source code,
compiler technology, and ISA
§ Cycles per instructions (CPI) depends on ISA and
µarchitecture
§ Time per cycle depends upon the µarchitecture and base
technology

Microarchitecture CPI cycle time

Microcoded >1 short
Single-cycle unpipelined 1 long
Pipelined 1 short

3
Classic 5-Stage RISC Pipeline
Fetch Decode EXecute Memory Writeback

Imm

Store
Inst. Register Data
Instruction Cache

B
Registers

ALU
PC

Cache

This version designed for regfiles/memories with

synchronous writes and asynchronous read.
4
CPI Examples
Microcoded machine Time
7 cycles 5 cycles 10 cycles
Inst 1 Inst 2 Inst 3

3 instructions, 22 cycles, CPI=7.33

Unpipelined machine
Inst 1 Inst 2 Inst 3
3 instructions, 3 cycles, CPI=1
Pipelined machine
Inst 1
Inst 2 3 instructions, 3 cycles, CPI=1
Inst 3 5-stage pipeline CPI≠5!!!
5
Instructions interact with each other in pipeline
§ An instruction in the pipeline may need a
resource being used by another instruction in the
pipeline à structural hazard
§ An instruction may depend on something
produced by an earlier instruction
– Dependence may be for a data value
à data hazard
– Dependence may be for the next instruction’s address
à control hazard (branches, exceptions)

§ Handling hazards generally introduces bubbles into

pipeline and reduces ideal CPI > 1
6
Pipeline CPI Examples
Measure from when first instruction finishes
Time to when last instruction in sequence finishes.
Inst 1
3 instructions finish in 3 cycles
Inst 2
CPI = 3/3 =1
Inst 3
Inst 1
Inst 2 3 instructions finish in 4 cycles
CPI = 4/3 = 1.33
Bubble
Inst 3
Inst 1
Bubble 1
Inst 2 3 instructions finish in 5cycles
CPI = 5/3 = 1.67
Inst
Bubble
3 2
Inst 3
7
Resolving Structural Hazards
§ Structural hazard occurs when two instructions
need same hardware resource at same time
– Can resolve in hardware by stalling newer instruction till older
instruction finished with resource
§ A structural hazard can always be avoided by
adding more hardware to design
– E.g., if two instructions both need a port to memory at same
time, could avoid hazard by adding second port to memory
§ Classic RISC 5-stage integer pipeline has no
structural hazards by design
– Many RISC implementations have structural hazards on multi-
cycle units such as multipliers, dividers, floating-point units, etc.,
and can have on register writeback ports

8
Types of Data Hazards
Consider executing a sequence of register-register
instructions of type:
rk ← ri op rj
Data-dependence
r3 ← r1 op r2 Read-after-Write
r5 ← r3 op r4 (RAW) hazard
Anti-dependence
r3 ← r1 op r2 Write-after-Read
r1 ← r4 op r5 (WAR) hazard
Output-dependence
r3 ← r1 op r2 Write-after-Write
r3 ← r6 op r7 (WAW) hazard

9
Three Strategies for Data Hazards
§ Interlock
– Wait for hazard to clear by holding dependent
instruction in issue stage
§ Bypass
– Resolve hazard earlier by bypassing value as soon as
available
§ Speculate
– Guess on value, correct if wrong

10
Interlocking Versus Bypassing
add x1, x3, x5
sub x2, x1, x4

F D X M W add x1, x3, x5

F D X M W bubble
Instruction interlocked
F D X M W bubble in decode stage
F D X M W sub x2, x1, x4

F D X M W add x1, x3, x5 Bypass around ALU

F D X M W sub x2, x1, x4 with no bubbles

11
Example Bypass Path
Fetch Decode EXecute Memory Writeback

Imm

Store
Inst. Register Data
Instruction Cache

B
Registers

ALU
PC

Cache

12
Fully Bypassed Data Path
Fetch Decode EXecute Memory Writeback

Imm

Store
Inst. Register Data
Instruction Cache

B
Registers

ALU
PC

Cache

F D X M W
F D X M W
F D X M W
F D X M W
13
Value Speculation for RAW Data Hazards
§ Rather than wait for value, can guess value!

§ So far, only effective in certain limited cases:

– Branch prediction
– Stack pointer updates
– Memory address disambiguation

14
Control Hazards
What do we need to calculate next PC?

§ For Unconditional Jumps

– Opcode, PC, and offset
§ For Jump Register
– Opcode, Register value, and offset
§ For Conditional Branches
– Opcode, Register (for condition), PC and offset
§ For all other instructions
– Opcode and PC ( and have to know it’s not one of above )

15
Control flow information in pipeline
Fetch Decode EXecute Memory Writeback
Branch condition,
Opcode,
PC known Jump register
offset known
value known

Imm

Store
Data
Inst. Register

Instruction Cache
B
Registers

ALU
PC

Cache A

16
RISC-V Unconditional PC-Relative Jumps
PCJumpSel FKill Jump?

PC_decode
[ Kill bit turns
instruction

Add
into a bubble ]
+4

Imm
Kill
Inst. Register
PC_fetch

B
Instruction Registers

ALU
Cache
A

Fetch Decode EXecute

17
Pipelining for Unconditional PC-Relative
Jumps

F D X M W j target

F D X M W bubble

F D X M W target: add x1, x2, x3

18
Branch Delay Slots
§ Early RISCs adopted idea from pipelined microcode
engines, and changed ISA semantics so instruction after
branch/jump is always executed before control flow
change occurs:
0x100 j target
0x104 add x1, x2, x3 // Executed before target
…
0x205 target: xori x1, x1, 7
§ Software has to fill delay slot with useful work, or fill with
explicit NOP instruction

F D X M W j target

F D X M W add x1, x2, x3

F D X M W target: xori x1, x1, 7

19
Post-1990 RISC ISAs don’t have delay slots
§ Encodes microarchitectural detail into ISA
– c.f. IBM 650 drum layout
§ Performance issues
– Increased I-cache misses from NOPs in unused delay slots
– I-cache miss on delay slot causes machine to wait, even if delay
slot is a NOP
§ Complicates more advanced microarchitectures
– Consider 30-stage pipeline with four-instruction-per-cycle issue
§ Better branch prediction reduced need
– Branch prediction in later lecture

20
RISC-V Conditional Branches
PCSel Branch? DKill
FKill Cond?

PC_execute
PC_decode

Add
Add
+4

Kill
Kill

Inst.
Inst. Register
PC_fetch

Instruction

B
Registers
Cache

ALU
A

Fetch Decode EXecute

21
Pipelining for Conditional Branches

F D X M W beq x1, x2, target

F D X M W bubble

F D X M W target: add x1, x2, x3

22
Pipelining for Jump Register
§ Register value obtained in execute stage
F D X M W jr x1

F D X M W bubble

F D X M W target: add x5, x6, x7

23
Why instruction may not be dispatched
every cycle in classic 5-stage pipeline (CPI>1)
§ Full bypassing may be too expensive to implement
– typically all frequently used paths are provided
– some infrequently used bypass paths may increase cycle time
and counteract the benefit of reducing CPI
§ Loads have two-cycle latency
– Instruction after load cannot use load result
– MIPS-I ISA defined load delay slots, a software-visible pipeline
hazard (compiler schedules independent instruction or inserts
NOP to avoid hazard). Removed in MIPS-II (pipeline interlocks
added in hardware)
• MIPS:“Microprocessor without Interlocked Pipeline Stages
§ Jumps/Conditional branches may cause bubbles
– kill following instruction(s) if no delay slots

Machines with software-visible delay slots may execute significant

number of NOP instructions inserted by the compiler.
NOPs reduce CPI, but increase instructions/program!
24
CS152 Administrivia
§ HW1 released
– Due Feb 02
§ Lab1 released
– Due Feb 09
§ Lab group matching
– https://docs.google.com/forms/d/e/1FAIpQLSfxXVEwM6a-pR2-
RU_ntjD1zfskipOWf4e-8eCIjxfOHtTiaA/viewform?usp=sf_link
§ Discussions and OHs start this week.
– Check the course calendar for details.
– Wednesday 10am-12pm discussion dropped.
– Discussions will be recorded.

25
CS252 Administrivia
§ CS252 Readings on
– https://ucb-cs252-sp23.hotcrp.com/u/0/
– Use hotcrp to upload reviews before Wednesday:
• Write one paragraph on main content of paper including good/bad
points of paper
• Also, answer/ask 1-3 questions about paper for discussion
• First two “360 Architecture”, “VAX11-780”
– 2-3pm Wednesday, Soda 606/Zoom
§ CS252 Project Timeline
– Proposal Wed Feb 22
– One page in PDF format including:
• project title
• team members (2 per project)
• what problem are you trying to solve?
• what is your approach?
• infrastructure to be used
• timeline/milestones
26
Traps and Interrupts
In class, we’ll use following terminology
§ Exception: An unusual internal event caused by
program during execution
– E.g., page fault, arithmetic underflow
§ Interrupt: An external event outside of running
program
§ Trap: Forced transfer of control to supervisor
caused by exception or interrupt
– Not all exceptions cause traps (c.f. IEEE 754 floating-point
standard)

27
History of Exception Handling
§ Analytical Engine had overflow exceptions
§ First system with traps was Univac-I, 1951
– Arithmetic overflow would either
• 1. trigger the execution a two-instruction fix-up routine at
address 0, or
• 2. at the programmer's option, cause the computer to stop
– Later Univac 1103, 1955, modified to add external interrupts
• Used to gather real-time wind tunnel data
§ First system with I/O interrupts was DYSEAC, 1954
– Had two program counters, and I/O signal caused switch between
two PCs
– Also, first system with DMA (Direct Memory Access by I/O device)
– And, first mobile computer!

28
DYSEAC, first mobile computer!

• Carried in two tractor trailers, 12 tons + 8 tons

• Built for US Army Signal Corps
[Courtesy Mark Smotherman]
29
Asynchronous Interrupts
§ An I/O device requests attention by asserting one
of the prioritized interrupt request lines

§ When the processor decides to process the

interrupt
– It stops the current program at instruction Ii ,
completing all the instructions up to Ii-1 (precise
interrupt)
– It saves the PC of instruction Ii in a special register (EPC)
– It disables interrupts and transfers control to a
designated interrupt handler running in supervisor
mode

30
Trap:
altering the normal flow of control

Ii-1 HI1

trap
program Ii HI2 handler

Ii+1 HIn

An external or internal event that needs to be processed by another (system)

program. The event is usually unexpected or rare from program’s point of view.

31
Trap Handler
§ Saves EPC before enabling interrupts to allow
nested interrupts Þ
– need an instruction to move EPC into GPRs
– need a way to mask further interrupts at least until EPC can be
saved
§ Needs to read a status register that indicates the
cause of the trap
§ Uses a special indirect jump instruction ERET
(return-from-environment) which
– enables interrupts
– restores the processor to the user mode
– restores hardware status and control state

32
Synchronous Trap
§ A synchronous trap is caused by an exception on
a particular instruction

§ In general, the instruction cannot be completed

and needs to be restarted after the exception has
been handled
– requires undoing the effect of one or more partially
executed instructions

§ In the case of a system call trap, the instruction is

considered to have been completed
– a special jump instruction involving a change to a
privileged mode
33
Exception Handling 5-Stage Pipeline

Inst. Data
PC Mem D Decode E + M Mem W

PC address Illegal Data address

Overflow
Exception Opcode Exceptions

Asynchronous Interrupts

§ How to handle multiple simultaneous exceptions in

different pipeline stages?
§ How and where to handle external asynchronous
interrupts?

34
Exception Handling 5-Stage Pipeline
Commit
Point

Inst. Data
PC Mem D Decode E + M Mem W

Illegal Overflow Data address

PC address
Opcode Exceptions
Exception
Exc Exc Exc

Cause
D E M

PC PC PC

EPC
Select
Handler Kill F D Kill D E Kill E M Asynchronous
PC Stage Stage Stage Interrupts
Kill
Writeback

35
Exception Handling 5-Stage Pipeline
§ Hold exception flags in pipeline until commit
point (M stage)

§ Exceptions in earlier pipe stages override later

exceptions for a given instruction

§ Inject external interrupts at commit point

(override others)

§ If trap at commit: update Cause and EPC registers,

kill all stages, inject handler PC into fetch stage

36
Speculating on Exceptions
§ Prediction mechanism
– Exceptions are rare, so simply predicting no exceptions is very
accurate!
§ Check prediction mechanism
– Exceptions detected at end of instruction execution pipeline,
special hardware for various exception types
§ Recovery mechanism
– Only write architectural state at commit point, so can throw away
partially executed instructions after exception
– Launch exception handler after flushing pipeline

§ Bypassing allows use of uncommitted instruction

results by following instructions

37
Acknowledgements
§ These slides contain material developed and copyright by:
– Arvind (MIT)
– Krste Asanovic (MIT/UCB)
– Joel Emer (Intel/MIT)
– James Hoe (CMU)
– John Kubiatowicz (UCB)
– David Patterson (UCB)

§ MIT material derived from course 6.823

§ UCB material derived from course CS252

Thesis: Proposed New Public Market With Transport Terminal and Bagsakan Center in Trece Martires, Cavite
89% (9)
Thesis: Proposed New Public Market With Transport Terminal and Bagsakan Center in Trece Martires, Cavite
75 pages
Continuous Integration (CI) and Continuous Delivery (CD)
100% (1)
Continuous Integration (CI) and Continuous Delivery (CD)
429 pages
L03-Pipelining
No ratings yet
L03-Pipelining
45 pages
Processor Organization & Instruction Cycle
No ratings yet
Processor Organization & Instruction Cycle
31 pages
CO Pipelining PDF notes
No ratings yet
CO Pipelining PDF notes
10 pages
Cse410 10 Pipelining A
No ratings yet
Cse410 10 Pipelining A
7 pages
Embedded Computer Architecture 5SAI0
No ratings yet
Embedded Computer Architecture 5SAI0
59 pages
A4 版本1 （未使用）
No ratings yet
A4 版本1 （未使用）
2 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
CS530-Fall2015-Lecture9
No ratings yet
CS530-Fall2015-Lecture9
5 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
L04 Pipelining
No ratings yet
L04 Pipelining
48 pages
Module 5_Processor Structure and Function
No ratings yet
Module 5_Processor Structure and Function
74 pages
Chapter_04_processor_2
No ratings yet
Chapter_04_processor_2
28 pages
Lec 1
No ratings yet
Lec 1
30 pages
03 Pipeline
0% (1)
03 Pipeline
38 pages
Pipelining Lecture
No ratings yet
Pipelining Lecture
39 pages
SRM Pipelining 05.Pptx
No ratings yet
SRM Pipelining 05.Pptx
42 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
Week 11
No ratings yet
Week 11
33 pages
16.482 / 16.561 Computer Architecture and Design: Instructor: Dr. Michael Geiger Fall 2013
No ratings yet
16.482 / 16.561 Computer Architecture and Design: Instructor: Dr. Michael Geiger Fall 2013
42 pages
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
No ratings yet
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
7 pages
Computer Architecture: Pipelining: Dr. Ashok Kumar Turuk
No ratings yet
Computer Architecture: Pipelining: Dr. Ashok Kumar Turuk
136 pages
Chapter 2 Lecture 4 and 5
No ratings yet
Chapter 2 Lecture 4 and 5
56 pages
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
74 pages
CA Final PDF
No ratings yet
CA Final PDF
13 pages
L05-PipeliningII
No ratings yet
L05-PipeliningII
36 pages
Computer Hardware Lecturer - 4
No ratings yet
Computer Hardware Lecturer - 4
9 pages
05-pipelining
No ratings yet
05-pipelining
37 pages
15IF11 Multicore A PDF
No ratings yet
15IF11 Multicore A PDF
64 pages
Unit - 1 Microprocessor Architecture
No ratings yet
Unit - 1 Microprocessor Architecture
52 pages
Questions With Answers
No ratings yet
Questions With Answers
22 pages
HRY-312 Computer Organization Introduction To Pipelining
No ratings yet
HRY-312 Computer Organization Introduction To Pipelining
30 pages
CA Lecture 12
No ratings yet
CA Lecture 12
48 pages
880317
No ratings yet
880317
74 pages
ACA_All Unit
No ratings yet
ACA_All Unit
31 pages
CAO Fall 2024 Lecture 07 RISC v Pipelined Implementation
No ratings yet
CAO Fall 2024 Lecture 07 RISC v Pipelined Implementation
114 pages
Chapter 4 The Processor
No ratings yet
Chapter 4 The Processor
72 pages
Instruction Scheduling
No ratings yet
Instruction Scheduling
17 pages
12 - Processor Structure and Function
No ratings yet
12 - Processor Structure and Function
73 pages
Unit-V: Performance Enhancement Techinques
No ratings yet
Unit-V: Performance Enhancement Techinques
61 pages
ILP - Appendix C PDF
No ratings yet
ILP - Appendix C PDF
52 pages
Giub 20223 62 17427 2024-06-18T10 47 08
No ratings yet
Giub 20223 62 17427 2024-06-18T10 47 08
32 pages
Lec14 Pipeline Riscv - Key
No ratings yet
Lec14 Pipeline Riscv - Key
58 pages
MIPS
No ratings yet
MIPS
70 pages
Review of Pipelines, Performance, Caches, & Virtual Memory
No ratings yet
Review of Pipelines, Performance, Caches, & Virtual Memory
64 pages
Pipelining2019_(1)[1]
No ratings yet
Pipelining2019_(1)[1]
82 pages
02a ILP Pipeline
No ratings yet
02a ILP Pipeline
40 pages
Ca06 2014 PDF
No ratings yet
Ca06 2014 PDF
53 pages
Computer Architecture 1
No ratings yet
Computer Architecture 1
37 pages
CH12 CPU Structure and Function
No ratings yet
CH12 CPU Structure and Function
44 pages
CCS CMCS 611-101 Advanced Computer Architecture Advanced Computer Architecture
100% (2)
CCS CMCS 611-101 Advanced Computer Architecture Advanced Computer Architecture
24 pages
Risc in Pipe Ine
No ratings yet
Risc in Pipe Ine
39 pages
QB106613
No ratings yet
QB106613
5 pages
Chapter_4
No ratings yet
Chapter_4
78 pages
CH14 COA9e Processor Structure and Function
No ratings yet
CH14 COA9e Processor Structure and Function
40 pages
CH10-Processor Structure and Function
No ratings yet
CH10-Processor Structure and Function
14 pages
Accelerated Computing With HIP: Second Edition
From Everand
Accelerated Computing With HIP: Second Edition
Yifan Sun
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Comptia Server+ Primer
From Everand
Comptia Server+ Primer
John Greene
5/5 (1)
Basic Information About C language PDF
From Everand
Basic Information About C language PDF
Suraj Das
No ratings yet
Scheme For Igneous Rock Identification
No ratings yet
Scheme For Igneous Rock Identification
1 page
Reading Progressions
No ratings yet
Reading Progressions
24 pages
Present Simple and Present Continuous 150355
No ratings yet
Present Simple and Present Continuous 150355
92 pages
Brochure - Roaming Certificate
No ratings yet
Brochure - Roaming Certificate
3 pages
Bone Studio Maritime Place Furniture Packages
No ratings yet
Bone Studio Maritime Place Furniture Packages
10 pages
AISC Essential Data's
100% (1)
AISC Essential Data's
1 page
Purple Tuesday Attendance Sheet
No ratings yet
Purple Tuesday Attendance Sheet
7 pages
Final Draft of Neelesh
No ratings yet
Final Draft of Neelesh
5 pages
C500-0055 QSL9 C300D6 OPENSET-Rev00
No ratings yet
C500-0055 QSL9 C300D6 OPENSET-Rev00
2 pages
Testbank for Research Methods for the Behavioral Sciences 3rd Edition Privitera Solution Manual
No ratings yet
Testbank for Research Methods for the Behavioral Sciences 3rd Edition Privitera Solution Manual
19 pages
MUCH ADO Study Guide ACTS 1-5 2009
No ratings yet
MUCH ADO Study Guide ACTS 1-5 2009
7 pages
Zoonoses 20220037
No ratings yet
Zoonoses 20220037
11 pages
Diet Plan Gerd
No ratings yet
Diet Plan Gerd
4 pages
HO - Consignment Arrangements Students
No ratings yet
HO - Consignment Arrangements Students
2 pages
Revaluation Summer24 GS
No ratings yet
Revaluation Summer24 GS
1,539 pages
PSM 2 Dump
No ratings yet
PSM 2 Dump
11 pages
Pricing Maternity & Newborn Shoot - DB ARTWORKS 2024
No ratings yet
Pricing Maternity & Newborn Shoot - DB ARTWORKS 2024
3 pages
Thy Will Be Done+ PDF
No ratings yet
Thy Will Be Done+ PDF
1 page
LOGO Design
No ratings yet
LOGO Design
8 pages
Lexico Grammar 3 Key
100% (1)
Lexico Grammar 3 Key
12 pages
Republican Clubs
No ratings yet
Republican Clubs
16 pages
Bimbo Lickleton - DND (Kaizen)
No ratings yet
Bimbo Lickleton - DND (Kaizen)
3 pages
2 Nouns_CP_ 2025 (2)
No ratings yet
2 Nouns_CP_ 2025 (2)
6 pages
Welcome Letter Volunteer
No ratings yet
Welcome Letter Volunteer
3 pages
FILIPINO CLUB PROPOSAL(MAKABANSA) (1)Buwan ng Wika
No ratings yet
FILIPINO CLUB PROPOSAL(MAKABANSA) (1)Buwan ng Wika
3 pages
Carriculum Vitae: Personal Information Akatew Haile Mebrahtu
100% (1)
Carriculum Vitae: Personal Information Akatew Haile Mebrahtu
3 pages
01 Sonova AR 2022 23 Full Report En
No ratings yet
01 Sonova AR 2022 23 Full Report En
305 pages
ADDRESSING-DECARBONISATION-AT-THE-2024-BANGUN-BANGSA-CONFERENCE
No ratings yet
ADDRESSING-DECARBONISATION-AT-THE-2024-BANGUN-BANGSA-CONFERENCE
4 pages

L04-Pipelining

Uploaded by

L04-Pipelining

Uploaded by

http://inst.eecs.berkeley.

https://en.wikipedia.org/wiki/Pentium_(original) A 0.6 µm BiCMOS Processor With

Microarchitecture CPI cycle time

This version designed for regfiles/memories with

3 instructions, 22 cycles, CPI=7.33

§ Handling hazards generally introduces bubbles into

F D X M W add x1, x3, x5

F D X M W add x1, x3, x5 Bypass around ALU

§ So far, only effective in certain limited cases:

§ For Unconditional Jumps

Fetch Decode EXecute

F D X M W target: add x1, x2, x3

F D X M W add x1, x2, x3

F D X M W target: xori x1, x1, 7

Fetch Decode EXecute

F D X M W beq x1, x2, target

F D X M W target: add x1, x2, x3

F D X M W target: add x5, x6, x7

Machines with software-visible delay slots may execute significant

• Carried in two tractor trailers, 12 tons + 8 tons

§ When the processor decides to process the

An external or internal event that needs to be processed by another (system)

§ In general, the instruction cannot be completed

§ In the case of a system call trap, the instruction is

PC address Illegal Data address

§ How to handle multiple simultaneous exceptions in

Illegal Overflow Data address

§ Exceptions in earlier pipe stages override later

§ Inject external interrupts at commit point

§ If trap at commit: update Cause and EPC registers,

§ Bypassing allows use of uncommitted instruction

§ MIT material derived from course 6.823

You might also like