MIPS Instruction Execution and Control
MIPS Instruction Execution and Control
7 Processor
Syllabus
Instruction Execution - Building a Data Path - Designing a Control Unit - Hardwired Control,
Microprogrammed Control - Pipelining - DataHazard - Control Hazards.
Contents
7.1 Instruction Execution
Dec.-15, 18, May-19, Marks 16
7.2 Basic MIPS Implementation
Dec.-14, May-15, Marks 8
7.3 Building a Data Path
7.4 Designing a Control Unit. Dec.-14,18, Marks 8
75 Hardwired Control
7.6 Microprogrammed Control
7.7 Comparison Between Hardwired
Marks 6
and Microprogrammed Control Units Dec.-18,
7.8 Pipelining May-07, 1213, 15, 16,17,19,
Dec.-06,08, 09,14, 15,16,18,
June-09 Marks 6
7.11 Handling Control Hazards. May-08, 09, 10, 11, 13, 14,18,
Dec.-07, 08, 16, 18, Marks 6
(7- 1)
Digital Principles and Computer Organization 7-2
Processor
7.1 Instruction Execution
" Let us see how instruction is executed. The complete instruction cycle involves
three operations : Instruction fetching, opcode decoding and instruction execution.
Fig. 7.1.1 shows the basic instruction cycle. After each instruction cycle, central
processing unit checks for any valid interrupt request. If so, central processing unit
fetches the instructions from the interrupt service routine and after completion of
interrupt service routine, central processing unit starts the new instruction cvcle
from where it has been interrupted.
Fig. 7.1.2 shows instruction cycle with interrupt cycle.
START
START
Fetch the
next instruction Fetch cycle
Fetch the
next instruction Fetch cycle
Decode instruction Decode cycle
STOP
Yes
Process interrupts
Fig. 7.1.1 Basic instruction cycle Fig. 7.1.2 Basic instruction cycle with interrupt
Instruction fetch cycle : In this cycle, the instruction is
location whose address is in the PC. This instruction is placedfetched from the memory.
in the Instruction Register
(IR) in the processor.
Instruction decode cycle : In this cycle, the opcode of the instruction stored in the
instruction register is decoded/examined to determine which operation is to be
performed.
Instruction execution cycle :
In this cycle, the specified operation is performed by the processor. This often
involves fetching operands from the memory or from processor registers, performing an
arithmetic or logical operation and storing the result in the destination location. During
TECHNICAL PUBLICATIONS -an up-thrust for knowiedge
ProcesSor
Principles and Computer Organization 7-3
ihe
execution, PC contents are incremented to point to the next instruction.
instruction
Aker
completion of execution of the current instruction, the PC contains the address of
the next instruction and a new instruction fetch cycle can begin.
Review Question
M
Program
COunter
X
A A
D D
D D
Data
Address
Register1
Instruction A Address
Register 2
Data
M
Instruction Register 3 memory
memorY X
Register file Data
Fig. 7.2.1 Major functional units and interconnections between them for implementation
of MIPS subset
Operation
The program cOunter gIves the instruction
address
After the instruction is fetched, the register therequired
operands
to instruction
by ne instructie
an
are specified by fields of that instruction
Comput!
Once the register operands
have been
memory address (for a load or store), fetched, they can be used to (tor
result
to compute an arithmei.
integer arithmetic-logical
instruction), or a
compare (for a branch).
TECHNICAL PUBLICATIOS an up-thrust for knowledge
Principlesand Computer Organization Processor
7-5
instruction
is an ALU must
Ifthe arithmetic-logical instruction, the result from the
be written to a register.
address to either
If the operation is a load or store, the ALU
result is used as an
registers.
store a value from the registers or load a value from memory into the
file.
The result from the ALU or memory is written back into the register
instruction
Branches require the use of the ALU output to determine the next
branch offset are
address, which comes either from the ALU(where the PC and
summed) or from an adder that increments the current PC by 4.
is coming from two ditrerent
Fig. 7.2.l shows that data going to a particular unit from one Of two
sources. For eXample, the value written into the PC can come
ALUor the
the data written into the register file can come from either the
adders, the
and the second input to the ALU can come from a register or
data memory,
selection of appropriate source 15 one
immediate field of the instruction. The
(data selector). The multiplexer selects from among several
using multiplexer control lines. The control lines are set based
of its
inputs based on the setting being executed. This i
taken from the instruction
primarily on information
illustrated in Fig. 7.2.2.
Branch
Control
M
Program
cOunter X
ALU
4
D operation
D M
Mem R
Zero
Data
Register 1 Address
Address A
Register2 L Data
Instruction memory
Register 3 M
Data Mem W
Registerfile X
Reg W
Instruction
memory
AU :Dec.-14, May-15
7.3 Building a Data Path
As shown in Fig. 7.3.2,
the MIPS M
Program
implementation includes, the datapath
x
Counter
X
elements (a unit used to operate on or
hold data within a processor) such as
the instruction and data memories, the
register file, the ALU, and adders.
" Fig. 7.3.1 shows the combination of
D
D
the three elements (instruction
memory, program counter and adder)
from Fig. 7.3.2 to form a datapath that
fetches instructions and increments the
PC to obtain the address of the next
sequential instruction.
Address
The instruction memory stores the
instructions of a programn and gives
instruction as an output corresponding Instruction
to the address specified by the
program counter. The adder is used to Instruction
increment the PC by 4 to the address memory
of the next instruction. instruction
Fig. 7.3.1 Data path to fetch
and increment PC
Since the instruction memory only
reads, the output at any time reflects the contents of the location
Specifiedbythe
address input, and no read control signal is needed
TECHNICAL PUBLICATIONS an
up-thrust for knowledge
wlalPrinCjples and Computer Organization Processor
7-7
The program Counter is a 32-bits register that is written at the end of every clock
Cvcle and thus does not necd a write
control signal.
The adder always adds its two 32-bits inputs and place the sum on its output.
ALU operation on the contents of the registers, and write the result to a register.
We call these instructions as R-type instructions. This instruction class incudes
add, sub, AND, OR, and slt. For example, OR $t1, $t2, St3 reads $t2 and Sts,
performs logical OR operation and saves the result in $t1.
. The processor S 32 general-purpose registers are stored in a structure called a
register can be
register file. A register file is a collection of registers in which any
the file. The register
read or written by specifying the number of the register in
file contains the register state of the computer.
. Fig. 7.3.2 shows multiport Write data
ALU operation
register file (two read ports
and one write port) and the Zero
to be
written into memory. Fig. 73.3 memory
shows these two elements. Data
Mem W
Sign extension is
by replicating theimplemented
sign bit of the high-order Fig. 7.3.3 Data
item in the original data memory unit and the sign
extension unit
.
Therefore,
high-order bits of the
tw0 units needed to larger, destination data itemn.
register file and
extension unit. ALU of implement
Fig. 7.3.2, are loads and
the data stores, unit and the sig
in addition
tothe'
memory
Dyilal
/Principles and Computer Organization 7-9
Processor
or not. If two
operands are equal PC + 4
next
operands are not equal the that
is the instruction
instruction
(PC= PC+4); D
Tollows sequentially that the D
we say PC
in this case,
On the other
branch is not taken.
are equal.
hand, if two operands the branch
condition is true),
(1.e.,
becomes the new Shift leit
target address
the branch is
by
PC, and we say that Address 2 bits
taken.
branch datapath must Instruction
Thus, the operations
two
perform branch target Instruction
the
Compute the register memory
address and compare Computation of branch target
Fig. 7.3.4 address
contents.
knowledge
TECHNICAL PUBLICA TIONS - an up-thrust for
Digital Principles and Computer 7- 10
Organization
Fig. 7.3.5 shows the structure of the datapath segment that handles Pr
branches.
oces or
Branch targetaddress
Program
Counter
M
PC + 4
A
A
D
D
PC
the instruction
immediately condition is true, a
delayed
first execute
branch
before jumping to the specifiedfollowing the branch in instruction order
address. sequential
branch target
CHNICAL PL
Principles and Computer ProcesSor
Dgial
Organization 7- 11
" To share a datapath element between two different instruction classes, we have
connected multiple connections to the input of an element and used a multiplexer
and control signal to select among the multiple inputs.
and Microprogrammed
Between Hardwired
7.7 Comparison
Control Units
Hardwired Control
Microprogrammed Control
Attribute
Slow
Fast
Speed Implemnented in software
Implemented in hardware
Control functions
new
accommodate new More flexible, to accOmmOdate
Not flexible, to new
Flexibility system
specifications or new System specification or required.
instructions redesign is
instructions.
Review Question
Clock
cycle 23 45678
2
Instruction
F D,E, S,
F2 D Ez S2
l3 F3 D, E S3
F4 D4 E4 S4
Fig. 7.8.1 Four stage instruction pipelining
" In this instruction pipelining four instructions are in progress at any given time.
Fig. 7.8.2.
This means that four distinct hardware units are needed, as shown in
their tasks
These units are implemented such that they are capable of performing
with one another. Information from the
simultaneously and without interfering with
help of buffers.
stage is passed to the next stage with the
Interstage buffers
D E
F Decode instruction Execution Store
Fetch and fetch operation result
instruction operands
B2 B
B four-stage instruction pipeline
Fig. 7.8.2Hardware organization for
7.8.1 Pipeline Stages in the MIPS Instructions
1. Fetch instruction from
memory.
2. Read registers while decoding the
instruction. The regular format
instructions allows reading and decoding to occur
3. Execute the
simultaneously.
operation or calculate an address.
4. Access an operand in data memory.
5. Write the result into a register.
Clock
cycle 12 3 67 8 Time
Proces Or
Instruction
FD, E, S,
F2 D, E
E3 S3
F D4 E4 S4
Fs DsEs
Fig. 7.8.4 Effect on pipeline of an
execution operation taking more than one clock cycle
Types of Hazards
1. Structural hazards : These
hazards are because of conflicts due to
resources whern even with all possible insufficient
operation. combination, it may not be possible to
overlap the
2. Data or data dependent hazards :
depends on the result of previous These result when instruction in the pipeline
completed. instructions which are still in pipeline and not
3. Instruction or control hazards : They arise while
instructions that change the contents of programn pipelining branch and otne
these hazards is to stall the counter. The simplest way to nat
pipeline. Stalling of the pipeline allows few
proceed to completion while stopping the execution of those which instructions
results in hazards.
7.8.4 Structural Hazards
The performance of
pipelined
pipelined and whether theyprocessor depends on
are multiple executionwhether
are the functional
units to Possible
Combination of instructions in the pipeline.
to be stalled to avoid If for some
allow all
Pipeline has
the resource conflicts then there is a combination,
In other words, we can say that when two structural hazara. given
hardware resource at the same tine, the instructions require the use of a
The most common case in which this
structural hazard oceurs.
hazard may ariseof the
One instruction may need to access memory for storage is in result
Imemory.
accesswhile
to another
Clock cycle 1 2 34 5 6
Time Proces!
Instruction
F1 E
I,(Branch) F2 E
Execution unit idle
F3 ! X
Fk Ek
Fk+1 Ek+1
Fig. 7.8.5 Effect of branching in two-stage pipelining
fetched instruction I3 is discarded and instruction Ik is fetched. During this ame
execution unit is idle and pipeline is stalled for one clock cycle.
Branch Penalty : The time lost as a result of a branch instruction is often referred to
the branch penalty.
Factor effecting branch penalty
1. It is more for complex instructions.
2. For a longer pipeline, branch penalty is more.
In case of longer pipelines, the branch penalty can be reduced by computing
branch address earlier in the pipeline.
14. Explain the hazards caused by unconditional branching statements.
AU
AU :-May-16,19
7.9 Pipelined Datapath and Control
The Fig. 7.9.1 shows the general structure of multistage pipeline. As shown in the
" sequence of
of m data
processor consists of aa
Fig. 7.9.1, the usually pipeline
segments.
processing circuits, called elements, stages or
Control unit
Data
Data R2 Rm Cm out
In R1
Stage S Stage Sm
Stage S processor
Fig. 7.9.1 Structure of pipeline
collectively perform a single operation on a stream of data operands
These stages done part by part in each stage, but the
passing through them. The processing 1s
only after an operand set has passed through the entire
final result is obtained
pipeline.
consists of two major blocks : Multiword input register and datapath
Each stage
circuit.
registers R hold partially processed results as they move
The multiword input
butters that prevent neighbouring
through the pipeline and they also serves as In each clock period the individual
stages from interfering with one another.
stage.
process its data and transfers its results to the next
Stages 7.9.T can smultaneOusly process un
pipeline processor shown in Fig.
T'he m-stage when the pipeline is full
to m independent sets of data operands. Thus
TECHNICAL PUBLICA TIONS - an up-thrust for knowledge
Digital Principles and Computer Organization 7-54
Microprogram counter
(uPC)
Branch
address
Control memory
(CM)
Stage S
Microinstruction
register uIR
External Next
Conditions address
logic Decoders
Stage S,
microinstruction addresses, and the control memory CM, which stores the
microinstructions.
The execution stage S consists of microinstruction register ulR, the
decoders that
extract control signals from the microinstructions in ulR, and the logC o
determining next address or the branch address.
The microinstruction register acts as a buffer register for stage S). With these two
stages it is possible that, while instruction I, with address A, is being executed by
stage S2 the instruction lË, 1 with the next consecutive address Aj41 is fetched
from memory by stage S1. If on executing |. in S, it is determined that abranch
must be made to a nonconsecutive address, then the prefetched instruction Ij+1 in
S has to be discarded. In such cases branch address is obtained directly from ul
itself and fed back to S,. The branch address is then loaded into uPC and next
instruction is fetched from the branch address.
D - cache
I- cache
Register file
(RF)
S, : OL S, : EX S, :os
S, :IF (Operand load) (ALUoperand) (Operand store)
(Instruction fetch)
The four stages S1 :S4 shown in Fig, 7.9.3 perform the following ProcesSo
S : IF : Instruction fetching and decoding using the Icache. functions:
S; : OL : Operand loading from the D-cache to register file.
S3 : EX : Data processing using the ALU and register file.
S4 : OS :Operand storing to the D-cache from register file.
Stages S, and S, implements memory load and store operations,
respectively
Stages S2, S and S4 share the CPU's local Register File (RF). The
register file act as interstage buffer registers. registers in the
The stage 3 implements data transfer and
data processing operations of the
register to register type using ALU of the CPU.
7.9.3 Implementation of MIPS Instruction Pipeline
Fig. 7.9.4 shows the single-cycle datapath with the pipeline stages. The instructon
execution is divided into five stages means a five-stage pipeline.
IF/ID
ID/EX
EX/MEM MEMWB
Program
COunter
X
D
4 D
Register Shift2 left
bit
byl
D Write
data
Read
data 1
Read Zero
Address register 1 A
Read Read data
Read data 2
Instruction 0 M Address
register 2 U
Write 1 X ALU Data
register result memory
Instruction
memory Write
16
32 data
Sign
extend
Fia. 7.9.4
Single-cycle datapath with the pipeline stages
TECHNICAL PUBLICATIONS® an up-thrust for
Principles and Computer Organization ProcesSor
7-61
Interstage buffers
Source Result
register register
E
store
Execute result in
(ALU) register file
B B2
Forwarding path
(b) Forwarding path in the processor pipeline
Fig. 7.10.1Operand forwarding in a pipelined processor
The data forwarding mechanism is indicated by dashed lines.
destination bus or from
The two multiplexers select the data for ALU either from
source 1and source 2 registers.
it forwards ALU output
When the forwarding logic detects data dependency,
forwarding path to the ALUfor the next
available in the result register using data
Hence the execution of next (dependent) instruction proceeds without
operation.
interruption.
shows the hardware necessary to support forwarding for operations that
" Fig. 7.10.2
use results during the EX stage. ID/EX
WB EX/MEM
wB MEMWB
Control M ws
EX
IF/ID
Instruction
ALU
Registers Data
memory
Program Instruction
ounter mermory
Rs
[Link]
Rt EX/[Link]
IF/ID. Rt
[Link] Rd
[Link] Forwarding
MEMWB RegisterRd
unit
forwarding
Modified datapath to resolve hazards via
Fig. 7.10.2 7.10.7 the multiplexers are added to provide
shownin Fig.
path
to data unit.
Compared along with the forwarding
ALU
nputs to the TECHNICAL PUBLICATIONS up-thrust for knowledge an