CA UNIT-2
CA UNIT-2
Instruction classes:
Operation:
Instruction memory: After the instruction is fetched, the register operands required by an
instruction are specified by fields of that instruction
Register operand: Once the register operands have been fetched, they can be used to
compute three classes of instruction.
Multiplexor: The data going to a particular unit comes from two different sources. These
data lines cannot be wired together, we must add a device that combine the multiple
sources and sent to the destination. Such device called multiplexor (many inputs single
output)
The control lines are set Based on information taken from the instruction being execute.
Control unit:
Control unit which has the instruction as an input is used to determine the control signals
for the function unit and two of the multiplexors.
The input to the control fields is the 6 bit opcode field from the instruction.
Theoutputofthecontrolunitconsistofthree1-bitsignalsareusedtocontrol multiplexors.
3signals for controlling reads and writes in the register file and data memory.
1-bitcontrolsignal used in determining for branch
2-bit control signal for ALU.
1. Combinational Element
2. State Element
Combinational Element:
The element that operates on data value such as AND gate or an ALU, which means
the output depend only on the current inputs.
State Element:
Clocking Methodology:
It is used to determine when the data is valid and stable relative to the clock.
It specifies the timing of read and writes.
A clocking methodology is designed to ensure predictability.
Any values stored in a sequential logic element are updated only on a clock edge.
2. BUILDING DATA PATH:
In MIPS implementation, the data path elements include the instruction and data memories,
the register files, the ALU and adders.
Instruction Memory: A memory unit to store the instruction of a program and supply
instructions gives an adderss.
Program counter: PC is register containing the address of the next instruction in the
program.
Data segments:
R- format instruction:
R-formatinstructionhavethreeregisteroperands.2source operandand1destination
operand.
It include add, sub, AND, OR and slt.
Example: OR $t1, $t2, $t3
Read registers:
To read data words, we need to specify the register number to the register file.
Register write:
The general form of load word and store word instruction in MIPS processor are
Lw $t1, offset
value($t2) Sw
$t1,offset value($t2)
These instruction compute memory address by adding the base register.
Store value: read from the Register File, written to the Data
Memory load value : read from the Data Memory, written to the
It has read and write control signal, an address input and an input for the data to be
written into memory
If the condition is true ,the branch target address becomes new pc and the branch is
taken.
If the condition is false, incremented pc should replace the current pc and branch is
not taken
1. Compute the branch target address: the branch data path includes a sign extension
unit, shifter and an adder
2. Compare the register content: used register file and the ALU.
Replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted
left by 2 bits.
The datapath component for individual instruction classes, combine them into a
single data path and add the control to complete the implementation.
The simplest data path will attempt to execute all instruction in one clock cycle
(fetch, decode and execute each instructions in one clock cycle)
No datapath resource can be used more than once per instruction.
Multiplexors needed at the input of shared elements with control lines to do the
selection
Write signals to control writing to the Register File and Data Memory
Cycle time is determined by length of the longest path.
Problem:
How to build a data path for the operational portion of the memory reference and
arithmetic logical instructions that use a single register file and a single ALU to handle
both types of instructions assign any necessary multiplexors.
Answer: to create a data path with only a single register file and a single ALU, we use two
multiplexors. One is placed at the ALU input and another at the data input to the register
file.
Show how to built a datapath for arithmetic-logical, memory reference and branch
instructions.
We can combine all the pieces to make a simple data path for the MIPS architecture by
adding the datapath for instruction fetch, the data path from R- format and memory
instruction and the data path for branches.
There are 6 possible combinations of 4 control inputs. ALU will need to perform one of
these function
Load word and store word instruction: use the ALU to compute the memory address by
adding.
codes:
4 bit control input using a small control inputs the function field of the instruction and 2 bit
control field.
Truth table: It is representation of a logical operation by listing all the values of the inputs.
Truth table shows how the 4 bit ALU control is set depending on these 2 input fields.
Don’t care term(x) indicates that the output does not depend on the value of the input
corresponding to that column.
The input to the control fields is the 6 bit opcode field from the instruction.
The output of the control unit consist of three1-bit signals are used to control
multiplexors.
3signals for controlling reads and writes in the register file and data memory.
1-bitcontrolsignal used in determining for branch
2-bit control signal for ALU.
And gate is used to combine the branch control signal and zero output from ALU.
And gate output controls selection of the next PC.
The multiplexor selects the 1 input, the control is asserted.
The multiplexor selects the 0 input, the control is de-asserted.
3 instruction classes which help to understand how to connect the fields of an institution
to the data path.
s1
+
+
0 rs rt rd shamt funct
add $s1,$s2,$s3
base
35or43 rs rt address
sw $s1,100($s2)
Downloadedfrom:[Link]
lw $s1,100($s2)
sw $s1,100($s2)
Downloadedfrom:[Link]
Important observation about this instruction format:
Bit15:0 give the The16-bitoffset for branch equal, load and store
This figure shows the additions plus the ALU control block, the write signal for state
elements, the read signal for the data memory and the control signals for the multiplexor.
instruction:
Beq $t1,$t2,offset
Implementing Jumps:
000010 address
31:26 25:0
4. PIPELINING: MAY/JUNE2016,NOV/DEC2014(16MARKS)
WHAT IS PIPELINING AND DISCUSS ABOUT PIPELINED DATAPATH AND CONTROL
register Example:
Assume time for stages is100ps for register read or write 200ps for other stages.
Compare pipelined datapath with single-cycle datapath
Solution:
Single-Cycle
0 100 200 300 400 500 600 700 800 900 1000 110012001300140015001600170018001900
Instr
Time(ps)
1 Fetch Decode Execute Memory Write
Instruction ReadReg ALU Read/Write Reg
2 Fetch Decode Execute Memory Write
Instruction ReadReg ALU Read/Write Reg
Pipelined
Instr
If all stages are balanced i.e., all take the same time.
If not balanced, speedup is less. Speedup due to increased throughput Latency (time
for each instruction) does notdecrease
The datapath and control unit share similarities with both the single-cycle and multicycle
implementations that we already saw.
PIPELINING DATAPATH:
The single cycle datapath with the pipeline stages identified. The division of an instruction
into 5 stages means a five stage pipeline and the five stage are as follows:
The instructions and data move generally from left to right through the five
The register between IF & ID stages to separate instruction fetch and decode.
The register between ID & EXE to separate decode and ALU execution.
The register between EXE and MEM to separate ALU execution and data Memory
The register between MEM and WB to separate data memory and write data to register
1 2 3 4 5 6 7 8 9 10
Time (cycles)
IMlw $0 DM $s2
lw$s2, 40($0) RF
40
+ RF
$t3
IMor DM $s7 RF
or$s7,$t3,$t4 RF$t4 |
IM represent the instruction memory and the PC in the instruction fetch stage(IF)
Reg stand for the register file in the instruction decode/register file read stage
The instruction being read from instruction memory using the address in PC and
then storing the instruction in the IF/ID pipeline register.
PC address is incremented by 4and then write back into PC to read for next clock
cycle.
Incremented address is also saved in the IF/ID pipeline register.
The load instruction reads the contents of register 1 and the sign extended
immediate from ID/EX pipeline register and add them using ALU.
The sum is placed in EX/MEM pipeline register.
The load instruction reading the data memory using the address from the EX/MEM
pipeline register and loading the data into the MEM/WB pipeline registers.
Reading the data from the MEM/WB pipeline register and writing into the register file.
PIPELINED CONTROL:
In pipeline control just add the control to the pipelined data path.
Thus data path borrows the control logic for source, register destination number and
ALU control.
The control signals are generated in the same way as in the single-cycle processor—after an
instruction is fetched, the processor decodes it and produces the appropriate control values.
But just like before, some of the control signals will not be needed until some later stage
and clock cycle.
These signals must be propagated through the pipeline until they reach the appropriate
stage. We can just pass them in the pipeline registers, along with the other data.
Control signals can be categorized by the pipeline stage that uses them:
WB RegWrite MemToReg
Stage1:Instruction fetch
The control signals to read instruction memory and to write PC are always asserted.
As in the previous stage the same thing happens at every clock cycle, so there are
no optional control line to set.
The control lines set in this stage are Branch, MEM Read and MEM write.
These signals are set by branch equal, load and store instructions.
PCSrc selects the next sequential address unless control asserts Branch and the ALU
result was 0.
Stage 5:WriteBack:
The two control lines are MemtoReg, which decides between sending the ALU result
or the memory value to the register file and Reg-write, which writes the chosen
value.
5. PIPELINE HAZARDS:
MAY/JUNE2016,APR/MAY2015,NOV/DEC2014(16MARKS)
Structural hazards - Attempt to use the same resource by two or more instructions
at the same time
Data hazards – When wither the source or destination operands of an instruction are not
available at time expected in the pipeline and as a result pipeline is stalled. This situation
is a data hazard.
The last four instructions are all dependent on the result in register $2 of the first instruction.
If register $2 had the value 10 before the subtract instruction and −20 afterwards, the
programmer intends that −20 will be used in the following instructions that refer to
register $2.
Data hazards occur when the pipeline changes the order of read/write accesses to operands
that differs from the normal sequential order
The last four instructions are all dependent on the result in register $2 of the first
instruction.
If the register $2 had the value 10 before the subtraction instruction and -20 afterwards, the
programmer intends -20 will be used in the following instruction that refer to register $2.
This diagram illustrates the execution of these instructions using a multiple clock cycle
pipeline representation.
1a: EX/[Link] =
= ID/[Link]
2a:MEM/[Link]=ID/[Link]
s 2b:MEM/[Link]=
ID/[Link]
The first hazard in the sequence is one register $2, between the result of sub $2 $1 $3
and thefirstreadoperandofand$12,$2,$[Link]
the EX stage and the prior instruction is in the MEM stage
EX/[Link]=ID?[Link]=$2.
The multiplexors have been expanded to add the forwarding paths, and we show
the forwarding unit.
The hardware necessary to support forwarding for operations that use results during the EX stage.
Note that the EX/[Link] field is the register destination for either an ALU
instruction(which comes from the Rd field of the instruction) or a load (which comes from the Rt fi
eld).
Forwarding can also help with hazards when store instructions are dependent on other
instructions. Since they use just one data value during the MEM stage, forwarding is easy.
Let’s now write both the conditions for detecting hazards and the control signals to resolve
them: Example:
[Link] Hazard:
if (EX/[Link])
and (EX/[Link] ≠ 0)
and(EX/[Link]=ID/[Link])) ForwardA=10
if (EX/[Link])
and (EX/[Link] ≠ 0)
and(EX/[Link]=ID/[Link])) ForwardB=10
MEM Hazard:
if (MEM/[Link])
and (MEM/[Link] ≠ 0)
and(MEM/[Link]=ID/[Link])) ForwardA=01
if (MEM/[Link])
and (MEM/[Link] ≠ 0)
and(MEM/[Link]=ID/[Link])) ForwardB=01
• So far, we’ve only addressed ―potential‖ data hazards, where the forwarding unit
was able to detect and resolve them without affecting the performance of the
pipeline.
• There are also―unavoidable ‖data hazards, which the forwarding unit cannot
resolve, and whose resolution does affect pipeline performance.
• We thus add a(unavoidable) hazard detection unit, which detects them and
introduces stalls to resolve them.
bubble
In the load instruction the data is read from memory in clock cycle 4. While the Alu
perform the operation for the following instruction. sometimes the stall the pipeline for the
combination of load.
One way to improve branch performance is to reduce the cost of the taken branch.
1. Taken: If a branch is changing the PC to its target address, than it is a taken branch.
PC<= PC + 4 + Immediate
2. Not Taken: If a branch doesn’t change the PC to its target address, than it is a not
taken branch.
PC<= PC + 4
The branch instruction decided where to branch in MEM stage the clock cycle 4 for the beq
instruction.
3 sequential instructions that follow the branch will be fetch and being
fl ush To discard
instructionsinapipeline,
usually due to an
unexpected event.
Handling control branch:
A branch predictor is a digital circuit that tries to guess which way a branch
([Link]-then-else structure) will go before this is known for sure.
The purpose of the branch predictor is to improve the flow in the instruction pipeline
The behavior of branch can be predicted both statically at compile time and
dynamically by the hardware at execution time.
Branch-prediction buffer:
The memory contains a bit that says whether the branch was recently taken or
2. Two-bits prediction
scheme:
If a branch is almost take, we can predict incorrectly twice otherwise it is not taken
2-bitpredictionscheme:
In a two bits prediction scheme are used to encode the four states in the system.
One bit that predicts the direction of the current branch if the previous branch
was not taken (PNT).
One bit that predicts the direction of the current branch if the previous branch was
taken
(PT).
The top box in each pair shows the code before scheduling; the bottom box shows the
scheduled code.
In (a), the delay slot is scheduled with an independent instruction from before the branch. This
is the best choice. Strategies (b) and (c) are used when (a) is not possible.
In the code sequences for (b) and (c), the use of $s1 in the branch condition prevents the add
instruction (whose destination is $s1) from being moved into the branch delay slot.
In (b) the branch delay slot is scheduled from the target of the branch; usually the target
instruction will need to be copied because it can be reached by another path.
Strategy (b) is preferred when the branch is taken with high probability, such as a loop
branch.
Finally, the branch may be scheduled from the not-taken fall-through as in (c).
To make this optimization legal for (b) or (c), it must be OK to execute the sub instruction
when the branch goes in the unexpected direction.
By―OK‖we mean that the work is wasted, but the program will still execute correctly.
6. EXCEPTIONS:
Types of exception:
Reponses to an exception:
When an exception occurs the processor save the address of the offending instruction in
the Exception Program Counter ( EPC) and transfer control to the OS at some specified
address.
The OS take the appropriate action, which involve providing some service to the user
program, taking some predefined action in response to an exception or stopping the
execution of the program and reporting an error.
To handling the exception it is must for the OS to know the reason for the exception.
Status register method: the MIPS architecture uses a status register which holds a field
that indicates the reason for the exception.
In pipelined computers that are not associated with the exact instruction that war the
cause of the interrupt or exception is called imprecise interrupts or imprecise exceptions