COA Module 5 BEC306C
COA Module 5 BEC306C
MODULE 5
BASIC PROCESSING UNIT
Instruction Set Processor (ISP) – executes machine instructions and coordinates the activities of other
cells.
Also called Central Processing Unit (CPU)
A typical computing task consists of a series of steps specified by a sequence of machine instructions that
constitute a program.
An instruction is executed by carrying out a sequence of more rudimentary operations.
Some fundamental concepts
Processor fetches one instruction at a time and performs the operation specified.
Instructions are fetched from successive memory locations until a branch or a jump instruction is
encountered.
Processor keeps track of the address of the memory location containing the next instruction to be
fetched using Program Counter (PC).
After fetching an instruction, the contents of the PC are updated to point to the next instruction in
the sequence.
A branch instruction may load a different value into the PC.
When an instruction is fetched, it is placed in the instruction register, IR, from where it is
interpreted, or decoded, by the processor’s control circuitry.
The IR holds the instruction until its execution is completed
MDR (Memory Data Register) has two inputs and two outputs.
Data may be loaded into MDR either from the memory bus or from the internal processor bus.
The data stored in MDR may be placed on either bus.
The input of MAR (Memory Address Register) is from the internal bus, and its output is
connected to the external address bus.
The control lines of the memory bus are connected to the instruction decoder and control logic
block.
This unit is responsible for issuing the signals that control the operation of all the units inside the
processor and for interacting with the memory bus.
The number and use of the processor registers R0 through R(n - 1) vary considerably from one
processor to another.
Registers may be provided for general-purpose use by the programmer.
Some may be dedicated as special-purpose registers, such as index registers or stack pointers.
The registers, Y, Z, and TEMP are used by the processor for temporary storage during execution of
some instructions.
These registers are never used for storing data generated by one instruction for later use by
another instruction.
The multiplexer MUX selects either the output of register Y or a constant value 4 to be provided
as input A of the ALU.
The constant 4 is used to increment the contents of the program counter.
We will refer to the two possible values of the MUX control input Select as Select4 and SelectY
for selecting the constant 4 or register Y, respectively.
As instruction execution progresses, data are transferred from one register to another, often
passing through the ALU to perform some arithmetic or logic operation.
The instruction decoder and control logic unit is responsible for implementing the actions
specified by the instruction loaded in the IR register.
The decoder generates the control signals needed to select the registers involved and direct the
transfer of data.
The registers, the ALU, and the interconnecting bus are collectively referred to as the datapath.
1. REGISTER TRANSFERS
Instruction execution involves a sequence of steps in which data are transferred from one register
to another.
For each register, two control signals are used to place the contents of that register on the bus or to
load the data on the bus into the register.
This is represented symbolically in Figure 5.2.
The input and output of register Ri are connected to the bus via switches controlled by the signals
Riin and Riout respectively.
When Riin is set to 1, the data on the bus are loaded into Ri.
Similarly, when Riout is set to 1, the contents of register Ri are placed on the bus.
While Riout is equal to 0, the bus can be used for transferring data from other registers.
Suppose that we wish to transfer the contents of register R1 to register R4. (MOV R1, R4)
This can be accomplished as follows:
Enable the output of register R1 by setting R1out to 1. This place the contents of R1 on the
processor bus.
Enable the input of register R4 by setting R4in to 1. This loads data from the processor bus into
register R4.
All operations and data transfers within the processor take place within time periods defined by
the processor clock.
This data will be loaded into the flip-flop at the rising edge of the clock.
When Riin is equal to 0, the multiplexer feeds back the value currently stored in the flip-flop
The Q output of the flip-flop is connected to the bus via a tri-state gate.
When Riout is equal to 0, the gate's output is in the high-impedance (electrically disconnected) state.
This corresponds to the open-circuit state of a switch.
When Riout = 1, the gate drives the bus to 0 or 1, depending on the value of Q.
The signals whose names are given in any step are activated for the duration of the clock cycle
corresponding to that step. (All other signals are inactive)
In step 1, the output of register R1 and the input of register Y are enabled, causing the contents of
R1 to be transferred over the bus to Y.
In step 2, the multiplexer's Select signal is set to SelectY, causing the multiplexer to gate the
contents of register Y to input A of the ALU.
At the same time, the contents of register R2 are gated onto the bus and, hence, to input B.
The Add line is set to 1, causing the output of the ALU to be the sum of the two numbers at inputs
A and B.
This sum is loaded into register Z because its input control signal is activated.
In step 3, the contents of register Z are transferred to the destination register, R3.
Figure 7.6 gives the sequence of control steps required to perform these operations for the single-bus
architecture of Figure 7.1.
Steps 1 through 3 constitute the instruction fetch phase,
This is the same for all instructions.
The instruction decoding circuit interprets the contents of the IR at the beginning of step 4.
This enables the control circuitry to activate the control signals for steps 4 through 7, which
constitute the execution phase.
The contents of register R3 are transferred to MAR in step 4 and memory read operation is
initialized.
Then the contents of R1 are transferred to register Y in step 5, to prepare for addition operation
When the read operation is completed, the memory operand is available in MDR and addition
operation is performed in step 6
(The contents of MDR are gated on to the bus and thus also to the B input of ALU and register
Y is selected as second input to ALU by choosing SelectY)
The sum is stored in Z and then transferred to R1 in step 7
End causes new instruction fetch cycle to begin by returning to step 1
(updated PC value is stored in Y register in step 2. This is useful for branch instructions)
Figure 7.7 gives a control sequence that implements an unconditional branch instruction.
Processing starts with the fetch phase.
This phase ends when the instruction is loaded into the IR in step 3.
The offset value is extracted from the IR by the instruction decoding circuit, which will also
perform sign extension if required.
Since the value of the updated PC is already available in register Y, the offset X is gated onto the
bus in step 4, and an addition operation is performed.
The result, which is the branch target address, is loaded into the PC in step 5.
The offset X is usually the difference between the branch target address and the address
immediately following the branch instruction.
For example, if the branch instruction is at location 2000 and if the branch target address is 2050,
the value of X must be 46.
All general-purpose registers are combined into a single block called the register file.
Implemented in the form of an array of memory cells.
The register file in Figure 7.8 is said to have three ports.
There are two outputs, allowing the contents of two different registers to be accessed
simultaneously and have their contents placed on buses A and B.
The third port allows the data on bus C to be loaded into a third register during the same clock
cycle.
Buses A and B are used to transfer the source operands to the A and B inputs of the ALU, where
an arithmetic or logic operation may be performed.
The result is transferred to the destination over bus C.
If needed, the ALU may simply pass one of its two input operands unmodified to bus C.
We will call the ALU control signals for such an operation R=A or R=B.
The Incrementer unit is used to increment the PC by 4.
Using the Incrementer eliminates the need to add 4 to the PC using the main ALU.
The source for the constant 4 at the ALU input multiplexer is still useful.
It can be used to increment other addresses, such as the memory addresses in LoadMultiple and
StoreMultiple instructions
In step 1, the contents of the PC are passed through the ALU, using the R=B control signal, and
loaded into the MAR to start a memory read operation.
At the same time the PC is incremented by 4.
In step 2, the processor waits for MFC and loads the data received into MDR, then transfers them
to IR in step 3.
Finally, the execution phase of the instruction requires only one control step to complete, step 4.
HARDWIRED CONTROL
To execute instructions, the processor must have some means of generating the control signals needed in
the proper sequence.
Two categories:
Hardwired control
Microprogrammed control
Hardwired system can operate at high speed; but with little flexibility.
The required control signals are determined by the following information:
Contents of the control step counter
Contents of the instruction register
Contents of the condition code flags
The decoder/encoder block in Figure 7.10 is a combinational circuit that generates the required control
outputs, depending on the state of all its inputs.
By separating the decoding and encoding functions, we obtain the more detailed block diagram in Figure
7.11.
The step decoder provides a separate signal line for each step, or time slot, in the control
sequence.
Similarly, the output of the instruction decoder consists of a separate line for each machine
instruction.
For any instruction loaded in the IR, one of the output lines INS1 through INSm is set to 1, and all
other lines are set to 0.
The input signals to the encoder block are combined to generate the individual control signals Yin,
PCout, Add, End, and so on.
Figure 7.11 contains another control signal called RUN.
When set to 1, RUN causes the counter to be incremented by one at the end of every clock cycle.
When RUN is equal to 0, the counter stops counting.
This is needed whenever the WMFC signal is issued, to cause the processor to wait for the reply
from the memory.
The control hardware shown in Figure 7.10 or 7.11 can be viewed as a state machine that changes
from one state to another in every clock cycle,
Depends on the contents of the instruction register, the condition codes, and the external inputs.
The outputs of the state machine are the control signals.
The sequence of operations carried out by this machine is determined by the wiring of the logic
elements, hence the name "hardwired“
A controller that uses this approach can operate at high speed.
However, it has little flexibility, and the complexity of the instruction set it can implement is
limited.
Consider the execution of instruction Add (R3), R1. and execution of unconditional branch instruction
The above diagram shows the control signals to be generated in the sequence. In the form of equation
control signal Zin is given as follows:
This signal goes high during time slot T1 for all instructions, during T6 for an Add instruction, during T4
for an unconditional branch instruction, and so on.
A COMPLETE PROCESSOR
A complete processor can be designed using the structure shown in Figure 7.14.
This structure has an instruction unit that fetches instructions from an instruction cache or from
the main memory when the desired instructions are not already in the cache.
It has separate processing units to deal with integer data and floating-point data.
A data cache is inserted between these units and the main memory.
A single cache can be used to store both instructions and data or separate caches can be used for
instructions and data.
The processor is connected to the system bus and, hence, to the rest of the computer, by means of
a bus interface.
A processor may include several integer or floating-point units to increase the potential for
concurrent operations.
MICROPROGRAMMED CONTROL
Here the control signals are generated by a program similar to machine language program.
A control word (CW) is a word whose individual bits represent the various control signals.
A sequence of CWs corresponding to the control sequence of a machine instruction constitutes the
microroutine for that instruction.
The individual control words in this microroutine are referred to as microinstructions.
Each of the control steps in the control sequence of an instruction defines a unique combination of
1s and 0s in the CW.
The CWs corresponding to the 7 steps of Figure 7.6 are shown in Figure 7.15.
We have assumed that SelectY is represented by Select=0 and Select4 by Select=1.
The microroutines for all instructions in the instruction set of a computer are stored in a special
memory called the control store.
The control unit can generate the control signals for any instruction by sequentially reading the
CWs of the corresponding microroutine from the control store.
Figure 7.16 shows the basic organization of a microprogrammed control unit.
To read the control words sequentially from the control store, a microprogram counter (µPC) is
used.
Every time a new instruction is loaded into the IR, the output of the block labeled "starting address
generator'' is loaded into the µPC.
The µPC is then automatically incremented by the clock, causing successive microinstructions to
be read from the control store.
Hence, the control signals are delivered to various parts of the processor in the correct sequence.
Microprogram model for BRANCH instruction
When the control unit has to check the status of condition codes or external inputs, the simple
organization shown in fig, 7.16 is not sufficient. In this case, conditional branch microinstructions should
be used. In addition to the branch address, these microinstrutions specify which external inputs or flag
bits or registers to be checked before branching.
The instruction Branch<0 can be implemented by a microroutine such as that shown in Figure 7.17.
After loading this instruction into IR, a branch microinstruction transfers control to the
corresponding microroutine, which is assumed to start at location 25 in the control store.
This address is the output of the starting address generator block in Figure 7.16.
The microinstruction at location 25 tests the N bit of the condition codes.
If this bit is equal to 0, a branch takes place to location 0 to fetch a new machine instruction.
Otherwise, the microinstruction at location 26 is executed to put the branch target address into
register Z.
The microinstruction in location 27 loads this address into the PC.
To support microprogram branching, the organization of the control unit should be modified as shown in
Figure 7.18.
The starting address generator block of Figure 7.16 becomes the starting and branch address generator.
This block loads a new address into the µPC when a microinstruction instructs it to do so.
To allow implementation of a conditional branch, inputs to this block consist of the external inputs and
condition codes as well as the contents of the instruction register.
In this control unit, the µPC is incremented every time a new microinstruction is fetched from the
microprogram memory, except in the following situations:
When a new instruction is loaded into the IR, the µPC is loaded with the starting address of the
microroutine for that instruction.
When a Branch microinstruction is encountered and the branch condition is satisfied, the µPC is
loaded with the branch address.
When an End microinstruction is encountered, the µPC is loaded with the address of the first CW
in the microroutine for the instruction fetch cycle.