MULTICYCLE PROCESSOR
1. Introduction:
A single-cycle processor has three main drawbacks:
1. It requires separate instruction and data memories, while most processors use a single
memory for both.
2. The clock cycle must be long enough for the slowest instruction (lw), even if others could
execute faster.
3. It needs three adders, which are costly, especially when speed is crucial.
A multicycle processor overcomes these issues by breaking instructions into multiple
shorter steps. It:
Uses a single memory since instructions and data are accessed in different steps.
Reuses one adder for multiple tasks across steps.
Allows simpler instructions to complete faster than complex ones.
To design a multicycle processor, we:
1. Build a data path connecting state elements and memories with logic.
2. Add registers to store intermediate values between steps.
3. Implement a finite state machine (FSM) to control execution step-by-step.
2. Multicycle Data Path:
2.1. Load word (lw):
Fetch the Instruction
o The processor reads the lw instruction from memory using the program counter
(PC).
Read the Base Address
o The base address is stored in a register specified by rs1 (Instr[19:15]).
o This value is read from the register file and stored in a temporary register A.
Extract and Extend Offset
o The 12-bit offset (Instr[31:20]) is sign-extended to 32 bits (ImmExt) using the
Extend unit.
o Since ImmExt is constant during execution, it does not need a separate register.
Calculate Memory Address
o The ALU adds the base address (A) and offset (ImmExt) to compute the final
memory address.
o The result is stored in a temporary register ALUOut.
Load Data from Memory
o The processor selects the correct memory address using a multiplexer (AdrSrc).
o Data is read from memory and stored in a temporary register Data.
Write Data to Register File
o The loaded data is written to the destination register rd (Instr[11:7]).
o A multiplexer selects whether the result comes from memory (Data) or the ALU
(ALUOut) before writing to the register file.
o The RegWrite signal enables the register update.
Update the Program Counter (PC)
o Instead of a separate adder, the ALU is reused to add 4 to the PC for the next
instruction.
o Multiplexers select the PC and 4 as ALU inputs, and the PCWrite signal updates
the program counter.
This completes the lw instruction execution in a multicycle processor, making efficient use of
resources by reusing components in different steps.
2.2. Store word (sw)
Fetch the Instruction – The sw instruction is read from memory.
Read Base Address and Register Data
o The base address (from rs1) is read from the register file.
o The immediate offset (Instr[31:25] & Instr[11:7]) is sign-extended.
o The data to be stored (from rs2) is also read from the register file and stored in
the WriteData register.
Calculate Memory Address – The ALU adds the base address and offset to compute
the memory address.
Store Data in Memory – The data in WriteData is written to memory at the computed
address when MemWrite is asserted.
This completes the sw instruction, which stores a register’s value into memory at a calculated
address.
2.3. R-Type Instructions:
R-type instructions operate on two source registers and write the result back to the
register file. The data path already contains all the connections necessary for these
steps.
2.4. Branch Instruction (Beq):
Check Equality – The beq instruction compares two registers. If their values are equal, the
processor branches to a new address.
Calculate Branch Target Address
The ALU adds the current PC (stored as OldPC) and the sign-extended 13-bit offset
(ImmExt) to compute the branch target address (PCTarget).
This calculation is done in step 2 when the registers are also fetched. The result is stored
in ALUOut.
Perform Subtraction & Update PC if Needed
The ALU subtracts the two register values.
If the result is zero (meaning the values are equal), the PCWrite signal is asserted, and
the PC is updated with PCTarget from ALUOut.
2.5. Multicycle Controls:
1. Main FSM (Finite State Machine)
This replaces the single-cycle Main Decoder.
It controls the sequence of operations in multiple cycles.
It generates control signals based on the current state.
2. ALU Decoder
Determines which operation the ALU should perform.
Uses the opcode and function fields (funct3, funct7).
3. Instruction Decoder (Instr Decoder)
Produces the ImmSrc signal, which selects the type of immediate value to be used.
FSM for the Control Unit:
FSM States & Execution Flow
The processor goes through the following states for each instruction:
1. Fetch (S0)
o Fetches instruction from memory using the PC.
o Updates PC for the next instruction.
o Control Signals:
AdrSrc = 0, IRWrite = 1, PCUpdate = 1, ResultSrc = 10
2. Decode (S1)
o Reads registers rs1 and rs2.
o Determines instruction type (R-type, I-type, load/store, branch).
o Control Signals:
ALUSrcA = 01, ALUSrcB = 01, ALUOp = 00
3. MemAdr (S2)
o Used for load (lw) and store (sw) instructions.
o Calculates memory address: ALUOut = rs1 + imm
o Control Signals:
ALUSrcA = 10, ALUSrcB = 01, ALUOp = 00
4. MemRead (S3) / MemWrite (S5)
o For lw: Read memory at ALUOut and store in DataReg.
o For sw: Write rs2 to memory at ALUOut.
o Control Signals:
AdrSrc = 1, MemWrite = 1 (for store)
ResultSrc = 00 (for load)
5. MemWB (S4)
o Write data from DataReg to register file (rd).
o Control Signals:
ResultSrc = 01, RegWrite = 1
6. ExecuteR (S6)
o For R-type: Performs ALU operations (add, sub, and, or, etc.).
o Control Signals:
ALUSrcA = 10, ALUSrcB = 00, ALUOp = 10
7. ALUWB (S7)
o Writes ALU result to rd in the register file.
o Control Signals:
ResultSrc = 00, RegWrite = 1
8. BEQ (S10)
o Computes branch condition (rs1 == rs2).
o Updates PC if the condition is met.
o Control Signals:
ALUSrcA = 10, ALUSrcB = 00, ALUOp = 01, Branch = 1
The Finite State Machine State Diagram for the RISC V Multicycle Processor
Control for Load Word (lw)
First we design the control for the load words (lw), which is given below:
The Complete Architecture for the load words is given below:
The Simulation Results for the LOAD WORD instruction in vivado is given below:
Now we Design the Control signals for other Instruction:
Now We will modify our Control FSM code according to this state machine and simulate for the
Results:
Store word Instruction Implementation (sw)
R-type Instruction Execution
Complete simulation in signal simulation Window :
The below two pics is the snap shot of simulation window with combine
execution of the four instruction, that is, I type, R-type S- type and B-type
Instruction as shown in the below figures.
For Explanation of the above waveform, check the individual waveform
execution as given above, where I write the label with each instructions.