0% found this document useful (0 votes)
160 views5 pages

MIPS RISC Processor Implementation

This paper discusses the implementation of a 32-bit MIPS based RISC processor using Cadence tools, focusing on a 5-stage pipelined architecture. It emphasizes the importance of hazard detection and data forwarding for efficient processing, while analyzing performance issues such as power dissipation and propagation delay. The design aims to achieve a complete ASIC flow from RTL to GDS II, highlighting the trade-offs between power, area, and delay in embedded systems.

Uploaded by

subhrojitsaha25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views5 pages

MIPS RISC Processor Implementation

This paper discusses the implementation of a 32-bit MIPS based RISC processor using Cadence tools, focusing on a 5-stage pipelined architecture. It emphasizes the importance of hazard detection and data forwarding for efficient processing, while analyzing performance issues such as power dissipation and propagation delay. The design aims to achieve a complete ASIC flow from RTL to GDS II, highlighting the trade-offs between power, area, and delay in embedded systems.

Uploaded by

subhrojitsaha25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

2014 IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT)

Implementation of a 32-bit MIPS Based RISC


Processor using Cadence

I
Mohit N. Topiwala, 2N. Saraswathi
I 2
[Link] Student, Asst. Professor S.G
I2
, Department Of Electronics and Communication,SRM University,Chennai, India
l 2
[Link]@[Link], saraswathy.n@[Link]

Abstract- This paper presents implementation of a 5-stage Atmel AVR, Micro Blaze which are widely used for
pipelined 32-bit High performance MIPS based RISC Core. embedded & DSP applications.
MIPS (Microprocessor without Interlocked Pipeline Stages) is a MIPS processor design is based on the RISC design
RISC (Reduced Instruction Set Computer) architecture. A RISC
principle that emphasizes on load/store architecture [1]. Due to
is a microprocessor that had been designed to perform a small set
the difference in time taken to access a register as compared to
of instructions, with the aim of increasing the overall speed of the
a memory location, it is much faster to perform operations in
processor. MIPS have 5 stages of pipeline viz. Instruction Fetch
(IF), Instruction Decode (ID), Execution (EX), Memory Access on chip registers rather than in memory. The architecture
(MEM) and Write Back (WB) modules. The various modules remains the same for all MIPS based processors while the
being used are Instruction Memory, Data Memory, ALU, implementations may differ like single cycle, multi-cycle and
Registers etc. The aim of this paper is to include Hazard detection pipelined implementation [2].
unit and Data forwarding unit for efficient implementation of the Nowadays power is most important performance parameter
pipeline. The design is developed using Verilog-HDL. The main for embedded and portable application [8]. But in any
goal is to do the complete ASIC flow (RTL to GDS II), using
integrated circuit, a tradeoff between power, area and delay is
Cadence tool. The module functionality and performance issues
there. For certain applications, low power circuits will be
like area, power dissipation and propagation delay are analyzed
needed and the design engineers have to compromise with
using Cadence RTL complier using typical libraries of tsmc 0.18
urn technology. more area and delay. In this paper, power reduction is
achieved through by-passing pipeline stages that cause
Keywords- MIPS, 5-stage pipeline, ASIC flow. unnecessary switching activity. Dynamic power depends upon
the switching activity or in general number of transitions and
is given by equation [6],
I. INTRODUCTION
1
2
P=-Cov.m Nt (I)
Microprocessors and Microcontrollers have been designed 2
around two philosophies: Complex Instruction Set Computer
Thereby decreasing number of transitions (N) results in
(CISC) and Reduced Instruction Set Computer (RISC). The
reduced dynamic power consumption [6].
CISC concept is an approach to the Instruction Set
The section 1\ deals with the MIPS Instruction Set, while
Architecture (ISA) design that emphasizes doing more with
11\ with the MIPS architecture and the IV with the pipelined
each instruction using a wide variety of addressing modes,
architecture with improved datapath. The section V gives
variable number of operands in various locations in its
simulation results with explanations.
Instruction Set. As a result, the instructions are of widely
varying lengths and execution times thus demanding a very
II. MIPS INSTRUCTION SET
complex Control Unit, which occupies a large real estate on
chip. On the other hand, the RISC Processor have reduced
number of Instructions, fixed instruction length, more general­ MIPS design consist three types of instruction set. Register

purpose registers, load-store architecture and simplified type, immediate type and Jump type. The instruction format is

addressing modes which makes individual instructions execute shown in the below figure, respectively [3].

faster, achieve a net gain in performance and an overall


simpler design with less silicon consumption as compared to A. Register Type (R-Type):
CISC. The above features make the RISC design ideally suited
(31 - 26) (25 - 21)(20 - 16)(15 - 11) (10 - 6) (5 - 0)
to participate in a powerful trend in the embedded Processor
market - the "system-on-a-chip" [9]. The most common RISC
microprocessors are ARM, SPARC, MIPS and IBM's
I R-Format I Opcode I I IRs Rt Rd I shift I Function

PowerPC. There are number of semiconductor companies


Figure 1. R-Type Instructions Format
implementing RISC processor on "system on chip" such as

ISBN No. 978-1-4799-3914-5/14/$31.00 ©2014 1EEE 979

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on September 02,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
2014 IEEE International Conference on Advanced Communication Control and Computing Teclmologies (ICACCCT)

Figure 1 shows R-Type instruction format. Here last 6 bits Here (Rs) + 5 is stored in the destination register Rt. 5 is
represents the opcode. Next 15 bits represents 3 registers Rs, immediate value.
Rt and Rd respectively, on which operations are performed. C. Jump Type (J-Type):
Rs, Rt are source registers and Rd is the destination registers
respectively. Next 5 bits is shift amount which is point to the (31 - 26) (25 - 0)
number of bits to be shifted. Last 6 bits is function field points
J-Format Opcode Branch target address
to the function that needs to be perfonned on the registers.

RegWrite Figure 5. J-Type Instructions Format

Read instruction Read Figure 5 shows J-Type instruction format. First 5 bits of
address 13t.{)1 data 1 this instruction format represent the type of branch operation
Instruction Read to be performed. The remaining 26 bits represents the branch
memory data 2 offset in 2's complement format. This number is added to the
Registers ALUOp value of the PC to obtain the branch target address.

Jump address [31:01

Figure 2. Data Path for R-Type Instructions 28

PC+4 [31:281
The data path for R-Type instruction can be depicted as
follows in figure 2. It is used mainly for ADD, SUB and OR
operation. E.g. add Rd, Rs, Rt.
Here signed addition contents of (Rs) + (Rt) is saved into
Rd.
Figure 6. Data Path for J-Type Instructions
B. Immediate Type (I-Type):
In figure 6 shows functionality of J -type instruction. It
(31 - 26) (25 - 21) (20 - 16) (15 - 0) shows that the last four bits of PC+4 are appended to the shift
left by 2 value of the 26-bit instruction, taken from Instruction
I 1- Format I Opcode Rs Rt Immediate value
Memory, to get the 32-bit jump address. E.g. j trgt.
Here, j is jump instruction word and trgt is target. It skips
Figure 3. I-Type Instructions Format the other instructions and jumps to the target.

Figure 3 shows I-Type instruction format. Similar to R­ III. MIPS ARCHITECTURE


type, first 6 bits represents the opcode and next 10 bits
represents Rs, Rt respectively. Here Rs is source register and MIPS based RISC processor is basically pipelined
also note that Rt is source register for store and destination architecture implementation. MIPS architecture carried 5
register for load operation. Last 16 bits represents immediate stages of pipeline. Pipelining is nothing but doing more than
value which is a part of instruction but not part of memory. one operation, in a single data path. Pipelining is a technique
which is used to improve overall performance of RISC
R Wnle processor [7]. A multicycle CPU consists of many processes.
ReadhsaruaJoo �I2-W:i.I-rRe;O;-'
--- -R;;;dl For example load might take up to 5 clock cycles, but beq
address (31.()1
Instruction takes only 3 clock cycles. So if one process is taking place,
memory
instead of waiting for the process to complete, simultaneously
start a new process in the same data path, without disturbing
the previous process. For this to happen, the each part of the
process is divided into various pipelined stages. So after
every clock, the process is stored into next pipelined
stage, enabling another operation to start in that stage
Figure 4. Data Path for I-Type Instructions
without disturbing the previous process. Hence all the
stages in the path can be used simultaneously. This in turn can
The data path for I-Type instruction can be depicted as
increase the throughput of MI PS design.
follows in figure 4. It shows that Rt register can be used both
MIPS processor architecture has been implemented using 5
as source and destination accordingly and lastl6 bits is the
pipeline stages. These pipeline stages are Instruction Fetch
immediate value sent to sign extend and then to ALU, for
(IF), Instruction Decode (ID), Execution (EX), Memory
performing the required operation. It is used for ADD!, AND!,
Access (MEM) and Write Back (WB). These stages are
and ORI operation. E.g. addi Rt, Rs, 5

980

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on September 02,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
2014 IEEE International Conference on Advanced Communication Control and Computing Teclmologies (ICACCCT)

separated by special registers called pipeline registers. The


purpose of these registers is to separate the stages of the E. Writeback:
instructions so that there is no conflicting data due to multiple This stage basically writes back the result into register file.
instructions being executed simultaneously. They are named Also it is responsible for taking the writing the data that which
after the stages that they are placed in-between: IF/ID have just computed or loaded from memory out of the
Register, ID/EX Register, EX/MEM Register, and EX/MEM register and writing it to a one of the registers in the
MEM/WB Register. Figure 7 shows datapath for MIPS register file.
pipelined processor.
IV. PIPELINED ARCHITECTURE WITH IMPROVED DATAPATH

This pipelined architecture consist two extra modules


Hazard detection unit and Data forwarding unit to improve
datapath of architecture.

Figure 7. MIPS Pipelined Processor Datapath

A. Instruction Fetch:
Program Counter (PC) is used to fetch the instruction
from the Instruction Memory and is stored in the
Instruction Register (IF/ID) at the next positive clock. This
stage has various modules like Instruction Memory, which
Figure 8. Pipelined Architecture with Improved Datapath
holds the instructions needed. PC holds the address of the
current instruction, which is used as address to the Instruction
A. Hazard Detection Unit:
Memory. The instructions read out from the Instruction
memory are stored in the Instruction Register, which is a part
There are three types of pipeline hazards: Structural
of IF/ID stage.
hazard, data hazard and control hazard. In this paper we are
concentrating on data hazard only. Data hazards arise when an
B. Instruction Decode:
instruction depends on the result of a previous instruction in a
In this stage, decodes the instructions sent from Instruction
way that is exposed by the overlapping of the instructions in
register. Based on the instructions, it reads the operands
the pipeline, thus causing the pipeline to stall until the results
required for register file. Out of 32-bits, 16 go to sign extend,
are made available.
where those 16 bits are extended to 32-bits. The register file
module gives out the value of 2 registers, which are sent to _m
axeOJIion
ALU through ID/EX stage. """',
(inlnstrvc:tions)

C. Execution:
and $12. $2, $5
All the instructions are executed in this stage. All ALU
operations like arithmetic and logical operations, take place in
01" $13, $S.$.:<.
this stage. It performs operations on the data sent from ID/EX
stage. This stage also has left shift by2 and an adder, for beq
Figure 9. Pipelined Data Dependencies
operation. The result from ALU is sent to ALUout register
which is in EX/MEM stage.
Let us take an example of the group of successive
instructions and understand the issue of data hazard, E.g. [4]:
D. Memory Access:
In this stage, memory access stage's purpose is to read
sub $2, $1,$3 # Register $2 written by sub
from and write to the data memory. The control signals passed
and $12,$2,$5 # 1st operand($2) depends on sub
in the EX/MEM register to determine which of the operations
or $13,$6,$2 # 2nd operand($2) depends on sub
to do. The output of the memory is written into the MEM/WB
register along with the WB control that is passed from the
EX/MEM register.

981

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on September 02,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
2014 IEEE International Conference on Advanced Communication Control and Computing Teclmologies (ICACCCT)

Above all instructions are dependent on sub instruction. $2 which were read from a memory to instruction memory. The
stores resulting subtraction of $1 and $3. Figure 9 shows the registers RO and R1 are already stored with data 32'd42 and
dependency of these instructions. It is clearly shown that $2 32'd8 respectively. Test bench is simulated successfully in
updates its value at clock cycle 5 and before that the written Cadence Simvision tool and output is shown in figure 12.
VSIMlJ>nn
value is unavailable but all the successive instructions :R3: o R4: o R5 o R6: o R7: ORB o R9: o R10: o Rl1 o R12: o R13: o R14 0
;�\Wj�toPC-nop
followed by sub instruction read the value from $2. So :;WR
r eg: OData: 0
:;WR
r e9: 3Dati!l: 0
basically they need updated value in very next clock cycle. =Wr
Reg: 3Data: B hI' R3. O(Rl)
VSIM 31> IU'I
This is called data hazard. =R3:
;R3:
8 R4:
8 R4:
o R5
o R5
o R6:
o R6:
o R7:
o R7:
ORB
ORB
o R9:
o R9:
o RlO:
o RlO:
ORl1
o Rll
o R12:
o R12:
o R13:
o R13:
OR14
ORB
=PC: 16
:;Wr
Reg: lData: B
=Wr
Reg: 4Data: B
B. Data Forwarding Unit: =Wr
Reg: 4Data: 42 hI'R4,I(RO)
*11.3: B R+. '1211.5 50R6, 011.7: .� 011.9: 011.10: 011.11 011.12: o R.13: o RI-4
."" I!I 11.'" ..211.5 50Jl6: o A.7: .� o R9: 011.10: 011.11 011.12: 011.13: o Rl't
"'PC: 28
One solution of data hazard is called Data forwarding, • WrReg: S 0.1:10, ,.
,.
�:"o.l:Io:
a'lt/l1teog :'Oo�:
· sub R6,R5.R�
which supplies the resulting operand to the dependant ."" aR+. '12RS SOFl6: .11.7: .� o R9: 011.10: 011.11 011.12: o R.U: 011.1'1
.A.l: 111.4: '12RS SOA.6: a 11.7, .� 011.9: 011.10: 011.11 011.12: 011.13: 011.1'-

instruction as soon it has been computed. Figure 10 shows _PC,12


"'�;60et.: ·
:S�:"OetJj: · and R7.R3.R4
how dependencies are resolved using a Data forwarding unit. 8 11.-4: 42RS SOA.6: a 11.7: .� 011.9: 011.10: 011.11 011.12: 011.13, 011.1'-
."" 8 11.-4: 42RS 50A.6: a 11.7: .� o R9, 011.10: 011.11 011.12: 011.13: 011.1'-
"
[t shows that in sub
instruction, result is available at
"'PC,
"'WIite<g, 70et.: ·

d
-WI'ReO: aOeta: ·
" RS.R3,R4
EXECUT[ON stage (after 3r clock cycle) and successive
.':_�:""_Oeta: or

instructions reads $2 at the end of execution stage or 4 th


Figure 12. Snapshot of Test Bench Output
or 5th clock cycle. This means instructions can be execute
without stalls by just forwarding the data. So, here forwarding
Then the MIPS design synthesized in Cadence RTL
unit gives remedy for data hazard.
compiler using typical libraries of TSMC 0.18 urn teclmology
and area and power analysis was done. Frequencies varying
Frog"•."
execution order
{in instructions} from 100MHz to 1000MHz and results are shown in Table I.
sUb$2.S1.$3 1M
TABLE I: FREQUENCYVS. POWER

and $12. 52, $5

$13. $6, S2
Leakage Switching Total
or
Frequency
Power Power Power
(MHz)
(mW) (mW) (mW)
Figure 10. Pipelined Data Dependencies Resolved with Forwarding
1 100 0.0315 79.250 79.282
V. SIMULATION RESULTS
2 250 0,03[5 [87.653 187,684
The verilog code for 32-bit M[PS based R[SC Processor
3 500 0,03[5 356.736 356,768
compiled using Cadence NC launch and simulated using
Cadence Simvision tool to check their outputs. Simulation 4 750 0.0315 518.206 518.237
waveform is shown in figure [1. 5 1000 0,03[5 674.778 674,8[0

Figure [3 show the graph between dynamic power and


frequency. This graph concludes that dynamic power linearly
increases with frequency. Figure 14 shows the area of MIPS
design which is generated by Cadence tool.

Frequency (MHz)
3: 800
.§.
....
Q) 600
3:
0
c..
400
u

·e
111 200
c:
>
C
Figure. 11 Simulation Waveform of MIPS 0

100 250 500 750 1000


The test bench is written to test the instructions of MIPS
design. Some series of instructions have sent in test bench,

982

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on September 02,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
2014 IEEE International Conference on Advanced Communication Control and Computing Teclmologies (ICACCCT)

Figure 13. Dynamic Power Vs. Frequency Graph urn teclmology. Design of MIPS processor is optimized both in
��!!!!:::.::�:=----------------------- timing and area. Also complete ASIC flow till RTL to GDS
Generaced b y: Encounl:.er(R) RTL [Link] v11..2 0-s017 1.
Generated on: Ha.r 2 <9 �Q�4 1,1.; �g ;14 am
Modul.e:
[Link] [Link]:
MIPS
tsmc18 1.. 0
cop II have done using Cadence SoC Encounter, and analyzed the
complete physical design flow.
Operat:1..ng conditions: [Link] (b [Link] tree)
_
[Link]
l:-:1..zni..":lQ [Link]

�sp,s Ce!-o! Are a Total. Area


ACKNOWLEDGEMENT
MIPS tOp <98 3 0 0 117537 0 (D)
-
,...
ZM
28722
1<9395
79751l.
2<90392
79751.1
21 0392
(D)
(D)
The authors would like to acknowledge the facilities
� 261.9 6-4-482 61182 <none> (D)

�1 23 17
1. "!I2 0
906
38659
26887
38659
26887
<none>
<none>
(D)
(D)
provided towards Cadence Encounter 11.13 by the Research
- -
=b 32 17 65 2372 2372 <none>
Lab, Department of Electronics and Communication
(D)
-
add=20 17 33 2182 2182 (D)
-
gte_ 35 9 71 105l. 105l. (D)
1 t 36 "9
IFIDreg -
71
192
l.O"!l8
5 1.26
1.018
5126
(D)
(D)
Engineering, SRM University, Chennai.
:[Link] 96 5109 5 1. 09 <none> (D)
EXMEMreg 104 "!IS?? 15T7 <none> (D)
MEMWBreg 71 3779 3779 'D)
add 71 45 31 2012 2012
-
i..nc add_13 _22 1 60 1.1.21 1.1.2l.
(D)
(D) REFERENCES
MtJXI - 67 1.10l. 1.1.0l. (D)
MOXO 67 110l. 1.1.01 'D)
add_ 50 _29 56 10"18 1.018 (D)
EU 39 685 685
ALOcontr o l. 28 469 469
(D)
(D) [1] Preetam Bhosle, Hari Krishna Moorthy, "FPGA Implementation of
HO 22 369 369 (D)
£becon£�S!:! 24 296 296 (D) low power pipelined 32-bit RISC Processor", International Journal
of Innovative Technology and Exploring Engineering (IJITEE),
August 2012.
Figure 14. Area Report Generated by Cadence Tool
[2] Gautham P, Parthasarathy R. Karthi, Balasubramanian. "Low
Power Pipelined MIPS Processor Design," in the proceedings of
Netlist of design is generated in pre-layout steps. This the 2009, l2(h international symposium,2009 pp. 462-465.
synthesized Verilog netlist and respective design constraints [3] Neenu Joseph. Sabarinath. S. "FPGA based Implementation of
file (.sdc file) are imported to Cadence SoC Encounter and are High Performance Architectural level Low Power 32-bit RISC
used to generate automated layout from standard cells, Core", 2009 IEEE.
placement and routing. After performing nano route step, [mal
layout of design is shown in figure 15.
[4] "Computer Organization and Design- the hardware/software
interface", 3rd edition by David A. Patterson and John L.
Hennessy, pp. 370-412.
[5] Mrs. Rupali Balpande, Mrs. Rashmi Keote, "Design of FPGA
based Instruction Fetch & Decode Module of 32-bit RISC (MIPS)
Processor". 2011 IEEE.
[6] Harpreet Kaur, Nitika Gulati, "Pipelined MIPS With Improved
Datapath", IJERA, Vol. 3, Issue 1, January -February 2013,
pp.762-765.
[7] Sharda P. Katke, G.P. Jain,"Design and Implementation of 5
Stages Pipelined Architecture in 32 Bit RISC Processor", IJETAE,
Volume 2. Issue 4. April 2012, pp. 340-346.
[8] Pejman Lotfi-Kamran. Ali-Asghar Salehpour. Amir-Mohammad
Rahmani. Ali Afzali-Kusha, and Zainalabedin Navabi. "Dynamic
Power Reduction of Stalls in Pipelined Architecture Processors",
International Journal Of Design, Analysis And Tools For Circuits
And Systems. Vol. I, No. I, June 2011.
[9] Neeraj Jain, "VLSI Design and Optimized Implementation of a
MIPS RISC Processor using XILINX Toor', International Journal
Figure. 15 Layout of MIPS Design using Cadence SoC Encounter of Advanced Research in Computer Science and Electronics
Engineering (IJARCSEE) Volume 1, Issue 10, December 2012.
VI. CONCLUSION

In this paper, design of 32-bit MIPS based RISC processor


is implemented successfully with pipeline functionalities.
Every instruction is executed in one clock cycle with 5-stage
pipelining. This design shows the implementation of MIPS
based CPU capable of handling various R -type, J-type and
I-type of instruction and each of these categories has a
different format. These instructions are verified successfully
through testbench. Designing Forwarding unit and hazard
detection unit to overcome the data dependencies was
critical task and it was implemented successfully. The design
is implemented using Verilog-HDL and synthesized using
Cadence RTL complier using typical libraries of TSMC 0.18

983

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on September 02,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.

You might also like