MB Ref Guide
MB Ref Guide
Processor
Reference Guide
Embedded
Development Kit
www.xilinx.com
1-800-255-7778
"Xilinx" and the Xilinx logo shown above are registered trademarks of Xilinx, Inc. Any rights not expressly granted herein are reserved.
CoolRunner, RocketChips, Rocket IP, Spartan, StateBENCH, StateCAD, Virtex, XACT, XC2064, XC3090, XC4005, and XC5210 are
registered trademarks of Xilinx, Inc.
www.xilinx.com
1-800-255-7778
1.0
Revision
Xilinx EDK (Embedded Processor Development Kit) release.
www.xilinx.com
1-800-255-7778
Table of Contents
Preface: About This Guide
Manual Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Typographical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Online Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
11
12
16
16
17
19
19
20
20
21
21
22
22
23
23
23
24
25
26
26
26
26
27
28
28
28
29
30
30
31
www.xilinx.com
1-800-255-7778
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bus Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Typical Peripheral Placement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bit and Byte Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Core I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bus Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
OPB Bus Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LMB Bus Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LMB Bus Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Read and Write Data Steering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
FSL Bus Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Debug Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
33
35
43
43
45
45
49
50
53
54
56
57
57
59
59
59
61
BRAM LMB Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
BRAM OPB Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Small data area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Data area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Common un-initialized area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Literals or constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Interrupt and Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
67
67
69
70
70
70
71
71
71
71
www.xilinx.com
1-800-255-7778
73
73
74
74
Preface
Manual Contents
This manual discusses the following topics specific to MicroBlaze soft processor:
Core Architecture
Additional Resources
For additional information, go to http://support.xilinx.com. The following table lists
some of the resources you can access from this website. You can also directly access these
resources using the provided URLs.
Resource
Tutorials
Description/URL
Tutorials covering Xilinx design flows, from design entry to
verification and debugging
http://support.xilinx.com/support/techsup/tutorials/index.htm
Answer Browser
Application Notes
Data Book
Problem Solvers
www.xilinx.com
1-800-255-7778
Resource
Description/URL
Tech Tips
Latest news, design tips, and patch information for the Xilinx
design environment
http://www.support.xilinx.com/xlnx/xil_tt_home.jsp
GNU Manuals
Conventions
This document uses the following conventions. An example illustrates each convention.
Typographical
The following typographical conventions are used in this document:
Convention
Meaning or Use
Courier font
Courier bold
Helvetica bold
Italic font
Square brackets
Braces
{ }
Vertical bar
Example
[ ]
File Open
Keyboard shortcuts
Ctrl+C
Variables in a syntax
statement for which you must
supply values
ngdbuild design_name
Emphasis in text
An optional entry or
parameter. However, in bus
specifications, such as
bus[7:0], they are required.
ngdbuild [option_name]
design_name
lowpwr ={on|off}
lowpwr ={on|off}
www.xilinx.com
1-800-255-7778
Conventions
Convention
Meaning or Use
Example
Vertical ellipsis
.
.
.
Horizontal ellipsis . . .
Online Document
The following conventions are used in this document:
Convention
Meaning or Use
Example
Blue text
Cross-reference link to a
location in the current file or
in another file in the current
document
Red text
Cross-reference link to a
location in another document
Go to http://www.xilinx.com
for the latest speed files.
www.xilinx.com
1-800-255-7778
10
www.xilinx.com
1-800-255-7778
Chapter 1
MicroBlaze Architecture
Summary
This document describes the architecture for the MicroBlaze 32-bit soft processor core.
Overview
The MicroBlaze embedded soft core is a reduced instruction set computer (RISC)
optimized for implementation in Xilinx field programmable gate arrays (FPGAs). See
Figure 1-1 for a block diagram depicting the MicroBlaze core.
Features
The MicroBlaze embedded soft core includes the following features:
32-bit instruction word with three operands and two addressing modes
Separate 32-bit instruction and data buses that conform to IBMs OPB (On-chip
Peripheral Bus) specification
Separate 32-bit instruction and data buses with direct connection to on-chip block
RAM through a LMB (Local Memory Bus)
www.xilinx.com
1-800-255-7778
11
Instruction-side
bus interface
Data-side
bus interface
Add/Sub
ILMB
Program
Counter
DLMB
Shift/Logical
Multiply
Bus
IF
Bus
IF
Instruction
Decode
MFSL0..7
SFSL0..7
Register File
32 X 32b
Instruction
Buffer
DOPB
IOPB
Instructions
All MicroBlaze instructions are 32 bits and are defined as either Type A or Type B. Type A
instructions have up to two source register operands and one destination register operand.
Type B instructions have one source register and a 16-bit immediate operand (which can be
extended to 32 bits by preceding the Type B instruction with an IMM instruction). Type B
instructions have a single destination register operand. Instructions are provided in the
following functional categories: arithmetic, logical, branch, load/store, and special.
Table 1-2 lists the MicroBlaze instruction set. Refer to the MicroBlaze Instruction Set
Architecture document for more information on these instructions. Table 1-1 describes the
instruction set nomenclature used in the semantics of each instruction.
Table 1-1: Instruction Set Nomenclature
Symbol
Description
Ra
Rb
Rd
Sa
Sd
s(x)
*Addr
12
www.xilinx.com
1-800-255-7778
Instructions
0-5
6-10
11-15 16-20
Type B
0-5
6-10
11-15
ADD Rd,Ra,Rb
000000
Rd
Ra
Rb
00000000000
Rd := Rb + Ra
RSUB Rd,Ra,Rb
000001
Rd
Ra
Rb
00000000000
Rd := Rb + Ra + 1
ADDC Rd,Ra,Rb
000010
Rd
Ra
Rb
00000000000
Rd := Rb + Ra + C
RSUBC Rd,Ra,Rb
000011
Rd
Ra
Rb
00000000000
Rd := Rb + Ra + C
ADDK Rd,Ra,Rb
000100
Rd
Ra
Rb
00000000000
Rd := Rb + Ra
RSUBK Rd,Ra,Rb
000101
Rd
Ra
Rb
00000000000
Rd := Rb + Ra + 1
ADDKC Rd,Ra,Rb
000110
Rd
Ra
Rb
00000000000
Rd := Rb + Ra + C
RSUBKC Rd,Ra,Rb
000111
Rd
Ra
Rb
00000000000
Rd := Rb + Ra + C
CMP Rd,Ra,Rb
000101
Rd
Ra
Rb
00000000001
Rd := Rb cmp Ra (signed)
CMPU Rd,Ra,Rb
000101
Rd
Ra
Rb
00000000011
Rd := Rb cmp Ra (unsigned)
ADDI Rd,Ra,Imm
001000
Rd
Ra
Imm
Rd := s(Imm) + Ra
RSUBI Rd,Ra,Imm
001001
Rd
Ra
Imm
Rd := s(Imm) + Ra + 1
ADDIC Rd,Ra,Imm
001010
Rd
Ra
Imm
Rd := s(Imm) + Ra + C
RSUBIC Rd,Ra,Imm
001011
Rd
Ra
Imm
Rd := s(Imm) + Ra + C
ADDIK Rd,Ra,Imm
001100
Rd
Ra
Imm
Rd := s(Imm) + Ra
RSUBIK Rd,Ra,Imm
001101
Rd
Ra
Imm
Rd := s(Imm) + Ra + 1
ADDIKC Rd,Ra,Imm
001110
Rd
Ra
Imm
Rd := s(Imm) + Ra + C
RSUBIKC Rd,Ra,Imm
001111
Rd
Ra
Imm
Rd := s(Imm) + Ra + C
MUL Rd,Ra,Rb
010000
Rd
Ra
Rb
00000000000
Rd := Ra * Rb
BSRL Rd,Ra,Rb
010001
Rd
Ra
Rb
00000000000
Rd : = Ra >> Rb
BSRA Rd,Ra,Rb
010001
Rd
Ra
Rb
01000000000
BSLL Rd,Ra,Rb
010001
Rd
Ra
Rb
10000000000
Rd := Ra << Rb
MULI Rd,Ra,Imm
011000
Rd
Ra
Imm
Rd := Ra * s(Imm)
BSRLI Rd,Ra,Imm
011001
Rd
Ra
0000
0000.. Imm
Rd : = Ra >> Imm
BSRAI Rd,Ra,Imm
011001
Rd
Ra
0000
0100.. Imm
BSLLI Rd,Ra,Imm
011001
Rd
Ra
0000
1000.. Imm
Rd := Ra << Imm
IDIV Rd,Ra,Rb
010010
Rd
Ra
Rb
00000000000
Rd := Rb/Ra, signed
IDIVU Rd,Ra,Rb
010010
Rd
Ra
Rb
00000000001
Rd := Rb/Ra, unsigned
GET Rd,FSLx
011011
Rd
00000
0000
FSLx
PUT Ra,FSLx
011011
00000
Ra
1000
FSLx
nGET Rd,FSLx
011011
Rd
00000
0100
FSLx
nPUT Ra,FSLx
011011
00000
Ra
1100
FSLx
21-31
Semantics
16-31
www.xilinx.com
1-800-255-7778
13
Table 1-2:
Type A
0-5
6-10
11-15 16-20
21-31
Type B
0-5
6-10
11-15
cGET Rd,FSLx
011011
Rd
00000
0010
FSLx
cPUT Ra,FSLx
011011
00000
Ra
1010
FSLx
ncGET Rd,FSLx
011011
Rd
00000
0110
FSLx
ncPUT Ra,FSLx
011011
00000
Ra
1110
FSLx
OR Rd,Ra,Rb
100000
Rd
Ra
Rb
00000000000
Rd := Ra or Rb
AND Rd,Ra,Rb
100001
Rd
Ra
Rb
00000000000
Rd := Ra and Rb
XOR Rd,Ra,Rb
100010
Rd
Ra
Rb
00000000000
Rd := Ra xor Rb
ANDN Rd,Ra,Rb
100011
Rd
Ra
Rb
00000000000
Rd := Ra and Rb
SRA Rd,Ra
100100
Rd
Ra
0000000000000001
SRC Rd,Ra
100100
Rd
Ra
0000000000100001
SRL Rd,Ra
100100
Rd
Ra
0000000001000001
SEXT8 Rd,Ra
100100
Rd
Ra
0000000001100000
Rd[0:23] := Ra[24];
Semantics
16-31
Rd[24:31] := Ra[24:31]
SEXT16 Rd,Ra
100100
Rd
Ra
0000000001100001
Rd[0:15] := Ra[16];
Rd[16:31] := Ra[16:31]
WIC Ra,Rb
100100
Ra
Ra
Rb
01101000
WDC Ra,Rb
100100
Ra
Ra
Rb
01100100
MTS Sd,Ra
100101
00000
Ra
110000000000000d
Sd := Ra , where S1 is MSR
MFS Rd,Sa
100101
Rd
00000
100000000000000a
MSRCLR Rd,Imm
100101
Rd
00001
00
Imm14
MSRSET Rd,Imm
100101
Rd
00000
00
Imm14
BR Rb
100110
00000
00000
Rb
00000000000
PC := PC + Rb
BRD Rb
100110
00000
10000
Rb
00000000000
PC := PC + Rb
BRLD Rd,Rb
100110
Rd
10100
Rb
00000000000
PC := PC + Rb; Rd := PC
BRA Rb
100110
00000
01000
Rb
00000000000
PC := Rb
BRAD Rb
100110
00000
11000
Rb
00000000000
PC := Rb
BRALD Rd,Rb
100110
Rd
11100
Rb
00000000000
PC := Rb; Rd := PC
BRK Rd,Rb
100110
Rd
01100
Rb
00000000000
BEQ Ra,Rb
100111
00000
Ra
Rb
00000000000
if Ra = 0: PC := PC + Rb
BNE Ra,Rb
100111
00001
Ra
Rb
00000000000
if Ra /= 0: PC := PC + Rb
BLT Ra,Rb
100111
00010
Ra
Rb
00000000000
if Ra < 0: PC := PC + Rb
BLE Ra,Rb
100111
00011
Ra
Rb
00000000000
if Ra <= 0: PC := PC + Rb
BGT Ra,Rb
100111
00100
Ra
Rb
00000000000
if Ra > 0: PC := PC + Rb
14
www.xilinx.com
1-800-255-7778
Instructions
Table 1-2:
Type A
0-5
6-10
11-15 16-20
Type B
0-5
6-10
11-15
BGE Ra,Rb
100111
00101
Ra
Rb
00000000000
if Ra >= 0: PC := PC + Rb
BEQD Ra,Rb
100111
10000
Ra
Rb
00000000000
if Ra = 0: PC := PC + Rb
BNED Ra,Rb
100111
10001
Ra
Rb
00000000000
if Ra /= 0: PC := PC + Rb
BLTD Ra,Rb
100111
10010
Ra
Rb
00000000000
if Ra < 0: PC := PC + Rb
BLED Ra,Rb
100111
10011
Ra
Rb
00000000000
if Ra <= 0: PC := PC + Rb
BGTD Ra,Rb
100111
10100
Ra
Rb
00000000000
if Ra > 0: PC := PC + Rb
BGED Ra,Rb
100111
10101
Ra
Rb
00000000000
if Ra >= 0: PC := PC + Rb
ORI Rd,Ra,Imm
101000
Rd
Ra
Imm
Rd := Ra or s(Imm)
ANDI Rd,Ra,Imm
101001
Rd
Ra
Imm
Rd := Ra and s(Imm)
XORI Rd,Ra,Imm
101010
Rd
Ra
Imm
Rd := Ra xor s(Imm)
ANDNI Rd,Ra,Imm
101011
Rd
Ra
Imm
Rd := Ra and s(Imm)
IMM Imm
101100
00000
00000
Imm
Imm[0:15] := Imm
RTSD Ra,Imm
101101
10000
Ra
Imm
PC := Ra + s(Imm)
RTID Ra,Imm
101101
10001
Ra
Imm
PC := Ra + s(Imm); MSR[IE] := 1
RTBD Ra,Imm
101101
10010
Ra
Imm
PC := Ra + s(Imm); MSR[BIP] := 0
BRID Imm
101110
00000
10000
Imm
PC := PC + s(Imm)
BRLID Rd,Imm
101110
Rd
10100
Imm
PC := PC + s(Imm); Rd := PC
BRAI Imm
101110
00000
01000
Imm
PC := s(Imm)
BRAID Imm
101110
00000
11000
Imm
PC := s(Imm)
BRALID Rd,Imm
101110
Rd
11100
Imm
PC := s(Imm); Rd := PC
BRKI Rd,Imm
101110
Rd
01100
Imm
BEQI Ra,Imm
101111
00000
Ra
Imm
if Ra = 0: PC := PC + s(Imm)
BNEI Ra,Imm
101111
00001
Ra
Imm
if Ra /= 0: PC := PC + s(Imm)
BLTI Ra,Imm
101111
00010
Ra
Imm
if Ra < 0: PC := PC + s(Imm)
BLEI Ra,Imm
101111
00011
Ra
Imm
if Ra <= 0: PC := PC + s(Imm)
BGTI Ra,Imm
101111
00100
Ra
Imm
if Ra > 0: PC := PC + s(Imm)
BGEI Ra,Imm
101111
00101
Ra
Imm
if Ra >= 0: PC := PC + s(Imm)
BEQID Ra,Imm
101111
10000
Ra
Imm
if Ra = 0: PC := PC + s(Imm)
BNEID Ra,Imm
101111
10001
Ra
Imm
if Ra /= 0: PC := PC + s(Imm)
BLTID Ra,Imm
101111
10010
Ra
Imm
if Ra < 0: PC := PC + s(Imm)
BLEID Ra,Imm
101111
10011
Ra
Imm
if Ra <= 0: PC := PC + s(Imm)
BGTID Ra,Imm
101111
10100
Ra
Imm
if Ra > 0: PC := PC + s(Imm)
BGEID Ra,Imm
101111
10101
Ra
Imm
if Ra >= 0: PC := PC + s(Imm)
21-31
Semantics
16-31
www.xilinx.com
1-800-255-7778
15
0-5
6-10
11-15 16-20
Type B
0-5
6-10
11-15
110000
Rd
Ra
LBU Rd,Ra,Rb
21-31
Semantics
16-31
Rb
00000000000
Addr := Ra + Rb;
Rd[0:23] := 0, Rd[24:31] := *Addr
LHU Rd,Ra,Rb
110001
Rd
Ra
Rb
00000000000
Addr := Ra + Rb;
Rd[0:15] := 0, Rd[16:31] := *Addr
LW Rd,Ra,Rb
110010
Rd
Ra
Rb
00000000000
Addr := Ra + Rb;
Rd := *Addr
SB Rd,Ra,Rb
110100
Rd
Ra
Rb
00000000000
Addr := Ra + Rb;
*Addr := Rd[24:31]
SH Rd,Ra,Rb
110101
Rd
Ra
Rb
00000000000
Addr := Ra + Rb;
*Addr := Rd[16:31]
SW Rd,Ra,Rb
110110
Rd
Ra
Rb
00000000000
Addr := Ra + Rb;
*Addr := Rd
LBUI Rd,Ra,Imm
111000
Rd
Ra
Imm
Addr := Ra + s(Imm);
Rd[0:23] := 0, Rd[24:31] := *Addr
LHUI Rd,Ra,Imm
111001
Rd
Ra
Imm
Addr := Ra + s(Imm);
Rd[0:15] := 0, Rd[16:31] := *Addr
LWI Rd,Ra,Imm
111010
Rd
Ra
Imm
Addr := Ra + s(Imm);
Rd := *Addr
SBI Rd,Ra,Imm
111100
Rd
Ra
Imm
Addr := Ra + s(Imm);
*Addr := Rd[24:31]
SHI Rd,Ra,Imm
111101
Rd
Ra
Imm
Addr := Ra + s(Imm);
*Addr := Rd[16:31]
SWI Rd,Ra,Imm
111110
Rd
Ra
Imm
Addr := Ra + s(Imm);
*Addr := Rd
Registers
MicroBlaze is a fully orthogonal architecture. It has thirty-two 32-bit general purpose
registers and two 32-bit special purpose registers.
16
www.xilinx.com
1-800-255-7778
Registers
31
R0-R31
Figure 1-2:
R0-R31
Name
R0 through
R31
Description
General Purpose Register
Reset Value
0x00000000
31
PC
Figure 1-3:
PC
Name
PC
Description
Program Counter
Reset Value
0x00000000
www.xilinx.com
1-800-255-7778
17
CC
RESERVED
24
25 26 27 28 29 30 31
Figure 1-4:
IE BE
MSR
Name
CC
Description
Reset Value
0
Reserved
24
DCE
DZ
Dvision by Zero
ICE
FSL
FSL Error
BIP
Break in Progress
0 No Break in Progress
1 Break in Progress
Source of break can be software break
instruction or hardware break from
Ext_Brk or Ext_NM_Brk pin.
18
www.xilinx.com
1-800-255-7778
Pipeline
Name
Description
Reset Value
0
Arithmetic Carry
0 No Carry (Borrow)
1 Carry (No Borrow)
30
IE
Interrupt Enable
0 Interrupts disabled
1 Interrupts enabled
31
BE
Buslock Enable
Pipeline
This section describes the MicroBlaze pipeline architecture.
Pipeline Architecture
The MicroBlaze pipeline is a parallel pipeline, divided into three stages:
Fetch
Decode
Execute
In general, each stage takes one clock cycle to complete. Consequently, it takes three clock
cycles (ignoring any delays or stalls) for the instruction to complete.
.
cycle 1
cycle 2
cycle 3
Fetch
Decode
Execute
In the MicroBlaze parallel pipeline, each stage is active on each clock cycle. Three
instructions can be executed simultaneously, one at each of the three pipeline stages. Even
though it takes three clock cycles for each instruction to complete, each pipeline stage can
work on other instructions in parallel with and in advance of the instruction that is
completing. Within one clock cycle, one new instruction is fetched, another is decoded, and
a third is completed. The pipeline effectively completes one instruction per clock cycle.
instruction 1
instruction 2
cycle 1
cycle 2
cycle 3
Fetch
Decode
Execute
Fetch
Decode
Execute
Fetch
Decode
instruction 3
www.xilinx.com
1-800-255-7778
cycle4
cycle5
Execute
19
Branches
Similar to other processor pipelines, the MicroBlaze pipeline can originate control hazards
that affect the pipeline execution rate. When an instruction that changes the control flow of
a program (branches) is executed and completed, and eventually changes the program
flow (taken branches), the previous pipeline work becomes useless. When the processor
executes a taken branch, the instructions in the fetch and decode stages are not the correct
ones, and must be discarded or flushed from the pipeline. The processor must refill the
pipeline with the correct instructions, taking three clock cycles for a taken branch, adding
a latency of two cycles for refilling the pipeline.
MicroBlaze uses two techniques to reduce the penalty of taken branches. One technique is
to use delay slots and another is use of a history buffer.
Delay Slots
When the processor executes a taken branch and flushes the pipeline, it takes three clock
cycles to refill the pipeline. By allowing the instruction following a branch to complete, this
penalty is reduced. Instead of flushing the instructions in both the fetch and decode stages,
only the fetch stage is discarded and the instruction in the decode stage is allowed to
complete. This effectively produces a delayed branch or delay slot. Since the work done on
the delay slot instruction is not discarded, this technique effectively reduces the branch
penalty from two clock cycles to one. Branch instructions that allow execution of the
subsequent instruction in the delay slot are denoted by a D in the instruction mnemonic.
For example, the BNE instruction does not execute the subsequent instruction in the delay
slot, whereas BNED does execute the next instruction in the delay slot before control is
transferred to the branch location.
Load/Store Architecture
MicroBlaze can access memory in the following three data sizes:
Byte (8 bits)
Memory accesses are always data-size aligned. For halfword accesses, the least significant
address bit is forced to 0. Similarly, for word accesses, the two least significant address bits
are forced to 0.
MicroBlaze is a Big-Endian processor and uses the Big-Endian address and labeling
conventions shown in Figure 1-5 when accessing memory. The following abbreviations are
used:
20
www.xilinx.com
1-800-255-7778
Byte address
n+1
n+2
n+3
Byte label
MSByte
Byte significance
Bit label
LSByte
31
Bit significance
MSBit
Byte address
n+1
Byte label
MSByte
LSByte
Byte significance
Bit label
Halfword
15
MSBit
Byte address
Byte label
LSBit
Byte
MSByte
Byte significance
Bit significance
LSBit
Bit significance
Bit label
Word
MSBit LSBit
Figure 1-5:
Interrupts
When an interrupt occurs, MicroBlaze stops the current execution to handle the interrupt
request. MicroBlaze branches to address 0x00000010 and uses the General Purpose
Register 14 to store the address of the instruction that was to be executed when the
interrupt occurred. It also disables future interrupts by clearing the Interrupt Enable flag in
the Machine Status Register (setting bit 30 to 0 in MSR). The instruction located at the
address where the current PC points to is not executed. Interrupts do not occur if the BIP
bit in the MSR register is active (equal to 1).
www.xilinx.com
1-800-255-7778
21
Latency
The time it will take MicroBlaze to enter an Interrupt Service Routine (ISR) from the time
an interrupt occurs, depends on the configuration of the processor. If MicroBlaze is
configuredto have a hardware divider, the largest latency will happen when an interrupt
occurs during the execuion of a division instruction.
Table 1-6 shows the different scenarios for interrupts. The cycle count includes the cycles
for completing the current instruction, the cycles for to branch and the cycles to access the
first instruction in the ISR.
Table 1-6: Interrupt latencies
Scenario
ISR in LMB
ISR in OPB
Normally
4 cycles
6 cycles
6 cycles
8 cycles
38 cycles
40 cycles
Equivalent Pseudocode
r14 PC
PC 0x00000010
MSR[IE] 0
Exceptions
When an exception occurs, MicroBlaze stops the current execution to handle the exception.
MicroBlaze branches to address 0x00000008 and uses the General Purpose Register 17 to
store the address of the instruction that was to be executed when the exception occurred.
The instruction located at the address where the current PC points to is not executed.
Equivalent Pseudocode
r17 PC
PC 0x00000008
Breaks
There are two kinds of breaks:
Software Breaks
To perform a software break, use the brk and brki instructions. Refer to the Instruction Set
Architecture documentation for more information on software breaks.
Hardware Breaks
Hardware breaks are performed by asserting the external break signal. When a hardware
break occurs, MicroBlaze stops the current execution to handle the break. MicroBlaze
branches to address 0x00000018 and uses the General Purpose Register 16 to store the
22
www.xilinx.com
1-800-255-7778
Instruction Cache
address of the instruction that was to be executed when the break occurred. MicroBlaze
also disables future breaks by setting the Break In Progress (BIP) flag in the Machine Status
Register (setting bit 28 to 1 in MSR). The instruction located at the address where the
current PC points to is not executed.
Hardware breaks are only handled when there is no break in progress (the Break In
Progress flag is set to 0). The Break In Progress flag has higher precedence than the
Interrupt Enabled flag. While no interrupts are handled when the Break In Progress flag is
set, breaks that occur when interrupts are disabled are handled immediately. However, it is
important to note that non-maskable hardware breaks are always handled immediately.
Equivalent Pseudocode
r16 PC
PC 0x00000018
MSR[BIP] 1
Instruction Cache
Overview
MicroBlaze may be used with an optional instruction cache for improved performance
when executing code that resides outside the LMB address range.
The instruction cache has the following features
Cache on and off controlled using a new bit in the MSR register
Does not require special memory controllers. Will work with existing OPB peripherals
Cache Organization
When the instruction cache is used, the memory address space in split into two segments a cacheable segment and a non-cacheable segment. The cacheable segment is determined
by two parameters, C_ICACHE_BASEADDR and C_ICACHE_HIGHADDR. All
addresses within this range correspond to the cacheable address space segment. All other
addresses are non-cacheable.
www.xilinx.com
1-800-255-7778
23
30 31
Tag Address
Addr
Addr
Tag
BRAM
Tag
Cache Line
- -
Cache_Hit
Valid
Instruction
BRAM
Figure 1-6:
Cache_instruction_data
Cache Organization
All cacheable instruction addresses are further split into two segments - a cache line
segment and a tag address segment. The size of the two segments can be configured by the
user. The address bits between bit 0 and the first tag address bit are ignored in the cache.
The size of the cache line can be between 9 to 14 bits. This results in a cache sizes ranging
from 4 Kbytes to 64 Kbytes. There is no limit on the tag address size.
Cache Operation
In the instruction fetch stage, MicroBlaze writes the instruction address to the instruction
address bus and waits for a ready signal. To reduce wait states, a request is done
simultaneously on the instruction OPB and the instruction LMB. If an acknowledge signal
is received from the LMB in the next cycle, the instruction access from OPB is aborted. For
every instruction fetched, the instruction cache detects if the instruction address belongs to
the cacheable segment. If the address is non-cacheable, the cache ignores the instruction
and allows the LMB or the OPB to fulfill the request. If the address is cacheable, a lookup
is performed on the tag memory to check if the requested instruction is in the cache. The
lookup is successful when both the valid bit is set and the tag address is the same as the tag
address segment of the instruction address.
24
www.xilinx.com
1-800-255-7778
Instruction Cache
0,1 (Locked,Valid)
Tag Address
IOPB_Address
0
Cache Line
Tag BRAM
Data
WE
Address
Instruction BRAM
IOPB_Select
IOPB_XferAck
Address
WE
IOPB_Data
Data
Software
MSR Bit
Bit 26 in the MSR indicates whether or not the cache is enabled. The MFS and MTS
instructions are used to read and write to the MSR respectively.
The contents of the cache are preserved by default when the cache is disabled. The user
may overwrite the contents of the cache using the WIC instruction or using the hardware
debug logic of MicroBlaze.
WIC Instruction
The WIC instruction may be used to update the instruction cache from a software
program. The assembly instruction is
WIC Ra,Rb
Where Ra contains cache line, tag address, valid and lock bit, Rb contains the instruction
data.
Ra(31) is the lock bit, Ra(30) is the valid bit (valid when bit is set to 1), the rest of the Ra
contains the instruction address.
This instruction can only be used when the cache is disabled. The lock bit is described in
the Lock Bit section below. The
HW Debug Logic
The HW debug logic may be used to perform a similar operation as the WIC instruction.
www.xilinx.com
1-800-255-7778
25
Lock Bit
The lock bit can be used to permanently lock a code segment into the cache and therefore
guarantee the instruction execution time. Locking of the cacheline however may result in a
decrease in the number of cache hits. This is because there could be addresses that were not
cached as the cacheline is locked.
The use of instruction LMB in most cases would be a better choice for locking code
segments since the wait states for accessing the LMB is the same as for cache hits.
LMB Memory
Instruction LMB memory can be used even when instruction cache is used. The LMB
address in the case has to be in the non-cacheable memory segment.
Data Cache
Overview
MicroBlaze may be used with an optional data cache for improved performance when
reading data that resides outside the LMB address range.
The data cache has the following features
Cache on and off controlled using a new bit in the MSR register
Does not require special memory controllers. Will work with existing OPB peripherals
Cache Organization
When the data cache is used, the memory address space in split into two segments - a
cacheable segment and a non-cacheable segment. The cacheable area is determined by two
parameters, C_DCACHE_BASEADDR and C_DCACHE_HIGHADDR. All addresses
within this range correspond to the cacheable address space segment. All other addresses
are non-cacheable.
26
www.xilinx.com
1-800-255-7778
Data Cache
30 31
Tag Address
Addr
Tag
BRAM
Addr
Data
BRAM
Figure 1-8:
- -
Cache Line
Tag
=
Valid
Load_Instruction
Cache_Hit
Cache data
Cache Organization
All cacheable data addresses are further split into two segments - a cache line segment and
a tag address segment. The size of the two segments can be configured by the user. The
address bits between bit 0 and the first tag address bit are ignored in the cache. The size of
the cache line can be between 9 to 14 bits. This results in a cache sizes ranging from 4
Kbytes to 64 Kbytes. There is no limit on the tag address size.
Cache Operation
When MicroBlaze executes a store instruction, the operation is performed as normal but if
the address is within the cacheable address segment, the data cache is updated with the
new data.
When MicroBlaze executes a load instruction, the address is first checked to see if the
address is within the cacheable area and secondly if the address is in the data cache. If that
case, the data is fetch from the data cache.
0,1 (Locked,Valid)
Tag Address
DOPB_Address
Cache Line
Cacheable_address
Tag BRAM
Data
WE
Address
Instruction BRAM
DOPB_Select
DOPB_XferAck
DOPB_RNW
DOPB_Data
Address
WE
Data
www.xilinx.com
1-800-255-7778
27
If the read data is in the cache, the cache will drive the ready signal (Cache_Hit) for
MicroBlaze and the data for the address. If the read data is not in the cache, the cache will
not drive the ready signal but will wait until the OPB fulfills the request.
Software
MSR Bit
Bit 24 in the MSR indicates whether or not the cache is enabled. The MFS and MTS
instructions are used to read and write to the MSR respectively.
The contents of the cache are preserved by default when the cache is disabled. The user
may overwrite the contents of the cache using the WDC instruction or using the hardware
debug logic of MicroBlaze.
Note: The cache cannot be turned on/off from an interrupt handler routine as the changes
to the MSR is lost once the interrupt is handled (the MSR state is restored after interrupt
handling).
WDC Instruction
The WDC instruction may be used to update the data cache from a software program. The
assembly instruction is
WDC Ra,Rb
Where Ra contains cache line, tag address, valid and lock bit, Rb contains the data.
Ra(31) is the lock bit, Ra(30) is the valid bit (valid when bit is set to 1), the rest of the Ra
contains the instruction address.
This instruction can only be used when the cache is disabled. The lock bit is described in
the Lock Bit section below. The
HW Debug Logic
The HW debug logic may be used to perform a similar operation as the WDC instruction.
Lock Bit
The lock bit can be used to permanently lock a code segment into the cache and therefore
guarantee that this data is always in the cache. Locking of the cacheline however may
result in a decrease in the number of cache hits. This is because there could be addresses
that were not cached as the cacheline is locked.
The use of data LMB in most cases would be a better choice for locking data since the wait
states for accessing the LMB is the same as for cache hits.
LMB Memory
Data LMB memory can be used even when data cache is used. The LMB address in the case
has to be in the non-cacheable memory segment.
28
www.xilinx.com
1-800-255-7778
FSL channels are dedicated uni-directional point-to-point data streaming interfaces. The
FSL interfaces on MicroBlaze are 32 bits wide. Further, the same FSL channels can be used
to transmit or receive either control or data words. A separate bit indicates whether the
trasmitted (received) word is control or data information.
The blocking get instruction stalls the MicroBlaze pipeline until data becomes available in
the input FSL, fslN. Once the data is available, the instruction is completed in two clock
cycles. The get instruction is used for getting Data values. If a get instruction is used to read
a Control value (the control_in bit of the fslN is set), a FSL get error bit is set in the MSR (Bit
27).
The non-blocking get instruction does not stall the MicroBlaze pipeline whether or not
data is present on the input FSL, fslN. The instruction is completed in two clock cycles. If
the data is available, the carry bit (Bit 29)in the MSR is reset. If the instruction fails the carry
bit in the MSR is set. Bit 0 of the MSR has the copy of the carry bit. Hence, a direct branch
on carry may be performed following the nget instruction. The nget instruction is also used
to read Data values. If a Control value is read, the FSL error bit (Bit 27 of MSR) is set.
The blocking control get instruction stalls the MicroBlaze pipeline until data becomes
available in the input FSL, fslN. Once the data is available, the instruction is completed in
two clock cycles. The cget instruction is used for reading Control values (the control_in bit
of the fslN is set). If the value read is a data value, the FSL error bit (Bit 27 of MSR) is set.
The non-blocking control get instruction does not stall the MicroBlaze pipeline whether or
not data is present on the input FSL, fslN. The instruction is completed in two clock cycles.
If the data is available, the carry bit (Bit 29) in the MSR is reset. If the instruction fails the
carry bit in the MSR is set. Bit 0 of the MSR has the copy of the carry bit. Hence, a direct
branch on carry may be performed following the ncget instruction. The ncget instruction is
also used to read Control values (the control_in bit of the fslN is set). If the value read is a
data value, the FSL error bit (Bit 27 of the MSR) is set.
www.xilinx.com
1-800-255-7778
29
The blocking put instruction stalls the MicroBlaze pipeline until a data can be written to
the output FSL, fslN (data can be written when the full bit is not set). Once the data can be
written, the instruction is completed in two clock cycles. The put instruction is used for
writing Data values (the control_out bit of the fslN is reset).
The non-blocking put instruction does not stall the MicroBlaze pipeline whether or not
data can be written to the output FSL, fslN (data can be written when the full bit is not set).
The instruction is completed in two clock cycles. If the data write succeeds, the carry bit
(Bit 29) in the MSR is reset. If the data write fails, the carry bit in the MSR is set. Bit 0 of the
MSR has the copy of the carry bit. Hence, a direct branch on carry may be performed
following the nput instruction. The nput instruction is also used to write Data values (the
control_out bit of fslN is reset).
The blocking put instruction stalls the MicroBlaze pipeline until a data can be written to
the output FSL, fslN (data can be written when the full bit is not set). Once the data can be
written, the instruction is completed in two clock cycles. The put instruction is used for
writing Control values (the control_out bit of the fslN is set).
The non-blocking put instruction does not stall the MicroBlaze pipeline whether or not
data can be written to the output FSL, fslN (data can be written when the full bit is not set).
The instruction is completed in two clock cycles. If the data write succeeds, the carry bit
(Bit 29) in the MSR is reset. If the data write fails, the carry bit in the MSR is set. Bit 0 of the
MSR has the copy of the carry bit. Hence, a direct branch on carry may be performed
following the nput instruction. The nput instruction is also used to write Control values
(the control_out bit of fslN is set).
Debug Interface
MicroBlaze features a debug interface to support JTAG based software debugging tools
(commonly known as BDM or Background Debug Mode debuggers) like the Xilinx
30
www.xilinx.com
1-800-255-7778
Debug Interface
Microprocessor Debug (XMD) tool. The debug interface is designed to be connected to the
Xilinx Microprocessor Debug Module (MDM) IP core, which interfaces with the JTAG port
of Xilinx FPGAs. Multiple MicroBlazes can be interfaced with a single MDM to enable
multiprocessor debugging.
Debugging Features
External processor control enables debug tools to stop, reset and single step
MicroBlaze
Read and write memory and all registers including PC and MSR
www.xilinx.com
1-800-255-7778
31
32
www.xilinx.com
1-800-255-7778
Chapter 2
Overview
The MicroBlaze core is organized as a Harvard architecture with separate bus interface
units for data accesses and instruction accesses. Each bus interface unit is further split into
a Local Memory Bus (LMB) and IBMs On-chip Peripheral Bus (OPB). The LMB provides
single-cycle access to on-chip dual-port block RAM. The OPB interface provides a
connection to both on-and off-chip peripherals and memory. Further, the MicroBlaze core
provides 8 input and 8 output interfaces to Fast Simplex Link (FSL) buses. The FSL buses
are uni-directional non-arbitrated dedicated communication channels.
Features
The MicroBlaze bus interfaces include the following features:
OPB V2.0 bus interface with byte-enable support (see IBMs 64-Bit On-Chip Peripheral
Bus, Architectural Specifications, Version 2.0)
LMB provides simple synchronous protocol for efficient block RAM transfers
LMB provides guaranteed performance of 125 MHz for local memory subsystem
Bus Configurations
The block diagram in Figure 2-1 depicts the MicroBlaze core with the bus interfaces
defined as follows:
DOPB: Data interface, On-chip Peripheral Bus
DLMB: Data interface, Local Memory Bus (BRAM only)
IOPB: Instruction interface, On-chip Peripheral Bus
ILMB: Instruction interface, Local Memory Bus (BRAM only)
MFSL0..MFSL7: Master data interface, Fast Simplex Link
SFSL0..SFSL7: Slave data interface, Fast Simplex Link
Core: Miscellaneous signals (Clock, Reset, Interrupt)
www.xilinx.com
1-800-255-7778
33
Instruction-side
bus interface
Data-side
bus interface
Add/Sub
IOPB
Program
Counter
DOPB
Shift/Logical
Multiply
Bus
IF
MFSL0..7
Bus
IF
Instruction
Decode
SFSL0..7
Register File
32 X 32b
Instruction
Buffer
DLMB
ILMB
IOPB
DOPB
ILMB
DLMB
IOPB
DLMB
IOPB
DOPB
DOPB
ILMB
DOPB
IOPB
DLMB
DOPB
ILMB
DOPB
ILMB
34
www.xilinx.com
1-800-255-7778
Bus Configurations
The optimal configuration for your application depends on code size and data spaces, and
if you require fast access to internal block RAM. The performance implications and
supported memory models for each configuration is shown in the following table:
Table 2-1: MicroBlaze Bus Configurations
Configuration
Core
Fmax
Debug
available
IOPB+ILMB+DOPB+DLMB
110
SW/JTAG
IOPB+DOPB+DLMB
125
SW/JTAG
ILMB+DOPB+DLMB
125
SW/JTAG
IOPB+ILMB+DOPB
110
JTAG for
ILMB
memory1
SW/for IOPB
memory
IOPB+DOPB
125
SW/JTAG
ILMB+DOPB
125
JTAG1
Note: ILMB memory can be debugged via a software resident monitor if the second port of the dualported ILMB BRAM is connected to an OPB BRAM memory controller. See Figure 2-6 and
Figure 2-8. Also, all the above 6 confugrations can be used with a special FSL configuration. See
Figure 2-9.
www.xilinx.com
1-800-255-7778
35
Configuration 1
Memory
Controller
(Ext. memory)
OPB-to-OPB
Bridge
Memory
Controller
(Ext. memory)
Interrupt
Controller
Timer/
Counter
and WDT
IOPB
DOPB
ILMB
DLMB
Data-side OPB
Data-side LMB
UART
A
Dual Port
Block RAM
Other OPB
Master, Slave,
or Bridge
Purpose
Use this configuration when your application requires more instruction and data memory
than is available in the on-chip block RAM (BRAM). Critical sections of instruction and
data memory can be allocated to the faster ILMB BRAM to improve your applications
performance. Depending on how much data memory is required, the data-side memory
controller may not be present. The data-side OPB is also used for other peripherals such as
UARTs, timers, general purpose I/O, additional BRAM, and custom peripherals. The OPBto-OPB bridge is only required if the data-side OPB needs access to the instruction-side
OPB peripherals, such as for software-based debugging.
Typical Applications
MPEG Decoder
Communications Controller
Complex state machine for process control and other embedded applications
Characteristics
Because of the extra logic required to implement two buses per side, the maximum clock
rate of the CPU may be slightly less than configurations with one bus per side. This
configuration allows debugging of application code through either software-based
debugging (resident monitor debugging) or hardware-based JTAG debugging.
36
www.xilinx.com
1-800-255-7778
Bus Configurations
Configuration 2
Memory
Controller
(Ext. memory)
OPB-to-OPB
Bridge
Memory
Controller
(Ext. memory)
Interrupt
Controller
Timer/
Counter
and WDT
IOPB
Data-side OPB
DOPB
DLMB
Data-side LMB
UART
Other OPB
Master, Slave,
or Bridge
Block RAM
Purpose
Use this configuration when your application requires more instruction and data memory
than is available in the on-chip BRAM. In this configuration, all of the instruction memory
is resident in off-chip memory or on-chip memory on the instruction-side OPB. Depending
on how much data memory is required, the data-side memory controller may not be
present. The data-side OPB is also used for other peripherals such as UARTs, timers,
general purpose I/O, additional BRAM, and custom peripherals. The OPB-to-OPB bridge
is only required if the data-side OPB needs access to the instruction-side OPB peripherals,
such as for software-based debugging.
Typical Applications
MPEG Decoder
Communications Controller
Complex state machine for process control and other embedded applications
Characteristics
This configuration allows the CPU core to operate at the maximum clock rate because of
the simpler instruction-side bus structure. Instruction fetches on the OPB, however, are
slower than fetches from BRAM on the LMB. Overall processor performance is lower than
implementations using LMB unless a large percentage of code is run from the internal
instruction history buffer. This configuration allows debugging of application code
through either software-based debugging (resident monitor debugging) or hardwarebased JTAG debugging.
www.xilinx.com
1-800-255-7778
37
Configuration 3
Memory
Controller
(Ext. memory)
Interrupt
Controller
Timer/
Counter
and WDT
DOPB
Instruction-side LMB
ILMB
DLMB
Data-side LMB
UART
A
Dual Port
Block RAM
Other OPB
Master, Slave,
or Bridge
Purpose
Use this configuration when your application code fits into the on-chip BRAM, but more
memory may be required for data memory. Critical sections of data memory can be
allocated to the faster DLMB BRAM to improve your applications performance.
Depending on how much data memory is required, the data-side memory controller may
not be present. The data-side OPB is also used for other peripherals such as UARTs, timers,
general purpose I/O, additional BRAM, and custom peripherals.
Typical Applications
Data-intensive controllers
Characteristics
This configuration allows the CPU core to operate at the maximum clock rate because of
the simpler instruction-side bus structure. The instruction-side LMB provides two-cycle
pipelined read access from the BRAM for an effective access rate of one instruction per
clock. This configuration allows debugging of application code through either softwarebased debugging (resident monitor debugging) or hardware-based JTAG debugging.
38
www.xilinx.com
1-800-255-7778
Bus Configurations
Configuration 4
Memory
Controller
(Ext. memory)
OPB-to-OPB
Bridge
Memory
Controller
(Ext. memory)
Interrupt
Controller
Timer/
Counter
and WDT
IOPB
Data-side OPB
DOPB
ILMB
BRAM Memory
Controller
UART
Other OPB
Master, Slave,
or Bridge
Block RAM
Purpose
Use this configuration when your application requires more instruction and data memory
than is available in the on-chip BRAM. Critical sections of instruction memory can be
allocated to the faster ILMB BRAM to improve your applications performance. The dataside OPB is used for one or more external memory controllers and other peripherals such
as UARTs, timers, general purpose I/O, additional BRAM, and custom peripherals. The
OPB-to-OPB bridge is only required if the data-side OPB needs access to the instructionside OPB peripherals, such as for software-based debugging.
Typical Applications
MPEG Decoder
Communications Controller
Complex state machine for process control and other embedded applications
Characteristics
Because of the extra logic required to implement two buses per side, the maximum clock
rate of the CPU may be slightly less than configurations with one bus per side. This
configuration allows debugging of application code through either software-based
debugging (resident monitor debugging) or hardware-based JTAG debugging. However,
software-based debugging of code in the ILMB BRAM can only be performed if a BRAM
memory controller is included on the D-side OPB bus to provide write access to the LMB
BRAM.
www.xilinx.com
1-800-255-7778
39
Configuration 5
Memory
Controller
(Ext. memory)
OPB-to-OPB
Bridge
Memory
Controller
(Ext. memory)
Interrupt
Controller
Timer/
Counter
and WDT
IOPB
DOPB
Data-side OPB
UART
Other OPB
Master, Slave,
or Bridge
Purpose
Use this configuration when your application requires external instruction and data
memory. In this configuration, all of the instruction and data memory is resident in off-chip
memory or on-chip memory on the OPB buses. The data-side OPB is used for one or more
external memory controllers and other peripherals such as UARTs, timers, general
purpose I/O, BRAM, and custom peripherals. The OPB-to-OPB bridge is only required if
the data-side OPB needs access to the instruction-side OPB peripherals, such as for
software-based debugging.
Typical Applications
MPEG Decoder
Communications Controller
Complex state machine for process control and other embedded applications
Characteristics
This configuration allows the CPU core to operate at the maximum clock rate because of
the simpler instruction-side bus structure. However, instruction fetches on the OPB are
slower than fetches from BRAM on the LMB. Overall processor performance is lower than
implementations using LMB unless a large percentage of code is run from the internal
instruction history buffer. This configuration allows debugging of application code
through either software-based debugging (resident monitor debugging) or hardwarebased JTAG debugging.
40
www.xilinx.com
1-800-255-7778
Bus Configurations
Configuration 6
Memory
Controller
(Ext. memory)
Interrupt
Controller
Timer/
Counter
and WDT
DOPB
Instruction-side LMB
ILMB
BRAM Memory
Controller
UART
Other OPB
Master, Slave,
or Bridge
Dual Port
Block RAM
Purpose
Use this configuration when your application code fits into the on-chip ILMB BRAM, but
more memory may be required for data memory. The data-side OPB is used for one or
more external memory controllers and other peripherals such as UARTs, timers, general
purpose I/O, additional BRAM, and custom peripherals.
Typical Applications
Minimal controllers
Characteristics
This configuration allows the CPU core to operate at the maximum clock rate because of
the simpler instruction-side bus structure. The instruction-side LMB provides two-cycle
pipelined read access from the BRAM for an effective access rate of one instruction per
clock. This configuration allows debugging of application code through either softwarebased debugging (resident monitor debugging) or hardware-based JTAG debugging.
However, software-based debugging of code in the ILMB BRAM can only be performed if
a BRAM memory controller is included on the D-side OPB bus to provide write access to
the LMB BRAM.
FSL Configuration
Along with any of the above specified configurations, MicroBlaze can optionally include upto 8
FSL input interfaces and 8 FSL output interfaces.
www.xilinx.com
1-800-255-7778
41
MICROBLAZE PROCESSOR
r0
r1
r2
IN FIFO
IN PORT
Figure 2-9:
r3
r32
FSL7 FSL8
OUT FIFO
OUT PORT
Purpose
Use this configuration for trasmitting data directly from the MicroBlaze core to other
peripherals or processors without using a shared bus. MicroBlaze contains several
instructions to read from the input FSLs and write to the output FSLs. The read and write
each consume two clock cycles. The number of FSLs in MicroBlaze can be configured by
using the C_NUM_FSL parameter.
Typical Applications
The FSLs are particularly useful for streaming data style applications. These include signal
processing, image processing, DSP and Network processing applications. The FSL
communication channels can also be used to interface with hardware accelerators that are
implemented on the reconfigurable fabric.
Characterestics
The CPU clock frequency is unaffected by the addition of FSLs to the MicroBlaze core. The
area of the MicroBlaze core increases sligthly based on the number of FSL interfaces.
42
www.xilinx.com
1-800-255-7778
Byte address
n+1
n+2
n+3
Byte label
MSByte
Byte significance
Bit label
31
Bit significance
MSBit
Byte address
n+1
Byte label
MSByte
LSByte
Byte significance
Bit label
Halfword
15
MSBit
Byte address
Byte label
LSBit
Byte
MSByte
Byte significance
Bit significance
LSBit
Bit significance
Bit label
Word
LSByte
MSBit LSBit
Core I/O
The MicroBlaze core implements separate buses for instruction fetch and data access,
denoted the I side and D side buses, respectively. These buses are split into the following
two bus types:
OPB V2.0 compliant bus for OPB peripherals and memory controllers
Local Memory Bus used exclusively for high-speed access to internal block RAM
(BRAM).
All core I/O signals are listed in Table 2-2. Page numbers prefaced by OPB reference IBMs
64-Bit On-Chip Peripheral Bus, Architectural Specifications, Version 2.0.
The core interfaces shown in the following table are defined as follows:
www.xilinx.com
1-800-255-7778
43
Interface
I/O
DM_ABus[0:31]
DOPB
OPB-11
DM_BE[0:3]
DOPB
OPB-16
DM_busLock
DOPB
OPB-9
DM_DBus[0:31]
DOPB
OPB-13
DM_request
DOPB
OPB-8
DM_RNW
DOPB
OPB-12
DM_select
DOPB
OPB-12
DM_seqAddr
DOPB
OPB-13
DOPB_DBus[0:31]
DOPB
OPB-13
DOPB_errAck
DOPB
OPB-15
DOPB_MGrant
DOPB
OPB-9
DOPB_retry
DOPB
OPB-10
DOPB_timeout
DOPB
OPB-10
DOPB_xferAck
DOPB
OPB-14
IM_ABus[0:31]
IOPB
OPB-11
IM_BE[0:3]
IOPB
OPB-16
IM_busLock
IOPB
OPB-9
IM_DBus[0:31]
IOPB
OPB-13
IM_request
IOPB
OPB-8
IM_RNW
IOPB
OPB-12
IM_select
IOPB
OPB-12
IM_seqAddr
IOPB
OPB-13
IOPB_DBus[0:31]
IOPB
OPB-13
IOPB_errAck
IOPB
OPB-15
IOPB_MGrant
IOPB
OPB-9
IOPB_retry
IOPB
OPB-10
IOPB_timeout
IOPB
OPB-10
44
Description
www.xilinx.com
1-800-255-7778
Page
Bus Organization
Interface
I/O
IOPB
Data_Addr[0:31]
DLMB
49
Byte_Enable[0:3]
DLMB
49
Data_Write[0:31]
DLMB
50
D_AS
DLMB
50
Read_Strobe
DLMB
50
Write_Strobe
DLMB
50
Data_Read[0:31]
DLMB
50
DReady
DLMB
50
Instr_Addr[0:31]
ILMB
49
I_AS
ILMB
50
IFetch
ILMB
50
Instr[0:31]
ILMB
50
IReady
ILMB
50
FSL0_M .. FSL7_M
MFSL
FSL0_S .. FSL7_S
SFSL
Interrupt
Core
Interrupt
Reset
Core
Core reset
Clk
Core
Clock
Debug_Rst
Core
Ext_BRK
Core
Ext_NM_BRK
Core
Dbg_...
Core
IO
IOPB_xferAck
Description
Page
OPB-12
Bus Organization
OPB Bus Configuration
The MicroBlaze OPB interfaces are organized as byte-enable capable only masters. The
byte-enable architecture is an optional subset of the OPB V2.0 specification and is ideal for
low-overhead FPGA implementations such as MicroBlaze.
The OPB data bus interconnects are illustrated in Figure 2-11. The write data bus (from
masters and bridges) is separated from the read data bus (from slaves and bridges) to
break up the bus OR logic. In minimal cases this can completely eliminate the OR logic for
the read or write data buses. Optionally, you can "OR" together the read and write buses to
create the correct functionality for the OPB bus monitor. Note that the instruction-side OPB
contains a write data bus (tied to 0x00000000) and a RNW signal (tied to logic 1) so that its
www.xilinx.com
1-800-255-7778
45
interface remains consistent with the data-side OPB. These signals are constant and
generally are minimized in implementation.
A multi-ported slave is used instead of a bridge in the example shown in Figure 2-12. This
could represent a memory controller with a connection to both the IOPB and the DOPB. In
this case, the bus multiplexing and prioritization must be done in the slave. The advantage
of this approach is that a separate I-to-D bridge and an OPB arbiter on the instruction side
are not required. The arbiter function must still exist in the slave device.
46
www.xilinx.com
1-800-255-7778
Bus Organization
Data-side OPB
DOPB_ABus[0:31]
DOPB_BE[0:3]
DOPB_busLock
DOPB_wrDBus[0:31]
DOPB_RNW
DOPB_select
DOPB_seqAddr
DOPB_rdDBus[0:31]
DOPB_errAck
DOPB_retry
DOPB_timeout
DOPB_xferAck
DOPB_MGrant
OPB
Slave1
MicroBlaze
Data OPB
Interface
Sl1_rdDBus[0:31]
Sl1_errAck
Sl1_retry
Sl1_timeout
Sl1_toutSup
Sl1_xferAck
DM_ABus[0:31]
DM_BE[0:3]
DM_busLock
DM_wrDBus[0:31]
DM_RNW
DM_select
DM_seqAddr
DM_request
OR
like
DOPB_ABus[0:31]
DOPB_BE[0:3]
DOPB_busLock
DOPB_wrDBus[0:31]
DOPB_rdDBus[0:31]
DOPB_RNW
DOPB_select
DOPB_seqAddr
DOPB_errAck
DOPB_retry
DOPB_timeout
DOPB_toutSup
DOPB_xferAck
D-side
OPB
arbiter
suffixes
Present for Bus Monitor functions:
Br1I_rdDBus[0:31]
Br1_errAck
Br1_retry
Br1_timeout
Br1_toutSup
Br1_xferAck
IOPB_rdDBus[0:31]
IOPB_errAck
IOPB_retry
IOPB_timeout
IOPB_toutSup
IOPB_xferAck
DOPB_ABus[0:31]
DOPB_BE[0:3]
DOPB_busLock
DOPB_wrDBus[0:31]
DOPB_RNW
DOPB_select
DOPB_seqAddr
Br1_MGrant
IOPB_rdDBus[0:31]
IOPB_errAck
IOPB_retry
IOPB_timeout
IOPB_xferAck
IOPB_MGrant
DOPB
to
IOPB
MicroBlaze
Instr OPB
Interface
(IOPB)
IOPB_ABus[0:31]
IOPB_BE[0:3]
IOPB_busLock
IOPB_wrDBus[0:31]
IOPB_RNW
IOPB_select
IOPB_seqAddr
OPB
Slave2
DOPB_wrDBus[0:31]
DOPB_rdDBus[0:31]
Br1_ABus[0:31]
Br1_BE[0:3]
Br1_busLock
Br1D_wrDBus[0:31]
Br1_RNW
Br1_select
Br1_seqAddr
Br1_request
IM_ABus[0:31]
IM_BE[0:3]
IM_busLock
IM_wrDBus[0:31]
IM_RNW
IM_select
IM_seqAddr
IM_request
DOPB_DBus[0:31]
OR
OR
like
IOPB_DBus[0:31]
OR
IOPB_ABus[0:31]
IOPB_BE[0:3]
IOPB_busLock
IOPB_wrDBus[0:31]
IOPB_rdDBus[0:31]
IOPB_RNW
IOPB_select
IOPB_seqAddr
IOPB_errAck
IOPB_retry
IOPB_timeout
IOPB_toutSup
IOPB_xferAck
Required
suffixes
Sl2_rdDBus[0:31]
Sl2_errAck
Sl2_retry
Sl2_timeout
Sl2_toutSup
Sl2_xferAck
I-side
OPB
arbiter
Instruction-side OPB
www.xilinx.com
1-800-255-7778
47
Data-side OPB
DOPB_ABus[0:31]
DOPB_BE[0:3]
DOPB_busLock
DOPB_wrDBus[0:31]
DOPB_RNW
DOPB_select
DOPB_seqAddr
DOPB_rdDBus[0:31]
DOPB_errAck
DOPB_retry
DOPB_timeout
DOPB_xferAck
DOPB_MGrant
OPB
Slave1
MicroBlaze
Data OPB
Interface
Sl1_rdDBus[0:31]
Sl1_errAck
Sl1_retry
Sl1_timeout
Sl1_toutSup
Sl1_xferAck
DM_ABus[0:31]
DM_BE[0:3]
DM_busLock
DM_wrDBus[0:31]
DM_RNW
DM_select
DM_seqAddr
DM_request
OR
like
DOPB_ABus[0:31]
DOPB_BE[0:3]
DOPB_busLock
DOPB_wrDBus[0:31]
DOPB_rdDBus[0:31]
DOPB_RNW
DOPB_select
DOPB_seqAddr
DOPB_errAck
DOPB_retry
DOPB_timeout
DOPB_toutSup
DOPB_xferAck
D-side
OPB
arbiter
suffixes
Present for Bus Monitor functions:
DOPB_ABus[0:31]
DOPB_BE[0:3]
DOPB_busLock
DOPB_wrDBus[0:31]
DOPB_RNW
DOPB_select
DOPB_seqAddr
IOPB_ABus[0:31]
IOPB_BE[0:3]
IOPB_busLock
IOPB_wrDBus[0:31]
IOPB_RNW
IOPB_select
IOPB_seqAddr
IOPB_rdDBus[0:31]
IOPB_errAck
IOPB_retry
IOPB_timeout
IOPB_xferAck
IOPB_MGrant
Sl2_rdDBus[0:31]
Sl2_errAck
Sl2_retry
Sl2_timeout
Sl2_toutSup
Sl2_xferAck
OPB
Slave2
(multi-
DOPB_wrDBus[0:31]
DOPB_rdDBus[0:31]
Sl2_rdDBus[0:31]
Sl2_errAck
Sl2_retry
Sl2_timeout
Sl2_toutSup
Sl2_xferAck
IM_ABus[0:31]
IM_BE[0:3]
IM_busLock
MicroBlaze
Instr OPB
Interface
IM_RNW
IM_select
IM_seqAddr
IM_request
DOPB_DBus[0:31]
OR
OR
like
IOPB_DBus[0:31]
OR
IOPB_ABus[0:31]
IOPB_BE[0:3]
IOPB_busLock
IOPB_wrDBus[0:31]
IOPB_rdDBus[0:31]
IOPB_RNW
IOPB_select
IOPB_seqAddr
IOPB_errAck
IOPB_retry
IOPB_timeout
IOPB_toutSup
IOPB_xferAck
suffixes
Instruction-side OPB
Figure 2-12: OPB Interconnection (with multi-ported slave and no bridge)
48
www.xilinx.com
1-800-255-7778
Bus Organization
Data Interface
Instr. Interface
Type
Description
Addr[0:31]
Data_Addr[0:31]
Instr_Addr[0:31]
Address bus
Byte_Enable[0:3]
Byte_Enable[0:3]
not used
Byte enables
Data_Write[0:31]
Data_Write[0:31]
not used
AS
D_AS
I_AS
Address strobe
Read_Strobe
Read_Strobe
IFetch
Read in progress
Write_Strobe
Write_Strobe
not used
Write in progress
Data_Read[0:31]
Data_Read[0:31]
Instr[0:31]
Ready
DReady
IReady
Clk
Clk
Clk
Bus clock
Addr[0:31]
The address bus is an output from the core and indicates the memory address that is being
accessed by the current transfer. It is valid only when AS is high. In multicycle accesses
(accesses requiring more than one clock cycle to complete), Addr[0:31] is valid only in the
first clock cycle of the transfer.
Byte_Enable[0:3]
The byte enable signals are outputs from the core and indicate which byte lanes of the data
bus contain valid data. Byte_Enable[0:3] is valid only when AS is high. In multicycle
accesses (accesses requiring more than one clock cycle to complete), Byte_Enable[0:3] is
valid only in the first clock cycle of the transfer. Valid values for Byte_Enable[0:3] are
shown in the following table:
Table 2-4: Valid Values for Byte_Enable[0:3]
Byte Lanes Used
Byte_Enable[0:3]
Data[0:7]
Data[8:15]
Data[16:23]
Data[24:31]
0000
0001
0010
0100
1000
x
x
www.xilinx.com
1-800-255-7778
49
Data[0:7]
Data[8:15]
0011
1100
1111
Data[16:23]
Data[24:31]
Data_Write[0:31]
The write data bus is an output from the core and contains the data that is written to
memory. It becomes valid when AS is high and goes invalid in the clock cycle after Ready
is sampled high. Only the byte lanes specified by Byte_Enable[0:3] contain valid data.
AS
The address strobe is an output from the core and indicates the start of a transfer and
qualifies the address bus and the byte enables. It is high only in the first clock cycle of the
transfer, after which it goes low and remains low until the start of the next transfer.
Read_Strobe
The read strobe is an output from the core and indicates that a read transfer is in progress.
This signal goes high in the first clock cycle of the transfer, and remains high until the clock
cycle after Ready is sampled high. If a new read transfer is started in the clock cycle after
Ready is high, then Read_Strobe remains high.
Write_Strobe
The write strobe is an output from the core and indicates that a write transfer is in progress.
This signal goes high in the first clock cycle of the transfer, and remains high until the clock
cycle after Ready is sampled high. If a new write transfer is started in the clock cycle after
Ready is high, then Write_Strobe remains high.
Data_Read[0:31]
The read data bus is an input to the core and contains data read from memory.
Data_Read[0:31] is valid on the rising edge of the clock when Ready is high.
Ready
The Ready signal is an input to the core and indicates completion of the current transfer
and that the next transfer can begin in the following clock cycle. It is sampled on the rising
edge of the clock. For reads, this signal indicates the Data_Read[0:31] bus is valid, and for
writes it indicates that the Data_Write[0:31] bus has been written to local memory.
Clk
All operations on the LMB are synchronous to the MicroBlaze core clock.
50
www.xilinx.com
1-800-255-7778
Bus Organization
Clk
Addr
A0
Byte_Enable
1111
Data_Write
D0
AS
Read_Strobe
Write_Strobe
Data_Read
Ready
Figure 2-13: LMB Generic Write Operation
Clk
Addr
A0
Byte_Enable
1111
Data_Write
AS
Read_Strobe
Write_Strobe
Data_Read
D0
Ready
Figure 2-14: LMB Generic Read Operation
www.xilinx.com
1-800-255-7778
51
A0
A1
Byte_Enable
BE0
BE1
Data_Write
D0
D1
AS
Read_Strobe
Write_Strobe
Data_Read
Ready
Figure 2-15: LMB Back-to-Back Write Operation
A0
A1
A2
Byte_Enable
BE0
BE1
BE2
D0
D1
Data_Write
AS
Read_Strobe
Write_Strobe
Data_Read
D2
Ready
Figure 2-16: LMB Single Cycle Back-to-Back Read Operation
per read)
52
www.xilinx.com
1-800-255-7778
Bus Organization
A0
A1
Byte_Enable
BE0
BE1
Data_Write
D0
AS
Read_Strobe
Write_Strobe
Data_Read
D1
Ready
Figure 2-17: Back-to-Back Mixed Read/Write Operation
MicroBlaze does not support transfers that are larger than the addressed device. These
types of transfers require dynamic bus sizing and conversion cycles that are not supported
by the MicroBlaze bus interface. Data steering for read cycles is shown in Table 2-5, and
data steering for write cycles is shown in Table 2-6
Table 2-5: Read Data Steering (load to Register rD)
Register rD Data
Address
[30:31]
Byte_Enable
Transfer Size
[0:3]
rD[0:7]
rD[8:15]
rD[16:23]
rD[24:31]
11
0001
byte
Byte3
10
0010
byte
Byte2
01
0100
byte
Byte1
00
1000
byte
Byte0
10
0011
halfword
Byte2
Byte3
00
1100
halfword
Byte0
Byte1
00
1111
word
Byte2
Byte3
Byte0
www.xilinx.com
1-800-255-7778
Byte1
53
Byte_Enable
Transfer Size
[0:3]
Byte0
Byte1
Byte2
11
0001
byte
10
0010
byte
01
0100
byte
00
1000
byte
10
0011
halfword
00
1100
halfword
rD[16:23]
rD[24:31]
00
1111
word
rD[0:7]
rD[8:15]
Byte3
rD[24:31]
rD[24:31]
rD[24:31]
rD[24:31]
rD[16:23]
rD[24:31]
rD[16:23]
rD[24:31]
Note that other OPB masters may have more restrictive requirements for byte lane
placement than those allowed by MicroBlaze. OPB slave devices are typically attached
"left-justified" with byte devices attached to the most-significant byte lane, and halfword
devices attached to the most significant halfword lane. The MicroBlaze steering logic fully
supports this attachment method.
Table 2-7:
Signal Name
54
Description
VHDL Type
Direction
FSLn_M_Clk
Clock
std_logic
input
FSLn_M_Write
std_logic
output
FSLn_M_Data
std_logic_vector
output
FSLn_M_CONTROL
std_logic
output
FSLn_M_FULL
std_logic
input
www.xilinx.com
1-800-255-7778
Bus Organization
Table 2-8:
Signal Name
Description
VHDL Type
Direction
FSLn_S_Clk
Clock
std_logic
input
FSLn_S_Read
std_logic
output
FSLn_S_Data
std_logic_vector
input
FSLn_S_Control
std_logic
input
FSLn_S_Exists
std_logic
input
Debug Interface
The debug interface on MicroBlaze is designed to work with the Xilinx Microprocessor
Debug Module (MDM) IP core, which interfaces with the JTAG port of Xilinx FPGAs. An
external software debug tool can control MicroBlaze using the MDM core and the debug
www.xilinx.com
1-800-255-7778
55
port on MicroBlaze. The MDM can support connections to multiple MicroBlaze debug
ports. The debug signals on MicroBlaze are listed in Table 2-9
Description
VHDL Type
Direction
Dbg_Clk
std_logic
input
Dbg_TDI
std_logic
input
Dbg_TDO
std_logic
output
Dbg_Reg_En
std_logic
input
Dbg_Capture
std_logic
input
Dbg_Update
std_logic
input
Implementation
Parameterization
The following characteristics of MicroBlaze can be parameterized:
Table 2-10:
Barrel shifte
Number of FSL interfaces (same number for both input and output)
Interrupt port
Debug port
Instruction cache
Data cache
MPD Parameters
Feature/Description
Parameter Name
Default
Value
Allowable Values
VHDL
Type
Target Family
C_FAMILY
virtex2
string
Data Size
C_DATA_SIZE
32
32
integer
Instance Name
C_INSTANCE
microblaze
string
C_D_OPB
0, 1
integer
C_D_LMB
0, 1
integer
56
www.xilinx.com
1-800-255-7778
Implementation
Table 2-10:
MPD Parameters
Feature/Description
Parameter Name
Default
Value
Allowable Values
VHDL
Type
C_I_OPB
0, 1
integer
C_I_LMB
0, 1
integer
Barrel Shifter
C_USE_BARREL
0, 1
integer
Divide Unit
C_USE_DIV
0, 1
integer
C_FSL_LINKS
0..8
integer
C_FSL_DATA_SIZE
32
32
integer
Level/Edge Interrupt
C_INTERRUPT_IS_EDGE
0, 1
integer
Negative/Positive Egde
Interrupt
C_EDGE_IS_POSITIVE
0, 1
integer
C_DEBUG_ENABLED
0,1
integer
Number of hardware
breakpoints
C_NUMBER_OF_PC_BR
K
0-8
integer
C_NUMBER_OF_RD_AD
DR_BRK
0-4
integer
C_NUMBER_OF_WR_A
DDR_BRK
0-4
integer
Instruction cache
C_USE_ICACHE
0,1
integer
C_ADDR_TAG_BITS
0-24
integer
C_CACHE_BYTE_SIZE
512,1024,2048,4096,819
2,16384,32768,65536
8192
integer
C_ICACHE_BASEADDR
X00000000 XFFFFFFFF
X00000000
std_logi
c_vector
C_ICACHE_HIGHADDR
X00000000 XFFFFFFFF
X3FFFFFF
F
std_logi
c_vector
C_ALLOW_ICACHE_WR
0,1
integer
Data cache
C_USE_DCACHE
0,1
integer
C_DCACHE_ADDR_TA
G
0-24
integer
C_DCACHE_BYTE_SIZE
512,1024,2048,4096,819
2,16384,32768,65536
8192
integer
C_DCACHE_BASEADDR
X00000000 XFFFFFFFF
X00000000
std_logi
c_vector
C_DCACHE_HIGHADD
R
X00000000 XFFFFFFFF
X3FFFFFF
F
std_logi
c_vector
C_ALLOW_DCACHE_W
R
0,1
integer
www.xilinx.com
1-800-255-7778
57
58
www.xilinx.com
1-800-255-7778
Chapter 3
MicroBlaze Endianness
This chapter describes big-endian and little-endian data objects and how to use littleendian data with the big-endian MicroBlaze soft processor. This chapter includes the
following sections
Definitions
VHDL Example
Definitions
Data are stored or retrieved in memory, in byte, half word, word, or double word units.
Endianness refers to the order in which data are stored and retrieved. Little-endian
specifies that the least significant byte is assigned the lowest byte address. Big-endian
specifies that the most significant byte is assigned the lowest byte address.
Note Endianness does not affect single byte data.
Byte address
n+1
n+2
n+3
Byte label
Byte
significance
MSByt
e
www.xilinx.com
1-800-255-7778
LSByte
59
31
Bit significance
MSBit
LSBit
n+1
Byte label
Byte
significance
MSByt
e
LSByte
Bit label
15
Bit significance
MSBit
LSBit
Byte label
Byte
significance
MSByte
Bit label
Bit significance
MSBit
LSBit
The following C language structure includes various scalars and character strings. The
comments indicate the value assumed to be in each structure element. These values show
how the bytes comprising each structure element are mapped into storage.
struct {
int a; /* 0x1112_1314 word */
long long b; /* 0x2122_2324_2526_2728 double word */
char *c; /* 0x3132_3334 word */
char d[7]; /* 'A','B','C','D','E','F','G' array of bytes */
short e; /* 0x5152 halfword */
int f; /* 0x6162_6364 word */
} s;
C structure mapping rules permit the use of padding (skipped bytes) to align scalars on
desirable boundaries. The structure mapping examples show each scalar aligned at its
natural boundary. This alignment introduces padding of four bytes between a and b, one
byte between d and e, and two bytes between e and f. The same amount of padding is
present in both big-endian and little-endian mappings.
Note For the MicroBlaze core, all operands in the ALU and GPRs, and all pipeline
instructions are big-endian.
60
www.xilinx.com
1-800-255-7778
VHDL Example
The big-endian mapping of struct is shown in the following table. (The data is
highlighted in the structure mappings). Hexadecimal addresses are below the data stored
at the address. The contents of each byte, as defined in the structure, are shown as a
number (hexadecimal) or character (for the string elements).
12
13
14
0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07
21
22
23
24
25
26
27
28
0x08
0x09
0x0A
0x0B
0x0C
0x0D
0x0E
0x0F
31
32
33
34
0x10
0x11
0x12
0x13
0x14
0x15
0x16
0x17
51
52
0x18
0x19
0x1A
0x1B
0x1C
0x1D
0x1E
0x1F
61
62
63
64
0x20
0x21
0x22
0x23
0x24
0x25
0x26
0x27
13
12
11
0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07
28
27
26
25
24
23
22
21
0x08
0x09
0x0A
0x0B
0x0C
0x0D
0x0E
0x0F
34
33
32
31
0x10
0x11
0x12
0x13
0x14
0x15
0x16
0x17
52
51
0x18
0x19
0x1A
0x1B
0x1C
0x1D
0x1E
0x1F
64
63
62
61
0x20
0x21
0x22
0x23
0x24
0x25
0x26
0x27
VHDL Example
BRAM LMB Example
LMB uses big-endian byte addressing, while the BRAM uses little-endian byte addressing.
To translate data between the two busses, swap the data and address bytes.
www.xilinx.com
1-800-255-7778
61
to 31);
to 31);
to 31);
to 3)
end Local_Memory;
architecture IMP of Local_Memory is
downto 0);
downto 0);
downto 0);
downto 0);
downto 0);
62
www.xilinx.com
1-800-255-7778
VHDL Example
end loop;
end process Swap_BE_and_LE_order;
BRAM Instantiation
mem_dp_0_I : mem_dp_0
port map (
addra=>addra,
--[IN std_logic_VECTOR(9 downto 0)]
addrb=>addrb,
--[IN std_logic_VECTOR(9 downto 0)]
clka=>Clk,
--[IN std_logic]
clkb=>Clk,
--[IN std_logic]
dinb=>dinb(31 downto 24)--[IN std_logic_VECTOR(7 downto 0)]
douta=>douta(31 downto 24), --[OUT std_logic_VECTOR(7 downto 0)]
doutb => doutb(31 downto 24), --[OUT std_logic_VECTOR(7 downto 0)]
web=>we(0));
--[IN std_logic]
std_logic_vector(0 to 31);
std_logic_vector(0 to 3);
std_logic;
std_logic;
std_logic;
std_logic_vector(0 to 31);
OPB_BRAM_DBus
OPB_BRAM_errAck
OPB_BRAM_retry
OPB_BRAM_toutSup
OPB_BRAM_xferAck
:
:
:
:
:
out
out
out
out
out
std_logic_vector(0 to 31);
std_logic;
std_logic;
std_logic;
std_logic;
www.xilinx.com
1-800-255-7778
to
to
to
to
31);
3);
31);
31)
63
BRAM Instantiation
All_Brams : for I in 0 to C_NO_BRAMS-1 generate
By_8 : if (C_NO_BRAMS = 4) generate
RAMB16_S9_S9_I : RAMB16_S9_S9
port map (
DIA => opb_DBUS_LE(((I+1)*8-1) downto I*8), --[in std_logic_vector(7
downto 0)]
DIB =>bram_Write_Data_LE(((I+1)*8)-1 downto I*8), --[in
std_logic_vector (downto 0)]
DIPA => null_1,
-- [in std_logic_vector (7 downto 0)]
DIPB => null_1,
-- [in std_logic_vector (7 downto 0)]
ENA
=> '1',
-- [in std_ulogic]
ENB
=> '1',
-- [in std_ulogic]
WEA
=> opb_WE(I),
-- [in std_ulogic]
WEB
=> BRAM_WE(I),
-- [in std_ulogic]
SSRA => '0',
-- [in std_ulogic]
SSRB => '0',
-- [in std_ulogic]
CLKA => OPB_Clk,
-- [in std_ulogic]
CLKB => BRAM_Clk,
-- [in std_ulogic]
64
www.xilinx.com
1-800-255-7778
VHDL Example
www.xilinx.com
1-800-255-7778
65
66
www.xilinx.com
1-800-255-7778
Chapter 4
Data Types
The data types used by MicroBlaze assembly programs are shown in Table 4-1. Data types
such as data8, data16, and data32 are used in place of the usual byte, halfword, and word.
Table 4-1: Data types in MicroBlaze assembly programs
MicroBlaze data types
(for assembly programs)
Corresponding
ANSI C data types
Size (bytes)
data8
char
data16
short
data32
int
data32
long int
data32
enum
data16/data32
pointera
2/4
a.Pointers to small data areas, which can be accessed by global pointers are
data16.
www.xilinx.com
1-800-255-7778
67
Type
Purpose
R0
Dedicated
Value 0
R1
Dedicated
Stack Pointer
R2
Dedicated
R3-R4
Volatile
Return Values
R5-R10
Volatile
Passing parameters/Temporaries
R11-R12
Volatile
Temporaries
R13
Dedicated
R14
Dedicated
R15
Dedicated
R16
Dedicated
R17
Dedicated
R18
Dedicated
R19-R31
Non-Volatile
RPC
Special
Program counter
RMSR
Special
The architecture for MicroBlaze defines 32 general purpose registers (GPRs). These
registers are classified as volatile, non-volatile and dedicated.
68
The volatile registers are used as temporaries and do not retain values across the
function calls. Registers R3 through R12 are volatile, of which R3 and R4 are used for
returning values to the caller function, if any. Registers R5 through R10 are used for
passing parameters between sub-routines.
Registers R19 through R31 retain their contents across function calls and are hence
termed as non-volatile registers. The callee function is expected to save those nonvolatile registers, which are being used. These are typically saved to the stack during
the prologue and then reloaded during the epilogue.
Certain registers are used as dedicated registers and programmers are not expected to
use them for any other purpose.
Registers R14 through R17 are used for storing the return address from interrupts,
sub-routines, traps and exceptions in that order. Sub-routines are called using the
branch and link instruction, which saves the current Program Counter (PC) onto
register R15.
Small data area pointers are used for accessing certain memory locations with 16
bit immediate value. These areas are discussed in the memory model section of
this document. The read only small data area (SDA) anchor R2 (Read-Only) is
used to access the constants such as literals. The other SDA anchor R13 (ReadWrite) is used for accessing the values in the small data read-write section.
Register R1 stores the value of the stack pointer and is updated on entry and exit
from functions.
www.xilinx.com
1-800-255-7778
Stack Convention
MicroBlaze has certain special registers such as a program counter (rpc) and machine
status register (rmsr). These registers are not mapped directly to the register file and
hence the usage of these registers is different from the general purpose registers. The
value from rmsr and rpc can be transferred to general purpose registers by using mts
and mfs instructions (For more details refer to the MicroBlaze Application Binary
Interface chapter).
Stack Convention
The stack conventions used by MicroBlaze are detailed in Figure 4-1
The shaded area in Figure 4-1 denotes a part of the caller functions stack frame, while the
unshaded area indicates the callee functions frame. The ABI conventions of the stack
frame define the protocol for passing parameters, preserving non-volatile register values
and allocating space for the local variables in a function. Functions which contain calls to
other sub-routines are called as non-leaf functions, These non-leaf functions have to create
a new stack frame area for its own use. When the program starts executing, the stack
pointer will have the maximum value. As functions are called, the stack pointer is
decremented by the number of words required by every function for its stack frame. The
stack pointer of a caller function will always have a higher value as compared to the callee
function.
Figure 4-1:
Stack Convention
High Address
Function Parameters for called sub-routine
(Arg n ..Arg1)
(Optional: Maximum number of arguments
required for any called procedure from the
current procedure.)
Old Stack Pointer
New Stack
Pointer
Link Register
Low Address
www.xilinx.com
1-800-255-7778
69
Consider an example where Func1 calls Func2, which in turn calls Func3. The stack
representation at different instances is depicted in Figure 4-2. After the call from Func 1 to
Func 2, the value of the stack pointer (SP) is decremented. This value of SP is again
decremented to accommodate the stack frame for Func3. On return from Func 3 the value
of the stack pointer is increased to its original value in the function, Func 2.
Details of how the stack is maintained are shown in Figure 4-2.
High Memory
Func 1
Func 1
Func 1
Func 1
Func 2
Func 2
Func 2
SP
SP
SP
Func 3
Low Memory
SP
Figure 4-2:
X9584
Stack Frame
Calling Convention
The caller function passes parameters to the callee function using either the registers (R5
through R10) or on its own stack frame. The callee uses the callers stack area to store the
parameters passed to the callee.
Refer to Figure 4-2. The parameters for Func 2 are stored either in the registers R5 through
R10 or on the stack frame allocated for Func 1.
Memory Model
The memory model for MicroBlaze classifies the data into four different parts:
70
www.xilinx.com
1-800-255-7778
Data area
Comparatively large initialized variables are allocated to the data area, which can
either be accessed using the read-write SDA anchor R13 or using the absolute address,
depending on the command line option given to the compiler.
Literals or constants
Constants are placed into the read-only small data area and are accessed using the
read-only small data area anchor R2.
The compiler generates appropriate global pointers to act as base pointers. The actual
values of the SDA anchors are decided by the linker, in the final linking stages. For more
information on the various sections of the memory please refer to the Address Management
chapter. The compiler generates appropriate sections, depending on the command line
options. Please refer to the GNU Compiler Tools chapter for more information about these
options.
Hardware jumps to
Software Labels
Start / Reset
0x0
_start
Exception
0x8
_exception_handler
Interrupt
0x10
_interrupt_handler
The code expected at these locations is as shown in Figure 4-3. In case of programs
compiled without the -xl-mode-xmdstub compiler option, the crt0.o initialization file is
passed by the mb-gcc compiler to the mb-ld linker for linking. This file sets the appropriate
addresses of the exception handlers.
In case of programs compiled with the -xl-mode-xmdstub compiler option, the crt1.o
initialization file is linked to the output program. This program has to be run with the
xmdstub already loaded in the memory at address location 0x0. Hence at run-time, the
initialization code in crt1.o writes the appropriate instructions to location 0x8 through 0x14
depending on the address of the exception and interrupt handlers.
www.xilinx.com
1-800-255-7778
71
Figure 4-3: Code for passing control to exception and interrupt handlers
0x00:
0x04:
0x08:
0x0c:
0x10:
0x14:
bri
nop
imm
bri
imm
bri
_start1
high bits of address (exception handler)
_exception_handler
high bits of address (interrupt handler)
_interrupt_handler
MicroBlaze allows exception and interrupt handler routines to be located at any address
location addressable using 32 bits. The exception handler code starts with the label
_exception_handler, while the interrupt handler code starts with the label
_interrupt_handler.
In the current MicroBlaze system, there are dummy routines for interrupt or exception
handling, which you can change. In order to override these routines and link your
interrupt and exception handlers, you must define the interrupt handler code with an
attribute interrupt_handler. For more details about the use and syntax of the interrupt
handler attribute, please refer to the GNU Compiler Tools chapter.
72
www.xilinx.com
1-800-255-7778
Chapter 5
Notation
The symbols used throughout this document are defined in Table 1.
Table 1: Symbol notation
Symbol
Meaning
Add
Subtract
Multiply
Bitwise logical OR
Assignment
>>
Right shift
<<
Left shift
rx
Register x
x[i]
Bit i in register x
x[i:j]
Equal comparison
>
>=
<
<=
sext(x)
www.xilinx.com
1-800-255-7778
73
Meaning
Memory location at address x
FSL interface x
Least Significant Word of x
Formats
MicroBlaze uses two instruction formats: Type A and Type B.
Type A
Type A is used for register-register instructions. It contains the opcode, one destination and
two source registers.
Opcode
11
Source Reg B
16
21
0
31
Type B
Type B is used for register-immediate instructions. It contains the opcode, one destination
and one source registers, and a source 16-bit immediate value.
Opcode
0
11
Immediate Value
16
31
Instructions
MicroBlaze instructions are described next. Instructions are listed in alphabetical order. For
each instruction Xilinx provides the mnemonic, encoding, a description of it, pseudocode
of its semantics, and a list of registers that it modifies.
74
www.xilinx.com
1-800-255-7778
Instructions
add
Arithmetic Add
add
rD, rA, rB
Add
addc
rD, rA, rB
addk
rD, rA, rB
addkc
rD, rA, rB
0 K C 0
rD
rA
11
rB
16
21
0
31
Description
The sum of the contents of registers rA and rB, is placed into register rD.
Bit 3 of the instruction (labeled as K in the figure) is set to a one for the mnemonic addk. Bit
4 of the instruction (labeled as C in the figure) is set to a one for the mnemonic addc. Both
bits are set to a one for the mnemonic addkc.
When an add instruction has bit 3 set (addk, addkc), the carry flag will Keep its previous
value regardless of the outcome of the execution of the instruction. If bit 3 is cleared (add,
addc), then the carry flag will be affected by the execution of the instruction.
When bit 4 of the instruction is set to a one (addc, addkc), the content of the carry flag
(MSR[C]) affects the execution of the instruction. When bit 4 is cleared (add, addk), the
content of the carry flag does not affect the execution of the instruction (providing a normal
addition).
Pseudocode
if C = 0 then
(rD) (rA) + (rB)
else
(rD) (rA) + (rB) + MSR[C]
if K = 0 then
MSR[C] CarryOut
Registers Altered
rD
MSR[C]
Latency
1 cycle
Note
The C bit in the instruction opcode is not the same as the carry bit in the MSR register.
www.xilinx.com
1-800-255-7778
75
addi
0
0
addi
Add Immediate
addic
addik
addikc
1 K C 0
rD
6
rA
11
IMM
16
31
Description
The sum of the contents of registers rA and the value in the IMM field, sign-extended to 32
bits, is placed into register rD. Bit 3 of the instruction (labeled as K in the figure) is set to a
one for the mnemonic addik. Bit 4 of the instruction (labeled as C in the figure) is set to a
one for the mnemonic addic. Both bits are set to a one for the mnemonic addikc.
When an addi instruction has bit 3 set (addik, addikc), the carry flag will Keep its previous
value regardless of the outcome of the execution of the instruction. If bit 3 is cleared (addi,
addic), then the carry flag will be affected by the execution of the instruction.
When bit 4 of the instruction is set to a one (addic, addikc), the content of the carry flag
(MSR[C]) affects the execution of the instruction. When bit 4 is cleared (addi, addik), the
content of the carry flag does not affect the execution of the instruction (providing a normal
addition).
Pseudocode
if C = 0 then
(rD) (rA) + sext(IMM)
else
(rD) (rA) + sext(IMM) + MSR[C]
if K = 0 then
MSR[C] CarryOut
Registers Altered
rD
MSR[C]
Latency
1 cycle
Notes
The C bit in the instruction opcode is not the same as the carry bit in the MSR register.
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
76
www.xilinx.com
1-800-255-7778
Instructions
and
Logical AND
and
1
0
rD, rA, rB
rD
6
rA
11
rB
16
0
21
0
31
Description
The contents of register rA are ANDed with the contents of register rB; the result is placed
into register rD.
Pseudocode
(rD) (rA) (rB)
Registers Altered
rD
Latency
1 cycle
www.xilinx.com
1-800-255-7778
77
andi
andi
rD
rA
11
IMM
16
31
Description
The contents of register rA are ANDed with the value of the IMM field, sign-extended to 32
bits; the result is placed into register rD.
Pseudocode
(rD) (rA) sext(IMM)
Registers Altered
rD
Latency
1 cycle
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an IMM instruction. See the imm instruction for details on using
32-bit immediate values.
78
www.xilinx.com
1-800-255-7778
Instructions
andn
andn
1
0
rD, rA, rB
rD
6
rA
11
rB
16
0
21
0
31
Description
The contents of register rA are ANDed with the logical complement of the contents of
register rB; the result is placed into register rD.
Pseudocode
(rD) (rA) (rB)
Registers Altered
rD
Latency
1 cycle
www.xilinx.com
1-800-255-7778
79
andni
andni
rD
rA
11
IMM
16
31
Description
The IMM field is sign-extended to 32 bits. The contents of register rA are ANDed with the
logical complement of the extended IMM field; the result is placed into register rD.
Pseudocode
(rD) (rA) (sext(IMM))
Registers Altered
rD
Latency
1 cycle
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
80
www.xilinx.com
1-800-255-7778
Instructions
beq
1
0
Branch if Equal
beq
rA, rB
Branch if Equal
beqd
rA, rB
1 D 0
rA
11
rB
16
21
0
31
Description
Branch if rA is equal to 0, to the instruction located in the offset value of rB. The target of
the branch will be the instruction at address PC + rB.
The mnemonic beqd will set the D bit. The D bit determines whether there is a branch
delay slot or not. If the D bit is set, it means that there is a delay slot and the instruction
following the branch (i.e. in the branch delay slot) is allowed to complete execution before
executing the target instruction. If the D bit is not set, it means that there is no delay slot, so
the instruction to be executed after the branch is the target instruction.
Pseudocode
If rA = 0 then
PC PC + rB
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
www.xilinx.com
1-800-255-7778
81
beqi
beqi
rA, IMM
beqid
rA, IMM
1 D 0
rA
11
IMM
16
31
Description
Branch if rA is equal to 0, to the instruction located in the offset value of IMM. The target
of the branch will be the instruction at address PC + IMM.
The mnemonic beqid will set the D bit. The D bit determines whether there is a branch
delay slot or not. If the D bit is set, it means that there is a delay slot and the instruction
following the branch (i.e. in the branch delay slot) is allowed to complete execution before
executing the target instruction. If the D bit is not set, it means that there is no delay slot, so
the instruction to be executed after the branch is the target instruction.
Pseudocode
If rA = 0 then
PC PC + sext(IMM)
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
82
www.xilinx.com
1-800-255-7778
Instructions
bge
1
0
bge
rA, rB
bged
rA, rB
1 D 0
rA
11
rB
16
21
0
31
Description
Branch if rA is greater or equal to 0, to the instruction located in the offset value of rB. The
target of the branch will be the instruction at address PC + rB.
The mnemonic bged will set the D bit. The D bit determines whether there is a branch
delay slot or not. If the D bit is set, it means that there is a delay slot and the instruction
following the branch (i.e. in the branch delay slot) is allowed to complete execution before
executing the target instruction. If the D bit is not set, it means that there is no delay slot, so
the instruction to be executed after the branch is the target instruction.
Pseudocode
If rA >= 0 then
PC PC + rB
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
www.xilinx.com
1-800-255-7778
83
bgei
bgei
rA, IMM
bgeid
rA, IMM
1 D 0
rA
11
IMM
16
31
Description
Branch if rA is greater or equal to 0, to the instruction located in the offset value of IMM.
The target of the branch will be the instruction at address PC + IMM.
The mnemonic bgeid will set the D bit. The D bit determines whether there is a branch
delay slot or not. If the D bit is set, it means that there is a delay slot and the instruction
following the branch (i.e. in the branch delay slot) is allowed to complete execution before
executing the target instruction. If the D bit is not set, it means that there is no delay slot, so
the instruction to be executed after the branch is the target instruction.
Pseudocode
If rA >= 0 then
PC PC + sext(IMM)
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
84
www.xilinx.com
1-800-255-7778
Instructions
bgt
1
0
bgt
rA, rB
bgtd
rA, rB
1 D 0
rA
11
rB
16
21
0
31
Description
Branch if rA is greater than 0, to the instruction located in the offset value of rB. The target
of the branch will be the instruction at address PC + rB.
The mnemonic bgtd will set the D bit. The D bit determines whether there is a branch delay
slot or not. If the D bit is set, it means that there is a delay slot and the instruction following
the branch (i.e. in the branch delay slot) is allowed to complete execution before executing
the target instruction. If the D bit is not set, it means that there is no delay slot, so the
instruction to be executed after the branch is the target instruction.
Pseudocode
If rA > 0 then
PC PC + rB
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
www.xilinx.com
1-800-255-7778
85
bgti
bgti
rA, IMM
bgtid
rA, IMM
1 D 0
rA
11
IMM
16
31
Description
Branch if rA is greater than 0, to the instruction located in the offset value of IMM. The
target of the branch will be the instruction at address PC + IMM.
The mnemonic bgtid will set the D bit. The D bit determines whether there is a branch
delay slot or not. If the D bit is set, it means that there is a delay slot and the instruction
following the branch (i.e. in the branch delay slot) is allowed to complete execution before
executing the target instruction. If the D bit is not set, it means that there is no delay slot, so
the instruction to be executed after the branch is the target instruction.
Pseudocode
If rA > 0 then
PC PC + sext(IMM)
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
86
www.xilinx.com
1-800-255-7778
Instructions
ble
1
0
ble
rA, rB
bled
rA, rB
1 D 0
rA
11
rB
16
21
0
31
Description
Branch if rA is less or equal to 0, to the instruction located in the offset value of rB. The
target of the branch will be the instruction at address PC + rB.
The mnemonic bled will set the D bit. The D bit determines whether there is a branch delay
slot or not. If the D bit is set, it means that there is a delay slot and the instruction following
the branch (i.e. in the branch delay slot) is allowed to complete execution before executing
the target instruction. If the D bit is not set, it means that there is no delay slot, so the
instruction to be executed after the branch is the target instruction.
Pseudocode
If rA <= 0 then
PC PC + rB
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
www.xilinx.com
1-800-255-7778
87
blei
blei
rA, IMM
bleid
rA, IMM
1 D 0
rA
11
IMM
16
31
Description
Branch if rA is less or equal to 0, to the instruction located in the offset value of IMM. The
target of the branch will be the instruction at address PC + IMM.
The mnemonic bleid will set the D bit. The D bit determines whether there is a branch
delay slot or not. If the D bit is set, it means that there is a delay slot and the instruction
following the branch (i.e. in the branch delay slot) is allowed to complete execution before
executing the target instruction. If the D bit is not set, it means that there is no delay slot, so
the instruction to be executed after the branch is the target instruction.
Pseudocode
If rA <= 0 then
PC PC + sext(IMM)
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
88
www.xilinx.com
1-800-255-7778
Instructions
blt
1
0
blt
rA, rB
bltd
rA, rB
1 D 0
rA
11
rB
16
21
0
31
Description
Branch if rA is less than 0, to the instruction located in the offset value of rB. The target of
the branch will be the instruction at address PC + rB.
The mnemonic bltd will set the D bit. The D bit determines whether there is a branch delay
slot or not. If the D bit is set, it means that there is a delay slot and the instruction following
the branch (i.e. in the branch delay slot) is allowed to complete execution before executing
the target instruction. If the D bit is not set, it means that there is no delay slot, so the
instruction to be executed after the branch is the target instruction.
Pseudocode
If rA < 0 then
PC PC + rB
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
www.xilinx.com
1-800-255-7778
89
blti
blti
rA, IMM
bltid
rA, IMM
1 D 0
rA
11
IMM
16
31
Description
Branch if rA is less than 0, to the instruction located in the offset value of IMM. The target
of the branch will be the instruction at address PC + IMM.
The mnemonic bltid will set the D bit. The D bit determines whether there is a branch delay
slot or not. If the D bit is set, it means that there is a delay slot and the instruction following
the branch (i.e. in the branch delay slot) is allowed to complete execution before executing
the target instruction. If the D bit is not set, it means that there is no delay slot, so the
instruction to be executed after the branch is the target instruction.
Pseudocode
If rA < 0 then
PC PC + sext(IMM)
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
90
www.xilinx.com
1-800-255-7778
Instructions
bne
1
0
bne
rA, rB
bned
rA, rB
1 D 0
rA
11
rB
16
21
0
31
Description
Branch if rA not equal to 0, to the instruction located in the offset value of rB. The target of
the branch will be the instruction at address PC + rB.
The mnemonic bned will set the D bit. The D bit determines whether there is a branch
delay slot or not. If the D bit is set, it means that there is a delay slot and the instruction
following the branch (i.e. in the branch delay slot) is allowed to complete execution before
executing the target instruction. If the D bit is not set, it means that there is no delay slot, so
the instruction to be executed after the branch is the target instruction.
Pseudocode
If rA 0 then
PC PC + rB
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
www.xilinx.com
1-800-255-7778
91
bnei
bnei
rA, IMM
bneid
rA, IMM
1 D 0
rA
11
IMM
16
31
Description
Branch if rA not equal to 0, to the instruction located in the offset value of IMM. The target
of the branch will be the instruction at address PC + IMM.
The mnemonic bneid will set the D bit. The D bit determines whether there is a branch
delay slot or not. If the D bit is set, it means that there is a delay slot and the instruction
following the branch (i.e. in the branch delay slot) is allowed to complete execution before
executing the target instruction. If the D bit is not set, it means that there is no delay slot, so
the instruction to be executed after the branch is the target instruction.
Pseudocode
If rA 0 then
PC PC + sext(IMM)
else
PC PC + 4
if D = 1 then
allow following instruction to complete execution
Registers Altered
PC
Latency
1 cycle (if branch is not taken)
2 cycles (if branch is taken and the D bit is set)
3 cycles (if branch is taken and the D bit is not set)
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
92
www.xilinx.com
1-800-255-7778
Instructions
br
Unconditional Branch
br
rB
Branch
bra
rB
Branch Absolute
brd
rB
brad
rB
brld
rD, rB
brald
rD, rB
rD
D A L
11
rB
16
21
0
31
Description
Branch to the instruction located at address determined by rB.
The mnemonics brld and brald will set the L bit. If the L bit is set, linking will be
performed. The current value of PC will be stored in rD.
The mnemonics bra, brad and brald will set the A bit. If the A bit is set, it means that the
branch is to an absolute value and the target is the value in rB, otherwise, it is a relative
branch and the target will be PC + rB.
The mnemonics brd, brad, brld and brald will set the D bit. The D bit determines whether
there is a branch delay slot or not. If the D bit is set, it means that there is a delay slot and
the instruction following the branch (i.e. in the branch delay slot) is allowed to complete
execution before executing the target instruction. If the D bit is not set, it means that there
is no delay slot, so the instruction to be executed after the branch is the target instruction.
Pseudocode
if L = 1 then
(rD) PC
if A = 1 then
PC (rB)
else
PC PC + (rB)
if D = 1 then
allow following instruction to complete execution
Registers Altered
rD
PC
Latency
2 cycles (if the D bit is set) or 3 cycles (if the D bit is not set)
Note
The instructions brl and bral are not available.
www.xilinx.com
1-800-255-7778
93
bri
1
0
bri
IMM
Branch Immediate
brai
IMM
brid
IMM
braid
IMM
brlid
rD, IMM
bralid
rD, IMM
rD
6
D A L
11
IMM
16
31
Description
Branch to the instruction located at address determined by IMM, sign-extended to 32 bits.
The mnemonics brlid and bralid will set the L bit. If the L bit is set, linking will be
performed. The current value of PC will be stored in rD.
The mnemonics brai, braid and bralid will set the A bit. If the A bit is set, it means that the
branch is to an absolute value and the target is the value in IMM, otherwise, it is a relative
branch and the target will be PC + IMM.
The mnemonics brid, braid, brlid and bralid will set the D bit. The D bit determines
whether there is a branch delay slot or not. If the D bit is set, it means that there is a delay
slot and the instruction following the branch (i.e. in the branch delay slot) is allowed to
complete execution before executing the target instruction. If the D bit is not set, it means
that there is no delay slot, so the instruction to be executed after the branch is the target
instruction.
Pseudocode
if L = 1 then
(rD) PC
if A = 1 then
PC (IMM)
else
PC PC + (IMM)
if D = 1 then
allow following instruction to complete execution
Registers Altered
rD
PC
Latency
2 cycles (if the D bit is set) or 3 cycles (if the D bit is not set)
94
www.xilinx.com
1-800-255-7778
Instructions
Notes
The instructions brli and brali are not available.
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
www.xilinx.com
1-800-255-7778
95
brk
Break
brk
1
0
rD, rB
rD
6
0
11
rB
16
21
0
31
Description
Branch and link to the instruction located at address value in rB. The current value of PC
will be stored in rD. The BIP flag in the MSR will be set.
Pseudocode
(rD) PC
PC (rB)
MSR[BIP] 1
Registers Altered
rD
PC
MSR[BIP]
Latency
3 cycles
96
www.xilinx.com
1-800-255-7778
Instructions
brki
Break Immediate
brki
rD, IMM
rD
0
11
IMM
16
31
Description
Branch and link to the instruction located at address value in IMM, sign-extended to 32
bits. The current value of PC will be stored in rD. The BIP flag in the MSR will be set.
Pseudocode
(rD) PC
PC sext(IMM)
MSR[BIP] 1
Registers Altered
rD
PC
MSR[BIP]
Latency
3 cycles
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
www.xilinx.com
1-800-255-7778
97
bs
Barrel Shift
bsrl
rD, rA, rB
bsra
rD, rA, rB
bsll
rD, rA, rB
rD
rA
11
rB
16
S T
21
0
31
Description
Shifts the contents of register rA by the amount specified in register rB and puts the result
in register rD.
The mnemonic bsll sets the S bit (Side bit). If the S bit is set, the barrel shift is done to the
left. The mnemonics bsrl and bsra clear the S bit and the shift is done to the right.
The mnemonic bsra will set the T bit (Type bit). If the T bit is set, the barrel shift perfomed
is Arithmetical. The mnemonics bsrl and bsll clear the T bit and the shift performed is
Logical.
Pseudocode
if S = 1 then
(rD) (rA) << (rB)[27:31]
else
if T = 1 then
if ((rB)[27:31]) 0 then
(rD)[0:(rB)[27:31]-1] (rA)[0]
(rD)[(rB)[27:31]:31] (rA) >> (rB)[27:31]
else
(rD) (rA)
else
(rD) (rA) >> (rB)[27:31]
Registers Altered
rD
Latency
2 cycles
Note
These instructions are optional. To use them, MicroBlaze has to be configured to use barrel
shift instructions.
98
www.xilinx.com
1-800-255-7778
Instructions
bsi
0
0
bsrli
bsrai
bslli
rD
6
rA
11
16
S T
21
IMM
27
31
Description
Shifts the contents of register rA by the amount specified by IMM and puts the result in
register rD.
The mnemonic bsll sets the S bit (Side bit). If the S bit is set, the barrel shift is done to the
left. The mnemonics bsrl and bsra clear the S bit and the shift is done to the right.
The mnemonic bsra will set the T bit (Type bit). If the T bit is set, the barrel shift perfomed
is Arithmetical. The mnemonics bsrl and bsll clear the T bit and the shift performed is
Logical.
Pseudocode
if S = 1 then
(rD) (rA) << IMM
else
if T = 1 then
if IMM 0 then
(rD)[0:IMM-1] (rA)[0]
(rD)[IMM:31] (rA) >> IMM
else
(rD) (rA)
else
(rD) (rA) >> IMM
Registers Altered
rD
Latency
2 cycles
Notes
These are not Type B Instructions. There is no effect from a preceeding imm instruction.
These instructions are optional. To use them, MicroBlaze has to be configured to use barrel
shift instructions.
www.xilinx.com
1-800-255-7778
99
cmp
Integer Compare
cmp
rD, rA, rB
cmpu
rD, rA, rB
rD
rA
11
rB
16
21
0 U 0
31
Description
The contents of register rA is subtracted from the contents of register rB and the result is
placed into register rD.
The MSB bit of rD is adjusted to shown true relation between rA and rB. If the U bit is set,
rA and rB is considered unsigned values. If the U bit is clear, rA and rB is considered
signed values
.
Pseudocode
if (rA) = (rB) then
(rD) 0
else
(rD)(MSB) (rA) > (rB)
Registers Altered
rD
Latency
1 cycle
.
100
www.xilinx.com
1-800-255-7778
Instructions
get
get
rD, FSLx
nget
rD, FSLx
cget
rD, FSLx
ncget
rD, FSLx
rD
11
16
FSLx
29
31
Description
MicroBlaze will read from the FSLx interface and place the result in register rD.
The get instruction has four variants.
The blocking versions (whenn bit is 0) will stall microblaze until the data from the FSL
interface is valid. The non-blocking versions will not stall microblaze and will set carry to
0 if the data was valid and to 1 if the data was invalid.
The get and nget instructions expect the control bit from the FSL interface to be 0. If this
is not the case, the instruction will set MSR[FSL_Error] to 1. The cget and ncget
instructions expect the control bit from the FSL interface to be 1. If this is not the case, the
instruction will set MSR[FSL_Error] to 1.
Pseudocode
(rD) FSLx
if (n = 1) then
MSR[Carry] not (FSLx Exists bit)
if ((FSLx Control bit) == c) then
MSR[FSL_Error] 0
else
MSR[FSL_Error] 1
Registers Altered
rD
MSR[FSL_Error]
MSR[Carry]
Latency
2 cycles if non-blocking or if data is valid at the FSL interface. For blocking instruction,
MicroBlaze will stall until the data is valid
Note
For nget and ncget, a rsubc instruction can be used for counting down a index variable
www.xilinx.com
1-800-255-7778
101
idiv
Integer Divide
idiv
rD, rA, rB
divide rB by rA (signed)
idivu
rD, rA, rB
divide rB by rA (unsigned)
rD
rA
11
rB
16
21
0 U 0
31
Description
The contents of register rB is divided by the contents of register rA and the result is placed
into register rD.
If the U bit is set, rA and rB is considered unsigned values. If the U bit is clear, rA and rB is
considered signed values
If the value of rA is 0, the divide_by_zero bit in MSR will be set and the value in rD will be
0.
Pseudocode
if (rA) = 0then
(rD) 0
else
(rD) (rB) / (rA)
Registers Altered
rD
MSR[Divide_By_Zero]
Latency
2 cycles if (rA) = 0, otherwise 34 cycles
Note
This instruction is only valid if MicroBlaze is configured to use a hardware divider.
102
www.xilinx.com
1-800-255-7778
Instructions
imm
Immediate
imm
1
0
IMM
0
11
IMM
16
31
Description
The instruction imm loads the IMM value into a temporary register. It also locks this value
so it can be used by the following instruction and form a 32-bit immediate value.
The instruction imm is used in conjunction with Type B instructions. Since Type B
instructions have only a 16-bit immediate value field, a 32-bit immediate value cannot be
used directly. However, 32-bit immediate values can be used in MicroBlaze. By default,
Type B Instructions will take the 16-bit IMM field value and sign extend it to 32 bits to use
as the immediate operand. This behavior can be overridden by preceding the Type B
instruction with an imm instruction. The imm instruction locks the 16-bit IMM value
temporarily for the next instruction. A Type B instruction that immediately follows the
imm instruction will then form a 32-bit immediate value from the 16-bit IMM value of the
imm instruction (upper 16 bits) and its own 16-bit immediate value field (lower 16 bits). If
no Type B instruction follows the IMM instruction, the locked value gets unlocked and
becomes useless.
Latency
1 cycle
Notes
The imm instruction and the Type B instruction following it are atomic, hence no interrupts
are allowed between them.
The assembler provided by Xilinx automatically detects the need for imm instructions.
When a 32-bit IMM value is specified in a Type B instruction, the assembler converts the
IMM value to a 16-bit one to assemble the instruction and inserts an imm instruction before
it in the executable file.
www.xilinx.com
1-800-255-7778
103
lbu
lbu
1
0
rD, rA, rB
rD
6
rA
11
rB
16
21
0
31
Description
Loads a byte (8 bits) from the memory location that results from adding the contents of
registers rA and rB. The data is placed in the least significant byte of register rD and the
other three bytes in rD are cleared.
Pseudocode
Addr (rA) + (rB)
(rD)[24:31] Mem(Addr)
(rD)[0:23] 0
Registers Altered
rD
Latency
2 cycles
104
www.xilinx.com
1-800-255-7778
Instructions
lbui
lbui
rD
rA
11
IMM
16
31
Description
Loads a byte (8 bits) from the memory location that results from adding the contents of
register rA with the value in IMM, sign-extended to 32 bits. The data is placed in the least
significant byte of register rD and the other three bytes in rD are cleared.
Pseudocode
Addr (rA) + sext(IMM)
(rD)[24:31] Mem(Addr)
(rD)[0:23] 0
Registers Altered
rD
Latency
2 cycles
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
www.xilinx.com
1-800-255-7778
105
lhu
lhu
1
0
rD, rA, rB
rD
6
rA
11
rB
16
21
0
31
Description
Loads a halfword (16 bits) from the halfword aligned memory location that results from
adding the contents of registers rA and rB. The data is placed in the least significant
halfword of register rD and the most significant halfword in rD is cleared.
Pseudocode
Addr (rA) + (rB)
Addr[31] 0
(rD)[16:31] Mem(Addr)
(rD)[0:15] 0
Registers Altered
rD
Latency
2 cycles
106
www.xilinx.com
1-800-255-7778
Instructions
lhui
lhui
rD
rA
11
IMM
16
31
Description
Loads a halfword (16 bits) from the halfword aligned memory location that results from
adding the contents of register rA and the value in IMM, sign-extended to 32 bits. The data
is placed in the least significant halfword of register rD and the most significant halfword
in rD is cleared.
Pseudocode
Addr (rA) + sext(IMM)
Addr[31] 0
(rD)[16:31] Mem(Addr)
(rD)[0:15] 0
Registers Altered
rD
Latency
2 cycles
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
www.xilinx.com
1-800-255-7778
107
lw
Load Word
lw
1
0
rD, rA, rB
rD
6
rA
11
rB
16
21
0
31
Description
Loads a word (32 bits) from the word aligned memory location that results from adding
the contents of registers rA and rB. The data is placed in register rD.
Pseudocode
Addr (rA) + (rB)
Addr[30:31] 00
(rD) Mem(Addr)
Registers Altered
rD
Latency
2 cycles
108
www.xilinx.com
1-800-255-7778
Instructions
lwi
lwi
rD
rA
11
IMM
16
31
Description
Loads a word (32 bits) from the word aligned memory location that results from adding
the contents of register rA and the value IMM, sign-extended to 32 bits. The data is placed
in register rD.
Pseudocode
Addr (rA) + sext(IMM)
Addr[30:31] 00
(rD) Mem(Addr)
Registers Altered
rD
Latency
2 cycles
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
www.xilinx.com
1-800-255-7778
109
mfs
mfs
rD, rS
rD
0
11
16
0 rS
31
Description
Copies the contents of the special purpose register rS into register rD.
Pseudocode
(rD) (rS)
Registers Altered
rD
Latency
1 cycle
Note
To refer to special purpose registers in assembly language, use rpc for PC and rmsr for
MSR.
110
www.xilinx.com
1-800-255-7778
Instructions
msrclr
msrclr
rD
rD, Imm
11
Imm14
16 17 18
31
Description
Copies the contents of the special purpose register MSR into register rD.
Bit positions in the IMM value that are 1 are cleared in the MSR. Bit positions that are 0 in
the IMM value are left untouched.
Pseudocode
(rD)
(MSR)
(MSR) (MSR) (IMM))
Registers Altered
rD
MSR
Latency
1 cycle
Note
This instruction is only valid if C_USE_MSR_INSTR is set for MicroBlaze.
The immediate values has to be less than 2^14. Only bits 18 to 31 of the MSR can be cleared.
This instruction only exists in version 2.10.a and above.
www.xilinx.com
1-800-255-7778
111
msrset
msrset
rD
rD, Imm
11
0
16
Imm14
18
31
Description
Copies the contents of the special purpose register MSR into register rD.
Bit positions in the IMM value that are 1 are set in the MSR. Bit positions that are 0 in the
IMM value are left untouched.
Pseudocode
(rD)
(MSR)
(MSR) (MSR) (IMM)
Registers Altered
rD
MSR
Latency
1 cycle
Note
This instruction is only valid if C_USE_MSR_INSTR is set for MicroBlaze.
The immediate values has to be less than 2^14. Only bits 18 to 31 of the MSR can be set.
This instruction only exists in version 2.10.a and above.
112
www.xilinx.com
1-800-255-7778
Instructions
mts
mts
1
0
rS, rA
rA
11
16
0 rS
31
Description
Copies the contents of register rD into the MSR register.
Pseudocode
(rS) (rA)
Registers Altered
rS
Latency
1 cycle
Notes
You cannot write to the PC using the MTS instruction.
When writing to MSR using MTS, the value written will take effect one clock cycle after
executing the MTS instruction.
To refer to special purpose registers in assembly language, use rpc for PC and rmsr for
MSR.
www.xilinx.com
1-800-255-7778
113
mul
Multiply
mul
rD, rA, rB
rD
rA
11
rB
16
21
0
31
Description
Multiplies the contents of registers rA and rB and puts the result in register rD. This is a 32bit by 32-bit multiplication that will produce a 64-bit result. The least significant word of
this value is placed in rD. The most significant word is discarded.
Pseudocode
(rD) LSW( (rA) (rB) )
Registers Altered
rD
Latency
3 cycles
Note
This instruction is only valid if the target architecture has an embedded multiplier.
114
www.xilinx.com
1-800-255-7778
Instructions
muli
Multiply Immediate
muli
0
0
rD
6
rA
11
IMM
16
31
Description
Multiplies the contents of registers rA and the value IMM, sign-extended to 32 bits; and
puts the result in register rD. This is a 32-bit by 32-bit multiplication that will produce a 64bit result. The least significant word of this value is placed in rD. The most significant word
is discarded.
Pseudocode
(rD) LSW( (rA) sext(IMM) )
Registers Altered
rD
Latency
3 cycles
Notes
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
This instruction is only valid if the target architecture has an embedded multiplier.
www.xilinx.com
1-800-255-7778
115
or
Logical OR
or
1
0
rD, rA, rB
rD
6
rA
11
rB
16
21
0
31
Description
The contents of register rA are ORed with the contents of register rB; the result is placed
into register rD.
Pseudocode
(rD) (rA) (rB)
Registers Altered
rD
Latency
1 cycle
116
www.xilinx.com
1-800-255-7778
Instructions
ori
ori
rD
rA
11
IMM
16
31
Description
The contents of register rA are ORed with the extended IMM field, sign-extended to 32
bits; the result is placed into register rD.
Pseudocode
(rD) (rA) (IMM)
Registers Altered
rD
Latency
1 cycle
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
www.xilinx.com
1-800-255-7778
117
put
0
0
put
rA, FSLx
nput
rA, FSLx
cput
rA, FSLx
ncput
rA, FSLx
0 0 0
rA
11
16
FSLx
29
31
Description
MicroBlaze will write the value from register rA to the FSLx interface.
The put instruction has four variants.
The blocking versions (when n is 0) will stall microblaze until there is space available in
the FSL interface. The non-blocking versions will not stall microblaze and will set carry to
0 if space was available and to 1 if no space was available.
The put and nput instructions will set the control bit to the FSL interface to 0 and the cput
and ncput instruction will set the control bit to 1.
Pseudocode
(FSLx) (rA)
if (n = 1) then
MSR[Carry] (FSLx Full bit)
(FSLx Control bit) C
Registers Altered
MSR[Carry]
Latency
2 cycles for non-blocking or if space is available on the FSL interface. For blocking,
MicroBlaze stalls until space is avaible on the FSL interface.
118
www.xilinx.com
1-800-255-7778
Instructions
rsub
0
0
rD, rA, rB
Subtract
rsubc
rD, rA, rB
rsubk
rD, rA, rB
rsubkc
rD, rA, rB
0 K C 1
rD
6
rA
11
rB
16
0
21
0
31
Description
The contents of register rA is subtracted from the contents of register rB and the result is
placed into register rD. Bit 3 of the instruction (labeled as K in the figure) is set to a one for
the mnemonic rsubk. Bit 4 of the instruction (labeled as C in the figure) is set to a one for
the mnemonic rsubc. Both bits are set to a one for the mnemonic rsubkc.
When an rsub instruction has bit 3 set (rsubk, rsubkc), the carry flag will Keep its previous
value regardless of the outcome of the execution of the instruction. If bit 3 is cleared (rsub,
rsubc), then the carry flag will be affected by the execution of the instruction.
When bit 4 of the instruction is set to a one (rsubc, rsubkc), the content of the carry flag
(MSR[C]) affects the execution of the instruction. When bit 4 is cleared (rsub, rsubk), the
content of the carry flag does not affect the execution of the instruction (providing a normal
subtraction).
Pseudocode
if C = 0 then
(rD) (rB) + (rA) + 1
else
(rD) (rB) + (rA) + MSR[C]
if K = 0 then
MSR[C] CarryOut
Registers Altered
rD
MSR[C]
Latency
1 cycle
Notes
In subtractions, Carry = (Borrow). When the Carry is set by a subtraction, it means that
there is no Borrow, and when the Carry is cleared, it means that there is a Borrow.
www.xilinx.com
1-800-255-7778
119
rsubi
0
0
Subtract Immediate
rsubic
rsubik
rsubikc
1 K C 1
rD
6
rA
11
IMM
16
31
Description
The contents of register rA is subtracted from the value of IMM, sign-extended to 32 bits,
and the result is placed into register rD. Bit 3 of the instruction (labeled as K in the figure)
is set to a one for the mnemonic rsubik. Bit 4 of the instruction (labeled as C in the figure)
is set to a one for the mnemonic rsubic. Both bits are set to a one for the mnemonic rsubikc.
When an rsubi instruction has bit 3 set (rsubik, rsubikc), the carry flag will Keep its
previous value regardless of the outcome of the execution of the instruction. If bit 3 is
cleared (rsubi, rsubic), then the carry flag will be affected by the execution of the
instruction. When bit 4 of the instruction is set to a one (rsubic, rsubikc), the content of the
carry flag (MSR[C]) affects the execution of the instruction. When bit 4 is cleared (rsubi,
rsubik), the content of the carry flag does not affect the execution of the instruction
(providing a normal subtraction).
Pseudocode
if C = 0 then
(rD) sext(IMM) + (rA) + 1
else
(rD) sext(IMM) + (rA) + MSR[C]
if K = 0 then
MSR[C] CarryOut
Registers Altered
rD
MSR[C]
Latency
1 cycle
Notes
In subtractions, Carry = (Borrow). When the Carry is set by a subtraction, it means that
there is no Borrow, and when the Carry is cleared, it means that there is a Borrow.
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
120
www.xilinx.com
1-800-255-7778
Instructions
rtbd
rtbd
1
0
rA, IMM
rA
11
IMM
16
31
Description
Return from break will branch to the location specified by the contents of rA plus the IMM
field, sign-extended to 32 bits. It will also enable breaks after execution by clearing the BIP
flag in the MSR.
This instruction always has a delay slot. The instruction following the RTBD is always
executed before the branch target. That delay slot instruction has breaks disabled.
Pseudocode
PC (rA) + sext(IMM)
allow following instruction to complete execution
MSR[BIP] 0
Registers Altered
PC
MSR[BIP]
Latency
2 cycles
www.xilinx.com
1-800-255-7778
121
rtid
rtid
1
0
rA, IMM
rA
11
IMM
16
31
Description
Return from interrupt will branch to the location specified by the contents of rA plus the
IMM field, sign-extended to 32 bits. It will also enable interrupts after execution.
This instruction always has a delay slot. The instruction following the RTID is always
executed before the branch target. That delay slot instruction has interrupts disabled.
Pseudocode
PC (rA) + sext(IMM)
allow following instruction to complete execution
MSR[IE] 1
Registers Altered
PC
MSR[IE]
Latency
2 cycles
122
www.xilinx.com
1-800-255-7778
Instructions
rtsd
rtsd
1
0
rA, IMM
rA
11
IMM
16
31
Description
Return from subroutine will branch to the location specified by the contents of rA plus the
IMM field, sign-extended to 32 bits.
This instruction always has a delay slot. The instruction following the RTSD is always
executed before the branch target.
Pseudocode
PC (rA) + sext(IMM)
allow following instruction to complete execution
Registers Altered
PC
Latency
2 cycles
www.xilinx.com
1-800-255-7778
123
sb
Store Byte
sb
1
0
rD, rA, rB
rD
6
rA
11
rB
16
21
0
31
Description
Stores the contents of the least significant byte of register rD, into the memory location that
results from adding the contents of registers rA and rB.
Pseudocode
Addr (rA) + (rB)
Mem(Addr) (rD)[24:31]
Registers Altered
None
Latency
2 cycles
124
www.xilinx.com
1-800-255-7778
Instructions
sbi
sbi
rD
rA
11
IMM
16
31
Description
Stores the contents of the least significant byte of register rD, into the memory location that
results from adding the contents of register rA and the value IMM, sign-extended to 32
bits.
Pseudocode
Addr (rA) + sext(IMM)
Mem(Addr) (rD)[24:31]
Registers Altered
None
Latency
2 cycles
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
www.xilinx.com
1-800-255-7778
125
sext16
sext16
1
0
rD, rA
rD
6
rA
11
16
1
31
Description
This instruction sign-extends a halfword (16 bits) into a word (32 bits). Bit 16 in rA will be
copied into bits 0-15 of rD. Bits 16-31 in rA will be copied into bits 16-31 of rD.
Pseudocode
(rD)[0:15] (rA)[16]
(rD)[16:31] (rA)[16:31]
Registers Altered
rD
Latency
1 cycle
126
www.xilinx.com
1-800-255-7778
Instructions
sext8
sext8
1
0
rD, rA
rD
6
rA
11
16
0
31
Description
This instruction sign-extends a byte (8 bits) into a word (32 bits). Bit 24 in rA will be copied
into bits 0-23 of rD. Bits 24-31 in rA will be copied into bits 24-31 of rD.
Pseudocode
(rD)[0:23] (rA)[24]
(rD)[24:31] (rA)[24:31]
Registers Altered
rD
Latency
1 cycle
www.xilinx.com
1-800-255-7778
127
sh
Store Halfword
sh
1
0
rD, rA, rB
rD
6
rA
11
rB
16
21
0
31
Description
Stores the contents of the least significant halfword of register rD, into the halfword
aligned memory location that results from adding the contents of registers rA and rB.
Pseudocode
Addr (rA) + (rB)
Addr[31] 0
Mem(Addr) (rD)[16:31]
Registers Altered
None
Latency
2 cycles
128
www.xilinx.com
1-800-255-7778
Instructions
shi
shi
rD
rA
11
IMM
16
31
Description
Stores the contents of the least significant halfword of register rD, into the halfword
aligned memory location that results from adding the contents of register rA and the value
IMM, sign-extended to 32 bits.
Pseudocode
Addr (rA) + sext(IMM)
Addr[31] 0
Mem(Addr) (rD)[16:31]
Registers Altered
None
Latency
2 cycles
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
www.xilinx.com
1-800-255-7778
129
sra
sra
1
0
rD, rA
rD
6
rA
11
16
1
31
Description
Shifts arithmetically the contents of register rA, one bit to the right, and places the result in
rD. The most significant bit of rA (i.e. the sign bit) placed in the most significant bit of rD.
The least significant bit coming out of the shift chain is placed in the Carry flag.
Pseudocode
(rD)[0] (rA)[0]
(rD)[1:31] (rA)[0:30]
MSR[C] (rA)[31]
Registers Altered
rD
MSR[C]
Latency
1 cycle
130
www.xilinx.com
1-800-255-7778
Instructions
src
src
1
0
rD, rA
rD
6
rA
11
16
1
31
Description
Shifts the contents of register rA, one bit to the right, and places the result in rD. The Carry
flag is shifted in the shift chain and placed in the most significant bit of rD. The least
significant bit coming out of the shift chain is placed in the Carry flag.
Pseudocode
(rD)[0] MSR[C]
(rD)[1:31] (rA)[0:30]
MSR[C] (rA)[31]
Registers Altered
rD
MSR[C]
Latency
1 cycle
www.xilinx.com
1-800-255-7778
131
srl
srl
1
0
rD, rA
rD
6
rA
11
16
1
31
Description
Shifts logically the contents of register rA, one bit to the right, and places the result in rD.
A zero is shifted in the shift chain and placed in the most significant bit of rD. The least
significant bit coming out of the shift chain is placed in the Carry flag.
Pseudocode
(rD)[0] 0
(rD)[1:31] (rA)[0:30]
MSR[C] (rA)[31]
Registers Altered
rD
MSR[C]
Latency
1 cycle
132
www.xilinx.com
1-800-255-7778
Instructions
sw
Store Word
sw
1
0
rD, rA, rB
rD
6
rA
11
rB
16
0
21
0
31
Description
Stores the contents of register rD, into the word aligned memory location that results from
adding the contents of registers rA and rB.
Pseudocode
Addr (rA) + (rB)
Addr[30:31] 00
Mem(Addr) (rD)[0:31]
Registers Altered
None
Latency
2 cycles
www.xilinx.com
1-800-255-7778
133
swi
swi
rD
rA
11
IMM
16
31
Description
Stores the contents of register rD, into the word aligned memory location that results from
adding the contents of registers rA and the value IMM, sign-extended to 32 bits.
Pseudocode
Addr (rA) + sext(IMM)
Addr[30:31] 00
Mem(Addr) (rD)[0:31]
Register Altered
None
Latency
2 cycles
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
134
www.xilinx.com
1-800-255-7778
Instructions
wdc
wic
1
0
rA,rB
rA
6
rA
11
rB
16
0
31
Description
Write into the data cache tag and data memory. Register rB contains the new data. Register
rA constains the data address. Bit 30 in rA is the new valid bit and bit 31 is the new lock bit.
The instruction only works when the data cache has been disabled by clearing the Data
cache enable bit in the MSR register
Pseudocode
(DCache Tag) (rA)
(DCache Data) (rB)
Registers Altered
None
Latency
1 cycle
www.xilinx.com
1-800-255-7778
135
wic
wic
1
0
rA,rB
rA
6
rA
11
rB
16
0
31
Description
Write into the instruction cache tag and data memory. Register rB contains the new
instruction data. Register rA constains the instruction address. Bit 30 in rA is the new valid
bit and bit 31 is the new lock bit.
The instruction only works when the instruction cache has been disabled by clearing the
Instruction cache enable bit in the MSR register
Pseudocode
(ICache Tag) (rA)
(ICache Data) (rB)
Registers Altered
None
Latency
1 cycle
136
www.xilinx.com
1-800-255-7778
Instructions
xor
Logical Exclusive OR
xor
1
0
rD, rA, rB
rD
6
rA
11
rB
16
0
21
0
31
Description
The contents of register rA are XORed with the contents of register rB; the result is placed
into register rD.
Pseudocode
(rD) (rA) (rB)
Registers Altered
rD
Latency
1 cycle
www.xilinx.com
1-800-255-7778
137
xori
xori
rD
rA
11
IMM
16
31
Description
The IMM field is extended to 32 bits by concatenating 16 0-bits on the left. The contents of
register rA are XORed with the extended IMM field; the result is placed into register rD.
Pseudocode
(rD) (rA) sext(IMM)
Registers Altered
rD
Latency
1 cycle
Note
By default, Type B Instructions will take the 16-bit IMM field value and sign extend it to 32
bits to use as the immediate operand. This behavior can be overridden by preceding the
Type B instruction with an imm instruction. See the imm instruction for details on using
32-bit immediate values.
138
www.xilinx.com
1-800-255-7778